Open learning- Text analysis basics

Up2Universe
Up2UniverseUp2Universe
Open Learning [0]
Text analysis Basics
[Innovation and Sustainability]
2019_04_11
Text Wandering exchanging experiences: crowdsourced based
interactive awareness.
Stefano Lariccia (Sapienza Università di Roma) stefano.lariccia@uniroma1.it
Fernando Martínez de Carnero Calzada (Sapienza University) fernando.martinez@uniroma1.it
UROMA
28/02/2019 Territoires Innovants: Essaouira 1
Creating a new protocol framework for the collection of data and
direct personal experiences.
Authors:
Stefano Lariccia (Sapienza University ) stefano.lariccia@uniroma1.it
Fernando Martínez de Carnero Calzada (Sapienza University) fernando.martinez@uniroma1.it
Giovanni Toffoli (Link R&D) toffoli@linkroma.it
Computing with Python: what is Python?
Computing with Python: why Python?
Computing with Python: why Python?
Computing with Python: what you can do with
Python?
Computing with Python: what you can do with
Python?
Computing with Python: how to proceed with
Python?
● >>> from nltk.book import * *** Introductory Examples for the NLTK Book *** Loading text1, ...,
text9 and sent1, ..., sent9 Type the name of the text or sentence to view it. Type: 'texts()' or
'sents()' to list the materials.
○ text1: Moby Dick by Herman Melville 1851
○ text2: Sense and Sensibility by Jane Austen 1811
○ text3: The Book of Genesis
○ text4: Inaugural Address Corpus
○ text5: Chat Corpus
○ text6: Monty Python and the Holy Grail
○ text7: Wall Street Journal
○ text8: Personals Corpus
○ text9: The Man Who Was Thursday by G . K . Chesterton 1908 >>>
Computing with Language: Texts and Words
● Any time we want to find out about these texts, we just have to enter their names at the Python
prompt:
○ >>> text1
○ >>> text2
○ >>>
● Now that we can use the Python interpreter, and have some data to work with, we’re ready to get
started.
Computing with Language: Texts and Words
● Searching Text There are many ways to examine the context of a text apart from simply reading it.
A concordance view shows us every occurrence of a given word, together with some context.
● Here we look up the word monstrous in Moby Dick by entering text1 followed by a period, then the
term concordance, and then placing "monstrous" in parentheses:
●
Computing with Language: Texts and Words
● Let’s begin by finding out the length of a text from start to finish, in terms of the words and
punctuation symbols that appear. We use the term len to get the length of something, which we’ll
apply here to the book of Genesis:
>>> len(text3) 44764 >>>
● So Genesis has 44,764 words and punctuation symbols, or “tokens.”
● A token is the technical name for a sequence of characters—such as hairy, his, or :)—that we
want to treat as a group.
● When we count the number of tokens in a text, say, the phrase to be or not to be, we are counting
occurrences of these sequences. Thus, in our example phrase there are two occurrences of to,
two of be, and one each of or and not.
● But there are only four distinct vocabulary items in this phrase. How many distinct words does the
book of Genesis contain?
Computing with Language: Counting Vocabulary
● There are many ways to examine the context of a text apart from simply reading it. A concordance
view shows us every occurrence of a given word, together with some context.
● Here we look up the word monstrous in Moby Dick by entering text1 followed by a period, then the
term concordance, and then placing "monstrous" in parentheses:
●
Computing with Language: Searching Text
● Now, let’s calculate a measure of the lexical richness of the text. The next example shows us that
each word is used 16 times on average (we need to make sure Python uses floating-point
division):
>>> from __future__ import division >>> len(text3) / len(set(text3)) 16.050197203298673 >>>
The constant __future__ makes it possible usage of future version constant in a today version of python)
● Now let’s calculate the variance of each of these texts:
○ Text1
○ Text2
○ Text3
○ Text4
○ Text5
○ Text6
○ Text7
○ Text8
○ Text9
Computing with Language: Counting Vocabulary
● Now, let’s calculate a measure of the lexical richness of the text. The next example shows us that
each word is used 16 times on average (we need to make sure Python uses floating-point
division): >>> from __future__ import division >>> len(text3) / len(set(text3)) 16.050197203298673
>>>
● The constant __future__ makes it possible usage of future version constant in a today version of
python)
● Now let’s calculate the variance of each of these texts:
○ Text1
○ Text2
○ Text3
○ …..
● Counting Vocabulary
28/02/2019 Territoires Innovants: Essaouira 17
Innovation in creativity: Up2U an Open Flexible ecosistem
(Screen shot of the Up2U gateway)
28/02/2019 Territoires Innovants: Essaouira 18
Innovation in creativity: Up2U an Open Flexible
ecosystem
CommonSpaces Social Learning Platform
28/02/2019 Territoires Innovants: Essaouira 19
Innovation in creativity: Up2U an Open Flexible
ecosystem
Learning Experiences Traceability in Up2U
(Learning Analytics, Learning Locker)
28/02/2019 Territoires Innovants: Essaouira 20
Innovation in creativity: Up2U an Open Flexible
ecosystem
Learning Experiences Traceability in Up2U
(Learning Analytics, Learning Locker)
28/02/2019 Territoires Innovants: Essaouira 21
Innovation in creativity: Up2U an Open Flexible
ecosystem
Learning Experiences Traceability in Up2U
(Learning Analytics, Learning Locker)
28/02/2019 Territoires Innovants: Essaouira 22
• Demo Learning Units preparation (part of Up2U Pedagogial WP5)
• Unit 1: New technologies, new learning methods
• Unit 2: Language Technologies and Learning Assessment ()
• Unit 3: Language analysis and Critical Thinking. How to increase your
ability to understand texts and documents ()
• Unit 4: GDPR (General Data Protection Regulation)
• Unit 5: Climate Change risk mitigation ()
• Unit 6: Education to environmental and seismic emergency and to ()
• Unit 7: Valorisation of Archaeological sites ()
• Unit 8: Landscape monitoring and navigation ()
Innovation in creativity: Up2U an Open Flexible
ecosystem
1. Social Movements fighting to promoting and let emerge Open Source data initiatives are trying to
alert research centres, universities, governments and other institutions on the risk for the people
to be alienated of the oil of the future , Big Data
2. Quali analogie presenta la situazione di alienazione dei dati (information alienation) dalle realtà
locali rispetto alla alienation of commodities being produced by workers? Is there any theoretical,
systematic comparison?
3. Developing countries are reluctant to accept limitations to their growth for Climate Change alerts:
they argument that if western countries did all they wanted for centuries, they should not accept
now limitation in the name of the planet
23Territoires Innovants: Essaouira
1. Why Europe should express a european specific position?
2. Open Source movements are more and more visible and authoritative
3. Linux case
4. Galileo case
5. Up2U case
6. How Up2U could support european schools and international schools to increase
awareness on Climate Change problems and mitigation policies
7. A pan european community of geographical monitoring and survey could be
enlarged to a wider community of countries (mediterranean area, middle-east)
Territoires Innovants: Essaouira
● Why a citizen science?
Academic level
Alan Irwin (1995): collaboratively determine the objectives of the research.
Bonney (1996): collective investigations at the Cornell Ornithology Laboratory.
Institutional level
Holdren (2015): voluntary participation of the public in the scientific process.
● Causes of the displacement of the participation of the scientist to the experimenter
○ Management of business and financial models at the public level: users and quality control.
○ Semantic Web and W30: the user as a data source. From hunter to hunted.
○ The cult of storytelling, micro-stories and emotionality.
○ Interactive devices, smartphone and social networks.
○ Traceability and GPS.
○ Big data, processing capacity and interpretation algorithms.
○ Ethical and unethical uses, with acceptance and ignored. The data and its thefts.
Cambridge analytica as limit.
○ Profiles and behavioral patterns.
● Areas of application of citizen science.
○ A digital revolution tailored to technology or a use of
technology tailored to a cultural model?
○ Transversality and interdisciplinarity.
○ Impact factors: social and educational.
○ Technologies and social networks as development
instruments.
○ Applications for tourist studies.
○ Ecotourism, food and wine tourism and cultural tourism:
sustainability, diversification and complementarity of the
offer.
○ Growing interest in financing research.
Open learning- Text analysis basics
Open learning- Text analysis basics
● The Ten Principles of Citizen Science.
○ 1. Citizen science projects actively involve citizens in scientific endeavour that generates new
knowledge or understanding.
○ 2. Citizen science projects have a genuine science outcome.
○ 3. Both the professional scientists and the citizen scientists benefit from taking part.
○ 4. Citizen scientists may, if they wish, participate in multiple stages of the scientific process.
○ 5. Citizen scientists receive feedback from the project.
○ 6. Citizen science is considered a research approach like any other, with limitations and biases that
should be considered and controlled for.
○ 7. Citizen science project data and metadata are made publicly available and where possible, results
are published in an open-access format.
○ 8. Citizen scientists are acknowledged in project results and publications.
○ 9. Citizen science programmes are evaluated for their scientific output, data quality, participant
experience and wider societal or policy impact.
○ 10. The leaders of citizen science projects take into consideration legal and ethical issues
surrounding copyright, intellectual property, data-sharing agreements, confidentiality, attribution and
the environmental impact of any activities.
● Tourist experience and experiential tourism.
○ Is a non-experiential tourism possible?
○ Experiential tourism as an emerging
collaborative form.
○ The gaps in the tourism: the demand that
does not cover the offer.
○ Educational instruments and appreciation of
the learning process.
[Marco Ramazzotti]
1. What kind of tools we can put at work to automate analysis landscapes
photographs and satellite images to achieve the goal of monitoring changes
trend on our planet?
○ Climate Change Emergency: the lack of time can boost the new adaptation process
2. Neural Network and Machine Learning methods to analyse images (aereo
photogrammetry, drone images and satellite images)
○ Many progresses were made into image analysis and pattern recognition
○ Another big step forward can be introduced by man machine interaction
3. Mixed methods (human and automata collaboration]
○ Students of secondary schools are invited to collaborate in communities between them and
in communities with well trained automata
Territoires Innovants: Essaouira
26/02/2019 32
• Mentoring Pilot organization: Italy
• CYC2 Module 1: 1-2 weeks; CYC2 Module 2: 12 weeks, Online
• Distribution of Learning Path 1, 2 and 3 (the last 2 need to be translated)
• Schools engagement: GARR mailing the schools, and regional districts
• Supporting Pilot organization: Greece
• Schools mailing: expected engagement of 100 schools
• Modeling a common collection of data
• Supporting Pilot organization: Lithuania
• Schools engagement: active engagement of 80 schools
• Modeling a common collection of data
• Supporting Pilot organization: Hungary
● Schools engagement: expected engagement of 50 schools
Essaouira 28 January 2019
26/02/2019 33
• Supporting Pilots, general overview:
• Supporting Pilot organization: Poland
■ Schools mailing: expected engagement of 100 schools
■ Modeling a common collection of data
• Supporting Pilot organization: Portugal
■ Schools mailing: expected engagement of 30 schools
• Supporting Pilot organization: Spain
■ Schools mailing: expected engagement of 30 schools
• Supporting Pilot organization: Switzerland
■ Schools recruitment: expected engagement of 5 schools
Essaouira Up2U January 2019
28/02/2019
Territoires Innovants: Essaouira
34
• Further piloting activities into Up2U ecosystem
• “Light” integration of existing Jupyter notebooks based on
CommonSpaces integrated for formal/informal in Up2U;
• University as a Hub is planning first “flipped” class for Italian secondary
schools based on CommonSpaces. 1 Pilot will start i classroom on February
• Starting planning May Meeting in Rome at SLERD 2019
• In Italy (to be integrated into WP7): started negotiation with Heritage
Ministry to co-authoring a pilot in italian secondary schools about
“Emergency and seismic education and risk mitigation” and “Valorisation
of Archaeological sites” ; this latter activity will converge during SLERD
2019 in a demonstration of Up2U system used in Heritage studies.
January - February 2019 [3]
28/02/2019
Territoires Innovants: Essaouira
35
• Planned piloting activities into Up2U ecosystem
• “Light” integration of existing Jupyter notebooks based on
CommonSpaces integrated for formal/informal in Up2U;
• UaH is planning first “flipped” class for Italian secondary schools based
on CommonSpaces. 1 Pilot will start in classroom on February / March
• Planning May Meeting in Rome at SLERD 2019
• Negotiation with the Italian Heritage Ministry to co-authoring a pilot in
italian secondary schools about “Emergency and seismic education and
risk mitigation” and “Valorisation of Archaeological sites”
March - December 2019
1 of 35

Recommended

Pankaj Gupta CV / Resume by
Pankaj Gupta CV / ResumePankaj Gupta CV / Resume
Pankaj Gupta CV / ResumePankaj Gupta, PhD
1.5K views6 slides
Introduction to automated text analyses in the Political Sciences by
Introduction to automated text analyses in the Political SciencesIntroduction to automated text analyses in the Political Sciences
Introduction to automated text analyses in the Political SciencesChristianRauh2
254 views65 slides
Re-engineering the Uptake of ICT in Schools by
Re-engineering  the Uptake of  ICT in SchoolsRe-engineering  the Uptake of  ICT in Schools
Re-engineering the Uptake of ICT in SchoolsSergio González Moreau
433 views214 slides
IDSP19C#F - B - Mingjun Lan - Updated - What ideologies and realities can be ... by
IDSP19C#F - B - Mingjun Lan - Updated - What ideologies and realities can be ...IDSP19C#F - B - Mingjun Lan - Updated - What ideologies and realities can be ...
IDSP19C#F - B - Mingjun Lan - Updated - What ideologies and realities can be ...IDSP - IE Dissertation Support Project
76 views17 slides
NYSCATE 2010 by
NYSCATE 2010NYSCATE 2010
NYSCATE 2010Charles Profitt
194 views68 slides
International projects by
International projectsInternational projects
International projectsmiride
712 views26 slides

More Related Content

Similar to Open learning- Text analysis basics

2009 horizon-report by
2009 horizon-report2009 horizon-report
2009 horizon-reportifsslideacc
514 views36 slides
Conole Japan Abstract by
Conole Japan AbstractConole Japan Abstract
Conole Japan Abstractgrainne
447 views3 slides
Project by
ProjectProject
Projectguest0cc6fbd9
885 views40 slides
Opening up Higher Education in Europe by
Opening up Higher Education in EuropeOpening up Higher Education in Europe
Opening up Higher Education in EuropeAndreia Inamorato dos Santos
1.3K views55 slides
Opening up Higher Education in Europe by
Opening up Higher Education in EuropeOpening up Higher Education in Europe
Opening up Higher Education in Europeiptsedu
523 views55 slides
ticEDUCA2010 presentation (Andrews) by
ticEDUCA2010 presentation (Andrews)ticEDUCA2010 presentation (Andrews)
ticEDUCA2010 presentation (Andrews)ticEDUCA2010
693 views23 slides

Similar to Open learning- Text analysis basics(20)

2009 horizon-report by ifsslideacc
2009 horizon-report2009 horizon-report
2009 horizon-report
ifsslideacc514 views
Conole Japan Abstract by grainne
Conole Japan AbstractConole Japan Abstract
Conole Japan Abstract
grainne447 views
Opening up Higher Education in Europe by iptsedu
Opening up Higher Education in EuropeOpening up Higher Education in Europe
Opening up Higher Education in Europe
iptsedu523 views
ticEDUCA2010 presentation (Andrews) by ticEDUCA2010
ticEDUCA2010 presentation (Andrews)ticEDUCA2010 presentation (Andrews)
ticEDUCA2010 presentation (Andrews)
ticEDUCA2010693 views
Final Assignment Tsci 2009 2010 C.P. Rozenberg by Digital Learning
Final Assignment Tsci 2009 2010 C.P. RozenbergFinal Assignment Tsci 2009 2010 C.P. Rozenberg
Final Assignment Tsci 2009 2010 C.P. Rozenberg
Digital Learning566 views
2017-06-09 WLS LINQ Evidence-based Inclusive School Education Stracke by Christian M. Stracke
2017-06-09 WLS LINQ Evidence-based Inclusive School Education Stracke2017-06-09 WLS LINQ Evidence-based Inclusive School Education Stracke
2017-06-09 WLS LINQ Evidence-based Inclusive School Education Stracke
Formen von studentischer Collaboration mit neuen Medien und Open Educational ... by Stian Håklev
Formen von studentischer Collaboration mit neuen Medien und Open Educational ...Formen von studentischer Collaboration mit neuen Medien und Open Educational ...
Formen von studentischer Collaboration mit neuen Medien und Open Educational ...
Stian Håklev79 views
Michela Insenga: 1.3) INSTEM – Innovation Network in STEM by Brussels, Belgium
Michela Insenga: 1.3)	INSTEM – Innovation Network in STEM Michela Insenga: 1.3)	INSTEM – Innovation Network in STEM
Michela Insenga: 1.3) INSTEM – Innovation Network in STEM
Brussels, Belgium584 views
Perl 111223 my college tomorrow article by Stéphane VINCENT
Perl 111223 my college tomorrow articlePerl 111223 my college tomorrow article
Perl 111223 my college tomorrow article
Stéphane VINCENT1.4K views
Open Education: OER, MOOCs e pratiche pedagogiche aperte by Fabio Nascimbeni
Open Education: OER, MOOCs e pratiche pedagogiche aperteOpen Education: OER, MOOCs e pratiche pedagogiche aperte
Open Education: OER, MOOCs e pratiche pedagogiche aperte
Fabio Nascimbeni155 views
Introduction to NEXT-TELL project for schools by Peter Reimann
Introduction to NEXT-TELL project for schoolsIntroduction to NEXT-TELL project for schools
Introduction to NEXT-TELL project for schools
Peter Reimann634 views
Social Software and Web2.0 in Teacher Education and Teacher Training (Report) by Marion R. Gruber
Social Software and Web2.0 in Teacher Education and Teacher Training (Report)Social Software and Web2.0 in Teacher Education and Teacher Training (Report)
Social Software and Web2.0 in Teacher Education and Teacher Training (Report)
Marion R. Gruber592 views
OEF presentation Open Education week 2017 by Fabio Nascimbeni
OEF presentation Open Education week 2017OEF presentation Open Education week 2017
OEF presentation Open Education week 2017
Fabio Nascimbeni452 views
Supporting educators as designers of complex blended learning scenarios: visu... by Laia Albó
Supporting educators as designers of complex blended learning scenarios: visu...Supporting educators as designers of complex blended learning scenarios: visu...
Supporting educators as designers of complex blended learning scenarios: visu...
Laia Albó132 views

More from Up2Universe

Up2U Pedagogical evaluation by
Up2U Pedagogical evaluationUp2U Pedagogical evaluation
Up2U Pedagogical evaluationUp2Universe
638 views5 slides
Continuous professional development for secondary education teachers to adopt... by
Continuous professional development for secondary education teachers to adopt...Continuous professional development for secondary education teachers to adopt...
Continuous professional development for secondary education teachers to adopt...Up2Universe
77 views11 slides
Up2U brand manual by
Up2U brand manualUp2U brand manual
Up2U brand manualUp2Universe
105 views19 slides
openUp2U booklet by
openUp2U bookletopenUp2U booklet
openUp2U bookletUp2Universe
297 views4 slides
Why choose Up2U? by
Why choose Up2U?Why choose Up2U?
Why choose Up2U?Up2Universe
99 views1 slide
Up2U step by step guides for NRENs by
Up2U step by step guides for NRENsUp2U step by step guides for NRENs
Up2U step by step guides for NRENsUp2Universe
92 views1 slide

More from Up2Universe(20)

Up2U Pedagogical evaluation by Up2Universe
Up2U Pedagogical evaluationUp2U Pedagogical evaluation
Up2U Pedagogical evaluation
Up2Universe638 views
Continuous professional development for secondary education teachers to adopt... by Up2Universe
Continuous professional development for secondary education teachers to adopt...Continuous professional development for secondary education teachers to adopt...
Continuous professional development for secondary education teachers to adopt...
Up2Universe77 views
Up2U brand manual by Up2Universe
Up2U brand manualUp2U brand manual
Up2U brand manual
Up2Universe105 views
openUp2U booklet by Up2Universe
openUp2U bookletopenUp2U booklet
openUp2U booklet
Up2Universe297 views
Up2U step by step guides for NRENs by Up2Universe
Up2U step by step guides for NRENsUp2U step by step guides for NRENs
Up2U step by step guides for NRENs
Up2Universe92 views
Up2U for schools booklet by Up2Universe
Up2U for schools bookletUp2U for schools booklet
Up2U for schools booklet
Up2Universe165 views
Open Educational Resources for Bridging High School – University Gaps in Acad... by Up2Universe
Open Educational Resources for Bridging High School – University Gaps in Acad...Open Educational Resources for Bridging High School – University Gaps in Acad...
Open Educational Resources for Bridging High School – University Gaps in Acad...
Up2Universe96 views
Greek IT security flyer by Up2Universe
Greek IT security flyerGreek IT security flyer
Greek IT security flyer
Up2Universe403 views
Edulearn2019_Up2U_Presentation_G.Cibulskis_A.Urbaityte by Up2Universe
Edulearn2019_Up2U_Presentation_G.Cibulskis_A.UrbaityteEdulearn2019_Up2U_Presentation_G.Cibulskis_A.Urbaityte
Edulearn2019_Up2U_Presentation_G.Cibulskis_A.Urbaityte
Up2Universe44 views
Pilots results- lessons learned Up2University project by Up2Universe
Pilots results- lessons learned Up2University projectPilots results- lessons learned Up2University project
Pilots results- lessons learned Up2University project
Up2Universe192 views
Praktyczny przewodnik po bezpieczeństwie teleinformatycznym Up2U by Up2Universe
Praktyczny przewodnik po bezpieczeństwie teleinformatycznym Up2UPraktyczny przewodnik po bezpieczeństwie teleinformatycznym Up2U
Praktyczny przewodnik po bezpieczeństwie teleinformatycznym Up2U
Up2Universe70 views
IT biztonsági kisokos by Up2Universe
IT biztonsági kisokosIT biztonsági kisokos
IT biztonsági kisokos
Up2Universe1.8K views
Guida pratica alla sicurezza ICT per il progetto Up2U by Up2Universe
Guida pratica alla sicurezza ICT per il progetto Up2UGuida pratica alla sicurezza ICT per il progetto Up2U
Guida pratica alla sicurezza ICT per il progetto Up2U
Up2Universe123 views
Una guía práctica para la seguridad TIC-Up2U by Up2Universe
Una guía práctica para la seguridad TIC-Up2UUna guía práctica para la seguridad TIC-Up2U
Una guía práctica para la seguridad TIC-Up2U
Up2Universe66 views
A practical guide to IT security-Up to University project by Up2Universe
A practical guide to IT security-Up to University projectA practical guide to IT security-Up to University project
A practical guide to IT security-Up to University project
Up2Universe347 views
Facilitating curation of open educational resources through the use of an app... by Up2Universe
Facilitating curation of open educational resources through the use of an app...Facilitating curation of open educational resources through the use of an app...
Facilitating curation of open educational resources through the use of an app...
Up2Universe67 views
Up2U Learning Community interactions by Up2Universe
Up2U Learning Community interactionsUp2U Learning Community interactions
Up2U Learning Community interactions
Up2Universe82 views
Up2U webinar for NRENs by Up2Universe
Up2U webinar for NRENsUp2U webinar for NRENs
Up2U webinar for NRENs
Up2Universe88 views

Recently uploaded

Meet the Bible by
Meet the BibleMeet the Bible
Meet the BibleSteve Thomason
76 views80 slides
Guess Papers ADC 1, Karachi University by
Guess Papers ADC 1, Karachi UniversityGuess Papers ADC 1, Karachi University
Guess Papers ADC 1, Karachi UniversityKhalid Aziz
83 views17 slides
MIXING OF PHARMACEUTICALS.pptx by
MIXING OF PHARMACEUTICALS.pptxMIXING OF PHARMACEUTICALS.pptx
MIXING OF PHARMACEUTICALS.pptxAnupkumar Sharma
117 views35 slides
Parts of Speech (1).pptx by
Parts of Speech (1).pptxParts of Speech (1).pptx
Parts of Speech (1).pptxmhkpreet001
43 views20 slides
Guidelines & Identification of Early Sepsis DR. NN CHAVAN 02122023.pptx by
Guidelines & Identification of Early Sepsis DR. NN CHAVAN 02122023.pptxGuidelines & Identification of Early Sepsis DR. NN CHAVAN 02122023.pptx
Guidelines & Identification of Early Sepsis DR. NN CHAVAN 02122023.pptxNiranjan Chavan
38 views48 slides
Retail Store Scavenger Hunt.pptx by
Retail Store Scavenger Hunt.pptxRetail Store Scavenger Hunt.pptx
Retail Store Scavenger Hunt.pptxjmurphy154
52 views10 slides

Recently uploaded(20)

Guess Papers ADC 1, Karachi University by Khalid Aziz
Guess Papers ADC 1, Karachi UniversityGuess Papers ADC 1, Karachi University
Guess Papers ADC 1, Karachi University
Khalid Aziz83 views
Parts of Speech (1).pptx by mhkpreet001
Parts of Speech (1).pptxParts of Speech (1).pptx
Parts of Speech (1).pptx
mhkpreet00143 views
Guidelines & Identification of Early Sepsis DR. NN CHAVAN 02122023.pptx by Niranjan Chavan
Guidelines & Identification of Early Sepsis DR. NN CHAVAN 02122023.pptxGuidelines & Identification of Early Sepsis DR. NN CHAVAN 02122023.pptx
Guidelines & Identification of Early Sepsis DR. NN CHAVAN 02122023.pptx
Niranjan Chavan38 views
Retail Store Scavenger Hunt.pptx by jmurphy154
Retail Store Scavenger Hunt.pptxRetail Store Scavenger Hunt.pptx
Retail Store Scavenger Hunt.pptx
jmurphy15452 views
The Accursed House by Émile Gaboriau by DivyaSheta
The Accursed House  by Émile GaboriauThe Accursed House  by Émile Gaboriau
The Accursed House by Émile Gaboriau
DivyaSheta246 views
Nelson_RecordStore.pdf by BrynNelson5
Nelson_RecordStore.pdfNelson_RecordStore.pdf
Nelson_RecordStore.pdf
BrynNelson546 views
JQUERY.pdf by ArthyR3
JQUERY.pdfJQUERY.pdf
JQUERY.pdf
ArthyR3103 views
Class 9 lesson plans by TARIQ KHAN
Class 9 lesson plansClass 9 lesson plans
Class 9 lesson plans
TARIQ KHAN68 views
Payment Integration using Braintree Connector | MuleSoft Mysore Meetup #37 by MysoreMuleSoftMeetup
Payment Integration using Braintree Connector | MuleSoft Mysore Meetup #37Payment Integration using Braintree Connector | MuleSoft Mysore Meetup #37
Payment Integration using Braintree Connector | MuleSoft Mysore Meetup #37
Education of marginalized and socially disadvantages segments.pptx by GarimaBhati5
Education of marginalized and socially disadvantages segments.pptxEducation of marginalized and socially disadvantages segments.pptx
Education of marginalized and socially disadvantages segments.pptx
GarimaBhati540 views
Career Building in AI - Technologies, Trends and Opportunities by WebStackAcademy
Career Building in AI - Technologies, Trends and OpportunitiesCareer Building in AI - Technologies, Trends and Opportunities
Career Building in AI - Technologies, Trends and Opportunities
WebStackAcademy41 views
Narration lesson plan by TARIQ KHAN
Narration lesson planNarration lesson plan
Narration lesson plan
TARIQ KHAN69 views
EILO EXCURSION PROGRAMME 2023 by info33492
EILO EXCURSION PROGRAMME 2023EILO EXCURSION PROGRAMME 2023
EILO EXCURSION PROGRAMME 2023
info33492181 views
SURGICAL MANAGEMENT OF CERVICAL CANCER DR. NN CHAVAN 28102023.pptx by Niranjan Chavan
SURGICAL MANAGEMENT OF CERVICAL CANCER DR. NN CHAVAN 28102023.pptxSURGICAL MANAGEMENT OF CERVICAL CANCER DR. NN CHAVAN 28102023.pptx
SURGICAL MANAGEMENT OF CERVICAL CANCER DR. NN CHAVAN 28102023.pptx
Niranjan Chavan43 views

Open learning- Text analysis basics

  • 1. Open Learning [0] Text analysis Basics [Innovation and Sustainability] 2019_04_11 Text Wandering exchanging experiences: crowdsourced based interactive awareness. Stefano Lariccia (Sapienza Università di Roma) stefano.lariccia@uniroma1.it Fernando Martínez de Carnero Calzada (Sapienza University) fernando.martinez@uniroma1.it UROMA 28/02/2019 Territoires Innovants: Essaouira 1
  • 2. Creating a new protocol framework for the collection of data and direct personal experiences. Authors: Stefano Lariccia (Sapienza University ) stefano.lariccia@uniroma1.it Fernando Martínez de Carnero Calzada (Sapienza University) fernando.martinez@uniroma1.it Giovanni Toffoli (Link R&D) toffoli@linkroma.it
  • 3. Computing with Python: what is Python?
  • 6. Computing with Python: what you can do with Python?
  • 7. Computing with Python: what you can do with Python?
  • 8. Computing with Python: how to proceed with Python?
  • 9. ● >>> from nltk.book import * *** Introductory Examples for the NLTK Book *** Loading text1, ..., text9 and sent1, ..., sent9 Type the name of the text or sentence to view it. Type: 'texts()' or 'sents()' to list the materials. ○ text1: Moby Dick by Herman Melville 1851 ○ text2: Sense and Sensibility by Jane Austen 1811 ○ text3: The Book of Genesis ○ text4: Inaugural Address Corpus ○ text5: Chat Corpus ○ text6: Monty Python and the Holy Grail ○ text7: Wall Street Journal ○ text8: Personals Corpus ○ text9: The Man Who Was Thursday by G . K . Chesterton 1908 >>> Computing with Language: Texts and Words
  • 10. ● Any time we want to find out about these texts, we just have to enter their names at the Python prompt: ○ >>> text1 ○ >>> text2 ○ >>> ● Now that we can use the Python interpreter, and have some data to work with, we’re ready to get started. Computing with Language: Texts and Words
  • 11. ● Searching Text There are many ways to examine the context of a text apart from simply reading it. A concordance view shows us every occurrence of a given word, together with some context. ● Here we look up the word monstrous in Moby Dick by entering text1 followed by a period, then the term concordance, and then placing "monstrous" in parentheses: ● Computing with Language: Texts and Words
  • 12. ● Let’s begin by finding out the length of a text from start to finish, in terms of the words and punctuation symbols that appear. We use the term len to get the length of something, which we’ll apply here to the book of Genesis: >>> len(text3) 44764 >>> ● So Genesis has 44,764 words and punctuation symbols, or “tokens.” ● A token is the technical name for a sequence of characters—such as hairy, his, or :)—that we want to treat as a group. ● When we count the number of tokens in a text, say, the phrase to be or not to be, we are counting occurrences of these sequences. Thus, in our example phrase there are two occurrences of to, two of be, and one each of or and not. ● But there are only four distinct vocabulary items in this phrase. How many distinct words does the book of Genesis contain? Computing with Language: Counting Vocabulary
  • 13. ● There are many ways to examine the context of a text apart from simply reading it. A concordance view shows us every occurrence of a given word, together with some context. ● Here we look up the word monstrous in Moby Dick by entering text1 followed by a period, then the term concordance, and then placing "monstrous" in parentheses: ● Computing with Language: Searching Text
  • 14. ● Now, let’s calculate a measure of the lexical richness of the text. The next example shows us that each word is used 16 times on average (we need to make sure Python uses floating-point division): >>> from __future__ import division >>> len(text3) / len(set(text3)) 16.050197203298673 >>> The constant __future__ makes it possible usage of future version constant in a today version of python) ● Now let’s calculate the variance of each of these texts: ○ Text1 ○ Text2 ○ Text3 ○ Text4 ○ Text5 ○ Text6 ○ Text7 ○ Text8 ○ Text9 Computing with Language: Counting Vocabulary
  • 15. ● Now, let’s calculate a measure of the lexical richness of the text. The next example shows us that each word is used 16 times on average (we need to make sure Python uses floating-point division): >>> from __future__ import division >>> len(text3) / len(set(text3)) 16.050197203298673 >>> ● The constant __future__ makes it possible usage of future version constant in a today version of python) ● Now let’s calculate the variance of each of these texts: ○ Text1 ○ Text2 ○ Text3 ○ …..
  • 17. 28/02/2019 Territoires Innovants: Essaouira 17 Innovation in creativity: Up2U an Open Flexible ecosistem (Screen shot of the Up2U gateway)
  • 18. 28/02/2019 Territoires Innovants: Essaouira 18 Innovation in creativity: Up2U an Open Flexible ecosystem CommonSpaces Social Learning Platform
  • 19. 28/02/2019 Territoires Innovants: Essaouira 19 Innovation in creativity: Up2U an Open Flexible ecosystem Learning Experiences Traceability in Up2U (Learning Analytics, Learning Locker)
  • 20. 28/02/2019 Territoires Innovants: Essaouira 20 Innovation in creativity: Up2U an Open Flexible ecosystem Learning Experiences Traceability in Up2U (Learning Analytics, Learning Locker)
  • 21. 28/02/2019 Territoires Innovants: Essaouira 21 Innovation in creativity: Up2U an Open Flexible ecosystem Learning Experiences Traceability in Up2U (Learning Analytics, Learning Locker)
  • 22. 28/02/2019 Territoires Innovants: Essaouira 22 • Demo Learning Units preparation (part of Up2U Pedagogial WP5) • Unit 1: New technologies, new learning methods • Unit 2: Language Technologies and Learning Assessment () • Unit 3: Language analysis and Critical Thinking. How to increase your ability to understand texts and documents () • Unit 4: GDPR (General Data Protection Regulation) • Unit 5: Climate Change risk mitigation () • Unit 6: Education to environmental and seismic emergency and to () • Unit 7: Valorisation of Archaeological sites () • Unit 8: Landscape monitoring and navigation () Innovation in creativity: Up2U an Open Flexible ecosystem
  • 23. 1. Social Movements fighting to promoting and let emerge Open Source data initiatives are trying to alert research centres, universities, governments and other institutions on the risk for the people to be alienated of the oil of the future , Big Data 2. Quali analogie presenta la situazione di alienazione dei dati (information alienation) dalle realtà locali rispetto alla alienation of commodities being produced by workers? Is there any theoretical, systematic comparison? 3. Developing countries are reluctant to accept limitations to their growth for Climate Change alerts: they argument that if western countries did all they wanted for centuries, they should not accept now limitation in the name of the planet 23Territoires Innovants: Essaouira
  • 24. 1. Why Europe should express a european specific position? 2. Open Source movements are more and more visible and authoritative 3. Linux case 4. Galileo case 5. Up2U case 6. How Up2U could support european schools and international schools to increase awareness on Climate Change problems and mitigation policies 7. A pan european community of geographical monitoring and survey could be enlarged to a wider community of countries (mediterranean area, middle-east) Territoires Innovants: Essaouira
  • 25. ● Why a citizen science? Academic level Alan Irwin (1995): collaboratively determine the objectives of the research. Bonney (1996): collective investigations at the Cornell Ornithology Laboratory. Institutional level Holdren (2015): voluntary participation of the public in the scientific process. ● Causes of the displacement of the participation of the scientist to the experimenter ○ Management of business and financial models at the public level: users and quality control. ○ Semantic Web and W30: the user as a data source. From hunter to hunted. ○ The cult of storytelling, micro-stories and emotionality. ○ Interactive devices, smartphone and social networks. ○ Traceability and GPS. ○ Big data, processing capacity and interpretation algorithms. ○ Ethical and unethical uses, with acceptance and ignored. The data and its thefts. Cambridge analytica as limit. ○ Profiles and behavioral patterns.
  • 26. ● Areas of application of citizen science. ○ A digital revolution tailored to technology or a use of technology tailored to a cultural model? ○ Transversality and interdisciplinarity. ○ Impact factors: social and educational. ○ Technologies and social networks as development instruments. ○ Applications for tourist studies. ○ Ecotourism, food and wine tourism and cultural tourism: sustainability, diversification and complementarity of the offer. ○ Growing interest in financing research.
  • 29. ● The Ten Principles of Citizen Science. ○ 1. Citizen science projects actively involve citizens in scientific endeavour that generates new knowledge or understanding. ○ 2. Citizen science projects have a genuine science outcome. ○ 3. Both the professional scientists and the citizen scientists benefit from taking part. ○ 4. Citizen scientists may, if they wish, participate in multiple stages of the scientific process. ○ 5. Citizen scientists receive feedback from the project. ○ 6. Citizen science is considered a research approach like any other, with limitations and biases that should be considered and controlled for. ○ 7. Citizen science project data and metadata are made publicly available and where possible, results are published in an open-access format. ○ 8. Citizen scientists are acknowledged in project results and publications. ○ 9. Citizen science programmes are evaluated for their scientific output, data quality, participant experience and wider societal or policy impact. ○ 10. The leaders of citizen science projects take into consideration legal and ethical issues surrounding copyright, intellectual property, data-sharing agreements, confidentiality, attribution and the environmental impact of any activities.
  • 30. ● Tourist experience and experiential tourism. ○ Is a non-experiential tourism possible? ○ Experiential tourism as an emerging collaborative form. ○ The gaps in the tourism: the demand that does not cover the offer. ○ Educational instruments and appreciation of the learning process.
  • 31. [Marco Ramazzotti] 1. What kind of tools we can put at work to automate analysis landscapes photographs and satellite images to achieve the goal of monitoring changes trend on our planet? ○ Climate Change Emergency: the lack of time can boost the new adaptation process 2. Neural Network and Machine Learning methods to analyse images (aereo photogrammetry, drone images and satellite images) ○ Many progresses were made into image analysis and pattern recognition ○ Another big step forward can be introduced by man machine interaction 3. Mixed methods (human and automata collaboration] ○ Students of secondary schools are invited to collaborate in communities between them and in communities with well trained automata Territoires Innovants: Essaouira
  • 32. 26/02/2019 32 • Mentoring Pilot organization: Italy • CYC2 Module 1: 1-2 weeks; CYC2 Module 2: 12 weeks, Online • Distribution of Learning Path 1, 2 and 3 (the last 2 need to be translated) • Schools engagement: GARR mailing the schools, and regional districts • Supporting Pilot organization: Greece • Schools mailing: expected engagement of 100 schools • Modeling a common collection of data • Supporting Pilot organization: Lithuania • Schools engagement: active engagement of 80 schools • Modeling a common collection of data • Supporting Pilot organization: Hungary ● Schools engagement: expected engagement of 50 schools Essaouira 28 January 2019
  • 33. 26/02/2019 33 • Supporting Pilots, general overview: • Supporting Pilot organization: Poland ■ Schools mailing: expected engagement of 100 schools ■ Modeling a common collection of data • Supporting Pilot organization: Portugal ■ Schools mailing: expected engagement of 30 schools • Supporting Pilot organization: Spain ■ Schools mailing: expected engagement of 30 schools • Supporting Pilot organization: Switzerland ■ Schools recruitment: expected engagement of 5 schools Essaouira Up2U January 2019
  • 34. 28/02/2019 Territoires Innovants: Essaouira 34 • Further piloting activities into Up2U ecosystem • “Light” integration of existing Jupyter notebooks based on CommonSpaces integrated for formal/informal in Up2U; • University as a Hub is planning first “flipped” class for Italian secondary schools based on CommonSpaces. 1 Pilot will start i classroom on February • Starting planning May Meeting in Rome at SLERD 2019 • In Italy (to be integrated into WP7): started negotiation with Heritage Ministry to co-authoring a pilot in italian secondary schools about “Emergency and seismic education and risk mitigation” and “Valorisation of Archaeological sites” ; this latter activity will converge during SLERD 2019 in a demonstration of Up2U system used in Heritage studies. January - February 2019 [3]
  • 35. 28/02/2019 Territoires Innovants: Essaouira 35 • Planned piloting activities into Up2U ecosystem • “Light” integration of existing Jupyter notebooks based on CommonSpaces integrated for formal/informal in Up2U; • UaH is planning first “flipped” class for Italian secondary schools based on CommonSpaces. 1 Pilot will start in classroom on February / March • Planning May Meeting in Rome at SLERD 2019 • Negotiation with the Italian Heritage Ministry to co-authoring a pilot in italian secondary schools about “Emergency and seismic education and risk mitigation” and “Valorisation of Archaeological sites” March - December 2019