Monitoring Consistency using Spatial Features and Tag Semantics
https://sotm-eu.org/talk/slots/70
Make in deep collaboration with DCL FBK Trento
Maurizio Napolitano,Cristian Consonni and Francesca de Chiara.
Séminaire AgroTIC : Arduino et ses applications en agriculture
Seminario per l'introduzione della tecnologia arduino in agricoltura
Riferimento alessandro matese a.matese@ibimet.cnr.it IBIMET CNR
City Data Dating: emerging affinities between diverse urban datasetsGloria Re Calegari
Cities are complex environments in which digital technologies are more and more pervasive; this digitization of the urban space has led to a rich ecosystem of data producers and data consumers. Moreover, heterogeneous sources differ in terms of data complexity, spatio-temporal resolution and curation/maintenance costs. Do those diverse urban sources reflect the same picture of the city? Do distinct perspectives share some commonalities?
We present our data analytics empirical experiments on a set of urban sources related to the city of Milano; our investigation is aimed at discovering “affinities” between datasets by means of different quantitative and qualitative correlation analyses. We also explore the influence of spatial resolution and data complexity on the dependence strength between heterogeneous urban sources, to pave the way to a meaningful information fusion.
Abstract: The main communication methods used by deaf people are sign language, but opposed to common thought, there is no specific universal sign language: every country or even regional group uses its own set of signs. The use of sign language in digital systems can enhance communication in both directions: animated avatars can synthesize signals based on voice or text recognition; and sign language can be translated into various text or sound forms based on different images, videos and sensors input. The ultimate goal of this research, but it is not a simple spelling of spoken language, so that recognizing different signs or letters of the alphabet (which has been a common approach) is not sufficient for its transcription and automatic interpretation. Here proposes an algorithm and method for an application this would help us in recognising the various user defined signs. The palm images of right and left hand are loaded at runtime. Firstly these images will be seized and stored in directory. Then technique called Template matching is used for finding areas of an image that match (are similar) to a template image (patch). Our goal is to detect the highest matching area. We need two primary components- A) Source image (I): In the template image in which we try to find a match. B) Template image (T): The patch image which will be compared to the template image. In proposed system user defined patterns will be having 60% accuracy while default patterns will be provided with 80% accuracy.
"Geographical Analysis of Foreign Immigration and Spatial Patterns in Urban Areas. Density Estimation and Spatial Segregation" Third International Workshop on "Geographical Analysis, Urban Modeling, Spatial Statistics"
Séminaire AgroTIC : Arduino et ses applications en agriculture
Seminario per l'introduzione della tecnologia arduino in agricoltura
Riferimento alessandro matese a.matese@ibimet.cnr.it IBIMET CNR
City Data Dating: emerging affinities between diverse urban datasetsGloria Re Calegari
Cities are complex environments in which digital technologies are more and more pervasive; this digitization of the urban space has led to a rich ecosystem of data producers and data consumers. Moreover, heterogeneous sources differ in terms of data complexity, spatio-temporal resolution and curation/maintenance costs. Do those diverse urban sources reflect the same picture of the city? Do distinct perspectives share some commonalities?
We present our data analytics empirical experiments on a set of urban sources related to the city of Milano; our investigation is aimed at discovering “affinities” between datasets by means of different quantitative and qualitative correlation analyses. We also explore the influence of spatial resolution and data complexity on the dependence strength between heterogeneous urban sources, to pave the way to a meaningful information fusion.
Abstract: The main communication methods used by deaf people are sign language, but opposed to common thought, there is no specific universal sign language: every country or even regional group uses its own set of signs. The use of sign language in digital systems can enhance communication in both directions: animated avatars can synthesize signals based on voice or text recognition; and sign language can be translated into various text or sound forms based on different images, videos and sensors input. The ultimate goal of this research, but it is not a simple spelling of spoken language, so that recognizing different signs or letters of the alphabet (which has been a common approach) is not sufficient for its transcription and automatic interpretation. Here proposes an algorithm and method for an application this would help us in recognising the various user defined signs. The palm images of right and left hand are loaded at runtime. Firstly these images will be seized and stored in directory. Then technique called Template matching is used for finding areas of an image that match (are similar) to a template image (patch). Our goal is to detect the highest matching area. We need two primary components- A) Source image (I): In the template image in which we try to find a match. B) Template image (T): The patch image which will be compared to the template image. In proposed system user defined patterns will be having 60% accuracy while default patterns will be provided with 80% accuracy.
"Geographical Analysis of Foreign Immigration and Spatial Patterns in Urban Areas. Density Estimation and Spatial Segregation" Third International Workshop on "Geographical Analysis, Urban Modeling, Spatial Statistics"
An Open Source Java Code For Visualizing Supply Chain Problemsertekg
Download Link > https://ertekprojects.com/gurdal-ertek-publications/blog/an-open-source-java-code-for-visualizing-supply-chain-problems/
In this paper, we decribe an open source Java class library for visualizing supply chain problems within a geographical context. The highly competitive markets and recent technological advances make the use of such supply chain network visualizations critical in both strategic and tactical levels. The most important characteristic of our work is its easy integration with any Java application. Our software differs from any other commercial and open source supply chain visualization tool by its simple structure, easy adoption and implementation and high compatibility. The main motivation of our study was to develop a simple – yet effective – library that would not require to learn and apply complicated visualization tools and data structures such as Geographical Information Systems (GIS). In this study, we illustrate the use of our visualization tool through maps of Turkey, Europe, North and South America, the United States and the NAFTA. We believe that ease of visualization offered by our open source tool will contribute to a multitude of projects in supply chain design, as well as increasing productive communication among practitioners, especially involved in strategic level decision making processes. We foresee that our supply chain visualization tool will fill a gap in this area with its simple but effective structure.
Francesca Froy "What is the role of spatial configuration and urban morpholog...HannahParr3
Francesca Froy's presentation at the Urban Depth & Autonomy Workshops: What is the role of spatial configuration and urban morphology in agglomeration economies?
Abstract: The main communication methods used by deaf people are sign language, but opposed to common thought, there is no specific universal sign language: every country or even regional group uses its own set of signs. The use of sign language in digital systems can enhance communication in both directions: animated avatars can synthesize signals based on voice or text recognition; and sign language can be translated into various text or sound forms based on different images, videos and sensors input. The ultimate goal of this research, but it is not a simple spelling of spoken language, so that recognizing different signs or letters of the alphabet (which has been a common approach) is not sufficient for its transcription and automatic interpretation. Here proposes an algorithm and method for an application this would help us in recognising the various user defined signs. The palm images of right and left hand are loaded at runtime. Firstly these images will be seized and stored in directory. Then technique called Template matching is used for finding areas of an image that match (are similar) to a template image (patch). Our goal is to detect the highest matching area. We need two primary components- A) Source image (I): In the template image in which we try to find a match. B) Template image (T): The patch image which will be compared to the template image. In proposed system user defined patterns will be having 60% accuracy while default patterns will be provided with 80% accuracy.
A Data Scientist Exploration in the World of Heterogeneous Open Geospatial DataGloria Re Calegari
We present the challenges faced by a Data Scientist in exploring and analyzing heterogeneous Open Geospatial Data. This work is aimed at explaining the initial steps of a data exploration process, specifically aimed at discovering similarities and differences conveyed by diverse sources and resulting from their correlation analysis; we also explore the influence of spatial resolution on the dependence strength between heterogeneous urban sources, to pave the way to a meaningful information fusion.
No more BITS - Blind Insignificant Technologies ands Systems by Roger Roberts...ACTUONDA
No more BITS - Blind Insignificant Technologies ands Systems by Roger Roberts of RTBF TITAN
Primer encuentro BIG MEDIA
Conectando Media, Audiencia y Publicidad con Datos
24 de junio 2014, Madrid
• Sponsor Platinum : Perfect Memory
• Sponsor Gold : Stratio, Paradigma
• Con el apoyo de : Big Data Spain, Medios On
• Socio tecnológico : Agora News
• Organizadores : Actuonda y Cátedra Big Data UAM-IBM
• Contacto : Nicolas Moulard (Actuonda) moulard@actuonda.com @Radio_20
www.bigmediaconnect.es
Anatomical Survey Based Feature Vector for Text Pattern DetectionIJEACS
The vital objective of artificial intelligence is to discover and understand the human competences, one of which is the capability to distinguish several text objects within one or more images exhibited on any canvas including prints, videos or electronic displays. Multimedia data has increased rapidly in past years. Textual information present in multimedia contains important information about the image/video content. However it needs to technologically verify the commonly used human intelligence of detecting and differentiating the text within an image, for computers. Hence in this paper feature set based on anatomical study of human text detection system is proposed.
Exploiting Natural Language Definitions and (Legacy) Data for Facilitating Ag...Christophe Debruyne
Debruyne, C. and Vasquez, C. (2013) Exploiting Natural Language Definitions and (Legacy) Data for Facilitating Agreement Processes. In Proc. of Software Quality. Increasing Value in Software and Systems Development 2013 (SWQD 2013), LNBIP, Springer
In IT, ontologies to enable semantic interoperability is only of the branches in which agreement between a heterogeneous group of stakeholders are of vital importance. As agreements are the result of interactions, appropriate methods should take into account the natural language used by the community. In this paper, we extend a method for reaching a consensus on a conceptualization within a community of stakeholders, exploiting the natural language communication between the stakeholders. We describe how agreements on informal and formal descriptions are complementary and interplay. To this end, we introduce, describe and motivate the nature of some of the agreements and the two distinct levels of commitment. We furthermore show how these commitments can be exploited to steer the agreement processes. Concepts introduced in this paper have been implemented in a tool for collaborative ontology engineering, called GOSPL, which can be also adopted for other purposes, e.g., the construction a lexicon for larger software projects.
An Open Source Java Code For Visualizing Supply Chain Problemsertekg
Download Link > https://ertekprojects.com/gurdal-ertek-publications/blog/an-open-source-java-code-for-visualizing-supply-chain-problems/
In this paper, we decribe an open source Java class library for visualizing supply chain problems within a geographical context. The highly competitive markets and recent technological advances make the use of such supply chain network visualizations critical in both strategic and tactical levels. The most important characteristic of our work is its easy integration with any Java application. Our software differs from any other commercial and open source supply chain visualization tool by its simple structure, easy adoption and implementation and high compatibility. The main motivation of our study was to develop a simple – yet effective – library that would not require to learn and apply complicated visualization tools and data structures such as Geographical Information Systems (GIS). In this study, we illustrate the use of our visualization tool through maps of Turkey, Europe, North and South America, the United States and the NAFTA. We believe that ease of visualization offered by our open source tool will contribute to a multitude of projects in supply chain design, as well as increasing productive communication among practitioners, especially involved in strategic level decision making processes. We foresee that our supply chain visualization tool will fill a gap in this area with its simple but effective structure.
Francesca Froy "What is the role of spatial configuration and urban morpholog...HannahParr3
Francesca Froy's presentation at the Urban Depth & Autonomy Workshops: What is the role of spatial configuration and urban morphology in agglomeration economies?
Abstract: The main communication methods used by deaf people are sign language, but opposed to common thought, there is no specific universal sign language: every country or even regional group uses its own set of signs. The use of sign language in digital systems can enhance communication in both directions: animated avatars can synthesize signals based on voice or text recognition; and sign language can be translated into various text or sound forms based on different images, videos and sensors input. The ultimate goal of this research, but it is not a simple spelling of spoken language, so that recognizing different signs or letters of the alphabet (which has been a common approach) is not sufficient for its transcription and automatic interpretation. Here proposes an algorithm and method for an application this would help us in recognising the various user defined signs. The palm images of right and left hand are loaded at runtime. Firstly these images will be seized and stored in directory. Then technique called Template matching is used for finding areas of an image that match (are similar) to a template image (patch). Our goal is to detect the highest matching area. We need two primary components- A) Source image (I): In the template image in which we try to find a match. B) Template image (T): The patch image which will be compared to the template image. In proposed system user defined patterns will be having 60% accuracy while default patterns will be provided with 80% accuracy.
A Data Scientist Exploration in the World of Heterogeneous Open Geospatial DataGloria Re Calegari
We present the challenges faced by a Data Scientist in exploring and analyzing heterogeneous Open Geospatial Data. This work is aimed at explaining the initial steps of a data exploration process, specifically aimed at discovering similarities and differences conveyed by diverse sources and resulting from their correlation analysis; we also explore the influence of spatial resolution on the dependence strength between heterogeneous urban sources, to pave the way to a meaningful information fusion.
No more BITS - Blind Insignificant Technologies ands Systems by Roger Roberts...ACTUONDA
No more BITS - Blind Insignificant Technologies ands Systems by Roger Roberts of RTBF TITAN
Primer encuentro BIG MEDIA
Conectando Media, Audiencia y Publicidad con Datos
24 de junio 2014, Madrid
• Sponsor Platinum : Perfect Memory
• Sponsor Gold : Stratio, Paradigma
• Con el apoyo de : Big Data Spain, Medios On
• Socio tecnológico : Agora News
• Organizadores : Actuonda y Cátedra Big Data UAM-IBM
• Contacto : Nicolas Moulard (Actuonda) moulard@actuonda.com @Radio_20
www.bigmediaconnect.es
Anatomical Survey Based Feature Vector for Text Pattern DetectionIJEACS
The vital objective of artificial intelligence is to discover and understand the human competences, one of which is the capability to distinguish several text objects within one or more images exhibited on any canvas including prints, videos or electronic displays. Multimedia data has increased rapidly in past years. Textual information present in multimedia contains important information about the image/video content. However it needs to technologically verify the commonly used human intelligence of detecting and differentiating the text within an image, for computers. Hence in this paper feature set based on anatomical study of human text detection system is proposed.
Exploiting Natural Language Definitions and (Legacy) Data for Facilitating Ag...Christophe Debruyne
Debruyne, C. and Vasquez, C. (2013) Exploiting Natural Language Definitions and (Legacy) Data for Facilitating Agreement Processes. In Proc. of Software Quality. Increasing Value in Software and Systems Development 2013 (SWQD 2013), LNBIP, Springer
In IT, ontologies to enable semantic interoperability is only of the branches in which agreement between a heterogeneous group of stakeholders are of vital importance. As agreements are the result of interactions, appropriate methods should take into account the natural language used by the community. In this paper, we extend a method for reaching a consensus on a conceptualization within a community of stakeholders, exploiting the natural language communication between the stakeholders. We describe how agreements on informal and formal descriptions are complementary and interplay. To this end, we introduce, describe and motivate the nature of some of the agreements and the two distinct levels of commitment. We furthermore show how these commitments can be exploited to steer the agreement processes. Concepts introduced in this paper have been implemented in a tool for collaborative ontology engineering, called GOSPL, which can be also adopted for other purposes, e.g., the construction a lexicon for larger software projects.
Domenica 7 novembre, ore 16
Conferenza “Leggere e decifrare i processi forestali d’interesse micologico nelle nuove dinamiche climatiche“. A cura del Dr. Alfonso Crisci (CNR – IBE)
23 Ottobre 2021 - 17:30 / 7 Novembre 2021 - 19:00
Mykes: una montagna di funghi Gavinana, Palazzo Achilli
Domenica 31 ottobre, ore 16
Conferenza “Complessità nascoste. I multiversi connessi creati dai funghi“. A cura del Dr. Alfonso Crisci (CNR-IBE)
Palazzo Achilli, Ecomuseo della Montagna Pistoiese
Gavinana (San Marcello - Piteglio)
Mappiamo la biodiversità Mappiamo la biodiversità
Sabato 11 gennaio 2020, presso l'Ecomuseo della Montagna Pistoiese, l'Istituto di bioeconomia (Ibe) del Cnr partecipa all'iniziativa 'Mappiamo la biodiversità. Strumenti di citizen-science per esperienze di monitoraggio delal biodiversità'.
Utilizzando lo schema OpenDataKit - Kobotoolbox, i ricercatori Cnr-Ibe hanno implementando un'applicazione che permette a chiunque, e quindi anche ai ragazzi o a semplici amanti della montagna, di raccogliere segnalazioni/osservazioni per mappare la biodiversità seguendo un questionario opportunamente progettato. L'applicazione è molto semplice e consente di raccogliere foto, suoni e video di specie vegetali o animali collegandole ad una posizione geografica, in modo da fare una mappatura in modo collaborativo delle comunità ecologiche presenti nella montagna pistoiese. L’evento è a supporto della costruzione di comunità di osservatori potenzialmente formate da volontari, guide, o semplici cittadini che possono trasformarsi in preziosi collaboratori di chi fa ricerca naturalistica sul territorio.
Il fine è una raccolta dati collaborativa volta ad avere una mappatura dello stato delle specie vegetali e animali, che consente allo stesso tempo di verificare gli impatti sull’ecosistema delle recenti dinamiche di cambiamento climatico. Un lavoro indispensabile per valutare/conoscere il reale stato ecologico di un territorio in questo periodo di profonde e rapide mutazioni.
Public crowd-sensing of heat-waves by social media dataAlfonso Crisci
Analisys of italian twitter social fluxes and heatwaves in 2015
Valentina Grasso (1,2), Alfonso Crisci (1), Marco Morabito (1), Paolo Nesi (3), and Gianni Pantaleo (3).
(1) CNR Ibimet, Italian National Research Council, Florence, Italy (v.grasso@ibimet.cnr.it)
(2) LaMMA Consortium, Italian National Research Council, Florence, Italy.
(3) DISIT Lab, Distributed [Systems and internet | Data Intelligence and] Technologies Lab, Dep. of Information Engineering (DINFO), University of Florence, Italy
Keywords: Social Media Data, Twitter,Heat Wave,Impacts.
An optimized weather type classification scheme for italian peninsula based on COST733
Giorgio Bartolini 2 , Giulio Betti 1,2 , Alfonso Crisci 1 , Bernardo Gozzini 1,2 , Daniele Grifoni 1,2 , Maurizio Iannuccilli 3 , Alessandro Messeri 4 ,
Gianni Messeri 1,2 , Marco Morabito 1 , Roberto Vallorani 1,2 and Giampiero Maracchi 5
corrisponding author: messeri@lamma.rete.toscana.it
1 IBIMET – CNR, Institute of Biometeorology - National Research Council, Firenze, Italy;
2 Consorzio LaMMA – Laboratory of environmental modelling and monitoring for a sustainable development, Sesto Fiorentino (Firenze), Italy
3 Rotary – District 2071, Toscana, Italy; 4 CIBIC – Interdipartimental Center of Bioclimatology, University of Firenze, Italy; 5 Accademia dei Georgofili, Firenze, Italy
EMS trieste
# SOILDAY 2016 Roma UNA GIORNATA PER IL SUOLO
ISPRA, la Casa dell'Architettura - Piazza Manfredo Fanti, 47
Laboratorio "La città che scotta" ore 16.30
IBIMET CNR
Marco Morabito Alfonso Crisci
Heat Wave risk mapping in Europe for elderly peopleAlfonso Crisci
Vulnerability mapping for heat risks in the elderly population.
Amsterdam Speech Marco Morabito m.morabito@ibimet.cnr.it Coauthor a.crisci@ibimet.cnr.it
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...2023240532
Quantitative data Analysis
Overview
Reliability Analysis (Cronbach Alpha)
Common Method Bias (Harman Single Factor Test)
Frequency Analysis (Demographic)
Descriptive Analysis
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Discussion on Vector Databases, Unstructured Data and AI
https://www.meetup.com/unstructured-data-meetup-new-york/
This meetup is for people working in unstructured data. Speakers will come present about related topics such as vector databases, LLMs, and managing data at scale. The intended audience of this group includes roles like machine learning engineers, data scientists, data engineers, software engineers, and PMs.This meetup was formerly Milvus Meetup, and is sponsored by Zilliz maintainers of Milvus.
1. Monitoring data
consistency in
OpenStreetMap using its
spatial features and tags
semantics
Alfonso Crisci - IBIMET CNR
Maurizio Napolitano - DCL FBK Trento
Francesca De Chiara - DCL FBK Trento
Cristian Consonni - DCL FBK Trento
George Kingsley Zipf
(PD, https://commons.wikimedia.org/wiki/File:George_Kingsley_Zipf_1917.jpg)
2. Backgrounds #osm3words
OpenstreetMap is a
language of free
representation
of real geographical entities
to build visual patterns
called maps where user
communities works in
participatory style.
3. Aims build a local areal approach
Do metrics exist to manage OSM
spatial and textual informative
complexity?
Which are the
candidates?
● to build customized
guidelines for thematic
mapping
● to help areal OSM fill gapping
strategies
● detect spatial and informative
gaps
Targets
4. Parameters looking up the most interesting
• Fractal Dimension Is it possible to measure spatial complexity of
OSM feature?
• Lacunarity Is it possible to identify where OSM contributions have
spatials gaps and how they change over time?
• Textual informative density Is semantics of textual descriptors (keys
and tags) informative?
• Diversity and Dissimilarity Is it possible to detect semantic
differences among different areas/communities at various spatial
scales?
5. What’s Lacunarity : measure of spatial pattern voids
Gliding Box lacunarity
( Allain Cloitre 1991)
Images & definitions
Marco Diego DOMINIETTO
ETH Zurich
Multimodality Approach To Study
The Fractal Physiology Of Tumor Angiogenesis
Same image complexity different lacunarity
Lacunarity
It is a pattern design analytical
tool and can be defined as a
complementary measure of
fractal dimension.
It allows to distinguish spatial
patterns through the analysis of
their gap (pixel void)
distribution at different scales.
Is rotational invariant ma as
function of the scale.
6. Information Entropy & Zipf plot: semantic analisys of OSM ‘s wordsets
Textual information density
Zipf's law states that given some corpus of natural language utterances, the
frequency of any word is inversely proportional to its rank in the frequency
table. http://en.wikipedia.org/wiki/Zipf%27s_law
x is the rank of a word in the frequency table;
y is the total number of the word’s occurrences (frequency).
From OSM data is possible to retrieve textual
corpus ( set of words) of keys, tags, keyvalue)
for every bounded area. Two action are possible:
Zipf plot : Description of word set in terms of
distribution of terms. Rare terms detection.
Information entropy : to detect indirectly
textual information density ( Shannon entropy)
http://en.wikipedia.org/wiki/Entropy_%28information_theory%29
7. Tools analitical framework for OSM data
Osmconvert
Osmfilter
Nepal Civic Hacker @prabhasp
http://prabhasp.github.io/OSMTimeLapseR
tm & ZipfR & LanguageR & qdap &wordcloud
Openstreetmap & Osmar & fractaldim
Urbanisation Regime and Environmental Impact: Analysis and Modelling
of Urban Patterns, Clustering and Metamorphoses
GDAL lacunarity and fractal dimensions
Spatial-tools library
Christian Kaiser
http://github.com/christiankaiser/spatial-tools
raster & rgdal & spatstat
R packages
http://github.com/alfcrisci/osm_analitics
8. Areas Zoom.level 12 Scale 1:150,000 Admin-centre centered
Trento Northern
Italy
Florence Central Italy Matera Southern Italy
OSMTimeLapseR
Medium city
Large Community
High density of
features
Large urban area
Large Community
High density of
features
Small urban area
Young Community
Recent mapping
9. OSM History Data preview
Raster Density Maps
A. Feature density
B. Users density
( at least one edit)
A. Version Count density
A. Local complexity
Fractal dimension isoentropic
method
Davies and Hall (1999)
Lexical Analysis
a. Zipf plot keys
b. Wordcloud keys
c. Histogram keys/ N_users
d. Venn diagram keys/user
e. Clustering users by key
Temporal Evolution
I. Year Feature amount
II. Year Lacunarity index
Tag Lexical Analysis
a. Zipf plot of selected key-values
b. Lexical diversity by keys
c. Treemap users by key
d. Treemap values by key
e. Word-network of user by keys
10. Aerial view Trento spatial resolution 20 m
Feature
density
Users density
11. Aerial view Trento spatial resolution 20 m
Version Count density
Local complexity
(pixel-area where complexity is lower <2 )
26. Information entropy Diversity
MATERA FIRENZE
"shop","amenity","tourism","man_made","natural","l
eisure","landuse","wikipedia" "shop","amenity","tourism","man_made"
Using several diversity index corpus of values for keys is possible to see the different use of tags in cities
31. Main Findings
• Spatial complexities in OSM for a specific area could be detected and monitored in
space and time by using complexity metric.
•The lacunarity decay show well the OSM informativity growth but its reliability
depends by the spatial scale used. In densely mapped areas small resolutions are
required (20 m or 10 m).
•Lacunarity thresholds for OSM quality assessment needs further investigations in
relation to the zoom level involved and the keys ( tags) monitored.
•Local fractal dimension indicates well where are area with a low complexity.
32. Main Findings
•Lexical statistical frameworks works with OSM data and describe their informativity
and the differences that exist among areal communities.
•Textual informativity parameters show the general terms’ abundancy in OSM and
demonstrate that is a really rich informative environment .
• Areal keyset, and only in certain tags, follow a natural language distribution ( Zipf’s
law emerges!) and integrating Information entropy analysis for different spatial scales (
zoom level) is possible to infer on information suitability of the area done by OSM
users’ community.
•All kind of investigation must ever to take into account population and user density
33. Conclusions
•Spatial and textual “complexity” parameters seems promising tools to help
the assessment of data quality in specific area.
•They main role is to quantify the amount but need to be linked with other
areal metrics ( population & mappers density, OSM feature density).
•The need is to define proper metrics linked to these parameters presented
…...to create osm services as well.
•Suggestions are welcome!
35. Appendix Fractal dimension: measure of spatial complexity state
A fractal dimension is a ratio providing a
statistical index of complexity comparing how
detail in a pattern (strictly speaking, a fractal
pattern) changes with the scale at which it is
measured.
http://en.wikipedia.org/wiki/Fractal_Dimension
Images
Marco Diego DOMINIETTO
ETH Zurich
Multimodality Approach To Study
The Fractal Physiology Of Tumor
Angiogenesis
Batty, M., and Longley, P. (1994). Fractal Cities: A Geometry of Form and
Function, Academic Press, San Diego, CA, at www.fractalcities.org
We need much better statistics that pertain to the
different kinds of dynamics and their variation over
time and space. (Batty,1994)