Data Visualization GIS and Maps, The Visualization Process Visualization Strategies: Present or explore? The cartographic toolbox: What kind of data do I have?, How can I map my data? How to map?: How to map qualitative data, How to map quantitative data, How to map the terrain elevation, How to map time series Map Cosmetics, Map Dissemination
Data Visualization GIS and Maps, The Visualization Process Visualization Strategies: Present or explore? The cartographic toolbox: What kind of data do I have?, How can I map my data? How to map?: How to map qualitative data, How to map quantitative data, How to map the terrain elevation, How to map time series Map Cosmetics, Map Dissemination
Spatial databases are used to store geographic information. Querying on such databases are : range queries, nearest neighbor queries and spatial joins. Many indexing techniques are used for faster retrieval of data out of which r-trees are mainly efficient. Other indexing techniques are quad-trees, grid files etc. Spatial data is used in GIS applications.
A MapReduce job usually splits the input data-set into independent chunks which are processed by the map tasks in a completely parallel manner. The framework sorts the outputs of the maps, which are then input to the reduce tasks. Typically both the input and the output of the job are stored in a file-system.
Introduction to GIS - Basic spatial concepts - Coordinate Systems - GIS and Information Systems – Definitions – History of GIS - Components of a GIS – Hardware, Software, Data, People, Methods – Proprietary and open source Software - Types of data – Spatial, Attribute data- types of attributes – scales/ levels of measurements.
Data Entry and Preparation Spatial Data Input: Direct spatial data capture, Indirect spatial data captiure, Obtaining spatial data elsewhere Data Quality: Accuracy and Positioning, Positional accuracy, Attribute accuracy, Temporal accuracy, Lineage, Completeness, Logical consistency Data Preparation: Data checks and repairs, Combining data from multiple sources Point Data Transformation: Interpolating discrete data, Interpolating continuous data
Spatial databases are used to store geographic information. Querying on such databases are : range queries, nearest neighbor queries and spatial joins. Many indexing techniques are used for faster retrieval of data out of which r-trees are mainly efficient. Other indexing techniques are quad-trees, grid files etc. Spatial data is used in GIS applications.
A MapReduce job usually splits the input data-set into independent chunks which are processed by the map tasks in a completely parallel manner. The framework sorts the outputs of the maps, which are then input to the reduce tasks. Typically both the input and the output of the job are stored in a file-system.
Introduction to GIS - Basic spatial concepts - Coordinate Systems - GIS and Information Systems – Definitions – History of GIS - Components of a GIS – Hardware, Software, Data, People, Methods – Proprietary and open source Software - Types of data – Spatial, Attribute data- types of attributes – scales/ levels of measurements.
Data Entry and Preparation Spatial Data Input: Direct spatial data capture, Indirect spatial data captiure, Obtaining spatial data elsewhere Data Quality: Accuracy and Positioning, Positional accuracy, Attribute accuracy, Temporal accuracy, Lineage, Completeness, Logical consistency Data Preparation: Data checks and repairs, Combining data from multiple sources Point Data Transformation: Interpolating discrete data, Interpolating continuous data
Towards a graph of ancient world geographical knowledgeElton Barker
Presentation on three collaborative projects: Hestia (http://hestia.open.ac.uk/), GAP (http://googleancientplaces.wordpress.com/gapvis/), and Pelagios (pelagios-project.blogspot.com)
Topological Data Analysis of Complex Spatial SystemsMason Porter
These are slides from a seminar I gave in "Cardiff" (for the mathematics department at University of Cardiff) on 4/15/20.
You can also find a recording of a similar talk that I gave in March 2020 for MBI (Mathematical Biosciences Institute): https://mbi.osu.edu/events/online-colloquium-mason-porter-spatial-systems-and-topological-data-analysis
Data analysis in geography simply concerns the methodology for collecting, analyzing, and presenting data. It frequently involves the application of statistical techniques useful in several ways ― first, these help summarize the findings of studies (example: total rainfall during a period in a state), second, these help understanding of the phenomenon under study (example: rainfall is more in the southern districts), third, these help forecast the state of variables (example: draught is likely during the next year), fourth, these help evaluate performance of certain activity (example: more rainfall means more rice production), fifth, these help decision making (example: finding out the best location for a H.S. School), sixth, they also help to establish whether relationships between the characteristics of a set of observations are genuine or not, and finally, certainly all these can show that the results of the analysis make a valuable contribution to the body of geographical knowledge.
Statistical techniques and procedures are applied in all fields of academic research; wherever data are collected and summarized or wherever any numerical information is analyzed or research is conducted, statistics are needed for sound analysis and interpretation of results. Geographers primarily use statistics in the following ways: to describe and summarize spatial data, to make generalizations concerning complex spatial patterns, to estimate the probability of outcomes for an event at a given location, to use samples of geographic data to infer characteristics for a larger set of geographic data (population), to determine if the magnitude or frequency of some phenomenon differs from one location to another, and to learn whether an actual spatial pattern matches some expected pattern.
How to use science maps to navigate large information spaces? What is the lin...Andrea Scharnhorst
A. Scharnhorst (2016) Wie können Wissenschaftskarten zur Suche in grossen Informationsräumen eingesetzt werden? How to use science maps to navigate large information spaces? What is the link between science maps and predictive models of science? Invited lecture Fraunhofer-Institut für Naturwissenschaftlich-Technische Trendanalysen, Euskirchen, Germany, December 7, 2016
Geographic Information Management TransformationPat Kenny
GI Management Transformation: from geometry to data-based relationships. - Dr Tracey P. Lauriault, School of Journalism and Communication, Carleton University & Programmable City, Maynooth University. Address given at Ordnance Survey Ireland GI R&D Initiatives, Tuesday, 22 March 2016, 13:00 to 20:30 (GMT), Maynooth University.
Andrea Scharnhorst (2016) Why do we need to model the science system? Talk at the seminar of the Eindhoven Centre for Innovation Sciences, June 2, 2016
PhD Research is very important as it is a means of acquiring distinction in the academia as well as a mean to be eligible for academic jobs and promotion. There are certainly various issues and concerns relating to it.
Knowledge – dynamics – landscape - navigation – what have interfaces to digit...Andrea Scharnhorst
When we google, search Wikipedia, and share information on Mendeley, we obviously deal with complex networks of information. But also traditional information spaces – the collections of libraries for instance – and their classification systems are evolving complex systems. This talk explores the possibilities to use concepts and methods from statistical physics to analyze information dynamics. We depart from information dynamics in scholarly communication, and point to current encounters between physics and scientometrics. We discuss more in-depth the evolution of category systems in libraries (Universal Decimal Classification) in comparison to on-line spaces (Wikipedia). The talk closes with an introduction into a new European network – the COST Action KnowEscape – in which information professionals, sociologists, computer scientists, physicists and digital humanities scholars in an unique alliance seek for knowledge maps to better navigate through large information spaces.
Talk on June 11, 2013 by Andrea Scharnhorst at the IMT in Lucca, Italy.
Geography is a spatial science and a 'space' has multiple dimensions to describe its characteristics in terms of the habitat, economy and society of man. Therefore, for practical purposes of spatial data analysis, we need to perform sampling techniques to identify units of survey at a certain level of probability of significance.
Similar to Introduction to spatial data mining (20)
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
1. Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN))
Organization Spatial Data Mining Fall 2011
1. Introduction
2. Region Discovery—Finding Interesting Places in Spatial
Datasets
3. Project3
4. CLEVER: a Spatial Clustering Algorithm Supporting Plug-in
Fitness Functions
5. [Spatial Regression]
2. Brief Introduction
to Spatial Data Mining
Spatial data mining is the process of discovering
interesting, useful, non-trivial patterns from large spatial
datasets
Reading Material: http://en.wikipedia.org/wiki/Spatial_analysis
Spatial Statistics Software: http://www.spatial-statistics.com/
3. Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN))
Examples of Spatial Patterns
Historic Examples (section 7.1.5, pp. 186)
1855 Asiatic Cholera in London: A water pump identified as the source
Fluoride and healthy gums near Colorado river
Theory of Gondwanaland - continents fit like pieces of a jigsaw puzlle
Modern Examples
Cancer clusters to investigate environment health hazards
Crime hotspots for planning police patrol routes
Bald eagles nest on tall trees near open water
Nile virus spreading from north east USA to south and west
Unusual warming of Pacific ocean (El Nino) affects weather in USA
4. Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN))
Why Learn about Spatial Data Mining?
Two basic reasons for new work
Consideration of use in certain application domains
Provide fundamental new understanding
Application domains
Scale up secondary spatial (statistical) analysis to very large datasets
• Describe/explain locations of human settlements in last 5000 years
• Find cancer clusters to locate hazardous environments
• Prepare land-use maps from satellite imagery
• Predict habitat suitable for endangered species
Find new spatial patterns
• Find groups of co-located geographic features
Exercise. Name 2 application domains not listed above.
5. Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN))
Why Learn about Spatial Data Mining? - 2
New understanding of geographic processes for Critical questions
Ex. How is the health of planet Earth?
Ex. Characterize effects of human activity on environment and ecology
Ex. Predict effect of El Nino on weather, and economy
Traditional approach: manually generate and test hypothesis
But, spatial data is growing too fast to analyze manually
• Satellite imagery, GPS tracks, sensors on highways, …
Number of possible geographic hypothesis too large to explore manually
• Large number of geographic features and locations
• Number of interacting subsets of features grow exponentially
• Ex. Find tele connections between weather events across ocean and land areas
SDM may reduce the set of plausible hypothesis
Identify hypothesis supported by the data
For further exploration using traditional statistical methods
6. Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN))
Autocorrelation
Items in a traditional data are independent of each other,
whereas properties of locations in a map are often “auto-correlated”.
First law of geography [Tobler]:
Everything is related to everything, but nearby things are more related
than distant things.
People with similar backgrounds tend to live in the same area
Economies of nearby regions tend to be similar
Changes in temperature occur gradually over space(and time)
Waldo Tobler in 2000
Papers on “Laws in Geography”: http://www.geog.ucsb.edu/~good/papers/393.pdf
http://www.cs.uh.edu/~ceick/DM/GOO10.pdf
7. Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN))
Characteristics of Spatial Data Mining
Auto correlation
Patterns usually have to be defined in the spatial attribute subspace
and not in the complete attribute space
Longitude and latitude (or other coordinate systems) are the glue that
link different data collections together
People are used to maps in GIS; therefore, data mining results have
to be summarized on the top of maps
Patterns not only refer to points, but can also refer to lines, or
polygons or other higher order geometrical objects
Patterns exist at different levels of granularity
Large number of patterns, large dataset sizes
Spatial patterns, e.g. spatial clusters can have arbitrary shapes
Regional knowledge is of particular importance due to lack of global
knowledge in geography (spatial heterogeniety)
8. Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN))
Why Regional Knowledge Important in Spatial Data Mining?
A special challenge in spatial data mining is that
information is usually not uniformly distributed in spatial
datasets.
It has been pointed out in the literature that “whole map
statistics are seldom useful”, that “most relationships in
spatial data sets are geographically regional, rather than
global”, and that “there is no average place on the Earth’s
surface” [Goodchild03, Openshaw99].
Therefore, it is not surprising that domain experts are
mostly interested in discovering hidden patterns at a
regional scale rather than a global scale.
Michael Frank Goodchild
9. Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN))
Spatial Autocorrelation: Distance-based measure
K-function Definition (http://dhf.ddc.moph.go.th/abstract/s22.pdf )
Test against randomness for point pattern
• λ is intensity of event
Model departure from randomness in a wide range of scales
Inference
For Poisson complete spatial randomness (CSR): K(h) = πh2
Plot Khat(h) against h, compare to Poisson CSR
• >: cluster
• <: decluster/regularity
EhK 1
)(
[number of events within distance h of an arbitrary event]
K-Function based Spatial Autocorrelation
10. Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN))
Answers: and
find patterns from the following sample dataset?
Associations, Spatial associations, Co-location
11. Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN))
Illustration of Cross-Correlation
Illustration of Cross K-function for Example Data
Cross-K Function for Example Data
12. Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN))
Colocation Rules – Spatial Interest Measures
http://www.youtube.com/watch?v=RPyJwYqyBuI
13. Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN))
Cross-Correlation
Cross K-Function Definition
Cross K-function of some pair of spatial feature types
Example
• Which pairs are frequently co-located
• Statistical significance
EhK jji
1
)(
[number of type j event within distance h of a randomly chosen
type i event]
14. Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN))
Spatial Association Rules
•Spatial Association Rules
• A special reference spatial feature
• Transactions are defined around instance of special spatial feature
• Item-types = spatial predicates
•Example: Table 7.5 (pp. 204)
15. Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN))
Participation index = min{pr(fi, c)}
Where pr(fi, c) of feature fi in co-location c = {f1, f2, …, fk}:
= fraction of instances of fi with feature {f1, …, fi-1, fi+1, …, fk} nearby
N(L) = neighborhood of location L
Pr.[ A in N(L) | B at location L ]Pr.[ A in T | B in T ]conditional probability metric
Neighborhood (N)Transaction (T)collection
events /Boolean spatial featuresitem-typesitem-types
support
discrete sets
Association rules Co-location rules
participation indexprevalence measure
continuous spaceUnderlying space
Co-location rules vs. traditional association rules
16. Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN))
Conclusions Spatial Data Mining
Spatial patterns are opposite of random
Common spatial patterns: location prediction, feature interaction, hot spots,
geographically referenced statistical patterns, co-location, emergent patterns,…
SDM = search for unexpected interesting patterns in large spatial databases
Spatial patterns may be discovered using
Techniques like classification, associations, clustering and outlier detection
New techniques are needed for SDM due to
• Spatial Auto-correlation
• Importance of non-point data types (e.g. polygons)
• Continuity of space
• Regional knowledge; also establishes a need for scoping
• Separation between spatial and non-spatial subspace—in traditional
approaches clusters are usually defined over the complete attribute space
Knowledge sources are available now
Raw knowledge to perform spatial data mining is mostly available online now
(e.g. relational databases, Google Earth)
GIS tools are available that facilitate integrating knowledge from different
source
17. Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN))
Spatial Regression
Spatial Regression.pptx
18. Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN))
Example Videos Discussing Spatial Analysis
http://www.esri.com/what-is-gis/index.html What is GIS?
http://www.youtube.com/watch?v=ZqMul3OIQNI&feature=related (Geo-
graphically weighted regression software advertisement video)
http://www.youtube.com/watch?v=RhDdtqgIy9Q&feature=related
(Spatial Analysis and Remote Sensing Degree at UA)
http://www.youtube.com/watch?v=_SBLBkP9O9I&feature=related
ArcGIS Spatial Analyst Overview
http://www.youtube.com/watch?v=mBSXBqEP-7Y&feature=related
(ArcGIS 9.3: Advanced planning and analysis - Part 1)
http://www.youtube.com/watch?v=agzjyi0rnOo&feature=related
(Example using Spatial Analysis to Analyze Medical Data; the video is not
really that “great”; if you know a better one share it with us!
http://acmgis2011.cs.umn.edu/ ACM GIS Conference, discusses advances
in Geographical Information Systems and related areas
http://www.houstonareagisday.org/ Houston Area GIS Day Nov. 10, 2011