2. WHY PYTHON?
• The main benefit to Python is the reduction of redundant behaviour.
• For instance, doing multiple viewshed analyses would take time using the native platform of
GRASS or QGIS. However, by integrating simple loops that process starting points and
simple rules affecting the viewshed, many areas can be applied in a single process.
• Additionally, this now also opened up the possibility for many GIS users to create specific
programs within their larger applications to enable batch runs, enable dynamic compiling,
access a wide array of open source tools, easy memory allocation, and other utilities.
• Most GIS users utilize Python like a script rather than apply its object oriented or other
principles. In other words, Python is often applied to solve specific but limited problems as
part of a wider application or analysis.
3. WHY PYTHON?
• Recognizing that many users simply want an easy script to use within a program, the main
advantage of Python relative to other languages, such as C, C++, and other higher-level
languages, is that Python is relatively easy to learn, with the syntax looking more like human
language and functions, with garbage collection automated.
• Nevertheless, Python, with its numerous libraries, is relatively powerful, despite its easy
syntax, and today it has enabled new types of applications to be made, such as GIS for
mobile devices, integration of mapping features with web programs, and other areas that
require server and cloud based services for many new tools.
• Python allows access to well know libraries such as Google Maps and other popular Google
software, as one example.
• In effect, Python has allowed a wide range of programmers to more easily integrate a variety
of software and make GIS and mapping tools integrated with other popular tools and
devices. This helps to largely explain the large growth in mobile devices and other
applications using GIS tools and mapping seen today
4. ADVANTAGES OF PYTHON
• 1. Less restricted data types;
• 2. Open source support;
• 3. Cross-platform;
• 4. Object-orientated (A data structure that combines data with a set of methods for
accessing and managing those data).
5. APPLICATIONS OF PYTHON
• 1. Automate workflows;
• 2. Batch process data;
• 3. Manipulate data tables, geometry, and map docs;
• 4. Use functions accessible only by scripts.
6. GEOPANDA
• Python itself does not include vectors, matrices, or data frames as fundamental data types. As
Python became an increasingly popular language, however, it was quickly realized that this was a
major short-coming, and new libraries were created that added these data-types (and did so in a
very, very high performance manner) to Python.
• The original library that added vectors and matrices to Python was called numpy. But numpy,
while very powerful, was a no-frills library. You couldn’t do things like mix data-types, label your
columns, etc.. To remedy this shortcoming a new library was created – built on top of numpy –
that added all the nice features we’ve come to expect from modern languages: pandas.
• GeoPandas is an open source project to make working with geospatial data in python easier.
GeoPandas extends the datatypes used by pandas to allow spatial operations on geometric
types. Geometric operations are performed by shapely. Geopandas further depends on fiona for
file access and descartes and matplotlib for plotting.
7. NATIVE PYTHON GIS TOOLS
• Fiona: Tools for importing and exporting vector data from various formats like shapefile.
• Rasterio: Tools for importing and exporting raster data from various formats
• PyProj: Tools for defining and transforming the datum and projections of spatial data
• Shapely: Tools for spatial analytics, like testing for intersections, measuring areas, etc. Note that
this is basically a tool for analyzing 2-dimensional cartesian shapes – it has no facilities for
managing projections. That you have to do with PyProj before you start manipulations with
shapely.
• RTree: Spatial analytics (like intersections) can be relatively computationally difficult and thus
slow. For example, if you want to do something like a spatial join of millions of points to a
shapefile of polygons, you want to use what’s called a “Spatial Index” tool like RTree. Basically, for
each point, RTree will very quickly identify a list of polygons with which that point ‘’might’’
intersect (this list will always include the polygon that the point intersects with, but also some
others. In other words, it has no false negatives, but lots of false positives). You then use a slower
but more accurate tool like Shapely to check more accurately whether your point lies in each of
these candidates to find the one true intersecting polygon.
• CartoPy and Descartes: Cartography tools for making pretty maps. Cartopy is basically the
successor to Basemap, which you may also read about on some forums.
8. MACHINE LEARNING AND PYTHON
• The primary library for machine learning is SCIKIT-LEARN
• Scikit-learn is a free software machine learning library for the Python programming
language. It features various classification, regression and clustering algorithms
including support vector machines, random forests, gradient boosting, k-
means and DBSCAN, and is designed to interoperate with the Python numerical and
scientific libraries NumPy and SciPy.
9. PLOTTING IN PYTHON
• The low-level library for making figures in Python is called matplotlib. Instead, most
people use either seaborn, or ggplot (meant to duplicate syntax and functionality
of ggplot in R).
• Seaborn is a popular library used for plotting. It’s build on top of matplotlib, and
basically allows you to make common statistical plots more easily.
10. NETWORK ANALYSIS IN PYTHON
• The most common library file used in network analysis is iGraph.
• It is initially written in C, but also available in Python.
• NetworkX is another library used in network analysis completely developed in native
python. But is relatively slow than iGraph.
11. BIG DATA IN PYTHON
• Dask is a new tool written for working with data that doesn’t fit into memory (and
parallelizing operations) for Python. It was written to basically work just like pandas.
• Dask is a flexible library for parallel computing in Python.
Dask is composed of two parts:
• Dynamic task scheduling
• “Big Data” collections like parallel arrays, dataframes, and lists that extend
common interfaces like NumPy, Pandas, or Python iterators to larger-than-memory
or distributed environments. These parallel collections run on top of dynamic task
schedulers.
12. TYPES OF MODELS IN GIS
• Descriptive models – patterns
• Change models – before and after
• Impact models – what happens
• Explanatory models – process influence
• Predictive models – what will be like
15. PYTHON AND ARCGIS
• Within ArcGIS, there are two options for working with and running python scripts:
directly within ArcMap using the python window, or within an Integrated
Development Environment (IDE) such as PythonWin.
• ArcPy is a site package that builds on the successful arc-gis scripting module. Its
goal is to create the cornerstone for a useful and productive way to perform
geographic data analysis, data conversion, data management, and map automation
with Python.