Big size meteorological data processing and mobile displaying system using PostGIS and GeoServer

•Download as PPTX, PDF•

4 likes•1,909 views

BJ Jang

Present on the FOSS4G 2014 in Portland

Software

Big size meteorological data processing
and mobile displaying system
using PostGIS and GeoServer
BJ Jang, JW Geum, JH Kwun, HG Park

 The system was SLOW not because of using PostGIS
but because of NOT TUNING.
 PostGIS definitely could make good performance if the
system has been PROPERLY TUNED.
 Let’s go over some tuning skills from CASE OF MOBILE
weather chart service of KMA.
2
Objective

background Problems to solved Import speed Keeping data size indexing
Mobile Weather Chart Service Flow
Observation
Data
Model
Improvement of performance part
by tuning
4
GRIB
Data
Vector
Chart
Chart
Service
Vectorize Image
GRIB Data Weather Chart for service
Korea Meteorological Administration
Vector Chart
※GRIB DATA : GRIdded Binary or General Regularly-disrtibuted Information in Binary form
- standardized by the World Meteorological Organization

background Problems to solved Import speed Keeping data size indexing
Vector Weather
Chart
5
Software Architecture

background Problems to solved Import speed Keeping data size indexing
Characteristics of Weather Data
6
Low Resolution • Geographically low resolution
• Surface + Height(Isobaric surface)
• Analysis model
• Data time + Forecasting time
Multiple
Dimension
• A few times ~ hundreds times
Frequent
Production
• Always need up-to-date data
Realtime/
Near Realtime

background Problems to solved Import speed Keeping data size indexing
Data usage per day
times generation (00, 06, 12, 18 UTC)
# of spatial table:
# of weather charts:
MB data
# of spatial data columns:

Problems of Existing System
Slow data
collection
Difficult big
size data
management
Slow searching
for
Weather Chart
9
background Problems to solved Import speed Keeping data size indexing

Why is the service slow?
Failed to understand
characteristics of data
10
background Problems to solved Import speed Keeping data size indexing

background Problems to solved Import speed Keeping data size indexing
Improvement Goal
11
PRGOOBALLESMS AGCOTIAVLITSY
5 hr to insert data
Data file grows 35 GB per day
Tens of seconds to search a
single weather chart
Inserting less than 3o min.
Keep the size of data file fixed
Searching a weather chart within
a few second

Improvement on importing
speed for big size data
using batch
12

General Data Processing Time
The time required each batch size
source: http://novathin.kr/19
Run one by one
Run one time after
gathering as much as
batch size
There is big difference according to the way of executing SQL!
13
background Problems to solved Import speed Keeping data size indexing

background Problems to solved Import speed Keeping data size indexing
Import speed comparison
One weather chart kml file  executing 3,000 columns  test criteria
14
# of addBatch() # of execution Time(sec)
0 3,000 109.0
100 30 8.9
500 6 5.7
1,000 3 3.4
3,000 1 1.1
 1 insert / 1 commit  kml file(3,000 insert) / 1 commit

Keeping data file
size
by managing table
15

Data Management of PostGIS
 PostGIS is write-once.
 Not deleting updated and deleted data
 Recording new data below after marking
 Pros
 Fast
 Can manage several versions of data
 Cons
 Data file size can be extremely increased
 Low performance by increase of file size
 Weather Chart DB file increases by 35 GB
per day!!!
16
background Problems to solved Import speed Keeping data size indexing

Snapshot vs Write-once
Oracle / MySQL PostgreSQL
table
A
B’
C
D
E
table
A
B X
C
D
E
B’
snapsho
t
B
Transaction owner
Other users
Record
before
renewal
Record
after
renewal
Record
after
renewal
Record
before
renewal
After completing
transection
17
background Problems to solved Import speed Keeping data size indexing

General VACUUM
Table
A
B X
C X
D
E
X
B’
C’
Table
A
B X
C X
D
E
X
B’
C’
Table
A
F
C X
D
E
X
B’
C’
FSM
No need
B X
C X
E X
FSM
No need
C X
E X
VACUUM execution Data Insert
Source: http://www.geocities.jp/sugachan1973/doc/funto60.html
In terms of PostGIS for KMA’s weather charts, general vacuum
functions can’t solve the problem that data files continuously
increase.
18
background Problems to solved Import speed Keeping data size indexing

background Problems to solved Import speed Keeping data size indexing
VACUUM FULL
unused space
arrange for big size
data management
Source: http://www.devmedia.com.br/otimizacao-uma-ferramenta-chamada-vacuum/1710
On PostGIS for KMA’s weather chart, it takes 15 hr. for full vacuum.
During Vacuum full, exclusive LOCK happens.
19
VACUUM FULL

Partitioning
 Partitioning?
 Managing tables by conceptually separating one table to
several
 Data size by table down  Index size down and Search
speed up Weather Chart
Weather Chart
_0
Weather Chart
_1
Weather Chart
_2
Weather Chart
_3
Weather Chart
_4
Weather Chart
_5
Weather Chart
_6
Insert on Monday
Insert on Sunday
Insert on Tuesday
Truncate on Tuesday
Truncate on Monday
Truncate on Sunday
Execution time of truncate is almost a few seconds
and file size is decreased without vacuum
20
background Problems to solved Import speed Keeping data size indexing

Improvement on inquiry
speed by resetting index
21

background Problems to solved Import speed Keeping data size indexing
Improvement flow of inquiry speed
22
Data
Condition
Analysis
Query
Finding
Query
Plan
Analysis
Index
Improvement

background Problems to solved Import speed Keeping data size indexing
Data Condition Analysis
 Understanding # of
columns by table
 select count(*)
table_name is foolish!
 Possible to understand
the number of rows if
using statistical table
 Meaningful data is
stored on pg_class
table
 Execution time within
one minute
select relname as table_name,
to_char(reltuples, '999,999,999')
as row_count
from pg_class
where relnamespace = (select oid
from pg_namespace where nspname
= 'public')
and relam = 0
order by 2 desc, 1;
23

background Problems to solved Import speed Keeping data size indexing
GeoServer SQL VIEW
 Register sql query as Layer
 Datasource is geoDB, can
use SQL VIEW
 Useful
 Complex condition to layer
 Reprojection
 Able to join multiple tables
 normal attribute -> spatial
object
GeoServer , showing weather chart,
perfomance is affected by
searching speed of PostGIS
24

background Problems to solved Import speed Keeping data size indexing
Query Finding
 Identifying executed
SQL using statistical
 Using table
pg_stat_activity table
 Necessary process for
tuning
 Possible to check
execution time
 Differences of queries
by PostGIS version select query_start, current_query
from pg_stat_activity
where username = ‘mobile’
and current_query not like ‘<IDLE>%’
order by query_start desc;
SELECT
"val",encode(ST_AsBinary(ST_Force_2D("geom
")),'base64') as "geom"
FROM (
select mdl, mdl_var,
placemark_name, val, lyrs_cd,
forecast_time,
create_time as anal_time,
ST_Transform(the_geom, 7188) as geom
from contour
where mdl_var = 'TMP'
) as "vtable"
WHERE (((("mdl" = 'GDAPS' AND "lyrs_cd" =
'A925.0') AND "forecast_time" =
'2011.06.27 00:00') AND "anal_time" =
'2011.06.27 00:00') AND "geom" &&
ST_GeomFromText('POLYGON ((-1056768 -
2105344, -1056768 -1040384, 8192 -1040384,
8192 -2105344, -1056768 -2105344))',
7188));
25

background Problems to solved Import speed Keeping data size indexing
Query Plan Analysis
 PostGIS has basically query analysis function
 pgAdmin III-Query-Analysis explanation function
 Explain Analyze command – Easy to analyze query
26

background Problems to solved Import speed Keeping data size indexing
Index Improvement
 Principles for Index Improvement
 Setting index with all columns on Where clause
 Spatial column has separate index
 Columns with lots of including data types come first
 Possibly, items compared as same operator come first
 Unnecessary index should be removed due to bad performance on
inserting
 Examples
-- contour_0
DROP INDEX index_createtime_contour_0;
DROP INDEX index_forecasttime_contour_0;
DROP INDEX index_lyrscd_contour_0;
DROP INDEX index_mdl_contour_0;
DROP INDEX index_mdlvar_contour_0;
CREATE INDEX index_contour_0_all
ON contour_0 (forecast_time ASC NULLS LAST, mdl_var ASC NULLS LAST, lyrs_cd ASC NULLS LAST, create_time
DESC NULLS LAST, mdl ASC NULLS LAST);
 Result
 After individually deleting index, integrated index creation reduces 20% of
data capacity
 6 ~ 25 times speed improvement by table(big tables show better
performance) 27

29
Under 300
isobaric/
Temperature/
Isokinetics
Ground/
Wet-number/
Temperature
800 isobaric/
Mixture ratio/
Temperature

Conclusion
Importance on excution
using addBatch() and
excuteBatch()
About 100 times
performance
improvement
Mixed with
partitioning and
truncate
Stably keeping
N-1
accordance
Appropriate index
for query
20 times inquiry
time improvement
30

Conclusion
PostGIS is really great
DBMS!
Perfectly suited with
GeoServer
However, tuning with
perfect understanding of
the features.
31

Q&A
Please Ask BJ Jang via
Email !
bjjang@gaia3d.com
32

What's hot

WMS Performance Shootout 2010

Jeff McKenna

Why is postgis awesome?

Kasper Van Lombeek

Twitter's Data Replicator for Google Cloud Storage

lohitvijayarenu

How @twitterhadoop chose google cloud

lohitvijayarenu

Graphite

David Lutz

Big data processing systems research

Vasia Kalavri

Managing 100s of PetaBytes of data in Cloud

lohitvijayarenu

An important underlying concept behind location-based applications is called geofencing. Geofencing is a process that allows acting on users and/or devices who enter/exit a specific geographical area, known as a geo-fence. A geo-fence can be dynamically generated—as in a radius around a point location, or a geo-fence can be a predefined set of boundaries (such as secured areas, buildings, boarders of counties, states or countries). Geofencing lays the foundation for realizing use cases around fleet monitoring, asset tracking, phone tracking across cell sites, connected manufacturing, ride-sharing solutions and many others. GPS tracking tells constantly and in real time where a device is located and forms the stream of events which needs to be analyzed against the much more static set of geo-fences. Many of the use cases mentioned above require low-latency actions taken place, if either a device enters or leaves a geo-fence or when it is approaching such a geo-fence. That’s where streaming data ingestion and streaming analytics and therefore the Kafka ecosystem comes into play. This session will present how location analytics applications can be implemented using Kafka and KSQL & Kafka Streams. It highlights the exiting features available out-of-the-box and then shows how easy it is to extend it by custom defined functions (UDFs). The design of such solution so that it can scale with both an increasing amount of position events as well as geo-fences will be discussed as well.

Location Analytics - Real-Time Geofencing using Apache Kafka

Guido Schmutz

Greenplum: A Pivotal Moment on Wall Street - Greenplum Summit 2018

VMware Tanzu

Statsd introduction

Rick Chang

m2r2: A Framework for Results Materialization and Reuse

Vasia Kalavri

Druid meetup @walkme

Dori Waldman

Location Analytics Real-Time Geofencing using Kafka

Guido Schmutz

Data Warehouse on Kubernetes: lessons from Clickhouse Operator

Altinity Ltd

Using GeoServer for spatio-temporal data management with examples for MetOc a...

GeoSolutions

MySQL performance monitoring using Statsd and Graphite (PLUK2013)

spil-engineering

With the open source Geo2tag platform, developers can use JSON or XML to manage location references in apps for Nokia Asha phones. In this webinar, we’ll show you how to use the Geo2tag API and how to manage a local database of georeferences. We’ll start with an overview of the RESTful Geo2tag API and explain how to use the API in apps for Nokia Asha phones. Then we’ll demonstrate a few location-based applications developed on top of Geo2tag and show how to integrate Geo2tag reference feeds with map widgets. Find out more about: * the Geo2Tag project and code: www.geo2tag.org * the Nokia Asha SDK: http://developer.nokia.com/Develop/asha/java/tools.xhtml * getting started with the Nokia IDE: http://developer.nokia.com/Develop/asha/java/start/nokia_ide/ * getting started with the NetBeans: http://developer.nokia.com/Develop/asha/java/start/netbeans/ * all the new APIs in Nokia Asha software platform 1.1: http://developer.nokia.com/Resources/Library/Java/#!whats-new/java-runtime-for-nokia-asha-software-platform-110.html Check out the current webinar schedule here: http://www.developer.nokia.com/webinars and https://developer.nokia.com/Develop/asha/learning/

Nokia Asha webinar: Developing location-based services for Nokia Asha phones ...

Microsoft Mobile Developer

Initially presented at OpenWest 2014 conference. Graphite and StatsD gather line series data and offer a robust set of APIs to access that data. While the tools are robust, the dashboards are straight from 1992 and alerting off the data is nonexistent. Nark, an opensource project, solves both of these problems. It provides easy to use dashboards and readily available alerts and notifications to users. It has been used in production at Lucid Software for almost a year. Related to Nark are the tools required to make Graphite highly available.

Highly Available Graphite

Matthew Barlocker

Kappa Architecture is a software architecture pattern that makes use of an immutable, append only log. All the processing of the event will be performed in the input streams and persisted as real-time views. Apache Flink is very well suited to be the processing engine because it provides support for event-time semantics, stateful exactly-once processing, and achieves high throughput and low latency at the same time. Apache Kudu Kudu is a storage system good at both ingesting streaming data and good at analyzing it using ad-hoc queries (e.g. interactive SQL based) and full-scan processes (e.g Spark/Flink). So Kudu is a good fit to store the real-time views in a Kappa Architecture. We have developed and open-sourced a connector to integrate Apache Kudu and Apache Flink. It allows reading/writing data from/to Kudu using the DataSet and DataStream Flink's APIs. The connector has been submitted to the Apache Bahir project and is already available from maven central repository.

Flink Forward Berlin 2017: Ruben Casado Tejedor - Flink-Kudu connector: an op...

Flink Forward

With the open source Geo2tag platform, developers can use JSON or XML to manage location references in apps for Nokia X and Nokia Asha phones. In this webinar, we’ll show how to use the Geo2tag API and how to manage a local database of georeferences. We’ll begin the training by introducing the fundamentals of Location Based Services and the REST API of Geo2Tag LBS Platform (www.geo2tag.org). We’ll focus on networking, JSON and web services. Then we will demonstrate several applications developed on top of Geo2Tagand share the newest enhancements to the platform. We’ll end the training with a discussion of integrating Geo2Tag and third-party map widgets.

Location based services for Nokia X and Nokia Asha using Geo2tag

Microsoft Mobile Developer

What's hot (20)

WMS Performance Shootout 2010

Why is postgis awesome?

Twitter's Data Replicator for Google Cloud Storage

How @twitterhadoop chose google cloud

Graphite

Big data processing systems research

Managing 100s of PetaBytes of data in Cloud

Location Analytics - Real-Time Geofencing using Apache Kafka

Greenplum: A Pivotal Moment on Wall Street - Greenplum Summit 2018

Statsd introduction

m2r2: A Framework for Results Materialization and Reuse

Druid meetup @walkme

Location Analytics Real-Time Geofencing using Kafka

Data Warehouse on Kubernetes: lessons from Clickhouse Operator

Using GeoServer for spatio-temporal data management with examples for MetOc a...

MySQL performance monitoring using Statsd and Graphite (PLUK2013)

Nokia Asha webinar: Developing location-based services for Nokia Asha phones ...

Highly Available Graphite

Flink Forward Berlin 2017: Ruben Casado Tejedor - Flink-Kudu connector: an op...

Location based services for Nokia X and Nokia Asha using Geo2tag

Similar to Big size meteorological data processing and mobile displaying system using PostGIS and GeoServer

Giga Spaces Data Grid / Data Caching Overview

jimliddle

Sql server performance tuning

ngupt28

Lambda architecture @ Indix

Rajesh Muppalla

MineDB Mineral Resource Evaluation White Paper

Derek Diamond

Checklist for Upgrades and Migrations

Markus Flechtner

Big data should be simple

Dori Waldman

What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1

MariaDB plc

What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1

MariaDB plc

Hybrid solutions – combining in memory solutions with SSD - Christos Erotocritou

JAXLondon_Conference

MongoDB World 2018: Breaking the Mold - Redesigning Dell's E-Commerce Platform

MongoDB

Big Data with SQL Server

Mark Kromer

Geo Searches for Health Care Pricing Data with MongoDB

Robert Stewart

My mapreduce1 presentation

Noha Elprince

Performance Monitor for PostgreSQL is a new product on the software market. The application allows for precise identification of problem with the performance of processes that run in the PostgreSQL database. Key software features: analysis of historical trends allows to compare performance in a very wide range of dates - this allows access to information on how SQL queries (currently cause of the problem) have behaved in the past. The tool also studies trends in the long term to avoid performance problems in the future. DBPLUS Performance Monitor does not require installation of any software on monitored databases. The system logs into databases at 15-minute intervals and collects performance statistics. The load of monitored instances is invisible. The application tracks the instance’s performance by monitoring active connections, collecting information about wait and locks that are responsible for long execution of queries. Monitors slow queries, IO readings, and other performance statistics (cache hit rate, disk usage, buffer reading, etc.). The application includes an alarm module for events that affect PostgreSQL instance performance. The tool has many defined alert rules based on performance statistics. Performance Monitor constantly monitors any deviation from the set trend and provides information to the administrator in real time. DBPLUS Performance Monitor monitors PostgreSQL 9.4 and higher.

DBPLUS Performance Monitor for PostgeSQL

DBPLUS

Upgrading 11i E-business Suite to R12 E-business Suite

iWare Logic Technologies Pvt. Ltd.

In this webinar, we discuss how the secret sauce to your business analytics strategy remains rooted on your approached, methodologies and the amount of data incorporated into this critical exercise. We also address best practices to supercharge your cloud analytics initiatives, and tips and tricks on designing the right information architecture, data models and other tactical optimizations. To learn more, visit: http://www.snaplogic.com/redshift-trial

Best Practices for Supercharging Cloud Analytics on Amazon Redshift

SnapLogic

Learn how Amazon Redshift, our fully managed, petabyte-scale data warehouse, can help you quickly and cost-effectively analyze all of your data using your existing business intelligence tools. Get an introduction to how Amazon Redshift uses massively parallel processing, scale-out architecture, and columnar direct-attached storage to minimize I/O time and maximize performance. Learn how you can gain deeper business insights and save money and time by migrating to Amazon Redshift. Take away strategies for migrating from on-premises data warehousing solutions, tuning schema and queries, and utilizing third party solutions.

Getting Started with Amazon Redshift

Amazon Web Services

Analyze Big Data for Consumer Applications with Looker BI and Amazon Redshift Customizing the customer experience based on user behavior is a constant challenge for today’s consumer apps. Business intelligence helps analyze and model large amounts of data. Looker offers a modern approach to BI leveraging AWS that’s fast, agile, and easy to manage. Join this webinar to learn how MessageMe, which provides emotionally engaging messaging apps to consumers, leverages Looker business intelligence software and the Amazon Redshift data warehouse service to analyze billions of rows of customer data in seconds. Webinar topics include: • How MessageMe turns billions of rows of customer data stored in Amazon Redshift into actionable insights • How Looker connects directly to Amazon Redshift in just a few clicks, enabling MessageMe to build a modern, big data analytics in the cloud. Who should attend • Information or Solution Architects, Data Analysts, BI Directors, DBAs, Development Leads, Developers, or Technical IT Leaders. Presenters: • Justin Rosenthal, CTO, MessageMe • Keenan Rice, VP, Marketing & Alliances, Looker • Tina Adams, Senior Product Manager, AWS

AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...

Amazon Web Services

sivamymom

http://www.kpipartners.com/webinar-Performance-Tuning-Oracle-BI-Applications/ ... From a virtual event that discusses techniques that can be used to optimize performance of the Oracle BI Apps. The BI Apps from Oracle present customers with a nice head start to getting their BI environment up and running. But for many customers, their user community demands lighting-fast speeds while running dashboards, reports and ad-hoc queries. Learn about some of the key techniques you can use to take the BI Apps to performance levels you didn’t think were possible. The discussion begins with a conceptual understanding of why performance problems can exist and the counteracting design considerations. Special attention will be paid to the concept of a Performance Layer, describing what it is, what it is comprised of and how to build it. The presentation includes several real world examples of the significant performance gains that can be had from a Performance Layer. Objective 1: Learn about the concept of a performance layer and what is involved with building one. Objective 2: Understand the most important steps to improve the performance of your system.

Performance Tuning Oracle's BI Applications

KPI Partners

Similar to Big size meteorological data processing and mobile displaying system using PostGIS and GeoServer (20)

Giga Spaces Data Grid / Data Caching Overview

Sql server performance tuning

Lambda architecture @ Indix

MineDB Mineral Resource Evaluation White Paper

Checklist for Upgrades and Migrations

Big data should be simple

What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1

Hybrid solutions – combining in memory solutions with SSD - Christos Erotocritou

MongoDB World 2018: Breaking the Mold - Redesigning Dell's E-Commerce Platform

Big Data with SQL Server

Geo Searches for Health Care Pricing Data with MongoDB

My mapreduce1 presentation

DBPLUS Performance Monitor for PostgeSQL

Upgrading 11i E-business Suite to R12 E-business Suite

Best Practices for Supercharging Cloud Analytics on Amazon Redshift

Getting Started with Amazon Redshift

AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...

Performance Tuning Oracle's BI Applications

More from BJ Jang

공간SQL을 이용한 공간자료분석 기초실습

BJ Jang

오픈소스GIS 개발 일반 강의자료

BJ Jang

2017년 나의 계획

BJ Jang

오픈소스 개발을 위한 Git 사용법 실습

BJ Jang

연차별로 구축된 지형도를 PostGIS에 넣어 ST_GeoHash()함수를 이용해 지리적인 식별키를 생성하고 이를 이용해 각 객처별 변화를 탐지해 낸다. 이렇게 탐지한 변화정보를 이용해 지형도의 변화를 시계열적으로 구축하여, 원하는 시점의 자료를 조회하고, 변화내용을 분석하는 과정을 국토지리정보원의 실사례와 함께 설명한다.

[FOSS4G Korea 2016] GeoHash를 이용한 지형도 변화탐지와 시계열 관리

BJ Jang

QGIS 소개 및 ArcMap과의 비교

BJ Jang

PyQGIS 개발자 쿡북

BJ Jang

Open Source based GIS devlopment cases by Gaia3D_20150417

BJ Jang

2015년 4월 1일 공간정보아카데미에서 진행된 오픈소스GIS 개발자과정의 첫째날 교재입니다. 1. 오픈소스 활동 참여방법 - Github, 국제화, 메일링리스트, 이슈추적 2. 오픈소스 개발환경 구축 - QGIS, GeoServer 컴파일, Cesium Sandcastle 3. 오픈소스 서비스환경 구축 - Apache, Tomcat, Python 설치 및 연결 4. 지도서비스 성능향상 전략 - GeoServer를 중심으로

공간정보아카데미 - Day1 오픈소스개발 일반

BJ Jang

올챙이 국제스타 만들기 20141023

BJ Jang

Github를 이용한 협동개발 20141001

BJ Jang

[Foss4 g2014 korea] qgis를 플랫폼으로 한 파이썬기반 공간통계 구현 사례

BJ Jang

Open Source GIS 기초교육 4일차 - GeoServer 기초 2014년 7월판

BJ Jang

[Foss4 g2013 korea]postgis와 geoserver를 이용한 대용량 공간데이터 기반 일기도 서비스 구축 사례

BJ Jang

Proj4를 이용한 좌표계 변환

BJ Jang

Geo server 성능향상을 위한 튜닝 기법 20111028

BJ Jang

공간정보거점대학 1.geo server_고급과정

BJ Jang

More from BJ Jang (17)

공간SQL을 이용한 공간자료분석 기초실습

오픈소스GIS 개발 일반 강의자료

2017년 나의 계획

오픈소스 개발을 위한 Git 사용법 실습

[FOSS4G Korea 2016] GeoHash를 이용한 지형도 변화탐지와 시계열 관리

QGIS 소개 및 ArcMap과의 비교

PyQGIS 개발자 쿡북

Open Source based GIS devlopment cases by Gaia3D_20150417

공간정보아카데미 - Day1 오픈소스개발 일반

올챙이 국제스타 만들기 20141023

Github를 이용한 협동개발 20141001

[Foss4 g2014 korea] qgis를 플랫폼으로 한 파이썬기반 공간통계 구현 사례

Open Source GIS 기초교육 4일차 - GeoServer 기초 2014년 7월판

[Foss4 g2013 korea]postgis와 geoserver를 이용한 대용량 공간데이터 기반 일기도 서비스 구축 사례

Proj4를 이용한 좌표계 변환

Geo server 성능향상을 위한 튜닝 기법 20111028

공간정보거점대학 1.geo server_고급과정

Recently uploaded

Conference: Engage2024 in Antwerp Type: Workshop Speakers: Florian Vogler, Henning Kunz, Christoph Adler Title: Navigating the Future with The Hitchhiker's Guide to Notes and Domino 14 Abstract: Embark on an exhilarating journey with industry trailblazers Florian Vogler, Henning Kunz, and Christoph Adler in this not-to-be-missed workshop at the forefront of the tech universe. Get ready for a thrilling kick-off as we navigate the current state of the HCL universe, setting the stage for an exploration of the groundbreaking Notes and Domino 14. Discover the latest enhancements and revolutionary features that will redefine your experience. In this interactive session, unlock a treasure trove of tips and tricks to elevate your utilization of version 14, both with and without the game-changing panagenda MarvelClient. Brace yourself for also diving into Nomad, Nomad Web, and VoltMX, expanding your horizons in the expansive HCL landscape. Be a part of this exclusive opportunity to stay ahead in the ever-evolving world of HCL technologies. Your journey to mastering Notes and Domino 14 begins here. And remember, in the spirit of intergalactic exploration, don't forget to bring your towel!

W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...

panagenda

8257 interfacing 2 in microprocessor for btech students

HimanshiGarg82

Craft an AI & Machine Learning Pitch with our Editable Professional PowerPoint Template. Ignite your AI & Machine Learning pitch with our cutting-edge PowerPoint template tailored for the industry. Perfect for AI conferences, investor presentations, sales pitches to tech-focused companies, training sessions, and educational programs. - 20+ editable slides: Get a variety of options to choose from for your presentation. - Time-saving solution: Download, replace text/images with a few clicks. - User-friendly customization: Easy to use and personalize. - Modern and attractive design: Captivating visuals, sleek layout. - Tailored to your requirements: Fully alterable for customization. - Well-organized slides: Complete control over content. - Thematic specificity: Reflects healthcare industry with relevant graphics. - Showcase your business idea: Communicate value proposition effectively.

AI & Machine Learning Presentation Template

Presentation.STUDIO

%in Harare+277-882-255-28 abortion pills for sale in Harare

masabamasaba

%in ivory park+277-882-255-28 abortion pills for sale in ivory park

masabamasaba

Investing in AI transformation today The modern business advantage: Uncovering deep insights with AI Organizations around the world have come to recognize AI as the transformative technology that enables them to gain real business advantage. AI’s ability to organize vast quantities of data allows those who implement it to uncover deep business insights, augment human expertise, drive operational efficiency, transform their products, and better serve their customers

Microsoft AI Transformation Partner Playbook.pdf

Willy Marroquin (WillyDevNET)

WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation

WSO2

%in Midrand+277-882-255-28 abortion pills for sale in midrand

masabamasaba

+971565801893 Mtp-Kit (500MG) Prices » Dubai [(+971565801893**)] Abortion Pills For Sale In Dubai, UAE, Mifepristone and Misoprostol Tablets Available In Dubai, UAE CONTACT DR.Leen Whatsapp +971565801893 We Have Abortion Pills / Cytotec Tablets /Mifegest Kit Available in Dubai, Sharjah, Abudhabi, Ajman, Alain, Fujairah, Ras Al Khaimah, Umm Al Quwain, UAE, Buy cytotec in Dubai +971565801893''''Abortion Pills near me DUBAI | ABU DHABI|UAE. Price of Misoprostol, Cytotec” +971565801893' Dr.DEEM ''BUY ABORTION PILLS MIFEGEST KIT, MISOPROTONE, CYTOTEC PILLS IN DUBAI, ABU DHABI,UAE'' Contact me now via What's App…… abortion Pills Cytotec also available Oman Qatar Doha Saudi Arabia Bahrain Above all, Cytotec Abortion Pills are Available In Dubai / UAE, you will be very happy to do abortion in Dubai we are providing cytotec 200mg abortion pill in Dubai, UAE. Medication abortion offers an alternative to Surgical Abortion for women in the early weeks of pregnancy. We only offer abortion pills from 1 week-6 Months. We then advise you to use surgery if its beyond 6 months. Our Abu Dhabi, Ajman, Al Ain, Dubai, Fujairah, Ras Al Khaimah (RAK), Sharjah, Umm Al Quwain (UAQ) United Arab Emirates Abortion Clinic provides the safest and most advanced techniques for providing non-surgical, medical and surgical abortion methods for early through late second trimester, including the Abortion By Pill Procedure (RU 486, Mifeprex, Mifepristone, early options French Abortion Pill), Tamoxifen, Methotrexate and Cytotec (Misoprostol). The Abu Dhabi, United Arab Emirates Abortion Clinic performs Same Day Abortion Procedure using medications that are taken on the first day of the office visit and will cause the abortion to occur generally within 4 to 6 hours (as early as 30 minutes) for patients who are 3 to 12 weeks pregnant. When Mifepristone and Misoprostol are used, 50% of patients complete in 4 to 6 hours; 75% to 80% in 12 hours; and 90% in 24 hours. We use a regimen that allows for completion without the need for surgery 99% of the time. All advanced second trimester and late term pregnancies at our Tampa clinic (17 to 24 weeks or greater) can be completed within 24 hours or less 99% of the time without the need surgery. The procedure is completed with minimal to no complications. Our Women's Health Center located in Abu Dhabi, United Arab Emirates, uses the latest medications for medical abortions (RU-486, Mifeprex, Mifegyne, Mifepristone, early options French abortion pill), Methotrexate and Cytotec (Misoprostol). The safety standards of our Abu Dhabi, United Arab Emirates Abortion Doctors remain unparalleled. They consistently maintain the lowest complication rates throughout the nation. Our Physicians and staff are always available to answer questions and care for women in one of the most difficult times in their lives. The decision to have an abortion at the Abortion Clinic in Abu Dhabi, United Arab Emirates.+971565801893

+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...

Health

WSO2CON 2024 - Does Open Source Still Matter?

WSO2

%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview

masabamasaba

We specialize in Psychic Readings, Psychic Love Spells, Binding Love Spells, Obsession Spells, Voodoo Spells, Lottery Spells, Marriage Spells, Black Magic Spells, Palm Readings & much more. Are you depressed? We perform this come-to-me love spell that works instantly with the aim of bringing back the victim to the person performing the magic. Have you lost your lover? We perform this come-to-me love spell that works instantly with the aim of bringing back the victim to the person performing the magic. Have you lost your lover? Do u need to solve any relationship problem? Contact the powerful spells caster chief kule with love spells that work overnight and love spells that really work. Have you found yourself infatuated with a special someone you think could be the one? Are you looking for a spell to provide them with a nudge in the right direction? Or maybe the spell you cast didn’t achieve the results you were hoping for? Whether you’re new or versed in the ways of spell casting, we’re here to help. Today we’re going to provide you with a detailed guide on the types of love spells to cast. Not only that but there’s something for those who wish to find outside advice from more advanced spell casters. We’re also going to provide you with the top sites available to help you with your dilemma. Let’s begin our journey by educating ourselves on love magic and what a real love caster looks like. Love Magic and Love Casters Love magic made its first appearance back in Ancient Egypt and has been an active practice since. This type of magic is a branch of traditional magic and can be practiced in various ways. Typically the more common use of love magic is through the work of spells, but other methods look like Charms Rituals-LOVE Potions-Dolls and even Amulets If you are interested in becoming a love caster, be prepared for what’s to come. A genuine love caster knows that the art of love casting is no easy feat and shouldn’t be done casually. You should know that not only does it require you to be gifted spiritually, but you must be ready to serve others. Someone who is considered a real love caster has experience in all manner of spells, no matter the difficulty. Training yourself in attraction, commitment, and marriage spells is an excellent place to start. But this by no means will make you a professional. Practice your craft and expand your knowledge; understand that you will possess the ability to help others in time truly. Types of Love Spells What better way to start broadening your experiences with love spells than by learning more about them? These spells work like just about any other spell. Simply apply your intention, use a medium (sigils, mantras, candles, or charm bags), and top it off with establishing the belief that you will receive what you want. So what kind of spells are available and which ones suit your needs the best? Let’s take a look at the many options you have at your disposal

%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...

masabamasaba

%in kempton park+277-882-255-28 abortion pills for sale in kempton park

masabamasaba

Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...

SelfMade bd

Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in Tembisa ● Abortion Pills For Sale in Tembisa ● Tembisa 🏥🚑!! Abortion Clinic Near Me Cost, Price, Women's Clinic Near Me, Abortion Clinic Near, Abortion Doctors Near me, Abortion Services Near Me, Abortion Pills Over The Counter, Abortion Pill Doctors' Offices, Abortion Clinics, Abortion Places Near Me, Cheap Abortion Places Near Me, Medical Abortion & Surgical Abortion, approved cyctotec pills and womb cleaning pills too plus all the instructions needed This Discrete women’s Termination Clinic offers same day services that are safe and pain free, we use approved pills and we clean the womb so that no side effects are present. Our main goal is that of preventing unintended pregnancies and unwanted births every day to enable more women to have children by choice, not chance. We offer Terminations by Pill and The Morning After Pill.” Our Private VIP Abortion Service offers the ultimate in privacy, efficiency and discretion. we do safe and same day termination and we do also womb cleaning as well its done from 1 week up to 28 weeks. We do delivery of our services world wide SAFE ABORTION CLINICS/PILLS ON SALE WE DO DELIVERY OF PILLS ALSO Abortion clinic at very low costs, 100% Guaranteed and it’s safe, pain free and a same day service. It Is A 45 Minutes Procedure, we use tested abortion pills and we do womb cleaning as well. Alternatively the medical abortion pill and womb cleansing !!!

Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...

Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg

(Vivek)Call Us, 8448380779,Call girls in Delhi NCr – We Offer best in class call girls. escort Service At Affordable Price At low Rate with Space Night 8000 We Are One Of The Oldest Escort and Call girls Agencies in Delhi. You Will Find That Our Female Escorts Are Full Of Fun, Sexy And They Would Love Enjoy Your Company. We Have A Fantastic Selection Of Escort Ladies Available For In-Calls As Well As Out-Calls. Our Escorts Are Not Only Beautiful But All Have Great Personalities Making Them The Perfect Companion For Any Occasion. In-Call:- You Can Come At Our Place in Delhi Our place Which Is Very Clean Hygienic 100% safe Accommodation. Out-Call:- You have To Come Pick The Girl From My Place We Are Also Provide Door Step Services (Delhi Ncr, Noida, Gurgaon, Faridabad, Ghaziabad Note:- Pic Collectors Time Passers Bargainers Stay Away As We Respect The Value For Your Money Time And Expect The Same From You Hygienic:- Full Ac room And Clean Rooms Available In Hotel 24 * 7 Hourly In Delhi NCR More Details, With WhatsApp Number, +91-8448380779

call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️

Delhi Call girls

Define the academic and professional writing..pdf

PearlKirahMaeRagusta1

OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...

Shane Coughlan

In today's dynamic e-commerce landscape, the payment gateway emerges as a linchpin, ensuring smooth and secure transactions between buyers and sellers. In this discourse, we delve into the meticulous process of devising test cases tailored for scrutinizing payment gateways. Crafting precise test cases for payment gateways is a quintessential responsibility for testers operating within the service industry. This article meticulously explores pivotal scenarios integral to how to test payment gateways, coupled with essential guidelines for drafting effective test cases.

Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf

kalichargn70th171

Direct Style Effect Systems -The Print[A] Example- A Comprehension Aid

Philip Schwarz

Recently uploaded (20)

W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...

8257 interfacing 2 in microprocessor for btech students

AI & Machine Learning Presentation Template

%in Harare+277-882-255-28 abortion pills for sale in Harare

%in ivory park+277-882-255-28 abortion pills for sale in ivory park

Microsoft AI Transformation Partner Playbook.pdf

WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation

%in Midrand+277-882-255-28 abortion pills for sale in midrand

+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...

WSO2CON 2024 - Does Open Source Still Matter?

%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview

%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...

%in kempton park+277-882-255-28 abortion pills for sale in kempton park

Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...

Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...

call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️

Define the academic and professional writing..pdf

OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...

Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf

Direct Style Effect Systems -The Print[A] Example- A Comprehension Aid

Big size meteorological data processing and mobile displaying system using PostGIS and GeoServer

1. Big size meteorological data processing and mobile displaying system using PostGIS and GeoServer BJ Jang, JW Geum, JH Kwun, HG Park

2.  The system was SLOW not because of using PostGIS but because of NOT TUNING.  PostGIS definitely could make good performance if the system has been PROPERLY TUNED.  Let’s go over some tuning skills from CASE OF MOBILE weather chart service of KMA. 2 Objective

3. Background 3

4. background Problems to solved Import speed Keeping data size indexing Mobile Weather Chart Service Flow Observation Data Model Improvement of performance part by tuning 4 GRIB Data Vector Chart Chart Service Vectorize Image GRIB Data Weather Chart for service Korea Meteorological Administration Vector Chart ※GRIB DATA : GRIdded Binary or General Regularly-disrtibuted Information in Binary form - standardized by the World Meteorological Organization

5. background Problems to solved Import speed Keeping data size indexing Vector Weather Chart 5 Software Architecture

6. background Problems to solved Import speed Keeping data size indexing Characteristics of Weather Data 6 Low Resolution • Geographically low resolution • Surface + Height(Isobaric surface) • Analysis model • Data time + Forecasting time Multiple Dimension • A few times ~ hundreds times Frequent Production • Always need up-to-date data Realtime/ Near Realtime

7. background Problems to solved Import speed Keeping data size indexing Data usage per day times generation (00, 06, 12, 18 UTC) # of spatial table: # of weather charts: MB data # of spatial data columns:

8. Problems to be solved 8

9. Problems of Existing System Slow data collection Difficult big size data management Slow searching for Weather Chart 9 background Problems to solved Import speed Keeping data size indexing

10. Why is the service slow? Failed to understand characteristics of data 10 background Problems to solved Import speed Keeping data size indexing

11. background Problems to solved Import speed Keeping data size indexing Improvement Goal 11 PRGOOBALLESMS AGCOTIAVLITSY 5 hr to insert data Data file grows 35 GB per day Tens of seconds to search a single weather chart Inserting less than 3o min. Keep the size of data file fixed Searching a weather chart within a few second

12. Improvement on importing speed for big size data using batch 12

13. General Data Processing Time The time required each batch size source: http://novathin.kr/19 Run one by one Run one time after gathering as much as batch size There is big difference according to the way of executing SQL! 13 background Problems to solved Import speed Keeping data size indexing

14. background Problems to solved Import speed Keeping data size indexing Import speed comparison One weather chart kml file  executing 3,000 columns  test criteria 14 # of addBatch() # of execution Time(sec) 0 3,000 109.0 100 30 8.9 500 6 5.7 1,000 3 3.4 3,000 1 1.1  1 insert / 1 commit  kml file(3,000 insert) / 1 commit

15. Keeping data file size by managing table 15

16. Data Management of PostGIS  PostGIS is write-once.  Not deleting updated and deleted data  Recording new data below after marking  Pros  Fast  Can manage several versions of data  Cons  Data file size can be extremely increased  Low performance by increase of file size  Weather Chart DB file increases by 35 GB per day!!! 16 background Problems to solved Import speed Keeping data size indexing

17. Snapshot vs Write-once Oracle / MySQL PostgreSQL table A B’ C D E table A B X C D E B’ snapsho t B Transaction owner Other users Record before renewal Record after renewal Record after renewal Record before renewal After completing transection 17 background Problems to solved Import speed Keeping data size indexing

18. General VACUUM Table A B X C X D E X B’ C’ Table A B X C X D E X B’ C’ Table A F C X D E X B’ C’ FSM No need B X C X E X FSM No need C X E X VACUUM execution Data Insert Source: http://www.geocities.jp/sugachan1973/doc/funto60.html In terms of PostGIS for KMA’s weather charts, general vacuum functions can’t solve the problem that data files continuously increase. 18 background Problems to solved Import speed Keeping data size indexing

19. background Problems to solved Import speed Keeping data size indexing VACUUM FULL unused space arrange for big size data management Source: http://www.devmedia.com.br/otimizacao-uma-ferramenta-chamada-vacuum/1710 On PostGIS for KMA’s weather chart, it takes 15 hr. for full vacuum. During Vacuum full, exclusive LOCK happens. 19 VACUUM FULL

20. Partitioning  Partitioning?  Managing tables by conceptually separating one table to several  Data size by table down  Index size down and Search speed up Weather Chart Weather Chart _0 Weather Chart _1 Weather Chart _2 Weather Chart _3 Weather Chart _4 Weather Chart _5 Weather Chart _6 Insert on Monday Insert on Sunday Insert on Tuesday Truncate on Tuesday Truncate on Monday Truncate on Sunday Execution time of truncate is almost a few seconds and file size is decreased without vacuum 20 background Problems to solved Import speed Keeping data size indexing

21. Improvement on inquiry speed by resetting index 21

22. background Problems to solved Import speed Keeping data size indexing Improvement flow of inquiry speed 22 Data Condition Analysis Query Finding Query Plan Analysis Index Improvement

23. background Problems to solved Import speed Keeping data size indexing Data Condition Analysis  Understanding # of columns by table  select count(*) table_name is foolish!  Possible to understand the number of rows if using statistical table  Meaningful data is stored on pg_class table  Execution time within one minute select relname as table_name, to_char(reltuples, '999,999,999') as row_count from pg_class where relnamespace = (select oid from pg_namespace where nspname = 'public') and relam = 0 order by 2 desc, 1; 23

24. background Problems to solved Import speed Keeping data size indexing GeoServer SQL VIEW  Register sql query as Layer  Datasource is geoDB, can use SQL VIEW  Useful  Complex condition to layer  Reprojection  Able to join multiple tables  normal attribute -> spatial object GeoServer , showing weather chart, perfomance is affected by searching speed of PostGIS 24

25. background Problems to solved Import speed Keeping data size indexing Query Finding  Identifying executed SQL using statistical  Using table pg_stat_activity table  Necessary process for tuning  Possible to check execution time  Differences of queries by PostGIS version select query_start, current_query from pg_stat_activity where username = ‘mobile’ and current_query not like ‘<IDLE>%’ order by query_start desc; SELECT "val",encode(ST_AsBinary(ST_Force_2D("geom ")),'base64') as "geom" FROM ( select mdl, mdl_var, placemark_name, val, lyrs_cd, forecast_time, create_time as anal_time, ST_Transform(the_geom, 7188) as geom from contour where mdl_var = 'TMP' ) as "vtable" WHERE (((("mdl" = 'GDAPS' AND "lyrs_cd" = 'A925.0') AND "forecast_time" = '2011.06.27 00:00') AND "anal_time" = '2011.06.27 00:00') AND "geom" && ST_GeomFromText('POLYGON ((-1056768 - 2105344, -1056768 -1040384, 8192 -1040384, 8192 -2105344, -1056768 -2105344))', 7188)); 25

26. background Problems to solved Import speed Keeping data size indexing Query Plan Analysis  PostGIS has basically query analysis function  pgAdmin III-Query-Analysis explanation function  Explain Analyze command – Easy to analyze query 26

27. background Problems to solved Import speed Keeping data size indexing Index Improvement  Principles for Index Improvement  Setting index with all columns on Where clause  Spatial column has separate index  Columns with lots of including data types come first  Possibly, items compared as same operator come first  Unnecessary index should be removed due to bad performance on inserting  Examples -- contour_0 DROP INDEX index_createtime_contour_0; DROP INDEX index_forecasttime_contour_0; DROP INDEX index_lyrscd_contour_0; DROP INDEX index_mdl_contour_0; DROP INDEX index_mdlvar_contour_0; CREATE INDEX index_contour_0_all ON contour_0 (forecast_time ASC NULLS LAST, mdl_var ASC NULLS LAST, lyrs_cd ASC NULLS LAST, create_time DESC NULLS LAST, mdl ASC NULLS LAST);  Result  After individually deleting index, integrated index creation reduces 20% of data capacity  6 ~ 25 times speed improvement by table(big tables show better performance) 27

28. Improvement Result 28

29. 29 Under 300 isobaric/ Temperature/ Isokinetics Ground/ Wet-number/ Temperature 800 isobaric/ Mixture ratio/ Temperature

30. Conclusion Importance on excution using addBatch() and excuteBatch() About 100 times performance improvement Mixed with partitioning and truncate Stably keeping N-1 accordance Appropriate index for query 20 times inquiry time improvement 30

31. Conclusion PostGIS is really great DBMS! Perfectly suited with GeoServer However, tuning with perfect understanding of the features. 31

32. Q&A Please Ask BJ Jang via Email ! bjjang@gaia3d.com 32

Editor's Notes

안녕하세요. 저는 한국의 가이아쓰리디에서 근무하고 있는 권재현입니다. 이 발표내용은 저희 회사 동료인 장병진부장님이 직접 수행하신 것이나 제가 대신 발표하게 되었습니다. 그럼 이제 한국 기상청에서 PostGis와 GeoServer를 이용하여 대용량 공간데이터 기반의 일기도를 서비스했던 내용을 발표하겠습니다. Hello, my name is Jawhyun Kwun and I work at Gaia3D based in South Korea. I’d really appreciate that I have a chance to deliver a presentation at FOSS4G 2014. This project was mainly managed by BJ Jang, one of OSGeo charter member in South Korea, currently used at Korean Meteorology Agency. The title is “Big size meteorological data processing and mobile displaying system using PostGIS and GeoServer” Postgis로 스크립트 변경 페이지 잘보이게 수정 파트 넘길 때 이전 설명 추가
일반적으로 적절한 튜닝을 하지 않고 postgresql를 사용할 경우 성능과 서비스 품질이 떨어지는 문제가 발생되는데, 이런 문제는 상황에 맞는 튜닝을 통해 효과적으로 해결할 수 있습니다. 본 발표에서는 한국 기상청의 일기도 서비스를 사례로 어떤 튜닝방법을 통해 문제를 해결했는지 말씀드리도록 하겠습니다. Generally speaking, using PostGIS without tuning could lower the performance and quality of service. Tuning according to the situation or environment makes the service better. I will speak about the experience that our team successfully launched the weather chart service at KMA by using several tuning skills based on situations.
Background
한국 기상청의 모바일 일기도 서비스는 다음과 같은 과정을 거쳐 진행됩니다. 먼저 관측자료를 수집하여 기상모델로 모델링하면 레스터데이터인 GRIB DATA로 변환합니다. 그 후 벡터라이징을 통해 벡터 일기도를 생성한 후 GeoServer를 통해 이미지화하여 서비스용 일기도를 서비스합니다. KMA’s mobile weather chart service is processed like this. On top of that, collected observation data is modeled and converted into GRIB data which is raster data. After that, vector weather charts are generated throughout vertorized and weather chart for service is serviced throughout imagification.
이 서비스의 아키텍처를 알아보면 먼저 kml 데이터를 postgresql에 넣은 다음 이 데이터를 GeoServer와 연동하여 모바일에서 html5, OpenLayers, JQueryMobile 등을 사용하여 서비스하는 구조로 되어있습니다. If you look at the service architecture here, kml data firstly are inserted into PostGIS and data in PostGIS is linked to GeoServer for the service using HTML5, OpenLayers, and JQueryMobile.
기상자료의 특징에 대해 말씀드리면, 기상자료는 5KM * 5KM 의 저해상도 영상이 대다수입니다. 또한 평면과 단순 Z축만이 아닌 등압면이라는 약간 다른 개념의 높이와 다양한 분석모델, 언제 관측되었냐의 자료시간과 이 자료를 가지고 언제를 예측할 것이냐는 등 다차원의 자료입니다. 또한 다른 일반적인 gis데이터와 다르게 하루에 수차례에서 수백차례에 생산될 정도로 자료생산주기가 빠릅니다. 또한 기상데이터의 특성상 항상 최신데이터를 필요로 합니다. Weather data are mostly low resolution images such as 5km by 5km. Also, there are somewhat different concepts such as isobaric surface, various kinds of analysis models, and observation date and time. Futhremore, different from general GIS data, frequency of data generation is quite often like from a few times to several hundreds times per day. Lastly, weather data should always be up to date.
저희가 실제로 다루는 데이터의 현황을 보면 하루에 4회 생산되는 데이터를 6개의 공간테이블에 밀어 넣습니다. 이 데이터를 이용하여 5332장의 일기도를 생성하고 이 데이터는 35gb정도의 용량이며 67,000,000행의 공간데이터가 쌓입니다. 하루에 엄청나게 많은 데이터가 처리되어 쌓이고 있습니다. 이렇게 생산되는 데이터는 남한 인구보다도 많습니다. The system used 6 spatial table to handle weather data generated 4 times per day. From these data, more than five thousand weather charts are generated, which is about 35 GB and 67M columns. Gigantic amount of data is collected, generated, processed, and stacked every day. Number of spatial data columns is more than the number of South Korea population.
다음은 저희가 이 서비스에서 극복해야 했던 문제들에 대해서 알아보겠습니다. I will go over some problems that we had to overcome.
이러한 데이터의 양으로 인해 3가지 문제가 발생했습니다. 먼저 데이터의 수집이 느리며 매일 매일 쌓이는 많은 데이터량으로 인해 데이터 유지관리가 제대로 되지 않았고, 데이터 조회가 느렸습니다. 즉 db에 관련된 전범위적인 부분에 문제가 있었습니다. Due to the size of weather data, there could be three problems. Firstly, it takes too much time to collect data. Secondly, data is not properly managed because lots of data is accumulated every day. Lastly, it takes too much time to search data. In all, there are all-round problems regarding database.
이러한 문제의 발생원인은 기상자료의 특징과 상황에 대한 파악이 부족하였기 때문입니다. 시스템 개발초기단계에서 단순히 데이터를 postgresql에 넣고 이를 GeoServer로 서비스하면 될 것이다 라고 생각했지만, 데이터의 특성을 제대로 파악하지 못한 채로 일반적인 서비스 구조를 적용하여 이런 문제가 일어났습니다. /*그리하여 앞서 말씀 드린 데이터 인서트 속도 개선, 데이터 자료 유지, 데이터 조회 속도 개선 이라는 3가지 파트로 나누어 튜닝을 하였습니다.*/ The reason to happen these problems is lack of characteristics and situation of weather data. At the beginning level of system development, people simply thought putting data into PostGIs and servicing using GeoServer without understanding characteristics of data. However, weather data is quite unique, so customization should be required before development of the system. /*From this perspective, we tuned the system based on three goals that I said before: inserting all data less than 30 minutes, keeping the size of data file fixed, and searching a weather chart withing a few second. */
구체적인 문제점을 살펴보면 먼저 자료를 인서트하는데 5시간이 소요됬으며 매일매일 35기가씩 데이터자료가 늘어났습니다. 또한 일기도 한 장을 모바일로 조회하는데 수 십 초가 걸렸습니다. 이런 상황분석 후 저희는 다음과 같은 목표를 잡았습니다. 30분이내에 데이터를 인서트하고, 불필요한 데이터파일이 늘어나지 않게 유지하며 수초이내에 일기도를 조회할 수 있도록 하는 것 입니다. 애드배치와 익스큐트배치를 이용하여 30분이내에 데이터를 인서트하고, 파티셔닝과 트룬케이트를 이용하여 불필요한 데이터파일이 늘어나지 않게 유지하며 인덱스개선을 통해 수초이내에 일기도를 조회할 수 있도록 하는 것 입니다. More specifically, it usually took five hours to insert all data to generate weather charts and the size of data grows 35 GB every day. Also, it took more than tens of seconds to search a single weather chart on mobile devices. Based on this situation, we built our own goal. We want to improve the system like this: inserting all data less than 30 minutes Using addBatch() & excuteBatch(), keeping the size of data file fixed Using partitioning & truncate, and searching a weather chart withing a few second by Improvement on index. From this perspective, we tuned the system based on three goals /*that I said before: inserting all data less than 30 minutes, keeping the size of data file fixed, and searching a weather chart withing a few second. */
Improvement on importing speed for big size data
데이터를 어떻게 임포트하느냐에 따라 속도가 달라졌습니다. 데이터를 6700만건을 일일이 하나씩 임포트할 때에는 24시간이 넘게 소요되었습니다. 하지만 이를 모아서 덩어리로 한번에 임포트하면 시간이 훨씬 단축되었습니다. 우측의 그래프는 실제로 테스트한 결과를 나타낸 것이며 3000건 정도 모아서 인서트시 수십 수백배의 시간을 단축할 수 있었습니다. There was big difference of speed according to how to import data. It took more than 24 hours to import one by one of 67 millions data. However, it is shortened if data is imported at a time. Graph on the right side shows the result of the test and it says that it shortens more than several hundreds times of time when inserting after gathering three thousands of data.
하나의 일기도 kml파일안에는 약 3000행의 데이터가 들어있는데 이 데이터를 가지고 실제 삽입속도를 테스트해보았습니다. 한건씩 임포트시 109초가 소요되었지만 addBatch()를 이용하여 데이터를 모은 후 한번에 임포트 시에 시간이 엄청 단축되었습니다. 추가: 여기서 쓰이는 addBatch()와 excuteBatch()는 jdbc2.0을 지원하는 모든 db에서 사용할 수 있는 기능입니다.. 100번 addBatch() 후 executeBatch()배치를 하였을 시 8.9초가 걸렸으며 3000번 실행 시 1.1초로 줄어들었습니다. 무려 백배 이상의 속도개선이 있었습니다. 어떻게 임포트를 하느냐에 따라서 이렇게 달라졌습니다. One weather chart KML file has about three thousands columns of data and we took a importing speed test using these data. When importing one by one, it took 109 seconds, but as you can see here, using addBatch() after gathering data saves huge amount of time. It took 8.9 seconds to do executeBatch() after 100 times of addBatch() and it took only 1.1 seconds after 3,000 times of addBatch(). This definitely shows that how to import makes huge difference.
두번째 테마는 데이터를 어떻게 유지하느냐 하루에 35기가씩 늘어나는 데이터를 어떻게 유지시킬 것이냐 입니다. Second task is about how to keep the data file size stable, which originally grows 35 GB per day. 제목고려
먼저 postgresql에서 데이터를 어떻게 관리를 하는 점을 알아보겠습니다. postgresql이 다른 dbms로부터 공격받는 문제 중 하나는 postgresql은 추기형이라는 것입니다. update나 delete시 자료를 실제로 데이터베이스에서 지우지 않습니다. 지웠다는 표시만 하고 실제로 지우지 않기 때문에 실행 속도가 빠르고 버져닝이 가능하다는 장점이 있습니다. 하지만 데이터파일의 크기가 엄청나게 늘어나기 때문에 성능저하로 서버가 죽는 상황이 발생할 수 있습니다. 더구나 기상데이터는 하루에 35기가씩 증가하기 이런 문제를 반드시 해결해야 했습니다. I will go over who to manage data on PostGIS. The problem of PostGIS or difference from other DBMS is PostGIS is write-once type. When updating or deleting, real data isn’t removed from database. There are some marks on deleting without removing, so execution is fast and versioning is possible. However, this makes the size of database extremely large, sometimes making the system down due to slow-down of performance. Also, weather data increase by 35 GB per day and we should solve this problem.
좀 더 살펴보겠습니다. 오라클에서는 b라는 데이터를 갱신 하면 b가 b’로 보이게 됩니다. B는 스냅샷에 들어가게 되고 트랜잭션이 완료되면 사라지게 됩니다. 하지만 postgresql는 갱신 전 레코드는 그대로 두고 갱신 후 레코드를 추가를 시킵니다. 이는 트랜잭션이 일어날 때마다 데이터가 늘어난다는 특징이 있습니다. I will go over more in detail. On Oracle, if renewing data called b, b looks as b’. B goes into snaphot and disappears after completing transaction. However, on PostGIS, it keeps record before renewing and adding a record after renewing. This makes the data increase whenever transaction.
postgresql의 이러한 특징으로 인해 postgresql에서는 VACUUM이라는 기능을 제공합니다. 청소기라는 이름과 같이 필요 없는 데이터를 빨아들이는 기능을 하며 자동으로 실행됩니다. 그림을 보시면 b와 c는 b’와 c’로 갱신되었으며 e는 삭제한 데이터입니다. postgresql에서는 Bce가 각각의 자리에 그대로 있는데 여기서 VACUUM을 실행되면 bce데이터는 Fsm이라는 공유메모리에 들어가게 됩니다. 그 후 데이터를 넣으면 b가 있던 자리에 f가 들어가는 것을 알 수 있습니다. 하지만 이 방식은 디비가 너무 바쁘게 많은 일을 하고 있을 경우 VACUUM이 실행되어도 파일이 줄기는 커녕 늘어나는 현상이 일어났습니다. Owing to PostGIS’s feature, PostGIS provides a function called vacuum. Like real vacuum, this function plays a role to absorb data and automatically executes. As you can see here in this slide, b and c are renewed b’ and c’ and e is deleted. On PostGIS, b, c, and e data are remained where they originally are, but if vacuum is executed, b, c, and e data is moved to share memory called FSM. After that, any data is inserted, f is inserted into the location of b. However, in case that database is too busy, this way could even increase the data file even though vacuum is executed.
이전의 일반적인 VACUUM과 달리 FULL VACUUM이라는 기능을 사용하면 실제로 파일사이즈를 줄일 수 있습니다. 그림과 같이 필요 없는 공간을 지우고 뒤의 자료를 삭제한 뒤 빈 공간에 밀어 넣는 방식인데 기상청의 3일치 데이터를 가지고 FULL VACUUM을 해본 결과 대략 15시간이 걸렸습니다. 게다가 FULL VACUUM을 하는 동안은 디비에 락이 걸려 아무것도 할 수가 없었습니다. 즉 FULL VACUUM도 쓸 수가 없는 상황이었습니다. Different from general vacuum, using the function called full vacuum can decrease the file size. Like this image, it deletes useless spaces and data and push all data back to empty spaces. As a result of applying this function into three days’ of KMA data, it took 15 hours. Furthermore, during full vacuum, database is locked and can’t do anything. Thus, full vacuum is not a good option.
그래서 찾아본 방법이 PARTIONING입니다. PARTIONING은 대부분 디비에서 지원하는 기능이며 데이터의 양이 많을 때 사용하는 방법입니다. 그리하여 일기도를 요일별로 7개로 나누었습니다. 각 요일마다 인서트하는 테이블을 다르게 하고 각 요일마다 지워야 할 테이블을 정하였습니다. 일요일은 0번 테이블에 데이터가 들어가고 4번 테이블의 데이터가 삭제됩니다. 그럼 전체의 7개 테이블에서 3개의 테이블에만 데이터가 들어가게 되고, 매일 똑같은 작업이 반복됩니다. 이때 데이터를 지우기 위해 VACUUM대신 TRUNCATE를 이용하면 삭제 시 시간이 1초 이하의 시간밖에 걸리지 않습니다. 순식간에 다 날려버립니다. 그 결과 쉽고 빠르게 데이터를 최대 n-1일치로 유지할 수 있었습니다. So, we try to do partitioning. Partitioning is a function provided by database and used when data size is huge. To use partitioning, we divided weather chart into 7 by days from Monday to Sunday. We defined tables by days to insert and also defined tables by days to delete. For example, on Sunday, data is inserted into table 0 and data on table 4 is deleted. On all 7 tables, data is inserted into three tables and repeated same works every day. At this time, in order to delete data, it takes less than one second to use truncate instead of vacuum. It blows away in the blink. As a result, we could easily and quickly keep all data less than n-1 days.
마지막으로 조회속도 개선에 대해 알아보겠습니다. Lastly, I will go over improvement on inquiry speed.
조회속도 개선과정을 보면 먼저 데이터의 분포를 파악합니다. 어떤 데이터가 많고 속도가 느린지 알아보는 것입니다. 그 다음 실제로 어떤 쿼리가 사용되는지를 파악해야 하며 어떤 쿼리 플랜이 생성되어 실행되는 지 파악하여 인덱스를 개선하는 과정으로 진행됩니다. If you look at the improvement flow of inquiry speed, present condition of data is firstly analyzed, understanding which data is big or slow. Next, after understanding what kinds of data is really used, what kinds of query plan is executed, index is steadily improved.
데이터 현황분석은 테이블의 로우 수가 얼마나 되는지 파악하는 게 중요합니다. 일반적으로 셀렉트 카운트를 이용하는데 이러면 시간이 오래 걸립니다. 하지만 통계테이블을 이용하면 빠르게 찾아낼 수 있습니다. 위와 같은 쿼리를 날리면 아래와 같은 결과가 수초 내 에 나타납니다. It’s important to grasp out the number of rows of tables for Data Condition Analysis. It took too much time to generally use select count(Asterisk). However, using statistical table as pg_class, it can be quickly recognized. If executing the query like above on the right side, the result shows like the below on the right side within a few seconds.
다음은 DB적인 문제에서 살짝 벗어난 얘기를 해보겠습니다. GeoServer 와 PostGIS를 연동하여 단순히 레이어를 사용하는 것이 아닌 레이어를 쉽게 관리하게 위하여 SQL VIEW라는 기능을 사용하였습니다. 이는 Oracle Spatial, PostGIS, Arc SDE와 같은 GeoDB가 데이터 소스인 경우 쓸 수 있는 기능인데 어느 특정한 SQL문을 하나의 레이어처럼 관리할 수있습니다. 이 시스템에서는 데이터가 6개의 테이블로 구성되있어서 조건절 쿼리가 복잡한데 SQL VIEW를 만듬으로써 편리하게 관리할 수 있었다. 또한 좌표계변환과 같은 작업도 SQL VIEW를 만드는 과정에서 할 수 있다. 이는 DB머신에서 작업을 시킴으로써 클라이언트측이나 GeoServer에서 작업을 하는 것보다 훨씬 빠르게 결과를 나타낼 수 있다. 우측 위와 같은 쿼리를 이용하면 아래 결과와 같은 속성을 가진 레이어로 관리할 수 있다. 다음 쿼리를 찾는 과정에서 우측에 쓰인 쿼리가 어떻게 쓰이는 지 알아봅시다. Next, I will move to other subject which is a little bit away from db. I use sql view in conjunction with postgis and geoserver, for easily manage layer not just to use a layer. It can manage a specific SQL query when datasource is geoDB like oracle spatial, postgis, sde etc. this system consists of six tables, so query are forced to complex but can manage easily By creating a SQL VIEW. In addition, In the process of making SQL VIEW, it can operate another work like a reprojection. By the operation in the DB machine, it is possible to obtain results more quickly than in the geoserver or client. Such Up in the right side query can be managed as a layer having the same properties as the result of down in the right side. in next page, how to used this query in the right side.
/*기상청 서비스는 조회 쿼리를 할 때 GeoServer와 postgresql를 연동하여 sql뷰를 이용합니다. 즉 특정한 sql문을 하나의 레이어로 관리할 수 있습니다. 이 방법을 이용함으로써 복잡한 조건을 편하게 관리할 수 있지만, 이 방식의 단점은 실제 postgresql에 들어오는 쿼리가 GeoServer에 등록한 sql문과 달리 더 복잡해지다는 것입니다. 따라서 */ 조회속도 개선을 위해 실제 수행되는 Sql문을 찾아내는 것이 중요합니다. 이를 위해 통계테이블을 이용하여 내부에서 실행중인 sql문을 찾아내고 그 sql문에 어떤 조건 절이 어떤 형태로 들어가는 지에 대해 파악했습니다. 위는 아래 쿼리의 결과입니다. 현재 실행중인 쿼리가 나오고 언제 시작되었는지 현재 돌고 있는지 얼마나 시간이 소요되는지에 대하여 파악되며 여기서 느린 쿼리를 찾아냅니다. 아래는 복잡하지만 sql 뷰에 등록한 쿼리가 내부에서 어떤 쿼리로 실행되는지를 확인한 샘플입니다. 추가:이전페이지에서 등록한 sql view가 실제로 쿼리에서 테이블의 역할을 하고 있는 것을 볼 수 있습니다. /*When requesting a query on KMA service, SQL view is used as linking GeoServer with PostGIS. In other words, Specific SQL statement can be managed as one layer. As using this way, complex query conditions can be easily manages, but the query on PostGIS could be more complicated than sql statement on GeoServer. Thus,*/ actually executed SQL statement should be found to improve inquiry speed. For doing this, we found actually executing SQL statement internally using statistical table as pg_stat_activity and grasped out how much time it took. Up in the right side is the result of query execution of down in the left side. Currently executing query is shown and it shows when it started, how much time it rook, and slow query. The query down in the right side is a little bit complicated and the sample of internally executing one registered on SQL view. You can see sql view registered previous page have a role in query as table.
오라클의 분석기능은 전문가 기능으로 구성되어 있으며 일반적인 클라이언트로 오라클 설치 시 전문가기능은 설치가 안됩니다. 하지만 postgresql는 기본적으로 깔리는 커맨드라인에서도 분석가능하며 ui툴인 pgadmin-3에서도 분석이 가능합니다. Sql 클릭 후 쿼리탭에서 explain-analyze 버튼을 클릭하면 아래와 같이 그래피컬하게 쿼리가 진행되는 과정을 확인 할 수 있으며 EXPLAIN ANALYZE 명령어를 실행 하면 아래와 같은 결과가 나옵니다. 이 결과를 가지고 쿼리 내용을 분석합니다. Oracle’s analysis function is consisted of professional functions and basic clients doesn’t have professional functions. However, PostGIS provides command line analysis function on its basic installation and pgadmin-3 which is ui tool of PostGIS also has that function. After clicking SQL, if clicking exlain-analyzing button on query tap, you can graphically see what’s going on query and also can see the result when executing explain analyze command. Using this result, query can be analyzed.
인덱스 개선을 할 때에는 위 분석 결과를 바탕으로 어떻게 인덱스를 구성할 지를 고려해야 합니다. 먼저 WHERE 절에 있는 가능한 모든 컬럼들을 한꺼번에 묶어서 인덱스를 구성합니다. 여기서 공간데이터 컬럼은 인덱스 방식이 조금 달라 별도로 구성합니다. 다음으로 고려할 것은 데이터의 종류가 많은 컬럼부터 먼저 써야 합니다. 이렇게 필터링을 통해 많은 데이터를 먼저 걸러내야 더욱 효과적입니다. 또한 EQUAL과 같은 등위연산자가 포함된 항목부터 구성합니다. 작다 크다, like 등과 같은 항목을 먼저 필터링하는 것보다 효율이 큽니다. 마지막으로 불필요한 인덱스는 제거합니다. 이 결과 인덱스 생성으로 인해 데이터용량이 20%나 감소하였으며 조회속도는 테이블별로 6배에서 25배까지 빨라졌습니다. 테이블이 클 수록 효과가 큰 것으로 나타났습니다. We should consider how to set the index based on previous analysis result when improving index. First of all, index with all columns is set on Where clause. Here, spatial column has somewhat index type, so it should be set separately. Next, Columns with lots of data types should come first. Like this, it is more effective to filter out some using filering. Also, items including same operator like EQUAL should come first, which is more efficient than filtering out items such as smaller, larger, and like. Lastly, unnecessary index should be removed due to bad performance on inserting. As a result, data capacity is decreased by 20 % due to index creation and inquiry speed has increased by 6 times to 25 times. It seems that the bigger table size, the better performance.
Lastly, Improvement result!
This is demo. And I will show you about our system. This mobile site. First, when click a button ,GDAPS, isobaric lists are showing. Isobaric means that has same atmospheric pressure section in the air. And under the isobaric list. meteorological data type list appears. And i can view weather chart when I select an option. So we can
위에서 말씀드린 세 가지 테마를 가지고 튜닝을 한 결과 앞에 보이시는 바와 같은 결과가 나왔습니다. 데이터를 인서트하는데 있어서는 시스템에 따른 최적의 조건에 찾아 실행하는 것이 중요하다는 것을 알았으며 이 결과 100배 정도의 성능 개선을 얻었습니다. 두 번째로 데이터파일 크기 유지는 partioning과 truncate의 조합으로 인하여 N-1일치의 데이터를 안정적으로 유지할 수 있었으며 마지막으로 조회시간은 쿼리에 알맞은 적절한 인덱스를 찾아서 재구성하여 20배 정도의 속도개선 효과를 보게 되었습니다. As I previously said, we have some results by tuning according to three scenarios. We just found out that appropriate conditions for system is very important to insert data and we improved 100 times performance. Secondly, we stably keep the data file size as N-1 accordance data by mixing up partitioning and truncate. Lastly, we have about 20 times inquiry time improvement throughout recomposing appropriate index for queries.
결론은 postgresql는 정말 훌륭한 DBMS이며 다른 DBMS와 비교하여도 성능이 떨어지지 않습니다. 게다가 GeoServer와의 궁합도 정말 좋아 활용도도 높습니다. 하지만 특성을 파악하고 상황에 맞춰 적절한 튜닝 과정을 거쳐야만 좋은 성능을 낼 수 있다는 것입니다. In conclusion, PostGIS is really great DBMS and the performance is never lower than other DBMS. Plus, it is perfectly suited with GeoServer, so availability is high. However, better performance will be guaranteed after tuning with perfect understanding of the features.
If you have any questions, please ask BJ Jang via Email - bjjang@gaia3d.com Thank you for listening

Big size meteorological data processing and mobile displaying system using PostGIS and GeoServer

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Big size meteorological data processing and mobile displaying system using PostGIS and GeoServer

Similar to Big size meteorological data processing and mobile displaying system using PostGIS and GeoServer (20)

More from BJ Jang

More from BJ Jang (17)

Recently uploaded

Recently uploaded (20)

Big size meteorological data processing and mobile displaying system using PostGIS and GeoServer

Editor's Notes