Oak, the architecture of Apache Jackrabbit 3Jukka Zitting
Apache Jackrabbit is just about to reach the 3.0 milestone based on a new architecture called Oak. Based on concepts like eventual consistency and multi-version concurrency control, and borrowing ideas from distributed version control systems and cloud-scale databases, the Oak architecture is a major leap ahead for Jackrabbit. This presentation describes the Oak architecture and shows what it means for the scalability and performance of modern content applications. Changes to existing Jackrabbit functionality are described and the migration process is explained.
In the age of data science and machine learning, data scientists want access to data sets quickly, but organizations often need to protect private data, whether due to internal policy or government regulations.
In this talk we discuss how to leverage PostgreSQL for managing organization-wide data access while protecting privacy.
Topics include:
Purpose-based data access
Federating data
Foreign data wrappers
Masking
Differential Privacy
Auditing
The document discusses MySQL's buffer pool and buffer management. It describes how the buffer pool caches frequently accessed data in memory for faster access. The buffer pool contains several lists including a free list, LRU list, and flush list. It explains functions for reading pages from storage into the buffer pool, replacing pages using LRU, and flushing dirty pages to disk including single page flushes during buffer allocation.
Breaking Parser Logic: Take Your Path Normalization Off and Pop 0days Out!Priyanka Aash
"We propose a new exploit technique that brings a whole-new attack surface to defeat path normalization, which is complicated in implementation due to many implicit properties and edge cases. This complication, being under-estimated or ignored by developers for a long time, has made our proposed attack vector possible, lethal, and general. Therefore, many 0days have been discovered via this approach in popular web frameworks written in trending programming languages, including Python, Ruby, Java, and JavaScript.
Being a very fundamental problem that exists in path normalization logic, sophisticated web frameworks can also suffer. For example, we've found various 0days on Java Spring Framework, Ruby on Rails, Next.js, and Python aiohttp, just to name a few. This general technique can also adapt to multi-layered web architecture, such as using Nginx or Apache as a proxy for Tomcat. In that case, reverse proxy protections can be bypassed. To make things worse, we're able to chain path normalization bugs to bypass authentication and achieve RCE in real world Bug Bounty Programs. Several scenarios will be demonstrated to illustrate how path normalization can be exploited to achieve sensitive information disclosure, SMB-Relay and RCE.
Understanding the basics of this technique, the audience won't be surprised to know that more than 10 vulnerabilities have been found in sophisticated frameworks and multi-layered web architectures aforementioned via this technique."
Solving the DB2 LUW Administration DilemmaRandy Goering
As a DB2 LUW Database Administrator you are probably reluctant to or prohibited from granting your users* these permissions because doing so gives them permission to other DB2 administrations tasks like stopping the database. If your users are not allowed to do these tasks then who is? Most likely, you, as the DBA will perform these and other administrative functions for your users. Would you like a way to eliminate these tasks from your daily to-do list? This presentation will discuss how to externalize specific administrative tasks with Stored Procedures, Federated procedures, Administrative SQL routines, and views.
This document summarizes recommender systems used in e-commerce and their benefits. It outlines examples of recommender systems from companies like Amazon, CDNOW, and eBay. It then categorizes recommender systems based on their interface (browsing, top-N lists, etc.) and recommendation technology (item-to-item correlation, people-to-people correlation, etc.). Finally, it discusses how recommender systems can enhance e-commerce by helping customers find products, suggesting additional purchases, and creating customer loyalty.
The document discusses tuning MySQL server settings for performance. Some key points covered include:
- Settings are workload-specific and depend on factors like storage engine, OS, hardware. Tuning involves getting a few settings right rather than maximizing all settings.
- Monitoring tools like SHOW STATUS, SHOW INNODB STATUS, and OS tools can help evaluate performance and identify tuning opportunities.
- Memory allocation and settings like innodb_buffer_pool_size, key_buffer_size, query_cache_size are important to configure based on the workload and available memory.
Oak, the architecture of Apache Jackrabbit 3Jukka Zitting
Apache Jackrabbit is just about to reach the 3.0 milestone based on a new architecture called Oak. Based on concepts like eventual consistency and multi-version concurrency control, and borrowing ideas from distributed version control systems and cloud-scale databases, the Oak architecture is a major leap ahead for Jackrabbit. This presentation describes the Oak architecture and shows what it means for the scalability and performance of modern content applications. Changes to existing Jackrabbit functionality are described and the migration process is explained.
In the age of data science and machine learning, data scientists want access to data sets quickly, but organizations often need to protect private data, whether due to internal policy or government regulations.
In this talk we discuss how to leverage PostgreSQL for managing organization-wide data access while protecting privacy.
Topics include:
Purpose-based data access
Federating data
Foreign data wrappers
Masking
Differential Privacy
Auditing
The document discusses MySQL's buffer pool and buffer management. It describes how the buffer pool caches frequently accessed data in memory for faster access. The buffer pool contains several lists including a free list, LRU list, and flush list. It explains functions for reading pages from storage into the buffer pool, replacing pages using LRU, and flushing dirty pages to disk including single page flushes during buffer allocation.
Breaking Parser Logic: Take Your Path Normalization Off and Pop 0days Out!Priyanka Aash
"We propose a new exploit technique that brings a whole-new attack surface to defeat path normalization, which is complicated in implementation due to many implicit properties and edge cases. This complication, being under-estimated or ignored by developers for a long time, has made our proposed attack vector possible, lethal, and general. Therefore, many 0days have been discovered via this approach in popular web frameworks written in trending programming languages, including Python, Ruby, Java, and JavaScript.
Being a very fundamental problem that exists in path normalization logic, sophisticated web frameworks can also suffer. For example, we've found various 0days on Java Spring Framework, Ruby on Rails, Next.js, and Python aiohttp, just to name a few. This general technique can also adapt to multi-layered web architecture, such as using Nginx or Apache as a proxy for Tomcat. In that case, reverse proxy protections can be bypassed. To make things worse, we're able to chain path normalization bugs to bypass authentication and achieve RCE in real world Bug Bounty Programs. Several scenarios will be demonstrated to illustrate how path normalization can be exploited to achieve sensitive information disclosure, SMB-Relay and RCE.
Understanding the basics of this technique, the audience won't be surprised to know that more than 10 vulnerabilities have been found in sophisticated frameworks and multi-layered web architectures aforementioned via this technique."
Solving the DB2 LUW Administration DilemmaRandy Goering
As a DB2 LUW Database Administrator you are probably reluctant to or prohibited from granting your users* these permissions because doing so gives them permission to other DB2 administrations tasks like stopping the database. If your users are not allowed to do these tasks then who is? Most likely, you, as the DBA will perform these and other administrative functions for your users. Would you like a way to eliminate these tasks from your daily to-do list? This presentation will discuss how to externalize specific administrative tasks with Stored Procedures, Federated procedures, Administrative SQL routines, and views.
This document summarizes recommender systems used in e-commerce and their benefits. It outlines examples of recommender systems from companies like Amazon, CDNOW, and eBay. It then categorizes recommender systems based on their interface (browsing, top-N lists, etc.) and recommendation technology (item-to-item correlation, people-to-people correlation, etc.). Finally, it discusses how recommender systems can enhance e-commerce by helping customers find products, suggesting additional purchases, and creating customer loyalty.
The document discusses tuning MySQL server settings for performance. Some key points covered include:
- Settings are workload-specific and depend on factors like storage engine, OS, hardware. Tuning involves getting a few settings right rather than maximizing all settings.
- Monitoring tools like SHOW STATUS, SHOW INNODB STATUS, and OS tools can help evaluate performance and identify tuning opportunities.
- Memory allocation and settings like innodb_buffer_pool_size, key_buffer_size, query_cache_size are important to configure based on the workload and available memory.
Chapter – 4 Normalization and Relational Algebra.pdfTamiratDejene1
The document discusses normalization and relational algebra. It defines normalization as a process of structuring a database into tables to reduce data redundancy and inconsistencies. The document covers various normal forms including 1st normal form (1NF), 2nd normal form (2NF), and 3rd normal form (3NF). It defines functional dependencies and different types of dependencies and anomalies. Examples are provided to illustrate how to determine the normal forms of relations and decompose relations to higher normal forms by removing dependencies.
The document discusses InnoDB flushing and checkpoints. It provides an overview of InnoDB architecture and describes the page cleaner thread that handles background flushing. The page cleaner thread coordinates multiple worker threads to flush pages from the LRU and flush lists. Flushing involves writing dirty pages from the buffer pool to disk in the background to avoid needing synchronous I/O.
Recommender Systems represent one of the most widespread and impactful applications of predictive machine learning models.
Amazon, YouTube, Netflix, Facebook and many other companies generate an important fraction of their revenues thanks to their ability to model and accurately predict users ratings and preferences.
In this presentation we cover the following points:
→ introduction to recommender systems
→ working with explicit vs implicit feedback
→ content-based vs collaborative filtering approaches
→ user-based and item-item methods
→ machine learning and deep learning models
→ pros & cons of the methods: scalability, accuracy, explainability
Beyond EXPLAIN: Query Optimization From Theory To CodeYuto Hayamizu
EXPLAIN is too much explained. Let's go "beyond EXPLAIN".
This talk will take you to an optimizer backstage tour: from theoretical background of state-of-the-art query optimization to close look at current implementation of PostgreSQL.
The document provides an overview of DB2 and discusses key concepts such as instances, databases, tablespaces, and recovery. It describes how to install and configure DB2, create instances and databases, load and move data between databases, and perform backups and recovery. Examples are given of commands used to create tablespaces and load data. The document also mentions tools for visualizing queries and monitoring performance.
Integrating Relational Databases with the Semantic Web: A ReflectionJuan Sequeda
This is a lecture given at the 2017 Reasoning Web Summer School
It has been clear from the beginning that the success of the Semantic Web hinges on integrating the vast amount of data stored in Relational Databases. In 2007, the W3C organized a workshop on RDF Access to Relational Databases. In 2012, two standards were ratified that map relational data to RDF: Direct Mapping and R2RML.
In this lecture, I will reflect on the last 10 years of research results and systems to integrate Relational Databases with the Semantic web. I will provide an answer to the following question: how and to what extent can Relational Databases be integrated with the Semantic Web? I will review how these standards and systems are being used in practice for data integration and discuss open challenges.
MongoDB is an open-source document database, and the leading NoSQL database. Written in C++.
MongoDB has official drivers for a variety of popular programming languages and development environments. There are also a large number of unofficial or community-supported drivers for other programming languages and frameworks.
Recommender systems support the decision making processes of customers with personalized suggestions. These widely used systems influence the daily life of almost everyone across domains like ecommerce, social media, and entertainment. However, the efficient generation of relevant recommendations in large-scale systems is a very complex task. In order to provide personalization, engines and algorithms need to capture users’ varying tastes and find mostly nonlinear dependencies between them and a multitude of items. Enormous data sparsity and ambitious real-time requirements further complicate this challenge. At the same time, deep learning has been proven to solve complex tasks like object or speech recognition where traditional machine learning failed or showed mediocre performance.
Join Marcel Kurovski to explore a use case for vehicle recommendations at mobile.de, Germany’s biggest online vehicle market. Marcel shares a novel regularization technique for the optimization criterion and evaluates it against various baselines. To achieve high scalability, he combines this method with strategies for efficient candidate generation based on user and item embeddings—providing a holistic solution for candidate generation and ranking.
The proposed approach outperforms collaborative filtering and hybrid collaborative-content-based filtering by 73% and 143% for MAP@5. It also scales well for millions of items and users returning recommendations in tens of milliseconds.
Event: O'Reilly Artificial Intelligence Conference, New York, 18.04.2019
Speaker: Marcel Kurovski, inovex GmbH
Mehr Tech-Vorträge: inovex.de/vortraege
Mehr Tech-Artikel: inovex.de/blog
This document summarizes security concepts in PostgreSQL including authentication, roles, and row-level security. It begins with an introduction comparing PostgreSQL and MySQL. Authentication methods in PostgreSQL include password, peer, and LDAP authentication configured via pg_hba.conf. Roles in PostgreSQL define privileges and inheritance and include attributes like SUPERUSER, LOGIN, and INHERIT. Row-level security controls access at the row level and examples demonstrate how to configure policies and the default policy.
Introduction to Elastic Search
Elastic Search Terminology
Index, Type, Document, Field
Comparison with Relational Database
Understanding of Elastic architecture
Clusters, Nodes, Shards & Replicas
Search
How it works?
Inverted Index
Installation & Configuration
Setup & Run Elastic Server
Elastic in Action
Indexing, Querying & Deleting
PostgreSQL Tuning: O elefante mais rápido que um leopardoelliando dias
O documento fornece dicas sobre como otimizar o desempenho de um banco de dados PostgreSQL. Ele discute problemas comuns de desempenho, escolhas de configuração erradas, melhorias de hardware e software, parâmetros do sistema operacional e do PostgreSQL, ferramentas de teste de desempenho e escalabilidade.
This document provides an overview of database management systems (DBMS). It defines key concepts like data, databases, and the basic functions of a DBMS, which include defining database structure, managing storage, manipulating data through queries, controlling access and usage, and monitoring performance. It also describes the roles of different people involved like designers, developers and administrators. The document outlines the different levels of data abstraction in a DBMS and key functionality around concurrency control, backup/recovery, redundancy management, access control, optimization and metadata.
In this presentation, Amit explains querying with MongoDB in detail including Querying on Embedded Documents, Geospatial indexing and Querying etc.
The tutorial includes a recap of MongoDB, the wrapped queries, queries which are using modifiers, Upsert (saving/ updating queries), updating multiple documents at once, etc. Moreover, it gives a brief explanation about specifying which keys to return, the AND/OR queries, querying on embedded documents, cursors and Geospatial indexing. The tutorial begins with a section about MongoDB which includes steps to install and start MongoDB, to show and select Database, to drop collection and database, steps to insert a document and get up to 20 matching documents. Furthermore, it also includes steps to store and use Javascript functions on the server side.
The next section after the MongoDB section is about wrapped queries and queries using modifiers which includes the types of wrapped queries which are used like LikeQuery, SortQuery, LimitQuery, SkipQuery. It also includes the types of queries using modifiers like NotEqualModifier, Greater/Lesser modifier, Increment Modifier, Set Modifier, Unset Modifier, Push Modifier etc. Then comes the section about Upsert (Save or update). There are steps mentioned for saving or updating queries in this section.
At the same time, there are steps to update multiple documents altogether. The next section which is called “specifying which keys to return” talks about ways to specify the keys the user wants. After this section comes OR/AND query. It informs us about the general steps to do an OR query. Also, it includes the general steps to do an AND query. After this section comes another section called “querying on embedded document” which tells the user about ways of querying for an embedded document.
One of the important sections of this tutorial is about cursors, uses of a cursor and also methods to chain additional options onto a query before it is performed. Following is a section about indexing which talks about indexing as a term and how indexing helps in improving the query’s speed. At the end is a section which gives a brief explanation on geospatial indexing which is another type of query that became common with the emergence of mobile devices. Also, it includes the ways geospatial queries can be performed.
The document summarizes a presentation on the internals of InnoDB file formats and source code structure. The presentation covers the goals of InnoDB being optimized for online transaction processing (OLTP) with performance, reliability, and scalability. It describes the InnoDB architecture, on-disk file formats including tablespaces, pages, rows, and indexes. It also discusses the source code structure.
New Features for Multitenant in Oracle Database 21cMarkus Flechtner
Oracle Database 21c introduces several new features for multitenant databases:
- PDBs can now be upgraded automatically when plugged into a 21c CDB or opened, replaying the upgrade process.
- Resource management is improved with options like mandatory user profiles, per-PDB database resident connection pooling, and Oracle DB Nest for isolating PDBs using Linux namespaces and cgroups.
- Multitenant enhancements for high availability include PDBs being managed as cluster resources and improved PDB-level recovery when using Active Data Guard.
The document provides an outline for a presentation on graph-based data models. It introduces some key concepts about graphs and how they are used to model real-world interconnected data. It discusses how early adopters of graph technologies grew by focusing on data relationships. The document also covers graph data structures, graph databases, and graph query languages like Cypher and Gremlin.
Chapter – 4 Normalization and Relational Algebra.pdfTamiratDejene1
The document discusses normalization and relational algebra. It defines normalization as a process of structuring a database into tables to reduce data redundancy and inconsistencies. The document covers various normal forms including 1st normal form (1NF), 2nd normal form (2NF), and 3rd normal form (3NF). It defines functional dependencies and different types of dependencies and anomalies. Examples are provided to illustrate how to determine the normal forms of relations and decompose relations to higher normal forms by removing dependencies.
The document discusses InnoDB flushing and checkpoints. It provides an overview of InnoDB architecture and describes the page cleaner thread that handles background flushing. The page cleaner thread coordinates multiple worker threads to flush pages from the LRU and flush lists. Flushing involves writing dirty pages from the buffer pool to disk in the background to avoid needing synchronous I/O.
Recommender Systems represent one of the most widespread and impactful applications of predictive machine learning models.
Amazon, YouTube, Netflix, Facebook and many other companies generate an important fraction of their revenues thanks to their ability to model and accurately predict users ratings and preferences.
In this presentation we cover the following points:
→ introduction to recommender systems
→ working with explicit vs implicit feedback
→ content-based vs collaborative filtering approaches
→ user-based and item-item methods
→ machine learning and deep learning models
→ pros & cons of the methods: scalability, accuracy, explainability
Beyond EXPLAIN: Query Optimization From Theory To CodeYuto Hayamizu
EXPLAIN is too much explained. Let's go "beyond EXPLAIN".
This talk will take you to an optimizer backstage tour: from theoretical background of state-of-the-art query optimization to close look at current implementation of PostgreSQL.
The document provides an overview of DB2 and discusses key concepts such as instances, databases, tablespaces, and recovery. It describes how to install and configure DB2, create instances and databases, load and move data between databases, and perform backups and recovery. Examples are given of commands used to create tablespaces and load data. The document also mentions tools for visualizing queries and monitoring performance.
Integrating Relational Databases with the Semantic Web: A ReflectionJuan Sequeda
This is a lecture given at the 2017 Reasoning Web Summer School
It has been clear from the beginning that the success of the Semantic Web hinges on integrating the vast amount of data stored in Relational Databases. In 2007, the W3C organized a workshop on RDF Access to Relational Databases. In 2012, two standards were ratified that map relational data to RDF: Direct Mapping and R2RML.
In this lecture, I will reflect on the last 10 years of research results and systems to integrate Relational Databases with the Semantic web. I will provide an answer to the following question: how and to what extent can Relational Databases be integrated with the Semantic Web? I will review how these standards and systems are being used in practice for data integration and discuss open challenges.
MongoDB is an open-source document database, and the leading NoSQL database. Written in C++.
MongoDB has official drivers for a variety of popular programming languages and development environments. There are also a large number of unofficial or community-supported drivers for other programming languages and frameworks.
Recommender systems support the decision making processes of customers with personalized suggestions. These widely used systems influence the daily life of almost everyone across domains like ecommerce, social media, and entertainment. However, the efficient generation of relevant recommendations in large-scale systems is a very complex task. In order to provide personalization, engines and algorithms need to capture users’ varying tastes and find mostly nonlinear dependencies between them and a multitude of items. Enormous data sparsity and ambitious real-time requirements further complicate this challenge. At the same time, deep learning has been proven to solve complex tasks like object or speech recognition where traditional machine learning failed or showed mediocre performance.
Join Marcel Kurovski to explore a use case for vehicle recommendations at mobile.de, Germany’s biggest online vehicle market. Marcel shares a novel regularization technique for the optimization criterion and evaluates it against various baselines. To achieve high scalability, he combines this method with strategies for efficient candidate generation based on user and item embeddings—providing a holistic solution for candidate generation and ranking.
The proposed approach outperforms collaborative filtering and hybrid collaborative-content-based filtering by 73% and 143% for MAP@5. It also scales well for millions of items and users returning recommendations in tens of milliseconds.
Event: O'Reilly Artificial Intelligence Conference, New York, 18.04.2019
Speaker: Marcel Kurovski, inovex GmbH
Mehr Tech-Vorträge: inovex.de/vortraege
Mehr Tech-Artikel: inovex.de/blog
This document summarizes security concepts in PostgreSQL including authentication, roles, and row-level security. It begins with an introduction comparing PostgreSQL and MySQL. Authentication methods in PostgreSQL include password, peer, and LDAP authentication configured via pg_hba.conf. Roles in PostgreSQL define privileges and inheritance and include attributes like SUPERUSER, LOGIN, and INHERIT. Row-level security controls access at the row level and examples demonstrate how to configure policies and the default policy.
Introduction to Elastic Search
Elastic Search Terminology
Index, Type, Document, Field
Comparison with Relational Database
Understanding of Elastic architecture
Clusters, Nodes, Shards & Replicas
Search
How it works?
Inverted Index
Installation & Configuration
Setup & Run Elastic Server
Elastic in Action
Indexing, Querying & Deleting
PostgreSQL Tuning: O elefante mais rápido que um leopardoelliando dias
O documento fornece dicas sobre como otimizar o desempenho de um banco de dados PostgreSQL. Ele discute problemas comuns de desempenho, escolhas de configuração erradas, melhorias de hardware e software, parâmetros do sistema operacional e do PostgreSQL, ferramentas de teste de desempenho e escalabilidade.
This document provides an overview of database management systems (DBMS). It defines key concepts like data, databases, and the basic functions of a DBMS, which include defining database structure, managing storage, manipulating data through queries, controlling access and usage, and monitoring performance. It also describes the roles of different people involved like designers, developers and administrators. The document outlines the different levels of data abstraction in a DBMS and key functionality around concurrency control, backup/recovery, redundancy management, access control, optimization and metadata.
In this presentation, Amit explains querying with MongoDB in detail including Querying on Embedded Documents, Geospatial indexing and Querying etc.
The tutorial includes a recap of MongoDB, the wrapped queries, queries which are using modifiers, Upsert (saving/ updating queries), updating multiple documents at once, etc. Moreover, it gives a brief explanation about specifying which keys to return, the AND/OR queries, querying on embedded documents, cursors and Geospatial indexing. The tutorial begins with a section about MongoDB which includes steps to install and start MongoDB, to show and select Database, to drop collection and database, steps to insert a document and get up to 20 matching documents. Furthermore, it also includes steps to store and use Javascript functions on the server side.
The next section after the MongoDB section is about wrapped queries and queries using modifiers which includes the types of wrapped queries which are used like LikeQuery, SortQuery, LimitQuery, SkipQuery. It also includes the types of queries using modifiers like NotEqualModifier, Greater/Lesser modifier, Increment Modifier, Set Modifier, Unset Modifier, Push Modifier etc. Then comes the section about Upsert (Save or update). There are steps mentioned for saving or updating queries in this section.
At the same time, there are steps to update multiple documents altogether. The next section which is called “specifying which keys to return” talks about ways to specify the keys the user wants. After this section comes OR/AND query. It informs us about the general steps to do an OR query. Also, it includes the general steps to do an AND query. After this section comes another section called “querying on embedded document” which tells the user about ways of querying for an embedded document.
One of the important sections of this tutorial is about cursors, uses of a cursor and also methods to chain additional options onto a query before it is performed. Following is a section about indexing which talks about indexing as a term and how indexing helps in improving the query’s speed. At the end is a section which gives a brief explanation on geospatial indexing which is another type of query that became common with the emergence of mobile devices. Also, it includes the ways geospatial queries can be performed.
The document summarizes a presentation on the internals of InnoDB file formats and source code structure. The presentation covers the goals of InnoDB being optimized for online transaction processing (OLTP) with performance, reliability, and scalability. It describes the InnoDB architecture, on-disk file formats including tablespaces, pages, rows, and indexes. It also discusses the source code structure.
New Features for Multitenant in Oracle Database 21cMarkus Flechtner
Oracle Database 21c introduces several new features for multitenant databases:
- PDBs can now be upgraded automatically when plugged into a 21c CDB or opened, replaying the upgrade process.
- Resource management is improved with options like mandatory user profiles, per-PDB database resident connection pooling, and Oracle DB Nest for isolating PDBs using Linux namespaces and cgroups.
- Multitenant enhancements for high availability include PDBs being managed as cluster resources and improved PDB-level recovery when using Active Data Guard.
The document provides an outline for a presentation on graph-based data models. It introduces some key concepts about graphs and how they are used to model real-world interconnected data. It discusses how early adopters of graph technologies grew by focusing on data relationships. The document also covers graph data structures, graph databases, and graph query languages like Cypher and Gremlin.
5. 범위(Scope)
• 메인 홈페이지에서는 호스트와 비회원 사용자를 기준으로 관광지 이미지가 보여지며
각 테마별로 초기에 구축한 정보를 이미지로 나타내어 관광지를 추천한다.
• 회원가입을 진행할 때 게스트와 호스트로구분시키며 게스트로 등록을 원할 경우
권역, 성별, 연령대, 여행 동반자, 여행 테마를 입력받는다.
• 여행 테마의 종류는 Tour API에서 제공하는 타입으로써 관광지, 문화시설,
축제공연행사, 레포츠, 쇼핑, 음식으로 분류된다.
• 게스트의 메인 홈페이지에서는 선택한 지역(권역)과 여행 테마를 바탕으로 정보를
제공하며, 이는 Tourl API에서 제공하는 인기순 정렬로 나열한다.
• 관광지 이미지를 선택하면 기본적으로 Tour API를 통해 해당 관광지의 사진과
개요 등을 제공하며 추가적으로 위치 기반 관광 정보를 이용해 선택한 관광지의
근처 관광지를 추천한다.
• 게스트는 비슷한 성향의 사람들이 추천하는 관광지를 추가로 보여준다.
7. 작업계획(Workplan)
• 분석 : 방법론 추론 및 설계, Tour API 분석
09/05(목) ~ 10/09(수)
• 사용자 인터페이스 설계 & 데이터베이스 설계
10/10(목) ~ 10/17(목)
• 프로그램 설계
10/18(금) ~ 10/23(수)
• 시스템 구현
10/24(목) ~ 11/15(금)
• 테스트
11/16(토) ~ 11/20(수)
• 배포
11/21(목)
8. 비용/기대효과(Cost/Benefit)
• 한국의 문화를 체험하고자 하는 고객들의 만족도를 높일 수 있다.
• 사용자에게 적합한 정보만을 제공하여 자원의 낭비를 줄일 수 있다.
• 4차 산업 혁명에 대비하여 유사 AI 기술에 대한 지식을 습득할 수 있다.
9. 메뉴의 구성
• 홈스테이 코리아 사이트와는 독자적으로 프로젝트를 개발하기 때문에
추천 콘텐츠 이외에는 제공할 정보가 없어 불필요한 정보는 제공하지 않는다.
10. 회원가입
• 초기에 게스트와 호스트로 구분 지어 회원가입을 진행한다.
• 호스트의 경우 사용자 인터페이스상에서 호스트 가입은 존재하지만
해당 프로젝트의 주목적은 게스트 성향에 맞는 관광지 추천이기 때문에
실제로 가입 기능은 구현하지 않는다.
• 게스트로 회원가입을 진행할 경우 아이디, 비밀번호, 이름(영문), 성별, 연락처,
연령대(10대, 20대, 30대, 40대, 50대), 선호하는 여행 테마(최대 3개),
여행 지역, 국가 코드을 입력받는다.
• 선호하는 여행 테마 3개를 입력받고 해당 테마들의 선호도를 33%로 초기설정한다.
• 여기서 테마 선호도를 계산하는 이유는 메인 홈페이지에서 하이브리드 필터링을
사용하기 위함이다.
11. 로그인
• 로그인 페이지를 통해 로그인을 진행한다.
• 로그인 시, 아이디와 비밀번호를 입력해야 한다.
• 로그인 성공 시, 게스트와 호스트를 구분 지어 세션에 등록한다.
• 로그인 성공 시, 메인 화면으로 이동한다.
• 로그인 실패 시, 아이디 또는 비밀번호가 틀렸다는 에러 메시지를 출력한다.
12. 메인 홈페이지
• 초기 메인 화면에서는 비회원 사용자와 호스트를 기준으로 화면이 보여진다.
• 초기에는 대한민국 전체의 관광지를 각 여행 테마별로 관광지 이미지가 보여진다.
• 각각의 테마는 관광지, 문화시설, 축제공연행사, 레포츠, 쇼핑, 음식으로 구성된다.
• 각각의 지역은 서울, 경기도, 강원도, 충청남도, 충청북도, 전라남도, 전라북도,
경상남도, 경상북도, 제주도로 구성된다.
• 각각의 관광지 이미지는 Tour API를 통해 미리 데이터베이스에 저장해두며,
인기순으로 정렬되어 보여진다.
• 게스트가 관광지를 클릭할 때마다 해당 관광지의 클릭 횟수를 1개씩 증가시키며
관광지 상세 페이지로 이동되고, 데이터가 쌓이게 되면 해당 웹 사이트를
• 접속하는 게스트들만의 인기순으로 정렬되어 보여진다.
• Tour API에서 관광지의 이미지를 제공하지 않는 경우 이미지가 없다는 문구를
표기한다.
13. 메인 홈페이지
• 비회원 사용자와 게스트와의 차별성을 두기 위해 하이브리드 필터링 기법을
사용하기로 했으며, 게스트로 로그인하여 메인 홈페이지로 접근할 경우
회원가입 시 입력했던 사용자 프로파일을 토대로 선택했던 지역으로 선호하는
여행 테마 3개가 상위에 위치하고 그 외의 테마가 하단에 위치하며
선호하는 여행 외에 다른 테마를 계속해서 클릭할 경우 선호도가 점차 변경되어
보이는 여행 테마 순서가 변경된다.
• 테마 선호도는 선호도를 저장하는 테이블을 바탕으로 퍼센트 계산하여
각 여행 테마의 순위를 결정한다.
• 사용자가 원하는 관광지를 클릭하여 상세 페이지로 이동할 때마다 해당 관광지에
해당하는 테마의 수치를 상승시킨다.
• 하이브리드 필터링을 사용하지 않고 고정적인 테마별로 보여주기 보다는
사용자 개개인에 맞춰 정보를 제공하는 방식을 사용한다.
14. 관광지 상세 페이지
• 사용자가 메인 홈페이지에서 원하는 관광지의 이미지를 클릭할 경우
해당 관광지의 정보를 제공한다.
• 해당 정보는 Tour API에서 제공하는 정보로써 공통정보(사진, 전화번호, 주소),
소개정보(관광지 정보), 추가이미지(관광지의 추가이미지)가 있다.
• 추가이미지의 경우 없는 관광지들이 존재함으로 이럴 경우 이미지가 없다는
문구를 표기한다.
• 하단에는 Tour API에서 제공하는 위치기반 관광정보 조회 서비스를 통해
클릭했던 관광지의 좌표값을 받아 반경 최대 20Km 안에 있는 관광지를 보여준다.
• 게스트가 메인 홈페이지에서 원하는 관광지의 이미지를 클릭할 경우
하단에 게스트와 성향이 비슷한 사람이 본 관광지 추천 항목이 보여진다.
• 하이브리드 필터링 기법을 사용하여 관광지를 추천한다.
15. 하이브리드 필터링 기법을 사용한 관광지 추천
• 사용자가 특정 관광지를 클릭해서 관광지 상세정보 페이지로 이동한다.
• 데이터베이스의 사용자 Log를 저장하는 테이블에 사용자의 데이터
(ID, 검색한 관광지 ID, 기록시간)를 축적한다.
• 1차적으로 해당 관광지를 조회했던 사용자들을 사용자 Log테이블을 통해 조회한다.
• 2차적으로 조회된 사용자 중 성별 / 연령대 / 여행 동반자를 비교해서 일치하는
사용자만을 조회한다.
• 4번 항목에서 조회된 데이터가 0개일 경우 3번 항목의 데이터를 사용한다.
• 2차에서 조회된 사용자만을 대상으로 피어슨 상관계수(Pearson Correlation Coefficient)
방식을 통해 관광지를 클릭했던 사용자와의 유사도를 계산한다.
• 계산된 유사도 중 가장 높은 유사도를 가진 3명의 사용자의 Log를 조회하여
클릭했던 관광지를 추천해준다.
16. Tour API 3.0
• 국문 관광정보 서비스
(관광정보의 통합/상세 검색 및 위치기반, 지역기반 등 국내 관광에 대한
전반적인 정보를 국문으로 제공)
17. 관광지 초기 데이터 구축
• Tour API에서 제공하는 국문 관광정보 서비스와 영문 관광정보 서비스를 통해
각 관광지의 공통정보(사진, 전화번호, 주소), 소개정보(관광지 정보),
추가이미지(관광지의 추가이미지)를 데이터베이스에 저장한다.
• 초기 데이터베이스에 정보를 미리 구축하는 이유는 Tour API에서 제공하는
인기순이 아닌 해당 웹 사이트만의 자체적인 인기순으로 정렬하기 위해
데이터를 저장하고 사용한다.
• 이때 저장하는 방식으로써 응답 XML을 파싱 후, 호출 태그명과 데이터를
저장하는 방식을 사용한다.
• 초기 데이터는 800개의 관광지 데이터를 저장한다.
• Java Quartz Library를 사용하여 매일 02시에 최신등록, 수정, 삭제된 데이터가 있는지
확인 후 최신화한다.
18.
19. Attribute Data Type PK NN FK Description
ID VARCHAR O 아이디
PW VARCHAR 비밀번호
NAME VARCHAR 이름
GENDER VARCHAR 성별
AGE VARCHAR 연령대
PHONE VARCHAR 전화번호
COMPANION VARCHAR 여행동반자
REGIONAL VARCHAR 여행 권역
NATION INT 국가 코드
20. Attribute Data Type PK NN FK Description
ID INT O 국가 코드
CNAME VARCHAR 국가명
DIAL_CODE VARCHAR 전화 코드
21. Attribute Data Type PK NN FK Description
ID VARCHAR O 아이디
TOUR_ATTR INT 관광지
CULT_FACIL INT 문화시설
EVENT INT 축제공연행사
SHOPPING INT 쇼핑
LEPORTS INT 레포츠
DINING INT 음식점
22. Attribute Data Type PK NN FK Description
LOG_NO INT O 로그 번호
ID VARCHAR O 아이디
CONTENTID VARCHAR 관광지ID
LOG_TIME TIMESTAMP 로그 기록시간
23. Attribute Data Type PK NN FK Description
CONTENTID VARCHAR O 관광지ID
CREATEDTIME VARCHAR 관광지등록 일자
MODIFIEDTIME VARCHAR 관광지수정 일자
THEME VARCHAR 여행 테마
COUNT INT 관광지 조회수
LOCATION VARCHAR 지역
MAPX DOUBLE 위도
MAPY DOUBLE 경도
24. Attribute Data Type PK NN FK Description
CONTENT_NO INT O 관광지 정보 번호
CONTENT_CATEGORY VARCHAR 여행 테마
CONTENT TEXT 내용
CONTENTID VARCHAR O 관광지ID
25. Attribute Data Type PK NN FK Description
IMAGENO INT O 이미지 번호
IMAGEURL VARCHAR 이미지 경로
CONTENTID VARCHAR O 관광지ID
27. CLASS명 메서드명 설명
ApiExplorer ApiExplorer() Tour API의 정보를 가져옴
ApiExplorerLocationBased ApiExplorerLocationBased() 위지 기반 관광지 정보를 불러옴
ApiExplorerLocationBased getJArray() JSON방식을 사용
TagInfo TagInfo() 지역코드표에 따른 지역 설정
CommonController top() bootstrap.jsp 리턴
CommonController top(HttpSession) 최상단 sign in, sign up버튼
CommonController logout() 회원 로그아웃
IndexController topview() 여행 테마 3개만 보여주는 페이지 리턴
IndexController menu() location값을 넘겨줌
IndexController mainview() 메인 홈페이지로 파라미터 전달
IndexController subview() 메인 홈페이지로 파라미터 전달
IndexController index() 메인 홈페이지 리턴
InfoController detailContent() 관광지 이미지/정보/추천 관광지 리턴
JoinGuestController JoinGuest() 회원가입
28. CLASS명 메서드명 설명
LocationBasedSightsController location() 관광지ID를 받아와 주변관광지 리턴
LoginFormController memberLogin() 회원 로그인
LoginFormController checkInput() 로그인 유효성 검사
ICountryDAO CountryList() 국가 코드를 조회
IJoinPlaceTourImageDAO ReadWithContentid() 이미지를 가져옴
IJoinPlaceTourImageDAO
readWithThemeLocationStartEndOrde
rByParm()
사용자가 지정한 테마와 권역을 가져옴
IJoinPlaceTourImageDAO readWithThemeStartEndOrderByParm 사용자가 지정한 테마와 전체지역 정보
IMemberDAO ReadWithId() 로그인 확인
IMemberDAO MemberInsert() 회원가입
IMemberDAO ReadRelationId() 사용자 로그 조회
IMemberLogDAO ReadWithId() 회원의 로그 조회
IMemberLogDAO InsertMemberlog() 회원 로그 데이터 삽입
IMemberLogDAO ReadContentIdWithIds() 회원의 로그에서 관광지ID 조회
IPlaceDAO ReadWithContentid() 관광지 등록정보 조회
29. CLASS명 메서드명 설명
IPlaceDAO
ReadWithThemeLocationOrderByCou
nt
사용자가 선택한 권역 조회
IPlaceDAO ReadWithThemeOrderByCount 테마 선호도에 맞게 테마를 조회
IPlaceDAO InsertWithDTO() 관광지 등록정보 삽입
IPlaceDAO DeletePlaceData() 관광지 삭제
IPlaceDetailDataDAO ReadWithPlaceDetailData() 관광지 상세정보 조회
IPlaceDetailDataDAO
readWithPlaceDetailDataContent_valu
e
관광지 이름 조회
IPlaceDetailDataDAO readTitles() 관광지 이름 조회
IPlaceDetailDataDAO deleteDetailData() 관광지 상세정보 삭제
IPlaceDetailDataDAO Insert() 관광지 상세정보 삽입
ITempContentidDAO InsertTempContentid() 임시 테이블 데이터 삽입
ITempContentidDAO DeleteAllTempContentid() 임시 테이블 데이터 삭제
ITempContentidDAO GetDeletedContentid() 데이터 존재여부 검증
IThemePreferDAO ReadWithId() 회원의 선호 테마 조회
30. CLASS명 메서드명 설명
IThemePreferDAO ReadList() 전체 선호 테마 조회
IThemePreferDAO PreferInsert() 회원가입 시 입력한 테마 삽입
ITourImageDAO ReadWithContentid() 선택한 관광지의 이미지 조회
ITourImageDAO ReadWithContentIds() 이미지를 랜덤으로 10개 조회
ITourImageDAO ReadWithThemeLimit() 이미지를 조회
ITourImageDAO ReadWithPlaceDetailDataImage() 관광지 이미지 데이터 조회
ITourImageDAO InsertSingleTourImageRecord() 관광지 이미지 데이터 삽입
ITourImageDAO DeleteImageWithContentid() 관광지 이미지 데이터 삭제
Api Api() TourAPI 연결
ApiUrl MakeAPIUri() API 호출하기 위한 url 생성, 반환
AreaBasedData StoreSettedData() 공통정보 파싱
CalendarUtil TransforCalendar() 날짜를 정규식으로 변환
CalendarUtil extractDay() 날짜에서 day문자열 추출
CommonDetailData storeSettedData() place_detailData테이블에 알맞게 처리된 상세관광지정보를 저장
31. CLASS명 메서드명 설명
DataScheduler insertNewTourData() 데이터를 갱신하며 새로운 데이터가 있을 경우 새로운 데이터 삽입
DataScheduler getBeforeday() 오늘 날짜 기준 beforeDay을 반환
DataScheduler updateModifiedTourData() 데이터를 갱신하며 수정된 정보가 있을 경우 데이터 수정
DataScheduler compareModifiedTime() 데이터를 갱신하며 데이터 수정 날짜를 데이터베이스와 비교
DataScheduler deleteDeletedTourData() 데이터를 갱신하며 데이터가 삭제되었을 경우 데이터 삭제
DetailData storeSettedData() 관광지 상세정보 데이터 삽입
RelationAnalyze getMatchIds() 상관계수를 통해 조회된 회원 중 3명의 회원정보만 조회
RelationAnalyze Pearson() 상관계수 알고리즘을 사용하여 사용자와 취향이 유사한 회원 조회
TourImageData storeSettedData() 관광지 추가이미지 데이터 삽입
XmlString extractNumOfRows() XML문자열 태그 갯수를 세어 Api 호출결과 값의 갯수를 추출
XmlString extractXmlValue() 해당 태그에 해당하는 값을 반환
XmlString extractXmlValue() 특정 문자열의 해당 태그에 해당하는 값을 반환
XmlString extractXmlNodeAndValue() Api호출을 위한 url을 받아 Api를 호출하고 각 태그에 해당하는 값을 Map에 담아 반환
XmlString extractSpecificXmlStr() 태그와 태그사이에 존재하는 문자열 추출
32. Folder JSP 파일명 파일 설명
homestay detailContent.jsp 관광지 상세 정보 페이지
homestay/common bootstrap.jsp Bootstrap를 사용하기 위한 import 페이지
homestay/common top_login.jsp 회원에게 보여질 상단 메뉴바
homestay/common top_no_login.jsp 비회원에게 보여질 상단 메뉴바
member joinGuest.jsp 회원가입 페이지
member loginForm.jsp 로그인 페이지