The aim of the paper is to design and implement an approach that provides high availability and load balancing to PostgreSQL databases. Shared nothing architecture and replicated centralized middleware architecture were selected in order to achieve data replication and load balancing among database servers. Pgpool-II was used to implementing these architectures. Besides, taking advantage of pgpool-II as
a framework, several scripts were developed in Bash for restoration of corrupted databases. In order to avoid single point of failure Linux HA (Heartbeat and Pacemaker) was used, whose responsibility is monitoring all processes involved in the whole solution. As a result applications can operate with a more reliable PostgreSQL database server, the most suitable situation is for applications with more load of reading statement (typically select) over writing statement (typically update). Also approach presented is only intended for Linux based Operating System.
Active/active architectures improve availability by distributing processing across multiple independent nodes with access to a common replicated database. If one node fails, its users are quickly switched to surviving nodes, often in under a second, preventing perceived downtime. This is achieved through techniques like network transactions or asynchronous/synchronous data replication to keep database copies in synchronization. While active/active eliminates single points of failure and planned downtime, it requires addressing issues like automatic user switching, data collision handling, and performance impacts of maintaining highly available systems.
Scheduling in distributed systems - Andrii VozniukAndrii Vozniuk
My EPFL candidacy exam presentation: http://wiki.epfl.ch/edicpublic/documents/Candidacy%20exam/vozniuk_andrii_candidacy_writeup.pdf
Here I present how schedulers work in three distributed data processing systems and their possible optimizations. I consider Gamma - a parallel database, MapReduce - a data-intensive system and Condor - a compute-intensive system.
This talk is based on the following papers:
1) Batch Scheduling in Parallel Database Systems by Manish Mehta, Valery Soloviev and David J. DeWitt
2) Improving MapReduce performance in heterogeneous environments by Matei Zaharia, Andy Konwinski, Anthony D. Joseph, Randy Katz and Ion Stoica
3) Batch Scheduling in Parallel Database Systems by Manish Mehta, Valery Soloviev and David J. DeWitt
This document discusses load balancing in distributed systems. It provides definitions of static and dynamic load balancing, compares their approaches, and describes several dynamic load balancing algorithms. Static load balancing assigns tasks at compile time without migration, while dynamic approaches migrate tasks at runtime based on current system state. Dynamic approaches have overhead from migration but better utilize resources. Specific dynamic algorithms discussed include nearest neighbor, random, adaptive contracting with neighbor, and centralized information approaches.
This document provides an overview of the Red Hat Cluster Suite, which delivers high availability solutions. It discusses the Cluster Manager technology, which provides application failover capability to make applications highly available. Cluster Manager uses shared storage, service monitoring, and communication between servers to detect failures and restart applications on healthy nodes. It ensures data integrity through techniques like I/O barriers, quorum partitions, and active/passive or active/active application configurations across nodes.
This is the Complete Information about Data Replication you need, i am focused on these topics:
What is replication?
Who use it?
Types ?
Implementation Methods?
DISTRIBUTED DATABASE WITH RECOVERY TECHNIQUESAAKANKSHA JAIN
Distributed Database Designs are nothing but multiple, logically related Database systems, physically distributed over several sites, using a Computer Network, which is usually under a centralized site control.
Distributed database design refers to the following problem:
Given a database and its workload, how should the database be split and allocated to sites so as to optimize certain objective function
There are two issues:
(i) Data fragmentation which determines how the data should be fragmented.
(ii) Data allocation which determines how the fragments should be allocated.
EOUG95 - Client Server Very Large Databases - PaperDavid Walker
The document discusses building large scaleable client/server solutions. It describes breaking the solution into four server components: database server, application server, batch server, and print server. It focuses on the database server, discussing how to make it resilient through clustering and scaleable by partitioning applications and using parallel query options. It also covers backup and recovery strategies.
The document proposes an architecture for a mashup container to execute mashups in virtualized environments according to the Platform as a Service (PaaS) model. It describes a monolithic solution where a single virtual machine runs a complete service execution platform (SEP) to execute mashup sessions with negligible performance degradation. Fault tolerance is provided by replicating the SEP virtual machine using the hypervisor-based fault tolerance mechanism of virtualized environments. Testing showed this approach enabled zero-downtime failover with only a small increase in latency and resource usage.
Active/active architectures improve availability by distributing processing across multiple independent nodes with access to a common replicated database. If one node fails, its users are quickly switched to surviving nodes, often in under a second, preventing perceived downtime. This is achieved through techniques like network transactions or asynchronous/synchronous data replication to keep database copies in synchronization. While active/active eliminates single points of failure and planned downtime, it requires addressing issues like automatic user switching, data collision handling, and performance impacts of maintaining highly available systems.
Scheduling in distributed systems - Andrii VozniukAndrii Vozniuk
My EPFL candidacy exam presentation: http://wiki.epfl.ch/edicpublic/documents/Candidacy%20exam/vozniuk_andrii_candidacy_writeup.pdf
Here I present how schedulers work in three distributed data processing systems and their possible optimizations. I consider Gamma - a parallel database, MapReduce - a data-intensive system and Condor - a compute-intensive system.
This talk is based on the following papers:
1) Batch Scheduling in Parallel Database Systems by Manish Mehta, Valery Soloviev and David J. DeWitt
2) Improving MapReduce performance in heterogeneous environments by Matei Zaharia, Andy Konwinski, Anthony D. Joseph, Randy Katz and Ion Stoica
3) Batch Scheduling in Parallel Database Systems by Manish Mehta, Valery Soloviev and David J. DeWitt
This document discusses load balancing in distributed systems. It provides definitions of static and dynamic load balancing, compares their approaches, and describes several dynamic load balancing algorithms. Static load balancing assigns tasks at compile time without migration, while dynamic approaches migrate tasks at runtime based on current system state. Dynamic approaches have overhead from migration but better utilize resources. Specific dynamic algorithms discussed include nearest neighbor, random, adaptive contracting with neighbor, and centralized information approaches.
This document provides an overview of the Red Hat Cluster Suite, which delivers high availability solutions. It discusses the Cluster Manager technology, which provides application failover capability to make applications highly available. Cluster Manager uses shared storage, service monitoring, and communication between servers to detect failures and restart applications on healthy nodes. It ensures data integrity through techniques like I/O barriers, quorum partitions, and active/passive or active/active application configurations across nodes.
This is the Complete Information about Data Replication you need, i am focused on these topics:
What is replication?
Who use it?
Types ?
Implementation Methods?
DISTRIBUTED DATABASE WITH RECOVERY TECHNIQUESAAKANKSHA JAIN
Distributed Database Designs are nothing but multiple, logically related Database systems, physically distributed over several sites, using a Computer Network, which is usually under a centralized site control.
Distributed database design refers to the following problem:
Given a database and its workload, how should the database be split and allocated to sites so as to optimize certain objective function
There are two issues:
(i) Data fragmentation which determines how the data should be fragmented.
(ii) Data allocation which determines how the fragments should be allocated.
EOUG95 - Client Server Very Large Databases - PaperDavid Walker
The document discusses building large scaleable client/server solutions. It describes breaking the solution into four server components: database server, application server, batch server, and print server. It focuses on the database server, discussing how to make it resilient through clustering and scaleable by partitioning applications and using parallel query options. It also covers backup and recovery strategies.
The document proposes an architecture for a mashup container to execute mashups in virtualized environments according to the Platform as a Service (PaaS) model. It describes a monolithic solution where a single virtual machine runs a complete service execution platform (SEP) to execute mashup sessions with negligible performance degradation. Fault tolerance is provided by replicating the SEP virtual machine using the hypervisor-based fault tolerance mechanism of virtualized environments. Testing showed this approach enabled zero-downtime failover with only a small increase in latency and resource usage.
Oracle Enterprise Manager allows administrators to monitor the performance of Oracle databases from any location with web access. Key metrics such as wait times, load levels, and cache usage can be viewed alongside alerts that notify administrators when thresholds are exceeded. Both collection-based historical data and real-time data are accessible to help identify and address potential performance issues.
TPT connection Implementation in InformaticaYagya Sharma
The document discusses implementing a Teradata Parallel Transporter (TPT) connection in Informatica to improve ETL performance. An existing Informatica mapping that loads 7 million records from a file to a stage table using a fast loader connection was taking over 6 hours. By using a TPT connection instead, which performs parallel loading, the load time was reduced to just over 1 hour. The steps shown create a new TPT connection in Informatica and using it in the desired mapping session to gain the performance benefits of parallel loading.
1) The document discusses how RMAN reporting provides an effective way to determine the best database backup strategy and ensure a successful recovery by understanding what has been backed up and what backups are needed.
2) It recommends running regular RMAN reports to identify files needing backup, which backups would be required for recovery, and historical information about past RMAN jobs.
3) The document provides examples of RMAN commands like LIST, REPORT, and RESTORE PREVIEW that can be used to generate useful information about current and past backups from the RMAN repository.
The document discusses distributed database systems, including homogeneous and heterogeneous distributed databases, distributed data storage using replication and fragmentation, distributed transactions, commit protocols like two-phase commit, and handling failures in distributed systems. Key topics covered are replication allowing high availability but increasing complexity, fragmentation allowing parallelism but requiring joins, and two-phase commit coordinating atomic commits across multiple sites through a prepare and commit phase.
Detailed comparison of two common legacy databases - HP's SQL/MP running on the NonStop Guardian environment and IBM's DB2 running on its z/OS platform, comparing a range of functionalities.
Fast-start failover enables automatic failovers in Oracle Data Guard environments without data loss or DBA intervention. When enabled, the Data Guard broker determines if a failover is needed and automatically initiates the failover to a pre-specified standby database. Expanded Enterprise Manager monitoring features allow monitoring of cluster interconnects, improving scalability, viewing backup reports for groups of databases, and identifying cache performance trends. The Server Control Utility has been enhanced to support management of services and Automatic Storage Management instances within Oracle Real Application Clusters environments.
The document defines several key terms related to Oracle Real Application Clusters (RAC) and high availability:
- Network Time Protocol (NTP) ensures accurate synchronization of computer clock times in a network.
- A node is a machine where an Oracle RAC instance resides.
- Oracle Clusterware manages cluster database processing including node membership and resource management.
- The Oracle Cluster Registry (OCR) manages configuration information for the RAC cluster.
This document discusses a proposed cost effective failover clustering system that allows organizations to use different database platforms together in a high availability cluster. The system would use a common interfacing technique to connect two non-similar database platforms. This would allow one database to be licensed while the other could be freeware. A central service would monitor the databases, trace queries to the main database, and redirect clients to the backup database if the main database fails, minimizing downtime. The service would also update the main database with any changes from the backup database after the main database is restored. This approach could help organizations save costs by allowing the use of a licensed and freeware database in the same failover cluster.
Abstract Every company or organization has to use a database for storing information. But what if this database fails? So it behooves the company to use another database for backup purpose. This is called as failover clustering. For making this clustering manageable and lucid the corporat people spend more money on buying a licensed copy for both, the core database and the redundant database. This overhead can be avoided using a database with non-similar platform in which one database is licensed and other may be freeware. If platforms are similar then the transactions between those databases become simpler. But it won’t be the case with non-similar platforms. Hence to overcome this problem in both cases cost effective failover clustering is proposed. For designing such system a common interfacing technique between two non-similar database platforms has to be developed. This will make provision for using any two database platforms for failover clustering. Keywords: CEFC (Cost Effective Failover Clustering), DAL (Data Access Layer), DB (Database), HA (High Availability)
The document discusses various topics related to distributed database management systems (DDBMS), including:
1) Transparency in DDBMS, which aims to hide the complexities of distribution from users. There are four main categories: distribution, transaction, performance, and DBMS transparency.
2) Distributed transaction management, which must ensure the ACID properties of transactions that access data across multiple sites. Objectives include improving utilization, minimizing response time and communication cost.
3) Concurrency control in DDBMS, which coordinates concurrent access to distributed data. Locking-based protocols are commonly used, where transactions acquire read or write locks before accessing data items. The two-phase locking protocol is also discussed.
The document discusses new features in IBM Information Server/DataStage 11.3. Key points include:
- The Hierarchical Data stage was renamed and can now process JSON and includes new REST, JSON parsing, and composition steps.
- The Big Data File stage supports more Hadoop distributions and Greenplum and Master Data Management connector stages were added.
- The Amazon S3 and Microsoft Excel connectors were enhanced.
- Sorting and record delimiting were optimized and Operations Console/Workload Manager are now default features.
Datastage parallell jobs vs datastage server jobsshanker_uma
The document compares Datastage parallel jobs and server jobs. Parallel jobs can take advantage of parallelism through features like partitioning and pipelining to enhance speed and performance when loading large amounts of data. Parallel jobs run on a multiprocessor system allowing both pipeline parallelism, where data is exchanged between stages as soon as it is available, and partitioning parallelism, where records are divided among nodes. In contrast, server jobs do not have built-in mechanisms for parallelism between stages.
This document provides instructions for deleting database instances and nodes from existing Real Application Clusters databases on Windows-based systems. It discusses:
1. Using DBCA in interactive or silent mode to delete database instances from existing nodes.
2. Procedures for deleting nodes from RAC databases, which includes stopping instances, removing listeners, detaching Oracle homes, and removing node configurations.
3. Clean-up procedures for the ASM instance when deleting a node, such as removing the ASM configuration and Oracle home directories.
This document provides an overview and comparison of several high availability solutions in SQL Server 2012 including database mirroring, failover clustering, transactional replication, log shipping, and AlwaysOn availability groups. Database mirroring provides redundancy using log transfer but requires separate instances, while failover clustering uses a virtual service name across nodes but requires shared storage. Transactional replication supports load balancing across multiple subscribers. Log shipping and database mirroring both rely on log backups and restores but log shipping allows read access during restore. AlwaysOn maximizes availability across databases in an availability group using Windows clustering without shared disks.
This document provides an overview of setting up MySQL replication between a master database server and one or more slave servers. It discusses creating a user for replication, configuring the master with binary logging and a server ID, configuring slaves with unique server IDs, obtaining the replication information from the master, and using mysqldump or raw data files to initialize slaves with data from the master. It also covers starting new replication environments and adding additional slaves.
WHAT ARE THE ALTERNATIVE FUNCTIONS AND BENEFITS OF CELL PHONES FOR STUDENTS?IJITE
Taiwanese College students bring their own cell phones in the English classroom and teachers may become overwhelmed with these technology trends. This study aims to provide a realistic perception of the hidden meanings of the use of mobile devices in English class settings and the benefits it can bring to the students. For this purpose, two conventional classes of fourth year license degree in the Department of Travel Management were the respondents. The students’ schooling experiences were clarified with a student
satisfaction questionnaire, their values highlighted with an interview, and their social interactions explained with observations of the two classes. The results of this study show that, even though they were not used to working collaboratively in small team-work groups, Taiwanese students were highly likely to develop a collaborative learning style that utilizes emails and internet connections matching their learning
needs and motivations and optimizing their academic success.
DYNAMIC CLASSIFICATION OF SENSITIVITY LEVELS OF DATAWAREHOUSE BASED ON USER P...ijdms
A data warehouse stores secret data about the privacy of individuals and important business activities. This makes access to this source a risk of disclosure of sensitive data. Hence the importance of implementing security measures which guarantee the data confidentiality by establishing an access control policy. In this direction, several propositions were made, but none are considered as a standard for access management to data warehouses. I n this article, we will present our approach that allows first to exploit the permissions defined in the data sources in order to help the administrator to define access permissions to the data
warehouse, and then our system will automatically generate the sensitivity level of each data warehouse
element according to the permissions granted to an object in the data warehouse.
Oracle Enterprise Manager allows administrators to monitor the performance of Oracle databases from any location with web access. Key metrics such as wait times, load levels, and cache usage can be viewed alongside alerts that notify administrators when thresholds are exceeded. Both collection-based historical data and real-time data are accessible to help identify and address potential performance issues.
TPT connection Implementation in InformaticaYagya Sharma
The document discusses implementing a Teradata Parallel Transporter (TPT) connection in Informatica to improve ETL performance. An existing Informatica mapping that loads 7 million records from a file to a stage table using a fast loader connection was taking over 6 hours. By using a TPT connection instead, which performs parallel loading, the load time was reduced to just over 1 hour. The steps shown create a new TPT connection in Informatica and using it in the desired mapping session to gain the performance benefits of parallel loading.
1) The document discusses how RMAN reporting provides an effective way to determine the best database backup strategy and ensure a successful recovery by understanding what has been backed up and what backups are needed.
2) It recommends running regular RMAN reports to identify files needing backup, which backups would be required for recovery, and historical information about past RMAN jobs.
3) The document provides examples of RMAN commands like LIST, REPORT, and RESTORE PREVIEW that can be used to generate useful information about current and past backups from the RMAN repository.
The document discusses distributed database systems, including homogeneous and heterogeneous distributed databases, distributed data storage using replication and fragmentation, distributed transactions, commit protocols like two-phase commit, and handling failures in distributed systems. Key topics covered are replication allowing high availability but increasing complexity, fragmentation allowing parallelism but requiring joins, and two-phase commit coordinating atomic commits across multiple sites through a prepare and commit phase.
Detailed comparison of two common legacy databases - HP's SQL/MP running on the NonStop Guardian environment and IBM's DB2 running on its z/OS platform, comparing a range of functionalities.
Fast-start failover enables automatic failovers in Oracle Data Guard environments without data loss or DBA intervention. When enabled, the Data Guard broker determines if a failover is needed and automatically initiates the failover to a pre-specified standby database. Expanded Enterprise Manager monitoring features allow monitoring of cluster interconnects, improving scalability, viewing backup reports for groups of databases, and identifying cache performance trends. The Server Control Utility has been enhanced to support management of services and Automatic Storage Management instances within Oracle Real Application Clusters environments.
The document defines several key terms related to Oracle Real Application Clusters (RAC) and high availability:
- Network Time Protocol (NTP) ensures accurate synchronization of computer clock times in a network.
- A node is a machine where an Oracle RAC instance resides.
- Oracle Clusterware manages cluster database processing including node membership and resource management.
- The Oracle Cluster Registry (OCR) manages configuration information for the RAC cluster.
This document discusses a proposed cost effective failover clustering system that allows organizations to use different database platforms together in a high availability cluster. The system would use a common interfacing technique to connect two non-similar database platforms. This would allow one database to be licensed while the other could be freeware. A central service would monitor the databases, trace queries to the main database, and redirect clients to the backup database if the main database fails, minimizing downtime. The service would also update the main database with any changes from the backup database after the main database is restored. This approach could help organizations save costs by allowing the use of a licensed and freeware database in the same failover cluster.
Abstract Every company or organization has to use a database for storing information. But what if this database fails? So it behooves the company to use another database for backup purpose. This is called as failover clustering. For making this clustering manageable and lucid the corporat people spend more money on buying a licensed copy for both, the core database and the redundant database. This overhead can be avoided using a database with non-similar platform in which one database is licensed and other may be freeware. If platforms are similar then the transactions between those databases become simpler. But it won’t be the case with non-similar platforms. Hence to overcome this problem in both cases cost effective failover clustering is proposed. For designing such system a common interfacing technique between two non-similar database platforms has to be developed. This will make provision for using any two database platforms for failover clustering. Keywords: CEFC (Cost Effective Failover Clustering), DAL (Data Access Layer), DB (Database), HA (High Availability)
The document discusses various topics related to distributed database management systems (DDBMS), including:
1) Transparency in DDBMS, which aims to hide the complexities of distribution from users. There are four main categories: distribution, transaction, performance, and DBMS transparency.
2) Distributed transaction management, which must ensure the ACID properties of transactions that access data across multiple sites. Objectives include improving utilization, minimizing response time and communication cost.
3) Concurrency control in DDBMS, which coordinates concurrent access to distributed data. Locking-based protocols are commonly used, where transactions acquire read or write locks before accessing data items. The two-phase locking protocol is also discussed.
The document discusses new features in IBM Information Server/DataStage 11.3. Key points include:
- The Hierarchical Data stage was renamed and can now process JSON and includes new REST, JSON parsing, and composition steps.
- The Big Data File stage supports more Hadoop distributions and Greenplum and Master Data Management connector stages were added.
- The Amazon S3 and Microsoft Excel connectors were enhanced.
- Sorting and record delimiting were optimized and Operations Console/Workload Manager are now default features.
Datastage parallell jobs vs datastage server jobsshanker_uma
The document compares Datastage parallel jobs and server jobs. Parallel jobs can take advantage of parallelism through features like partitioning and pipelining to enhance speed and performance when loading large amounts of data. Parallel jobs run on a multiprocessor system allowing both pipeline parallelism, where data is exchanged between stages as soon as it is available, and partitioning parallelism, where records are divided among nodes. In contrast, server jobs do not have built-in mechanisms for parallelism between stages.
This document provides instructions for deleting database instances and nodes from existing Real Application Clusters databases on Windows-based systems. It discusses:
1. Using DBCA in interactive or silent mode to delete database instances from existing nodes.
2. Procedures for deleting nodes from RAC databases, which includes stopping instances, removing listeners, detaching Oracle homes, and removing node configurations.
3. Clean-up procedures for the ASM instance when deleting a node, such as removing the ASM configuration and Oracle home directories.
This document provides an overview and comparison of several high availability solutions in SQL Server 2012 including database mirroring, failover clustering, transactional replication, log shipping, and AlwaysOn availability groups. Database mirroring provides redundancy using log transfer but requires separate instances, while failover clustering uses a virtual service name across nodes but requires shared storage. Transactional replication supports load balancing across multiple subscribers. Log shipping and database mirroring both rely on log backups and restores but log shipping allows read access during restore. AlwaysOn maximizes availability across databases in an availability group using Windows clustering without shared disks.
This document provides an overview of setting up MySQL replication between a master database server and one or more slave servers. It discusses creating a user for replication, configuring the master with binary logging and a server ID, configuring slaves with unique server IDs, obtaining the replication information from the master, and using mysqldump or raw data files to initialize slaves with data from the master. It also covers starting new replication environments and adding additional slaves.
WHAT ARE THE ALTERNATIVE FUNCTIONS AND BENEFITS OF CELL PHONES FOR STUDENTS?IJITE
Taiwanese College students bring their own cell phones in the English classroom and teachers may become overwhelmed with these technology trends. This study aims to provide a realistic perception of the hidden meanings of the use of mobile devices in English class settings and the benefits it can bring to the students. For this purpose, two conventional classes of fourth year license degree in the Department of Travel Management were the respondents. The students’ schooling experiences were clarified with a student
satisfaction questionnaire, their values highlighted with an interview, and their social interactions explained with observations of the two classes. The results of this study show that, even though they were not used to working collaboratively in small team-work groups, Taiwanese students were highly likely to develop a collaborative learning style that utilizes emails and internet connections matching their learning
needs and motivations and optimizing their academic success.
DYNAMIC CLASSIFICATION OF SENSITIVITY LEVELS OF DATAWAREHOUSE BASED ON USER P...ijdms
A data warehouse stores secret data about the privacy of individuals and important business activities. This makes access to this source a risk of disclosure of sensitive data. Hence the importance of implementing security measures which guarantee the data confidentiality by establishing an access control policy. In this direction, several propositions were made, but none are considered as a standard for access management to data warehouses. I n this article, we will present our approach that allows first to exploit the permissions defined in the data sources in order to help the administrator to define access permissions to the data
warehouse, and then our system will automatically generate the sensitivity level of each data warehouse
element according to the permissions granted to an object in the data warehouse.
INTEGRATED FRAMEWORK TO MODEL DATA WITH BUSINESS PROCESS AND BUSINESS RULESijdms
Data modeling is an approach to model data by mapping operational tasks iteratively, while associated guidelines are either partly mapped in the data model or expressed through software applications. Since an organization is a collection of business processes, it is essential that data models utilize such processes to facilitate data modeling. Also, data models should incorporate guidelines for completing operational tasks
through the concept of business rules. This paper outlines a unified framework on database modeling and design based on business process concepts that also incorporates business rules impacting business operations. The paper focuses on the relational database and its primary mode of conceptual modeling in the form of an en tity relationship model. Concepts are illustrated through Oracle's database language
PL/SQL and its Web variant PL/SQL Server Pages.
Paper Map Industry Trends
Ways to Collaborate
License maps from other publishers
Product development for other publishers
Re-brand product for other publishers
License data from content suppliers
Buy/sell cartography from/to others
Este documento describe las áreas de intervención de la tutoría para estudiantes de nivel secundaria, incluyendo el apoyo académico, vocacional, de salud, social y cultural de los estudiantes, así como el establecimiento de relaciones democráticas. La tutoría busca mejorar el rendimiento escolar de los estudiantes, orientarlos en su desarrollo vocacional y estilos de vida saludables, y promover su participación en la comunidad.
The document discusses various topics related to digital security presented at different events, including a keynote on issues with encryption for IoT devices, a panel discussion on authentication technology at the BankTech Asia conference, and presentations on blockchain, IoT, and quantum attacks at the PrimeKey PKI Tech Days. It also describes a solution implemented by SecureMetric using multi-factor authentication with RADIUS and one-time passwords to securely access the SWIFT application.
THE EFFECTS OF COMMUNICATION NETWORKS ON STUDENTS’ ACADEMIC PERFORMANCE: THE ...IJITE
Social networks, as the most important communication tools, have had a profound impact on social aspects of community user interactions and they are used widely in various fields, such as education. Student interaction through different communication networks can affect individual learning and leads to improved academic performance. In this study, a combined approach of social network analysis and educational data mining (decision tree method) was used to study the impact of communication networks, behavior networks
and the combination of these two networks on students’ academic performance considering the role of factors such as computer self-efficacy, age, gender and university. The results of this study, which included 139 students, indicate gender is highly prioritised in all three models. Moreover, according to the results all three models had enough confidence level that among them communication networks with higher
confidence, accuracy and precision had significant impacts on the prediction of academic performance.
ON THE USAGE OF DATABASES OF EDUCATIONAL MATERIALS IN MACEDONIAN EDUCATIONIJITE
Technologies have become important part of our lives. The steps for introducing ICTs in education vary from country to country. The Republic of Macedonia has invested with a lot in installment of hardware and software in education and in teacher training. This research was aiming to determine the situation of usage of databases of digital educational materials and to define recommendation for future improvements. Teachers from urban schools were interviewed with a questionnaire. The findings are several: only part of the interviewed teachers had experience with databases of educational materials; all teachers still need capacity building activities focusing exactly on the use and benefits from databases of educational materials; preferably capacity building materials to be in Macedonian language; technical support and upgrading of software and materials should be performed on a regular basis. Most of the findings can be applied at both national and international level – with all this implemented, application of ICT in education will have much
bigger positive impact
Festividades tradicionales de fin de año culturaTheo Banda
En América Latina existen diversas tradiciones para celebrar el Año Nuevo, como comer 12 uvas a la medianoche, quemar muñecos que representan el año viejo, usar ropa interior de colores particulares, barrer la casa, o lanzar objetos para dejar atrás la mala suerte del año pasado. Cada país tiene sus propias costumbres heredadas de culturas precolombinas y de la colonización española.
El documento describe la historia y los componentes básicos de las computadoras. Explica que las computadoras revolucionaron los sistemas de oficina al permitir la comunicación e intercambio de información instantáneo. Luego resume los hitos clave en el desarrollo de las computadoras desde el ábaco hasta la invención del microprocesador y el nacimiento de las computadoras personales. Finalmente, describe los principales componentes de una computadora como la unidad central de procesamiento, las memorias, los dispositivos de entrada y salida.
EFFECT OF POSTURAL CONTROL BIOMECHANICAL GAIN ON PSYCHOPHYSICAL DETECTION THR...ijbbjournal
A Sliding Linear Investigative Platform for Assessing Lower Limb Stability (SLIP-FALLS) was employed to study postural control biomechanical reaction to external perturbations in a short ≤16mm postural perturbation. Head acceleration were evaluated while blindfolded subjects stood on a platform that was given a short anterior perturbation presented in one of 2 sequential 4s intervals (2-Alternative-ForcedChoice) for a set of 30 trials. Anterior-Posterior head acceleration (Head Accl AP) were investigated among the movement and non- movement intervals for the healthy adults. A strong ringing signal was observed in Head Accl AP movement interval that was absent in non-movement interval. A positive power
law trading relationship was found between Head Accl AP gain and move length standing blindfolded subjects. This could explain the observed negative power law relationship between translation length and peak acceleration threshold in previous psychophysical detection threshold studies.
LOCALITY SIM: CLOUD SIMULATOR WITH DATA LOCALITY ijccsa
Cloud Computing (CC) is a model for enabling on-demand access to a shared pool of configurable computing resources. Testing and evaluating the performance of the cloud environment for allocating, provisioning, scheduling, and data allocation policy have great attention to be achieved. Therefore, using cloud simulator would save time and money, and provide a flexible environment to evaluate new research
work. Unfortunately, the current simulators (e.g., CloudSim, NetworkCloudSim, GreenCloud, etc..) deal with the data as for size only without any consideration about the data allocation policy and locality. On the other hand, the NetworkCloudSim simulator is considered one of the most common used simulators because it includes different modules which support needed functions to a simulated cloud environment,
and it could be extended to include new extra modules. According to work in this paper, the NetworkCloudSim simulator has been extended and modified to support data locality. The modified simulator is called LocalitySim. The accuracy of the proposed LocalitySim simulator has been proved by building a mathematical model. Also, the proposed simulator has been used to test the performance of the
three-tire data center as a case study with considering the data locality feature
Dr. Amit Vyas is seeking a sales, marketing, or business development role. He has over 25 years of experience in these areas. Currently he is a Senior Zonal Manager at ONGC MRPL, where his responsibilities include developing consumer programs, identifying strategic alliances, analyzing customer data, and more. Previously he held roles at SHV Energy India Limited, GSL India Limited, Oswal Chemicals & Fertilizers Ltd, and Bhagwati Gases Ltd. He has an MBA and a PhD, and has published several research papers.
This document discusses configuring graphics and visual elements on web pages. It covers topics like borders, padding, image optimization, and using HTML5 elements like <figure> and <progress>. CSS properties like background-image and multiple backgrounds are demonstrated. The document provides code samples and screenshots to illustrate concepts like applying different graphic file types (.gif, .jpg, .png), styling image links, and optimizing images for web display.
This chapter introduces HTML and covers key concepts for coding web pages. It describes HTML, XHTML, and HTML5 and their elements. It covers how to structure a web page using the html, head, body, title, and meta elements. It also describes how to configure text elements, links, lists, and special characters in the body of a web page. Finally, it discusses validating HTML code and ensuring web pages are structured properly.
This document discusses an introductory chapter on cascading style sheets (CSS) that covers key concepts such as:
- The evolution of style sheets from print to web and advantages of using CSS
- Types of CSS like inline, embedded, and external styles
- Common CSS properties for configuring color, text, and formatting
- Using CSS selectors like classes, IDs, and contextual selectors to target specific elements
- Validating CSS styles and ensuring sufficient color contrast for accessibility
This chapter introduces key concepts about the evolution of the internet and web. It describes how the internet grew in the 1990s due to the removal of commercial restrictions and the development of technologies like the World Wide Web, Mosaic browser, and affordable personal computers. Standards bodies like IETF, W3C, and organizations like ICANN were formed to coordinate and develop protocols and standards for the internet. The chapter discusses internet infrastructure, client-server model, protocols like HTTP, TCP/IP, and standards for web accessibility. It also defines common terms like IP addresses, domain names, and URIs.
When you have a message to share with the world – more businesses trust Sharp Professional LCD displays to make it happen.Brilliant High Definition (1920 x 1080) LED Display.Thin and Lightweight Design using edge-lit LED Backlighting.Super-slim 0.4″ bezel.RS-232C control capability.Visit Us: https://www.floridacopiers.com
Presentation on Large Scale Data ManagementChris Bunch
The document summarizes recent research on MapReduce and virtual machine migration. It discusses papers that compare MapReduce to parallel databases, describe techniques for live migration of virtual machines with low downtime, and propose using system call logging and replay to further reduce migration times and overhead. The document provides context on debates around MapReduce and outlines key approaches and findings from several papers on virtual machine migration.
The Concept of Load Balancing Server in Secured and Intelligent NetworkIJAEMSJORNAL
Hundreds and thousands of data packets are routed every second by computer networks which are complex systems. The data should be routed efficiently to handle large amounts of data in network. A core networking solution which is responsible for distribution of incoming traffic among servers hosting the same content is load balancing. For example, if there are ten servers within a network and two of them are doing 95% of the work, the network is not running very efficiently. If each server was handling about 10% of the traffic, the network would run much faster.Networks get more efficient with the help of Load balancing. The traffic is evenly distributed amongst the network making sure no single device is overwhelmed.When a request is balanced across multiple servers, it prevents any server from becoming a single point of failure. It improves overall availability and responsiveness. To evenly split the traffic load among several different servers web servers; often use load balancing.Load balancing requires hardware or software that divides incoming traffic amongst the available serverseither it is done on a local network or a large web server. High amount of traffic is received by a network that have one server dedicated to balance the load among other servers and devices in the network. This server is often known as load balancer. Load balancing is used by clusters or multiple computers that work together, to spread out processing jobs among the available systems.
The document discusses distributed algorithms and techniques used in NoSQL databases related to data consistency, data placement, and system coordination. It covers topics such as replication, failure detection, data partitioning, leader election, and consistency models. Specifically, it analyzes various techniques for data replication that provide different tradeoffs between consistency, availability, scalability, and latency such as anti-entropy protocols, master-slave replication, quorum-based replication, and using vector clocks to detect concurrent writes.
The Grouping of Files in Allocation of Job Using Server Scheduling In Load Ba...iosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
The document proposes a method for grouping files and allocating jobs using server scheduling to balance load. It involves splitting a server into multiple sub-servers. The performance of client machines is analyzed based on factors like processing speed, bandwidth, and memory usage. Jobs are then assigned to sub-servers based on these performance analyses, with the goal of completing all tasks quickly. Once tasks are complete, files are distributed to the respective client machines. The proposed method aims to reduce the workload on servers and improve response times compared to the existing system.
Talon systems - Distributed multi master replication strategySaptarshi Chatterjee
This document proposes a new approach to multi-master data replication called TalonStore. It describes existing replication strategies and identifies limitations. TalonStore uses an event-driven architecture where writes are published to a queue and all nodes subscribe independently. For reads, nodes constitute a quorum to enforce consistency if the majority agree. This allows parallel writes without locking, and eliminates single points of failure compared to traditional synchronous replication. The goal is to improve performance and availability for distributed databases while maintaining consistency.
The document discusses Oracle Golden Gate software. It provides real-time data integration across heterogeneous database systems with low overhead. It captures transactional data from source databases using log-based extraction and delivers the data to target systems with transactional integrity. Key features include high performance, flexibility to integrate various database types, and reliability during outages.
This slide deck provides an introduction to the fundamental concepts and principles of system design. Aimed at aspiring software engineers and professionals looking to strengthen their understanding, the presentation covers key topics such as scalability, reliability, load balancing, caching, CDN, partitioning, indexes, replication etc.
Veritas Database Edition/Advanced Cluster for Oracle9i RAC provides several benefits for managing Oracle databases in a clustered environment including simplified storage layout, simplified management of tablespaces with file systems, shared Oracle home and application directories, and improved storage utilization and faster recovery. It uses Cluster Volume Manager and Cluster File System to manage raw partitions and physical devices more easily while increasing performance and availability.
Software architecture case study - why and why not sql server replicationShahzad
This document discusses using replication to consolidate check list data from multiple ships into a central database server. Replication is suitable because the data from different ships does not require real-time consistency and each ship's records will be updated independently. However, the document notes that an alternative like periodic file transfers may work as well depending on the volume and frequency of data changes. The key factors in choosing replication or another method are how critical up-to-date data consolidation is and how much data needs to be transferred.
HMR LOG ANALYZER: ANALYZE WEB APPLICATION LOGS OVER HADOOP MAPREDUCEijujournal
In today’s Internet world, log file analysis is becoming a necessary task for analyzing the customer’s
behavior in order to improve advertising and sales as well as for datasets like environment, medical,
banking system it is important to analyze the log data to get required knowledge from it. Web mining is the
process of discovering the knowledge from the web data. Log files are getting generated very fast at the
rate of 1-10 Mb/s per machine, a single data center can generate tens of terabytes of log data in a day.
These datasets are huge. In order to analyze such large datasets we need parallel processing system and
reliable data storage mechanism. Virtual database system is an effective solution for integrating the data
but it becomes inefficient for large datasets. The Hadoop framework provides reliable data storage by
Hadoop Distributed File System and MapReduce programming model which is a parallel processing
system for large datasets. Hadoop distributed file system breaks up input data and sends fractions of the
original data to several machines in hadoop cluster to hold blocks of data. This mechanism helps to
process log data in parallel using all the machines in the hadoop cluster and computes result efficiently.
The dominant approach provided by hadoop to “Store first query later”, loads the data to the Hadoop
Distributed File System and then executes queries written in Pig Latin. This approach reduces the response
time as well as the load on to the end system. This paper proposes a log analysis system using Hadoop
MapReduce which will provide accurate results in minimum response time.
HMR Log Analyzer: Analyze Web Application Logs Over Hadoop MapReduce ijujournal
This document proposes a log analysis system called HMR Log Analyzer that uses Hadoop MapReduce to analyze large volumes of web application log files in parallel. It discusses how Hadoop Distributed File System stores and distributes log files across nodes for fault tolerance. The system first pre-processes logs to clean and organize the data before applying the MapReduce algorithm. MapReduce jobs break the analysis into map and reduce phases to efficiently process logs in parallel and generate summarized results like page view counts. The system provides an interface for users to query and visualize results.
This paper presents FACADE, a compiler framework that can generate highly efficient data manipulation code for big data applications written in managed languages like Java. FACADE transforms applications by separating data storage from data manipulation. Data is stored in native memory rather than Java heap objects, bounding the number of heap objects. This significantly reduces memory management overhead and improves scalability. The compiler locally transforms methods to insert data conversion functions. Experiments show the generated code runs faster, uses less memory, and scales to larger datasets than the original code for several real-world big data applications and frameworks.
An asynchronous replication model to improve data available into a heterogene...Alexander Decker
This document summarizes a research paper that proposes an asynchronous replication model to improve data availability in heterogeneous systems. The proposed model uses a loosely coupled architecture between main and replication servers to reduce dependencies. It also supports heterogeneous systems, allowing different parts of an application to run on different systems for better performance. This makes it a cost-effective solution for data replication across different system types.
Survey on Load Rebalancing for Distributed File System in CloudAM Publications
1. The document discusses load rebalancing algorithms for distributed file systems in cloud computing. It aims to balance the load across storage nodes to improve performance and resource utilization.
2. A large file is divided into chunks which are distributed across multiple storage nodes. If some nodes become overloaded (heavy nodes) while others are underloaded (light nodes), chunks can be migrated from heavy to light nodes using load rebalancing algorithms.
3. The algorithms structure storage nodes in a distributed hash table to allow efficient lookup and migration of chunks between nodes. Nodes independently calculate their load and migrate chunks to balance load without global knowledge of all nodes' loads.
Load balancing is an approach to distributing work units across multiple servers in a distributed system. The load balancer acts as a reverse proxy to distribute network or application traffic evenly among servers. It allocates the first task to the first server, second task to the second server, and so on, to balance loads. Load balancing provides security, protects applications from threats using a web application firewall, authenticates user access, protects against DDoS attacks, and improves performance by reducing load on servers and optimizing traffic.
The document discusses different database system architectures including centralized, client-server, server-based transaction processing, data servers, parallel, and distributed systems. It covers key aspects of each architecture such as hardware components, process structure, advantages and limitations. The main types are centralized systems with one computer, client-server with backend database servers and frontend tools, parallel systems using multiple processors for improved performance, and distributed systems with data and users spread across a network.
Data Distribution Handling on Cloud for Deployment of Big Dataneirew J
This document summarizes a research paper that proposes an algorithm to reduce data distribution and processing time in cloud computing for big data deployment. The paper discusses different data distribution techniques for virtual machines (VMs) in cloud computing, such as centralized, semi-centralized, hierarchical, and peer-to-peer approaches. It also reviews related work on MapReduce frameworks and load balancing algorithms. The authors implemented their proposed peer-to-peer distribution technique and Round Robin and Throttled load balancing algorithms in CloudSim. Experimental results showed the Throttled algorithm achieved significantly lower average response times than Round Robin.
Data Distribution Handling on Cloud for Deployment of Big Dataijccsa
Cloud computing is a new emerging model in the field of computer science. For varying workload Cloud computing presents a large scale on demand infrastructure. The primary usage of clouds in practice is to process massive amounts of data. Processing large datasets has become crucial in research and business environments. The big challenges associated with processing large datasets is the vast infrastructure required. Cloud computing provides vast infrastructure to store and process Big data. Vms can be provisioned on demand in cloud to process the data by forming cluster of Vms . Map Reduce paradigm can be used to process data wherein the mapper assign part of task to particular Vms in cluster and reducer combines individual output from each Vms to produce final result. we have proposed an algorithm to reduce the overall data distribution and processing time. We tested our solution in Cloud Analyst Simulation environment wherein, we found that our proposed algorithm significantly reduces the overall data processing time in cloud.
This document summarizes a talk on designing distributed systems. It discusses distributed computing paradigms like microservices and cloud computing. It also covers reactive system principles from the Reactive Manifesto like being responsive, resilient and message-driven. It then explains the actor model used in frameworks like Akka, where actors communicate asynchronously through message passing. It provides examples of using actors for concurrency and fault tolerance.
Similar to HIGH AVAILABILITY AND LOAD BALANCING FOR POSTGRESQL DATABASES: DESIGNING AND IMPLEMENTING. (20)
Main Java[All of the Base Concepts}.docxadhitya5119
This is part 1 of my Java Learning Journey. This Contains Custom methods, classes, constructors, packages, multithreading , try- catch block, finally block and more.
How to Setup Warehouse & Location in Odoo 17 InventoryCeline George
In this slide, we'll explore how to set up warehouses and locations in Odoo 17 Inventory. This will help us manage our stock effectively, track inventory levels, and streamline warehouse operations.
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPRAHUL
This Dissertation explores the particular circumstances of Mirzapur, a region located in the
core of India. Mirzapur, with its varied terrains and abundant biodiversity, offers an optimal
environment for investigating the changes in vegetation cover dynamics. Our study utilizes
advanced technologies such as GIS (Geographic Information Systems) and Remote sensing to
analyze the transformations that have taken place over the course of a decade.
The complex relationship between human activities and the environment has been the focus
of extensive research and worry. As the global community grapples with swift urbanization,
population expansion, and economic progress, the effects on natural ecosystems are becoming
more evident. A crucial element of this impact is the alteration of vegetation cover, which plays a
significant role in maintaining the ecological equilibrium of our planet.Land serves as the foundation for all human activities and provides the necessary materials for
these activities. As the most crucial natural resource, its utilization by humans results in different
'Land uses,' which are determined by both human activities and the physical characteristics of the
land.
The utilization of land is impacted by human needs and environmental factors. In countries
like India, rapid population growth and the emphasis on extensive resource exploitation can lead
to significant land degradation, adversely affecting the region's land cover.
Therefore, human intervention has significantly influenced land use patterns over many
centuries, evolving its structure over time and space. In the present era, these changes have
accelerated due to factors such as agriculture and urbanization. Information regarding land use and
cover is essential for various planning and management tasks related to the Earth's surface,
providing crucial environmental data for scientific, resource management, policy purposes, and
diverse human activities.
Accurate understanding of land use and cover is imperative for the development planning
of any area. Consequently, a wide range of professionals, including earth system scientists, land
and water managers, and urban planners, are interested in obtaining data on land use and cover
changes, conversion trends, and other related patterns. The spatial dimensions of land use and
cover support policymakers and scientists in making well-informed decisions, as alterations in
these patterns indicate shifts in economic and social conditions. Monitoring such changes with the
help of Advanced technologies like Remote Sensing and Geographic Information Systems is
crucial for coordinated efforts across different administrative levels. Advanced technologies like
Remote Sensing and Geographic Information Systems
9
Changes in vegetation cover refer to variations in the distribution, composition, and overall
structure of plant communities across different temporal and spatial scales. These changes can
occur natural.
Walmart Business+ and Spark Good for Nonprofits.pdfTechSoup
"Learn about all the ways Walmart supports nonprofit organizations.
You will hear from Liz Willett, the Head of Nonprofits, and hear about what Walmart is doing to help nonprofits, including Walmart Business and Spark Good. Walmart Business+ is a new offer for nonprofits that offers discounts and also streamlines nonprofits order and expense tracking, saving time and money.
The webinar may also give some examples on how nonprofits can best leverage Walmart Business+.
The event will cover the following::
Walmart Business + (https://business.walmart.com/plus) is a new shopping experience for nonprofits, schools, and local business customers that connects an exclusive online shopping experience to stores. Benefits include free delivery and shipping, a 'Spend Analytics” feature, special discounts, deals and tax-exempt shopping.
Special TechSoup offer for a free 180 days membership, and up to $150 in discounts on eligible orders.
Spark Good (walmart.com/sparkgood) is a charitable platform that enables nonprofits to receive donations directly from customers and associates.
Answers about how you can do more with Walmart!"
A review of the growth of the Israel Genealogy Research Association Database Collection for the last 12 months. Our collection is now passed the 3 million mark and still growing. See which archives have contributed the most. See the different types of records we have, and which years have had records added. You can also see what we have for the future.
HIGH AVAILABILITY AND LOAD BALANCING FOR POSTGRESQL DATABASES: DESIGNING AND IMPLEMENTING.
1. International Journal of Database Management Systems ( IJDMS ) Vol.8, No.6, December 2016
DOI: 10.5121/ijdms.2016.8603 27
HIGH AVAILABILITY AND LOAD BALANCING FOR
POSTGRESQL DATABASES: DESIGNING AND
IMPLEMENTING.
Pablo Bárbaro Martinez Pedroso
Universidad De Las Ciencias Informáticas, Cuba
ABSTRACT
The aim of the paper is to design and implement an approach that provides high availability and load
balancing to PostgreSQL databases. Shared nothing architecture and replicated centralized middleware
architecture were selected in order to achieve data replication and load balancing among database
servers. Pgpool-II was used to implementing these architectures. Besides, taking advantage of pgpool-II as
a framework, several scripts were developed in Bash for restoration of corrupted databases. In order to
avoid single point of failure Linux HA (Heartbeat and Pacemaker) was used, whose responsibility is
monitoring all processes involved in the whole solution. As a result applications can operate with a more
reliable PostgreSQL database server, the most suitable situation is for applications with more load of
reading statement (typically select) over writing statement (typically update). Also approach presented is
only intended for Linux based Operating System.
KEYWORDS
Distributed Database, Parallel Database.
1. INTRODUCTION
With increasing and fast development of connectivity, web applications have growth enormously,
in part because of the growing of the amount of clients and the requests of these as well. These
applications work with an underlying database. Having interruptions of the database may cause
unexpected functioning. For small applications this could be permissible, however for critical
applications constant interruptions of databases are unacceptable. For instance, applications in
airport that monitor and control air traffic should be equipped with a database capable of serving
the data every time that is requested. If this database gets down this airport may suffer a serious
catastrophe. There exits tons of phenomena that could crashes services provided for databases.
Disconnections or failures of electronic devices responsible of guarantee connectivity such as
wires, switches, routers, etc.; operating systems failures: main memory, processors, storage
devices, short circuit, unexpected outage; software malfunction of Data Base Manager System,
cyber-attacks, undetected and unsolved programming bugs; even natural phenomena such as
cyclones, tornados, earth quakes, thunders and fires. Any of these situations could stop database
servers from continuing offering services. For this reason is necessary to design and implement a
solution that provides high availability and load balancing to databases.
High availability to databases can be accomplished throw replication of data, which is storing and
maintaining of one or more copies or replicas of databases in multiples sites (servers,
nodes)(Raghu Ramakrishnan 2002). This is useful to support failover, a technique that enables
2. International Journal of Database Management Systems ( IJDMS ) Vol.8, No.6, December 2016
28
automatic redirection of transactions from a failed site1
to another that stores a copy of the
data(M. Tamer Özsu 2011).
A replication solution is based mainly in two decisions: where updates are performed (writing
statements) and how propagate the updates to the copies (M. Tamer Özsu 2011). In (Jim Gray
1996) are defined two alternatives of where updates can be performed. The first called group
(master-master, update anywhere) states that any site with a copy of data item can update it. And
the second called master (master-slave) proposes that each object (data item) belongs to a master
site. Only the master can update the primary copy of the object. All other replicas are read-only,
and waits for the propagation of updates carried out in master site.
Also Gray et al. 1996 defines two ways of propagating updates: eager (synchronous) and lazy
(asynchronous) replication. Eager replication applies every update to all copies as a global
transaction. As a result either all sites commit the updates or none does it. This ensures that all
sites have full copy of data all the time. On the other hand, lazy replication first commits updates
in master server (site owner of primary copy of object), and later propagates the updates to others
replicas as a separate transaction for every server. As a result part of the time not all sites have
data updated.
Different combinations can be obtained as the table below shows.
Table 1Combination of where and how Updates can be Performed.
Each of these solutions has its advantages and drawbacks depending of its use. Therefore the
configuration must be selected (depending on its characteristics) for the most suitable
environment or situation. In (Bettina Kemme 2010; M. Tamer Özsu 2011) is available conceptual
basis of database replication while (Salman Abdul Moiz 2011) offers a survey of database
replication tools.
2. MATERIAL AND METHODS
The schema selected is synchronous multi-master replication. As it was previously stated in
synchronous replication every site has a full copy of data. As consequence, only one site is
needed to process reading statement. Therefore load balancing can be achieved, since reading
transactions can be distributed and processed in parallel among servers. This avoids break downs
of services for overload causes. Also decreases response time to client applications, due to
parallel processing of reading statements. The main drawback of this schema is that global
transaction involving all sites slows down connectivity speed between them. Therefore this
approach is recommended for systems with large volume of reading statement instead of tons of
update transactions. Other important feature can be observed in failover mechanism. In case of
the master node fails (master-slave situation), the slave promoted as new master, does not require
additional operations to recover last consistent state of failed master. Wasting of time only occurs
in detecting master node has failed and promoting a slave as new master. In multi-master case the
1
Database server that for some reason (software malfunction, network failure, inconsistent state reached) is
not able of provides services.
3. International Journal of Database Management Systems ( IJDMS ) Vol.8, No.6, December 2016
29
behavior is the same. For critical scenario where − 1 nodes have failed, the last one remaining
online and consistent can guarantee by itself the services until failed nodes get back online.
Synchronous replication in multi-master scheme proposes a protocol named ROWA 2
(Read One
Write All). This means transactions reach commit step if and only if all sites update the changes.
As can be seen if any of the sites is down none of the updates can be achieved (David Taniar
2008). This protocol assures consistency among nodes over availability. Despite high consistency
among nodes is an important feature it is possible continuing accepting updates and later restoring
down sites, so high availability is maintained along the process. In order to overcome previous
limitation was designed ROWA-A protocol (Read One Write All Available) which indicates that
updates can be processed for all sites that are still online (David Taniar 2008). Although this
approach may compromise consistency on the servers that were not part of transaction when they
get back online, nevertheless down sites could be identified and restored, so inconsistency will be
solved.
2.1. DISTRIBUTED DATABASE ARCHITECTURE
Physical design is needed to collocate data across sites (Raghu Ramakrishnan 2002). The
architecture of distributed database system is based in shared nothing architecture. Each node has
its own processor unit (P), main memory (M), store device (SD) and Database Manager System,
all running on independents operating systems interconnected throw network. Since corrupted
servers can be detached instantaneously and remaining servers can resume operations, high
availability can be achieved; besides, new nodes could be added easily, which leads to a scalable
solution(M. Tamer Özsu 2011; Raghu Ramakrishnan 2002).The image below shows the
architecture adopted.
Figure 1:Shared Nothing Architecture. Bibliographic Source: (Raghu Ramakrishnan 2002).
2.2. DATABASE REPLICATION ARCHITECTURE
The replication architecture is based on middleware, specifically replicated centralized
middleware replication (Bettina Kemme 2010). This architecture removes single point of failure,
since at least one stand by middleware is maintained waiting in case primary fails, in such case
stand by middleware becomes primary and resumes operations. Middleware is placed between
clients and server. It receives all client requests and forwards them to the relevant replicas. The
resulting answers of the database servers are replied by the middleware to the corresponding
clients. Writing statement (typically insert, update, delete) are send them to all sites, this allow to
achieve full replication system. Because of this, reading statement (typically select) is processed
only for one site. Load balancing can be achieved as high availability as well.
2
In ROWA protocol, with at least one site down and one site online, data still can be served but it cannot be
updated, and forward databases are considered unavailable.
4. International Journal of Database Management Systems ( IJDMS ) Vol.8, No.6, December 2016
30
Figure 2: Replicated Centralized Middleware Architecture. Bibliographic Source (Bettina Kemme 2010).
2.3. TOOLS
Taking into account these architectures some tools are needed for implementing them. This
section will discuss such topic. Pgpool-II is used as PostgreSQL database middleware. It provides
connection pooling3
, database replication and load balancing. Besides, pgpool-II implements
ROWA-A protocol, thus when failover it occurs online database servers resume operations of the
applications. This is an important feature because assures availability of database services even
when at most − 1 servers have failed, with total number of database servers. As can be seen
as more database servers are used more high is the level of availability of database services. At
the same time, as more middle wares (instances of pgpool-II running on different sites) are used,
more is the certainty that services still will be running after primary (or any stand by promoted as
primary) fails. In order to provide high availability to Pgpool-II, PostgreSQL processes and
network services was used Linux HA (High Availability). Linux HA is the origin of two projects
Heartbeat and Pacemaker. Heartbeat provides messaging layer between all nodes involved while
Pacemaker is a crm (cluster resource manager), whose purpose is to perform complex
configurable operations to provide high availability to resources (processes, daemons, services)
(Team 2015).This involves restart a stopped resource and maintain a set of rules or restrictions
between them.
For a detailed description about how to install and configure these tools consider consult (Group
2014)in the case of Pgpool-II, (Project 2014)for Heartbeat and (Team 2015) for Pacemaker.
2.4 ARCHITECTURE OF THE APPROACH.
After previous decisions were made a more general architecture is presented (Figure 3). As can be
seen, replicated centralized middleware is encapsulated at middle wares level whereas shared
nothing architecture is at database servers level. At a higher level monitoring of processes is
carried out at middle wares level and at database servers level as well throw Linux HA.
3
“Pgpool-II maintains established connections to the PostgreSQL servers, and reuses them whenever a new
connection with the same properties (i.e. user name, database, protocol version) comes in. It reduces the
connection overhead, and improves system's overall throughput.” Bibliographic source: GROUP, P. G. D.
Pgpool-II Manual Page. In., 2014, vol. 2015.
5. International Journal of Database Management Systems ( IJDMS ) Vol.8, No.6, December 2016
31
Connections are represented by arrows, and should be faster enough in order to avoid a bottleneck
at messaging and communication process.
Figure 3: General architecture.
An interesting feature that presents this architecture is that is no restricted only for the tools used
in this paper. Therefore different tools can implement this architecture, leading to be operating
system independent as well.
3. RELATED WORK
Previous works have implemented replicated centralized middleware architecture (Felix Noel
Abelardo Santana and Méndez Roldán ; Partio 2007; Pupo 2013; Rodríguez 2011; Santana 2010)
throw pgpool-II. Some of them use asynchronous master-slave scheme (Felix Noel Abelardo
Santana and Méndez Roldán ; Partio 2007), while others are based on synchronous multi-master
replication scheme (Pupo 2013; Rodríguez 2011; Santana 2010).
However none of these solutions has planned the restoration of a corrupted database. This is
known as online recovery: the capability of restoring a crashed database while online databases
still servicing clients. Also the monitoring of PostgreSQL instances had not been designed.
Pgpool-II provides a two phase mechanism to achieve database restoration. First phase is
triggered when a database is detected failed executing a script that performs a PostgreSQL
physical backup from an online master to failed site. During this phase available sites process
client’s requests saving changes into WAL (Write Ahead Logs) files. Once physical backup has
done first stage is finished. At starting second stage clients are disconnected, a second scripts is
executed, which synchronizes WAL files generated during first phase to failed site, assuring that
this site is fully recovered. After this the node is attached to the system and operations of
databases are resumed. Taking advantage of Linux, scripts triggered during online recovery were
developed in Bash, which is a native interpreter of Linux based Operating System, so no extra
software is needed. Restoration of databases are based on Point in Time Recovery feature of
PostgreSQL, pgpool-II as event trigger mechanism and Bash as executor of predefined routines
encapsulated in multiples scripts. Recipe adopted for implementing this process was extracted
from(Simon Riggs 2010) and follows the steps described in Algorithm 1.
6. International Journal of Database Management Systems ( IJDMS ) Vol.8, No.6, December 2016
32
In order to detail the whole process of recovering, Algorithms 2 and 3 present the procedures
implemented in the scripts.
Let failed node database server crashed database server targeted to be recovered, master node
database server from recovery process will be carried out and therefore scripts executed,
PGDATA PostgreSQL database directory, Backup-mm-dd-yyyy.tar.gz file compressed with
PGDATA PostgreSQL database directory with format date mm-dd-yyyy for month, day and year
respectively.
Once 1st
Stage is accomplished and there is not active client connections, routine programmed in
Algorithm 3 is executed at 2nd
Stage of recovering process. Its responsibility is synchronize
changes made during 1st
Stage (e.g. Algorithm 2). At the end of Algorithm 3,
pgpool_remote_start script is triggered. Its function is the initialization of postgresql resource
(e.g. PostgreSQL process) in order to Pacemaker can monitor such process. After this, recovering
process is accomplished, failed node is fully recovered and clients can resume connections to
middleware level.
PostgreSQL servers is where bash scripts are executed throw stored procedures invoked from
pgpool-II. Because of this, pgpool-IIneeds to be set with recovery_user and recovery_password at
pgpool.conf configuration file in order to connect to databases and trigger recovery stored
procedure. Also it must provide as arguments the paths of bash scripts relative to PGDATA
directory (e.g. location where bash scripts are placed in PostgreSQL database servers). These are
recovery_1st_stage_command and recovery_2nd_stage_commandfor copy_base_backup and
pgpool_recovery_pitr respectively. The script pgpool_remote_start does not need to be set, only
be located in PGDATA directory. For creating recovery stored procedures please consult (Group
2014).
Online recovery can be done manually allowing not only restoration of a databases server, but
adding new servers since it carries out a full recovery. Moreover online recovery is designed to be
executed automatically in order to avoid database manager intervention.
7. International Journal of Database Management Systems ( IJDMS ) Vol.8, No.6, December 2016
33
Also none of the works surveyed that uses Heartbeat make use of Pacemaker (Felix Noel
Abelardo Santana and Méndez Roldán; Pupo 2013; Rodríguez 2011; Santana 2010). Although
Heartbeat contains a crm capable of giving high availability, this is a primitive version of
pacemaker which is incapable of making complex operations as restriction of location and co
location concerned to resources. Without this it is not possible guarantee high availability to
PostgreSQL process and network services to all sites. Moreover, pgpool-II could be running and
expecting client connections in one server while database services ip address is expecting
database client connections on other server which could lead to unavailability of database
services. To overcome this limitation co location restriction throw Pacemaker can be used.
As was stated previously processes monitoring will be carried out across the system throw
Heartbeat/Pacemaker, therefore it needs to be installed on all server used.LSB class process init
scripts pgpool2, postgresql and networking must be added as resources in order to monitor
pgpool-II and PostgreSQL processes and networking services. OCF class script IPaddr2 must be
also added as resource and as argument, must be set IP address (or subnet address). Once IPaddr2
is running will publish this address in order clients can connect with middleware level.
As can be seen some constraints must be defined in order to control resources behavior. Pgpool2
and IPaddr2 can only be executed on middleware level whereas postgresql on database server
level. This is known as location constraints. On the other hand Pgpool2 and IPaddr2 should be
running on the same middleware (e.g. the one that is active), otherwise pgpool-II could be
running and expecting client connections in one middleware while IPaddr2 publishes database
services ip address on other middleware which could lead to unavailability of database services.
3.1. RESULTS AND BRIEF DISCUSSION
Shared nothing was selected as distributed database architecture which provides high availability
and scalability; whereas replicated centralized middleware is used with synchronous multi-master
replication, in order to keep databases servers fully synchronized. Some tools were used to
accomplish all this. Pgpool-II was used as middleware responsible of replication of data, load
balancer and framework for online recovery. On the other hand Heartbeat and Pacemaker were
used for monitoring all processes involved in the solution such as pgpool-II, PostgreSQL
instances, IPaddr2 (e.g. database services ip address), and restrictions between all these processes.
Multiples scripts were developed in Bash in order to achieve the restoration of corrupted
databases. Such restoration can be done manually and/or automatically, depending on
requirements specified.
4. CONCLUSIONS
1. As a result of this investigation was designed and implemented a solution to provide high
availability and load balancing to PostgreSQL databases. Therefore applications can
count with a more reliable PostgreSQL database service.
2. Having multiples sites with full copy of data allows not only failover mechanism, but
also, load balancing among the servers. Since every server has an entire and consistent
copy of data, each of them can process all kind of reading statement. This situation avoids
a breakdown of the services for overload causes.
3. Using more than one middleware (one active, at least one stand by) removes single point
of failure, thus high availability can be obtained at middleware level.
4. Since distributed database architecture is shared nothing, many servers can be added or
removed in a few configuration steps, because of this approach presented is flexible and
scalable.
5. All tools selected are open source and developed for Linux, thus solution of this
investigation is intended only for Linux based Operating System.
8. International Journal of Database Management Systems ( IJDMS ) Vol.8, No.6, December 2016
34
6. Since global transactions (of writing statement)slow down connectivity speed, this
approach is more suitable and recommended for having big amounts of reading statement
instead writing statement.
5. FUTURE WORK
1. Designing and carrying out several of experiments in order to measuring performance of
heavy reading statements. Moreover, obtain a mathematical model which describes the
phenomena of performance depending on number of database servers. This model could
predict performance in a manner that for scalable solutions more precise decisions can be
made.
2. Designing and implementing a solution intended for big amount of writing statement.Due
to the fact that no solution fits all kind of problems, heavy write statement should be also
explored in order to produce an approach in such sense.
3. Approach presented in this paper suffers from the fact that it is only intended for Linux
based Operating Systems thus, is needed to extend this solution to other Operating
Systems.
REFERENCES
[1] Bettina Kemme, R. J. P. A. M. P.-M. Database Replication. Edited By M.T. Özsu.Edtion Ed.: Morgan
& Claypool, 2010. Isbn 9781608453825.
[2] David Taniar, C. H. C. L., Wenny Rahayu, Sushant Goel High-Performance Parallel Database
Processing And Grid Databases. Edited By A. Zamoya. Edtion Ed. New Jersey: Wiley, 2008. 574 P.
[3] Felix Noel Abelardo Santana, P. Y. P. P., Ernesto Ahmed Mederos, Javier Menéndez Rizo, Isuel And
H. P. P. Méndez Roldán Databases Cluster Postgresql 9.0 High Performance, 7.
[4] Group, P. G. D. Pgpool-Ii Manual Page. In., 2014, Vol.
2015.Http://Www.Pgpool.Net/Docs/Latest/Pgpool-En.Html
[5] Jim Gray, P. H., Patrick O'neil, Dennis Sasha. The Dangers Of Replication And A Solution. In
Proceeding Acm Sigmod Int. Conference On Management Of Data. Monreal, Canada: Acm, 1996, P.
10.
[6] M. Tamer Özsu, P. V. Principles Of Distributed Database Systems. Edtion Ed.: Springer, 2011. 866 P.
Isbn 978-1-4419-8833-1.
[7] Partio, M. Ev Aluation Of Postgresql Replica Tion And Load Balancingimplement A Tions. 2007.
[8] Project, L. H. Linux Ha User's Guide. In., 2014, Vol. 2015. Http://Www.Linux-Ha.Org/Doc/Users-
Guide/Users-Guide.Html
[9] Pupo, I. R. Base De Datos De Alta Disponibilidad Para La Plataforma Videoweb 2.0. University Of
Informatics Science, 2013.
[10] Raghu Ramakrishnan, J. G. Database Management Systems. Edtion Ed.: Mcgraw-Hill, 2002. 931 P.
Isbn 0-07-246535-2.
[11] Rodríguez, M. R. R. Solución De Clúster De Bases De Datos Para Sistemas Que Gestionan Grandes
Volúmenes De Datos Utilizando Postgresql. Universidad De Las Ciencias Informáticas, 2011.
[12] Salman Abdul Moiz, S. P., Venkataswamy G, Supriya N. Pal Database Replication: A Survey Of
Open Source And Commercial Tools. International Journal Of Computer Applications, 2011, 13(6),
8.
[13] Santana, Y. O. Clúster De Altas Prestaciones Para Medianas Y Pequeñas Bases De Datos Que
Utilizan A Postgresql Como Sistema De Gestión De Bases De Datos. Universidad De Las Ciencias
Informáticas, 2010.
[14] Simon Riggs, H. K. Postgresql 9 Administration Cookbook. Edtion Ed. Birmingham, Uk: Packt
Publishing Ltd., 2010. Isbn 978-1-849510-28-8.
[15] Team, P. D. Pacemaker Main Page. In., 2015, Vol. 2015.Http://Clusterlabs.Org/Wiki/Main_Page