Database Systems MPP
University Colegio Mayor de Cundinamarca
Realized: Andrés Alirio Barragán Idarraga
Jhonatan Yordano Lázaro Mejía
Database systems MPP
multiple sets of CPU and disk space
MPP is the most mature and proven
mechanism to store and analyze large
amounts of data.
It is equivalent to tens and hundreds of personal
computers each of which houses a small piece
of a large data set. this allows the execution of
the massive query quickly, since many smaller
independent queries are running simultaneously,
instead of a single large query.
An MPP database distributes the data in
Independent piers, managed by independent processing,
and the resources of the central processing unit (CPU)
MPP
Conceptually, it is the same as having
pieces of data loaded in multiple
networks, connected to personal
computers hosted in the same host
(host). MPP is the coordinated
processing of a program by multiple
processors that work in different parts of
the program, and with each of the
processors using its own operating
system and memory. Normally, MPP
processors communicate using some
messaging interface. In some
implementations, hundreds of
processors can work in the same
application.
In essence, parallel processing is
becoming very important in the world of
database computing.
This involves taking a large task,
dividing it into smaller tasks, and then
working with each of these smaller
tasks simultaneously (Mahapatra,
2010). The objective of this "divide and
conquer" approach is to complete the
more complex task in a shorter time
than what it would have taken to
complete a large part.
Three reasons are confirming the use of database
processing:
1
Need for increased speed and performance(database sizes are
increasing, queries are becoming more complex- especially in data
warehouse systems- and database software must cope in some
way to the growing demands that derive from this complexity).
2
Need for scalability.The databases grow rapidly and the companies
need a means to scale (upload their services) easily and with the
minimum cost.
3 Need for high availability.It refers to the need to maintain a
database and execute it with minimal or no delay. The companies
need to accompany users 24 hours a day with the growing use of
fixed and mobile internet.
One known application of MPP is Kognitio WX2,
which uses the full power of basic hardware to
provide a fast means of accessing massive volumes
of data without the need for any indexing technique.
Other typical applications of MPP are Oracle
Parallel Server, and the Greenplum database that
works well with MPP.
MPP
MPP databases typically provide an SQL interface and a
relational database management system (RDBMS) that runs on
a cluster of servers connected to each other by high-speed
interconnection. The figure shows the components of an
RDBMS that is normally included in SQL-en-Hadoop
solutions.

Slideshare mmp

  • 1.
    Database Systems MPP UniversityColegio Mayor de Cundinamarca Realized: Andrés Alirio Barragán Idarraga Jhonatan Yordano Lázaro Mejía
  • 2.
  • 3.
    multiple sets ofCPU and disk space MPP is the most mature and proven mechanism to store and analyze large amounts of data. It is equivalent to tens and hundreds of personal computers each of which houses a small piece of a large data set. this allows the execution of the massive query quickly, since many smaller independent queries are running simultaneously, instead of a single large query. An MPP database distributes the data in Independent piers, managed by independent processing, and the resources of the central processing unit (CPU)
  • 4.
    MPP Conceptually, it isthe same as having pieces of data loaded in multiple networks, connected to personal computers hosted in the same host (host). MPP is the coordinated processing of a program by multiple processors that work in different parts of the program, and with each of the processors using its own operating system and memory. Normally, MPP processors communicate using some messaging interface. In some implementations, hundreds of processors can work in the same application.
  • 5.
    In essence, parallelprocessing is becoming very important in the world of database computing. This involves taking a large task, dividing it into smaller tasks, and then working with each of these smaller tasks simultaneously (Mahapatra, 2010). The objective of this "divide and conquer" approach is to complete the more complex task in a shorter time than what it would have taken to complete a large part.
  • 6.
    Three reasons areconfirming the use of database processing: 1 Need for increased speed and performance(database sizes are increasing, queries are becoming more complex- especially in data warehouse systems- and database software must cope in some way to the growing demands that derive from this complexity). 2 Need for scalability.The databases grow rapidly and the companies need a means to scale (upload their services) easily and with the minimum cost. 3 Need for high availability.It refers to the need to maintain a database and execute it with minimal or no delay. The companies need to accompany users 24 hours a day with the growing use of fixed and mobile internet.
  • 7.
    One known applicationof MPP is Kognitio WX2, which uses the full power of basic hardware to provide a fast means of accessing massive volumes of data without the need for any indexing technique. Other typical applications of MPP are Oracle Parallel Server, and the Greenplum database that works well with MPP. MPP
  • 8.
    MPP databases typicallyprovide an SQL interface and a relational database management system (RDBMS) that runs on a cluster of servers connected to each other by high-speed interconnection. The figure shows the components of an RDBMS that is normally included in SQL-en-Hadoop solutions.