DATABASE MANAGEMENT
MYTHS & REALITY
For The Future
By
Name: A B M Moniruzzaman, ID:121-25-238
Name: ID:
Name: ID:
High rate of data input
Big Data Challenges
This “Big Data” usually has one
or more of the following
characteristics:
Very Large Data Volumes –
Measured in terabytes or
petabytes
Variety – Structured and
Unstructured Data
High Velocity – Rapidly
Changing Data
Abstract.
The challenge of building consistent, available, and
scalable data management systems capable of serving
petabytes of data for millions of users as large internet
enterprises.
Though scalable data management has been a vision for
more than three decades and much research has focussed
on large scale data management in traditional enterprise
setting, cloud computing brings its own set of novel
challenges that must be addressed to ensure the success
of data management solutions in the cloud environment
in future.
Scalable Data Management
• Scalable database management systems
(DBMS)—both
• For update intensive application workloads as
well as decision support systems
• For descriptive and deep analytics—are a critical
part of the cloud infrastructure
• and play an important role in ensuring the
smooth transition of applications from the
traditional enterprise infrastructures to next
generation cloud infrastructures.
Different class of scalable
data management
Google’s
Bigtable [5], PNUTS [6] from
Yahoo!, Amazon’s Dynamo [7]
and other similar but
undocumented systems. All of
these systems deal with
petabytes of data, serve online
requests with stringent latency
and availability requirements,
accommodate erratic
workloads, and run on cluster
computing architectures; staking
claims to the territories
used to be occupied by database
systems.
Cloud Computing
Cloud Computing
• Cloud computing is an extremely successful
paradigm of service oriented computing, and has
revolutionized the way computing infrastructure
is abstracted and used. Three most popular cloud
paradigms include:
• Infrastructure as a Service (IaaS),
• Platform as a Service (PaaS), and
• Software as a Service (SaaS).
• The concept however can also be extended to
Database as a Service or Storage as a Service.
Cloud Computing Services
Cloud Database
Cloud Database System
• a system in the cloud must posses some features to
be able to effectively utilize the cloud economies.
These cloud features include:
• scalability
• elasticity
• fault-tolerance
• self-manageability
• ability to run on commodity hardware.
Most traditional relational database systems were designed
for enterprise infrastructures and hence were not designed to
meet all these goals. This calls for novel data management
systems for cloud infrastructures.
Scalable database system to supports large application with
lots of data and supporting hundreds of thousands of clients
Large data analysis tools
• Hadoop
• MapReduce
Open source tools like Hadoop and MapReduce
provide tremendous data processing power to big
data.
How Hadoop Map/Reduce works
Large multitenant system
• Another important domain for data
management in the cloud is the need to
support large number of applications. This is
referred to as a large multitenant system.
Multi-tenant Cloud Storage
Data Management for Large
Applications
REFERENCES
•
• [1] An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads
• http://www.cs.uwaterloo.ca/~kmsalem/courses/CS848W10/presentations/Chalamalla-HadoopDB.pdf
•
• [2] Divyakant Agrawal Sudipto Das Amr El Abbadi
• Department of Computer Science
• University of California, Santa Barbara
• Santa Barbara, CA 93106-5110, USA
• http://www.edbt.org/Proceedings/2011-Uppsala/papers/edbt/a50-agrawal.pdf
•
• [3] Big Data Challenge: Future is Right Here!
• http://www.e-zest.net/blog/big-data-challenge-future-is-right-here/
•
• [4] Hadoop and Big Data challenge
• http://www.apexcloud.com/blog/category/big-data
•
• [5] Big data and cloud computing: New wine or just new bottles?
• http://www.vldb2010.org/proceedings/files/papers/T02.pdf
•
• [6] C. Curino, E. Jones, Y. Zhang, E. Wu, and S. Madden. Relational
• Cloud: The Case for a Database Service. Technical Report 2010-14,
• CSAIL, MIT, 2010. http://hdl.handle.net/1721.1/52606.
•
• [7] D. Agrawal, A. El Abbadi, S. Antony, and S. Das. Data Management
• Challenges in Cloud Computing Infrastructures.
• http://www.just.edu.jo/~amerb/teaching/2-11-12/cs728/paper1.pdf
•
• [8] Bigtable: A Distributed Storage System for Structured Data.
• http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//archive/bigtable-osdi06.pdf
•
• [9] J. Cohen, B. Dolan, M. Dunlap, J. M. Hellerstein, and C. Welton.
• Mad skills: New analysis practices for big data. PVLDB,
• http://db.cs.berkeley.edu/papers/vldb09-madskills.pdf
•
• [10] Divyakant Agrawal Sudipto Das Amr El Abbadi
• Department of Computer Science
• University of California, Santa Barbara
• Santa Barbara, CA 93106-5110, USA
• {agrawal, sudipto, amr}@cs.ucsb.edu
REFERENCES
• [11] Scalable Data Management in the Cloud: Research Challenges
• http://cse.vnit.ac.in/comad2010/pdf/Keynotes/Key%20Note%202.pdf
•
•
• [12] C. Curino, E. Jones, Y. Zhang, E. Wu, and S. Madden. Relational
• Cloud: The Case for a Database Service. Technical Report 2010-14,
• CSAIL, MIT, 2010. http://hdl.handle.net/1721.1/52606.
•
• [13] S. Das, D. Agrawal, and A. El Abbadi. ElasTraS: An Elastic
• Transactional Data Store in the Cloud. In USENIX HotCloud, 2009.
• http://www.cs.ucsb.edu/~sudipto/diss/Dissertation_Single.pdf
•
• [14] S. Das, D. Agrawal, and A. El Abbadi. G-Store: A Scalable Data
• Store for Transactional Multi key Access in the Cloud. In ACM
• SOCC, 2010.
•
• [15] D. Jacobs and S. Aulbach. Ruminations on multi-tenant databases. In
• BTW, pages 514–521, 2007.
• http://static.usenix.org/event/osdi04/tech/full_papers/dean/dean.pdf
•
• [16] T. Kraska, M. Hentschel, G. Alonso, and D. Kossmann. Consistency
• Rationing in the Cloud: Pay only when it matters. PVLDB,
• http://www.vldb.org/pvldb/2/vldb09-759.pdf
•
• [17] C. D. Weissman and S. Bobrowski. The design of the force.com
• multitenant internet application development platform. In SIGMOD,
• http://cloud.pubs.dbs.uni-leipzig.de/sites/cloud.pubs.dbs.uni-leipzig.de/files/p889-weissman-1.pdf
•
• [18] F. Yang, J. Shanmugasundaram, and R. Yerneni. A scalable data
• platform for a large number of small applications. In CIDR, 2009.
• https://database.cs.wisc.edu/cidr/cidr2009/Paper_17.pdf
•
• [19] S. Das, S. Agarwal, D. Agrawal, and A. El Abbadi. ElasTraS: An
• Elastic, Scalable, and Self Managing Transactional Database for the
• Cloud. Technical Report 2010-04, CS, UCSB, 2010.
•
• [20] S. Das, S. Nishimura, D. Agrawal, and A. El Abbadi. Live Database
• Migration for Elasticity in a Multitenant Database for Cloud
• Platforms. Technical Report 2010-09, CS, UCSB, 2010.
•

Database Management Myths & Reality for the future

  • 1.
    DATABASE MANAGEMENT MYTHS &REALITY For The Future By Name: A B M Moniruzzaman, ID:121-25-238 Name: ID: Name: ID:
  • 2.
    High rate ofdata input
  • 3.
    Big Data Challenges This“Big Data” usually has one or more of the following characteristics: Very Large Data Volumes – Measured in terabytes or petabytes Variety – Structured and Unstructured Data High Velocity – Rapidly Changing Data
  • 4.
    Abstract. The challenge ofbuilding consistent, available, and scalable data management systems capable of serving petabytes of data for millions of users as large internet enterprises. Though scalable data management has been a vision for more than three decades and much research has focussed on large scale data management in traditional enterprise setting, cloud computing brings its own set of novel challenges that must be addressed to ensure the success of data management solutions in the cloud environment in future.
  • 5.
    Scalable Data Management •Scalable database management systems (DBMS)—both • For update intensive application workloads as well as decision support systems • For descriptive and deep analytics—are a critical part of the cloud infrastructure • and play an important role in ensuring the smooth transition of applications from the traditional enterprise infrastructures to next generation cloud infrastructures.
  • 6.
    Different class ofscalable data management Google’s Bigtable [5], PNUTS [6] from Yahoo!, Amazon’s Dynamo [7] and other similar but undocumented systems. All of these systems deal with petabytes of data, serve online requests with stringent latency and availability requirements, accommodate erratic workloads, and run on cluster computing architectures; staking claims to the territories used to be occupied by database systems.
  • 7.
  • 8.
    Cloud Computing • Cloudcomputing is an extremely successful paradigm of service oriented computing, and has revolutionized the way computing infrastructure is abstracted and used. Three most popular cloud paradigms include: • Infrastructure as a Service (IaaS), • Platform as a Service (PaaS), and • Software as a Service (SaaS). • The concept however can also be extended to Database as a Service or Storage as a Service.
  • 9.
  • 10.
  • 11.
    Cloud Database System •a system in the cloud must posses some features to be able to effectively utilize the cloud economies. These cloud features include: • scalability • elasticity • fault-tolerance • self-manageability • ability to run on commodity hardware. Most traditional relational database systems were designed for enterprise infrastructures and hence were not designed to meet all these goals. This calls for novel data management systems for cloud infrastructures.
  • 12.
    Scalable database systemto supports large application with lots of data and supporting hundreds of thousands of clients
  • 13.
    Large data analysistools • Hadoop • MapReduce Open source tools like Hadoop and MapReduce provide tremendous data processing power to big data.
  • 14.
  • 15.
    Large multitenant system •Another important domain for data management in the cloud is the need to support large number of applications. This is referred to as a large multitenant system.
  • 16.
  • 17.
    Data Management forLarge Applications
  • 18.
    REFERENCES • • [1] AnArchitectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads • http://www.cs.uwaterloo.ca/~kmsalem/courses/CS848W10/presentations/Chalamalla-HadoopDB.pdf • • [2] Divyakant Agrawal Sudipto Das Amr El Abbadi • Department of Computer Science • University of California, Santa Barbara • Santa Barbara, CA 93106-5110, USA • http://www.edbt.org/Proceedings/2011-Uppsala/papers/edbt/a50-agrawal.pdf • • [3] Big Data Challenge: Future is Right Here! • http://www.e-zest.net/blog/big-data-challenge-future-is-right-here/ • • [4] Hadoop and Big Data challenge • http://www.apexcloud.com/blog/category/big-data • • [5] Big data and cloud computing: New wine or just new bottles? • http://www.vldb2010.org/proceedings/files/papers/T02.pdf • • [6] C. Curino, E. Jones, Y. Zhang, E. Wu, and S. Madden. Relational • Cloud: The Case for a Database Service. Technical Report 2010-14, • CSAIL, MIT, 2010. http://hdl.handle.net/1721.1/52606. • • [7] D. Agrawal, A. El Abbadi, S. Antony, and S. Das. Data Management • Challenges in Cloud Computing Infrastructures. • http://www.just.edu.jo/~amerb/teaching/2-11-12/cs728/paper1.pdf • • [8] Bigtable: A Distributed Storage System for Structured Data. • http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//archive/bigtable-osdi06.pdf • • [9] J. Cohen, B. Dolan, M. Dunlap, J. M. Hellerstein, and C. Welton. • Mad skills: New analysis practices for big data. PVLDB, • http://db.cs.berkeley.edu/papers/vldb09-madskills.pdf • • [10] Divyakant Agrawal Sudipto Das Amr El Abbadi • Department of Computer Science • University of California, Santa Barbara • Santa Barbara, CA 93106-5110, USA • {agrawal, sudipto, amr}@cs.ucsb.edu
  • 19.
    REFERENCES • [11] ScalableData Management in the Cloud: Research Challenges • http://cse.vnit.ac.in/comad2010/pdf/Keynotes/Key%20Note%202.pdf • • • [12] C. Curino, E. Jones, Y. Zhang, E. Wu, and S. Madden. Relational • Cloud: The Case for a Database Service. Technical Report 2010-14, • CSAIL, MIT, 2010. http://hdl.handle.net/1721.1/52606. • • [13] S. Das, D. Agrawal, and A. El Abbadi. ElasTraS: An Elastic • Transactional Data Store in the Cloud. In USENIX HotCloud, 2009. • http://www.cs.ucsb.edu/~sudipto/diss/Dissertation_Single.pdf • • [14] S. Das, D. Agrawal, and A. El Abbadi. G-Store: A Scalable Data • Store for Transactional Multi key Access in the Cloud. In ACM • SOCC, 2010. • • [15] D. Jacobs and S. Aulbach. Ruminations on multi-tenant databases. In • BTW, pages 514–521, 2007. • http://static.usenix.org/event/osdi04/tech/full_papers/dean/dean.pdf • • [16] T. Kraska, M. Hentschel, G. Alonso, and D. Kossmann. Consistency • Rationing in the Cloud: Pay only when it matters. PVLDB, • http://www.vldb.org/pvldb/2/vldb09-759.pdf • • [17] C. D. Weissman and S. Bobrowski. The design of the force.com • multitenant internet application development platform. In SIGMOD, • http://cloud.pubs.dbs.uni-leipzig.de/sites/cloud.pubs.dbs.uni-leipzig.de/files/p889-weissman-1.pdf • • [18] F. Yang, J. Shanmugasundaram, and R. Yerneni. A scalable data • platform for a large number of small applications. In CIDR, 2009. • https://database.cs.wisc.edu/cidr/cidr2009/Paper_17.pdf • • [19] S. Das, S. Agarwal, D. Agrawal, and A. El Abbadi. ElasTraS: An • Elastic, Scalable, and Self Managing Transactional Database for the • Cloud. Technical Report 2010-04, CS, UCSB, 2010. • • [20] S. Das, S. Nishimura, D. Agrawal, and A. El Abbadi. Live Database • Migration for Elasticity in a Multitenant Database for Cloud • Platforms. Technical Report 2010-09, CS, UCSB, 2010. •