INFSCI 2711
Advanced Topics in
Database Management
Instructor:
Evgeny Karataev
The instructor
❖ Evgeny Karataev
❖ Where/how to find:
❖ In class: Tuesdays, 12:00 noon - 2:50 pm, IS 403
❖ Office Hours: online and by appointment
❖ email: epk8@pitt.edu
What this class is about…
❖ Prerequisites:
❖ You know what a relational database and a database
management system are (INFSCI 2710)
❖ Do you think that [one] [R]DBMS on [single] machine is
enough to handle today’s volume, velocity and variety
of the data?
What this class is about…
❖ Topics that will be covered in this class:
❖ Data Integration (OLAP and Data Warehousing,
Virtual Data Integration)
❖ Distributed and Parallel Databases (including
distributed transactions and query execution)
❖ NoSQL databases, NewSQL databases (Main Memory
Databases)
❖ Cluster Computing (Hadoop and other animals, Spark)
The textbooks
❖ Too many to list here
❖ there is no single book that covers all topics
❖ so I will post selected chapters of online available
books, blog posts and research papers before or after
each lecture
Class components
❖ Lectures/Demos/Labs
❖ Homework Assignments
❖ Students DB tools overview presentations
❖ Term Research & Development Project
❖ Midterm exam
❖ Final exam
Lectures/Demos/Labs
❖ Lectures’ slides will be available online usually a day
before the class.
❖ Sometime you might be asked to bring your laptop to
the class for lab work.
❖ Sometimes I will do demos of the DB systems related to
the class material.
Homework Assignments
❖ So far I planned 4 assignments fairly well spread over the
semester. However this might change to 5, 6 or 3.
❖ Assignment are usually very practical and are based on
the material learned in the class. You might be need to do
some programing.
❖ All assignments need to be submitted ONLINE
(assignments will have submission instructions)
❖ All assignments are group based (2 or 3 people per group)
Students DB tools overview presentations
❖ Each student (or maybe in groups of 2 or 3) will have to make 10
minutes presentation about a database system sometime during
the term (I will provide the list of the database systems). The
presentation must include:
❖ Architecture/Main idea/Main approach.
❖ Advantages and Disadvantages.
❖ How it differs from other systems.
❖ When it is applicable and when not.
❖ Where to learn more about it.
Term Research Project
❖ An original R&D project in groups of 5-6 people
❖ Most probably a lot learning and programming
❖ In class project progress reports/demos every other week (up to 15
minutes max)
❖ One final written report
❖ One final demo
❖ Project ideas will be provided by me, but you are welcome to
propose yours
❖ Projects development will be managed via github
Exams
❖ Both Midterm and Final exams are open notes, but no
computers and/or phones.
❖ Final exam is cumulative.
❖ No sample exam questions will be posted or
distributed.
Late Policy
❖ Homework and Project reports are due at the beginning
of class on the due date. Homework and project reports
can be turned in the following class for a 25% penalty.
Nothing will be accepted after that time.
Grading
❖ This course is being offered for three credits. The
grading is as follows:
❖ Homework Assignments: 20 %
❖ DB tool presentation: 10 %
❖ Midterm exam: 20 %
❖ Project: 25 %
❖ Final exam: 25 %
Class Q&A (and more) Management System
❖ This term we will be using Piazza for class discussion.
The system is highly catered to getting you help fast and
efficiently from classmates, and myself. Rather than
emailing questions to me, I encourage you to post your
questions on Piazza. If you have any problems or
feedback for the developers, email team@piazza.com.
❖ Find our class page at: https://piazza.com/pitt/
spring2015/infsci2711/home
Piazza Demo
Extra Credits to your grades
❖ Top 5 most active users on Piazza will get 5 extra points
❖ Active users are those who:
❖ ask many and GOOD questions
❖ answer questions posted by others (preferably
before I answer)

Intro

  • 1.
    INFSCI 2711 Advanced Topicsin Database Management Instructor: Evgeny Karataev
  • 2.
    The instructor ❖ EvgenyKarataev ❖ Where/how to find: ❖ In class: Tuesdays, 12:00 noon - 2:50 pm, IS 403 ❖ Office Hours: online and by appointment ❖ email: epk8@pitt.edu
  • 3.
    What this classis about… ❖ Prerequisites: ❖ You know what a relational database and a database management system are (INFSCI 2710) ❖ Do you think that [one] [R]DBMS on [single] machine is enough to handle today’s volume, velocity and variety of the data?
  • 4.
    What this classis about… ❖ Topics that will be covered in this class: ❖ Data Integration (OLAP and Data Warehousing, Virtual Data Integration) ❖ Distributed and Parallel Databases (including distributed transactions and query execution) ❖ NoSQL databases, NewSQL databases (Main Memory Databases) ❖ Cluster Computing (Hadoop and other animals, Spark)
  • 5.
    The textbooks ❖ Toomany to list here ❖ there is no single book that covers all topics ❖ so I will post selected chapters of online available books, blog posts and research papers before or after each lecture
  • 6.
    Class components ❖ Lectures/Demos/Labs ❖Homework Assignments ❖ Students DB tools overview presentations ❖ Term Research & Development Project ❖ Midterm exam ❖ Final exam
  • 7.
    Lectures/Demos/Labs ❖ Lectures’ slideswill be available online usually a day before the class. ❖ Sometime you might be asked to bring your laptop to the class for lab work. ❖ Sometimes I will do demos of the DB systems related to the class material.
  • 8.
    Homework Assignments ❖ Sofar I planned 4 assignments fairly well spread over the semester. However this might change to 5, 6 or 3. ❖ Assignment are usually very practical and are based on the material learned in the class. You might be need to do some programing. ❖ All assignments need to be submitted ONLINE (assignments will have submission instructions) ❖ All assignments are group based (2 or 3 people per group)
  • 9.
    Students DB toolsoverview presentations ❖ Each student (or maybe in groups of 2 or 3) will have to make 10 minutes presentation about a database system sometime during the term (I will provide the list of the database systems). The presentation must include: ❖ Architecture/Main idea/Main approach. ❖ Advantages and Disadvantages. ❖ How it differs from other systems. ❖ When it is applicable and when not. ❖ Where to learn more about it.
  • 10.
    Term Research Project ❖An original R&D project in groups of 5-6 people ❖ Most probably a lot learning and programming ❖ In class project progress reports/demos every other week (up to 15 minutes max) ❖ One final written report ❖ One final demo ❖ Project ideas will be provided by me, but you are welcome to propose yours ❖ Projects development will be managed via github
  • 11.
    Exams ❖ Both Midtermand Final exams are open notes, but no computers and/or phones. ❖ Final exam is cumulative. ❖ No sample exam questions will be posted or distributed.
  • 12.
    Late Policy ❖ Homeworkand Project reports are due at the beginning of class on the due date. Homework and project reports can be turned in the following class for a 25% penalty. Nothing will be accepted after that time.
  • 13.
    Grading ❖ This courseis being offered for three credits. The grading is as follows: ❖ Homework Assignments: 20 % ❖ DB tool presentation: 10 % ❖ Midterm exam: 20 % ❖ Project: 25 % ❖ Final exam: 25 %
  • 14.
    Class Q&A (andmore) Management System ❖ This term we will be using Piazza for class discussion. The system is highly catered to getting you help fast and efficiently from classmates, and myself. Rather than emailing questions to me, I encourage you to post your questions on Piazza. If you have any problems or feedback for the developers, email team@piazza.com. ❖ Find our class page at: https://piazza.com/pitt/ spring2015/infsci2711/home
  • 15.
  • 16.
    Extra Credits toyour grades ❖ Top 5 most active users on Piazza will get 5 extra points ❖ Active users are those who: ❖ ask many and GOOD questions ❖ answer questions posted by others (preferably before I answer)