SlideShare a Scribd company logo
Lineage-based database
There is some code written already. I need it completed and debugged. Although the topic is
database it requires you to write in Python. Instructions are attached below in doc file. The
professor is pretty open with how everything is implemented.
Info about L-store
L-Store is a relational database. Simply put, data is stored in a table form consisting of rows
and columns. Each row of a table is a record (also called a tuple), and the columns hold the
attributes of each record. Each record is identified by a unique primary key that can be
referenced by other records to form relationships through foreign keys.
(S1) Data Model: The key idea of L-Store is to separate the original version of a record
inserted into the database (a base record) and the subsequent updates to it (tail records).
Records are stored in physical pages where a page is basically a fixed-size contiguous
memory chunk, say 4 KB (you may experiment with larger page sizes and observe its effects
on
the performance). The base records are stored in read-only pages called base pages. Each
base page is associated with a set of append-only pages called tail pages that will hold the
corresponding tail records, namely, any updates to a record will be added to the tail pages
and will be maintained as a tail record. We will generalize how we associate base and tail
pages when we discuss page ranges.
Data storage in L-Store is columnar, meaning that instead of storing all fields of a record
contiguously, data from different records for the same column are stored together. Each
page is dedicated to a certain column. The idea behind this layout is that most update and
read operations will only affect a small set of columns; hence, separating the storage for
each column would reduce the amount of contention and I/O. Also, the data in each column
tends to be homogeneous, and the data on each page can be compressed more effectively.
As a result, the base page (or tail page) is a logical concept because, physically, each base
page (or tail page) consists of a set of physical pages (4K each, for example), one for each
column.
In the database, each record is assigned a unique identifier called a RID, which is often the
physical location of where the record is actually stored. In L-Store, this identifier will never
change during a record’s lifecycle. Each record also includes an indirection column that
points to the latest tail record holding the latest update to the record. When updating a
record, a new tail record is inserted in the corresponding tail pages, and the indirection
column of the base record is set to point to the RID of this new tail record.The tail record’s
own indirection is set to point to the RID of the previous tail record (the
previous update) for the same base record if available.
Tail records can be either cumulative or non-cumulative. A cumulative tail record will
contain the latest updated values for all columns while a non-cumulative one only includes
values of the updated columns and sets the values of other columns to a special NULL value.
The choice between cumulative or non-cumulative updates offers a trade-off between
update and read performance. For non-cumulative updates, the whole lineage needs (past
updates) to be traversed to get the latest values for all columns. This design might seem
inefficient in the sense that it needs to read multiple records to
yield all columns; however, in practice, the entire lineage may rarely be traversed as most
queries need specific columns. In your implementation, you may choose either option, or
you may experiment with both options and quantify the difference. In order to see a
difference, you need to insert/update many records, perhaps up to a few million records.
Each base record also contains a schema encoding column. This is a bit vector with one bit
per column that stores information about the updated state of each column. In the base
records, the schema encoding will include a 0 bit for all the columns that have not yet been
updated and a 1 bit for those that have been updated. This helps optimize queries by
determining whether we need to follow the lineage or not. In non-cumulative tail records,
the schema encoding serves to distinguish between columns with updated values and those
with NULL values.
L-Store also supports deleting records. When a record is deleted, the base record will be
invalidated by setting the RID of itself and all its tail records to a special value. These
invalidated records will be removed during the next merge cycle for the corresponding page
range. The invalidation part needs to be implemented during this milestone. The removal of
invalidated records will be implemented in the merge routine of the next milestone.
(S2) Bufferpool Management:
In this milestone, we have a simplified bufferpool because data resides only in memory and
is not backed by disk. To keep track of data, whether, in memory (or disk), we require to
have a page directory that maps RIDs to pages in memory (or disk) to allow fast retrieval of
records. Recall records are stored in pages, and records are partitioned across page ranges.
Given a RID, the page directory returns the location of a certain record inside the page
within the page range. The efficiency of this data structure is a key factor in performance.
(S3) Query Interface:
We will require simple query capabilities in this milestone that provides standard SQL-like
functionalities, which are also similar to Key-Value Stores (NoSQL). For this milestone, you
need to provide select, insert, update, and delete of a single key along with a simple
aggregation query, namely, to return the summation of a single column for a range of keys.
#Implementation
You will find three main classes in the provided skeleton. Some of the needed methods in
each class are provided as stubs. But you must implement the APIs listed in db.py, query.py,
table.py, and index.py; you also need to ensure that you can run main.py to allow auto-
grading as well.
The Database class is a general interface to the database and handles high-level operations
such as starting and shutting down the database instance and loading the database from
stored disk files. This class also handles the creation and deletion of tables via the create
and drop function. The create function will create a new table in the database. The Table
constructor takes as input the name of the table, the number of columns, and the index of
the key column. The drop function drops the specified table.
The Query class provides standard SQL operations such as insert, select, update, delete, and
sum. The select function returns matching records based on the search key and projected
columns. The insert function inserts a new record in the table. The update function updates
values for specified columns. The delete function deletes a record with the specified key.
The sum function calculates the sum of selected column values for a range of records. Tables
are queried using direct function calls.
The Table class provides the core relational storage functionality. All columns are 64-bit
integers. Tables manage storage and retrieval of data and are responsible for managing
pages and merging page ranges.
The Index class provides a data structure for fast processing of queries. The key column of
all tables is indexed by default. The API exposes create_index and drop_index functions
(optional).
The Page class provides low-level physical storage capabilities. Each page has a fixed size of
4096 KB for optimal performance. This class is mostly used internally by the Table class for
storing and retrieving records.
The config.py file stores all configuration options and constant values used in the code.

More Related Content

Similar to database.pdf

Bt0066 database management system1
Bt0066 database management system1Bt0066 database management system1
Bt0066 database management system1
Techglyphs
 
3.Implementation with NOSQL databases Document Databases (Mongodb).pptx
3.Implementation with NOSQL databases Document Databases (Mongodb).pptx3.Implementation with NOSQL databases Document Databases (Mongodb).pptx
3.Implementation with NOSQL databases Document Databases (Mongodb).pptx
RushikeshChikane2
 
Sql server lesson6
Sql server lesson6Sql server lesson6
Sql server lesson6
Ala Qunaibi
 
Column oriented Transactions
Column oriented TransactionsColumn oriented Transactions
Column oriented Transactions
Aerial Telecom Solutions (ATS) Pvt. Ltd.
 
Sql
SqlSql
Sql
shenazk
 
153680 sqlinterview
153680  sqlinterview153680  sqlinterview
153680 sqlinterview
zdsgsgdf
 
Sql Interview Questions
Sql Interview QuestionsSql Interview Questions
Sql Interview Questions
arjundwh
 
Sql
SqlSql
Sql
SqlSql
Sql Server Basics
Sql Server BasicsSql Server Basics
Sql Server Basics
rainynovember12
 
SQLServer Database Structures
SQLServer Database Structures SQLServer Database Structures
SQLServer Database Structures
Antonios Chatzipavlis
 
Physical architecture of sql server
Physical architecture of sql serverPhysical architecture of sql server
Physical architecture of sql server
Divya Sharma
 
1650607.ppt
1650607.ppt1650607.ppt
1650607.ppt
KalsoomTahir2
 
Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008
paulguerin
 
Oracle Database Overview
Oracle Database OverviewOracle Database Overview
Oracle Database Overview
honglee71
 
Datastores
DatastoresDatastores
Datastores
Raveen Vijayan
 
Application sql issues_and_tuning
Application sql issues_and_tuningApplication sql issues_and_tuning
Application sql issues_and_tuning
Anil Pandey
 
Data warehouse physical design
Data warehouse physical designData warehouse physical design
Data warehouse physical design
Er. Nawaraj Bhandari
 
Database Basics
Database BasicsDatabase Basics
Database Basics
Abdel Moneim Emad
 
Artigo no sql x relational
Artigo no sql x relationalArtigo no sql x relational
Artigo no sql x relational
Adenilson Lima Diniz
 

Similar to database.pdf (20)

Bt0066 database management system1
Bt0066 database management system1Bt0066 database management system1
Bt0066 database management system1
 
3.Implementation with NOSQL databases Document Databases (Mongodb).pptx
3.Implementation with NOSQL databases Document Databases (Mongodb).pptx3.Implementation with NOSQL databases Document Databases (Mongodb).pptx
3.Implementation with NOSQL databases Document Databases (Mongodb).pptx
 
Sql server lesson6
Sql server lesson6Sql server lesson6
Sql server lesson6
 
Column oriented Transactions
Column oriented TransactionsColumn oriented Transactions
Column oriented Transactions
 
Sql
SqlSql
Sql
 
153680 sqlinterview
153680  sqlinterview153680  sqlinterview
153680 sqlinterview
 
Sql Interview Questions
Sql Interview QuestionsSql Interview Questions
Sql Interview Questions
 
Sql
SqlSql
Sql
 
Sql
SqlSql
Sql
 
Sql Server Basics
Sql Server BasicsSql Server Basics
Sql Server Basics
 
SQLServer Database Structures
SQLServer Database Structures SQLServer Database Structures
SQLServer Database Structures
 
Physical architecture of sql server
Physical architecture of sql serverPhysical architecture of sql server
Physical architecture of sql server
 
1650607.ppt
1650607.ppt1650607.ppt
1650607.ppt
 
Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008
 
Oracle Database Overview
Oracle Database OverviewOracle Database Overview
Oracle Database Overview
 
Datastores
DatastoresDatastores
Datastores
 
Application sql issues_and_tuning
Application sql issues_and_tuningApplication sql issues_and_tuning
Application sql issues_and_tuning
 
Data warehouse physical design
Data warehouse physical designData warehouse physical design
Data warehouse physical design
 
Database Basics
Database BasicsDatabase Basics
Database Basics
 
Artigo no sql x relational
Artigo no sql x relationalArtigo no sql x relational
Artigo no sql x relational
 

More from stirlingvwriters

Speak to the idea of feminism from your perspective and.docx
Speak to the idea of feminism from your perspective and.docxSpeak to the idea of feminism from your perspective and.docx
Speak to the idea of feminism from your perspective and.docx
stirlingvwriters
 
What is the logic behind How.docx
What is the logic behind How.docxWhat is the logic behind How.docx
What is the logic behind How.docx
stirlingvwriters
 
Thinking about password identify two that you believe are.docx
Thinking about password identify two that you believe are.docxThinking about password identify two that you believe are.docx
Thinking about password identify two that you believe are.docx
stirlingvwriters
 
The student will demonstrate and articulate proficiency in.docx
The student will demonstrate and articulate proficiency in.docxThe student will demonstrate and articulate proficiency in.docx
The student will demonstrate and articulate proficiency in.docx
stirlingvwriters
 
To help lay the foundation for your study of postmodern.docx
To help lay the foundation for your study of postmodern.docxTo help lay the foundation for your study of postmodern.docx
To help lay the foundation for your study of postmodern.docx
stirlingvwriters
 
TITLE Digital marketing before and after pandemic Sections that.docx
TITLE Digital marketing before and after pandemic Sections that.docxTITLE Digital marketing before and after pandemic Sections that.docx
TITLE Digital marketing before and after pandemic Sections that.docx
stirlingvwriters
 
This assignment focuses on Marxist students will educate.docx
This assignment focuses on Marxist students will educate.docxThis assignment focuses on Marxist students will educate.docx
This assignment focuses on Marxist students will educate.docx
stirlingvwriters
 
Upton Souls of Black.docx
Upton Souls of Black.docxUpton Souls of Black.docx
Upton Souls of Black.docx
stirlingvwriters
 
What is a In this.docx
What is a In this.docxWhat is a In this.docx
What is a In this.docx
stirlingvwriters
 
There are many possible sources of literature for.docx
There are many possible sources of literature for.docxThere are many possible sources of literature for.docx
There are many possible sources of literature for.docx
stirlingvwriters
 
You enter your project team meeting with Mike and Tiffany.docx
You enter your project team meeting with Mike and Tiffany.docxYou enter your project team meeting with Mike and Tiffany.docx
You enter your project team meeting with Mike and Tiffany.docx
stirlingvwriters
 
Write a minimum of 200 words response to each post.docx
Write a minimum of 200 words response to each post.docxWrite a minimum of 200 words response to each post.docx
Write a minimum of 200 words response to each post.docx
stirlingvwriters
 
View the video on Law at Discuss various.docx
View the video on Law at Discuss various.docxView the video on Law at Discuss various.docx
View the video on Law at Discuss various.docx
stirlingvwriters
 
Your software has gone live and is in the production.docx
Your software has gone live and is in the production.docxYour software has gone live and is in the production.docx
Your software has gone live and is in the production.docx
stirlingvwriters
 
This learning was a cornucopia of enrichment with regard.docx
This learning was a cornucopia of enrichment with regard.docxThis learning was a cornucopia of enrichment with regard.docx
This learning was a cornucopia of enrichment with regard.docx
stirlingvwriters
 
This is a school community relations My chosen school.docx
This is a school community relations My chosen school.docxThis is a school community relations My chosen school.docx
This is a school community relations My chosen school.docx
stirlingvwriters
 
Write 3 Only one resource is I.docx
Write 3 Only one resource is I.docxWrite 3 Only one resource is I.docx
Write 3 Only one resource is I.docx
stirlingvwriters
 
Sociology researches social issues through the use of theoretical.docx
Sociology researches social issues through the use of theoretical.docxSociology researches social issues through the use of theoretical.docx
Sociology researches social issues through the use of theoretical.docx
stirlingvwriters
 
Step Listen to the Trail of Tears.docx
Step Listen to the Trail of Tears.docxStep Listen to the Trail of Tears.docx
Step Listen to the Trail of Tears.docx
stirlingvwriters
 
You are the newly hired Director of Risk Management for.docx
You are the newly hired Director of Risk Management for.docxYou are the newly hired Director of Risk Management for.docx
You are the newly hired Director of Risk Management for.docx
stirlingvwriters
 

More from stirlingvwriters (20)

Speak to the idea of feminism from your perspective and.docx
Speak to the idea of feminism from your perspective and.docxSpeak to the idea of feminism from your perspective and.docx
Speak to the idea of feminism from your perspective and.docx
 
What is the logic behind How.docx
What is the logic behind How.docxWhat is the logic behind How.docx
What is the logic behind How.docx
 
Thinking about password identify two that you believe are.docx
Thinking about password identify two that you believe are.docxThinking about password identify two that you believe are.docx
Thinking about password identify two that you believe are.docx
 
The student will demonstrate and articulate proficiency in.docx
The student will demonstrate and articulate proficiency in.docxThe student will demonstrate and articulate proficiency in.docx
The student will demonstrate and articulate proficiency in.docx
 
To help lay the foundation for your study of postmodern.docx
To help lay the foundation for your study of postmodern.docxTo help lay the foundation for your study of postmodern.docx
To help lay the foundation for your study of postmodern.docx
 
TITLE Digital marketing before and after pandemic Sections that.docx
TITLE Digital marketing before and after pandemic Sections that.docxTITLE Digital marketing before and after pandemic Sections that.docx
TITLE Digital marketing before and after pandemic Sections that.docx
 
This assignment focuses on Marxist students will educate.docx
This assignment focuses on Marxist students will educate.docxThis assignment focuses on Marxist students will educate.docx
This assignment focuses on Marxist students will educate.docx
 
Upton Souls of Black.docx
Upton Souls of Black.docxUpton Souls of Black.docx
Upton Souls of Black.docx
 
What is a In this.docx
What is a In this.docxWhat is a In this.docx
What is a In this.docx
 
There are many possible sources of literature for.docx
There are many possible sources of literature for.docxThere are many possible sources of literature for.docx
There are many possible sources of literature for.docx
 
You enter your project team meeting with Mike and Tiffany.docx
You enter your project team meeting with Mike and Tiffany.docxYou enter your project team meeting with Mike and Tiffany.docx
You enter your project team meeting with Mike and Tiffany.docx
 
Write a minimum of 200 words response to each post.docx
Write a minimum of 200 words response to each post.docxWrite a minimum of 200 words response to each post.docx
Write a minimum of 200 words response to each post.docx
 
View the video on Law at Discuss various.docx
View the video on Law at Discuss various.docxView the video on Law at Discuss various.docx
View the video on Law at Discuss various.docx
 
Your software has gone live and is in the production.docx
Your software has gone live and is in the production.docxYour software has gone live and is in the production.docx
Your software has gone live and is in the production.docx
 
This learning was a cornucopia of enrichment with regard.docx
This learning was a cornucopia of enrichment with regard.docxThis learning was a cornucopia of enrichment with regard.docx
This learning was a cornucopia of enrichment with regard.docx
 
This is a school community relations My chosen school.docx
This is a school community relations My chosen school.docxThis is a school community relations My chosen school.docx
This is a school community relations My chosen school.docx
 
Write 3 Only one resource is I.docx
Write 3 Only one resource is I.docxWrite 3 Only one resource is I.docx
Write 3 Only one resource is I.docx
 
Sociology researches social issues through the use of theoretical.docx
Sociology researches social issues through the use of theoretical.docxSociology researches social issues through the use of theoretical.docx
Sociology researches social issues through the use of theoretical.docx
 
Step Listen to the Trail of Tears.docx
Step Listen to the Trail of Tears.docxStep Listen to the Trail of Tears.docx
Step Listen to the Trail of Tears.docx
 
You are the newly hired Director of Risk Management for.docx
You are the newly hired Director of Risk Management for.docxYou are the newly hired Director of Risk Management for.docx
You are the newly hired Director of Risk Management for.docx
 

Recently uploaded

Monthly Market Risk Update: June 2024 [SlideShare]
Monthly Market Risk Update: June 2024 [SlideShare]Monthly Market Risk Update: June 2024 [SlideShare]
Monthly Market Risk Update: June 2024 [SlideShare]
Commonwealth
 
Enhanced metrics to measure the Regulatory impact
Enhanced metrics to measure the Regulatory impactEnhanced metrics to measure the Regulatory impact
Enhanced metrics to measure the Regulatory impact
Alexander Belyaev
 
Singapore 2024 Event The Way Forward Slides
Singapore 2024 Event The Way Forward SlidesSingapore 2024 Event The Way Forward Slides
Singapore 2024 Event The Way Forward Slides
International Federation of Accountants
 
Singapore 2024 Sustainability Reporting and Accountancy Education Slides
Singapore 2024 Sustainability Reporting and Accountancy Education SlidesSingapore 2024 Sustainability Reporting and Accountancy Education Slides
Singapore 2024 Sustainability Reporting and Accountancy Education Slides
International Federation of Accountants
 
快速办理(RWTH毕业证书)德国亚琛工业大学毕业证录取通知书一模一样
快速办理(RWTH毕业证书)德国亚琛工业大学毕业证录取通知书一模一样快速办理(RWTH毕业证书)德国亚琛工业大学毕业证录取通知书一模一样
快速办理(RWTH毕业证书)德国亚琛工业大学毕业证录取通知书一模一样
yeuwffu
 
Accounting Information Systems (AIS).pptx
Accounting Information Systems (AIS).pptxAccounting Information Systems (AIS).pptx
Accounting Information Systems (AIS).pptx
TIZITAWMASRESHA
 
Singapore 2024 Sustainability Slides.pdf
Singapore 2024 Sustainability Slides.pdfSingapore 2024 Sustainability Slides.pdf
Singapore 2024 Sustainability Slides.pdf
International Federation of Accountants
 
Ending stagnation: How to boost prosperity across Scotland
Ending stagnation: How to boost prosperity across ScotlandEnding stagnation: How to boost prosperity across Scotland
Ending stagnation: How to boost prosperity across Scotland
ResolutionFoundation
 
CONTABILIDAD FINANCIERA / ENSAYO DE CONTABILIDAD
CONTABILIDAD FINANCIERA / ENSAYO DE CONTABILIDADCONTABILIDAD FINANCIERA / ENSAYO DE CONTABILIDAD
CONTABILIDAD FINANCIERA / ENSAYO DE CONTABILIDAD
godiperoficial
 
BIHC Briefing June 2024 from Bank+Insurance Hybrid Capital in association wit...
BIHC Briefing June 2024 from Bank+Insurance Hybrid Capital in association wit...BIHC Briefing June 2024 from Bank+Insurance Hybrid Capital in association wit...
BIHC Briefing June 2024 from Bank+Insurance Hybrid Capital in association wit...
Neil Day
 
1比1复刻(ksu毕业证书)美国堪萨斯州立大学毕业证本科文凭证书原版一模一样
1比1复刻(ksu毕业证书)美国堪萨斯州立大学毕业证本科文凭证书原版一模一样1比1复刻(ksu毕业证书)美国堪萨斯州立大学毕业证本科文凭证书原版一模一样
1比1复刻(ksu毕业证书)美国堪萨斯州立大学毕业证本科文凭证书原版一模一样
28xo7hf
 
Governor Olli Rehn: Inflation down and recovery supported by interest rate cu...
Governor Olli Rehn: Inflation down and recovery supported by interest rate cu...Governor Olli Rehn: Inflation down and recovery supported by interest rate cu...
Governor Olli Rehn: Inflation down and recovery supported by interest rate cu...
Suomen Pankki
 
Navigating Your Financial Future: Comprehensive Planning with Mike Baumann
Navigating Your Financial Future: Comprehensive Planning with Mike BaumannNavigating Your Financial Future: Comprehensive Planning with Mike Baumann
Navigating Your Financial Future: Comprehensive Planning with Mike Baumann
mikebaumannfinancial
 
Dr. Alyce Su Cover Story - China's Investment Leader
Dr. Alyce Su Cover Story - China's Investment LeaderDr. Alyce Su Cover Story - China's Investment Leader
Dr. Alyce Su Cover Story - China's Investment Leader
msthrill
 
Importance of community participation in development projects.pdf
Importance of community participation in development projects.pdfImportance of community participation in development projects.pdf
Importance of community participation in development projects.pdf
krisretro1
 
Singapore Event 2024 State of Play Slides
Singapore Event 2024 State of Play SlidesSingapore Event 2024 State of Play Slides
Singapore Event 2024 State of Play Slides
International Federation of Accountants
 
一比一原版(cwu毕业证书)美国中央华盛顿大学毕业证如何办理
一比一原版(cwu毕业证书)美国中央华盛顿大学毕业证如何办理一比一原版(cwu毕业证书)美国中央华盛顿大学毕业证如何办理
一比一原版(cwu毕业证书)美国中央华盛顿大学毕业证如何办理
asukqco
 
Calculation of compliance cost: Veterinary and sanitary control of aquatic bi...
Calculation of compliance cost: Veterinary and sanitary control of aquatic bi...Calculation of compliance cost: Veterinary and sanitary control of aquatic bi...
Calculation of compliance cost: Veterinary and sanitary control of aquatic bi...
Alexander Belyaev
 
PM pre reads for the product manager framework
PM pre reads for the product manager frameworkPM pre reads for the product manager framework
PM pre reads for the product manager framework
KishoreKatta6
 
Discovering Delhi - India's Cultural Capital.pptx
Discovering Delhi - India's Cultural Capital.pptxDiscovering Delhi - India's Cultural Capital.pptx
Discovering Delhi - India's Cultural Capital.pptx
cosmo-soil
 

Recently uploaded (20)

Monthly Market Risk Update: June 2024 [SlideShare]
Monthly Market Risk Update: June 2024 [SlideShare]Monthly Market Risk Update: June 2024 [SlideShare]
Monthly Market Risk Update: June 2024 [SlideShare]
 
Enhanced metrics to measure the Regulatory impact
Enhanced metrics to measure the Regulatory impactEnhanced metrics to measure the Regulatory impact
Enhanced metrics to measure the Regulatory impact
 
Singapore 2024 Event The Way Forward Slides
Singapore 2024 Event The Way Forward SlidesSingapore 2024 Event The Way Forward Slides
Singapore 2024 Event The Way Forward Slides
 
Singapore 2024 Sustainability Reporting and Accountancy Education Slides
Singapore 2024 Sustainability Reporting and Accountancy Education SlidesSingapore 2024 Sustainability Reporting and Accountancy Education Slides
Singapore 2024 Sustainability Reporting and Accountancy Education Slides
 
快速办理(RWTH毕业证书)德国亚琛工业大学毕业证录取通知书一模一样
快速办理(RWTH毕业证书)德国亚琛工业大学毕业证录取通知书一模一样快速办理(RWTH毕业证书)德国亚琛工业大学毕业证录取通知书一模一样
快速办理(RWTH毕业证书)德国亚琛工业大学毕业证录取通知书一模一样
 
Accounting Information Systems (AIS).pptx
Accounting Information Systems (AIS).pptxAccounting Information Systems (AIS).pptx
Accounting Information Systems (AIS).pptx
 
Singapore 2024 Sustainability Slides.pdf
Singapore 2024 Sustainability Slides.pdfSingapore 2024 Sustainability Slides.pdf
Singapore 2024 Sustainability Slides.pdf
 
Ending stagnation: How to boost prosperity across Scotland
Ending stagnation: How to boost prosperity across ScotlandEnding stagnation: How to boost prosperity across Scotland
Ending stagnation: How to boost prosperity across Scotland
 
CONTABILIDAD FINANCIERA / ENSAYO DE CONTABILIDAD
CONTABILIDAD FINANCIERA / ENSAYO DE CONTABILIDADCONTABILIDAD FINANCIERA / ENSAYO DE CONTABILIDAD
CONTABILIDAD FINANCIERA / ENSAYO DE CONTABILIDAD
 
BIHC Briefing June 2024 from Bank+Insurance Hybrid Capital in association wit...
BIHC Briefing June 2024 from Bank+Insurance Hybrid Capital in association wit...BIHC Briefing June 2024 from Bank+Insurance Hybrid Capital in association wit...
BIHC Briefing June 2024 from Bank+Insurance Hybrid Capital in association wit...
 
1比1复刻(ksu毕业证书)美国堪萨斯州立大学毕业证本科文凭证书原版一模一样
1比1复刻(ksu毕业证书)美国堪萨斯州立大学毕业证本科文凭证书原版一模一样1比1复刻(ksu毕业证书)美国堪萨斯州立大学毕业证本科文凭证书原版一模一样
1比1复刻(ksu毕业证书)美国堪萨斯州立大学毕业证本科文凭证书原版一模一样
 
Governor Olli Rehn: Inflation down and recovery supported by interest rate cu...
Governor Olli Rehn: Inflation down and recovery supported by interest rate cu...Governor Olli Rehn: Inflation down and recovery supported by interest rate cu...
Governor Olli Rehn: Inflation down and recovery supported by interest rate cu...
 
Navigating Your Financial Future: Comprehensive Planning with Mike Baumann
Navigating Your Financial Future: Comprehensive Planning with Mike BaumannNavigating Your Financial Future: Comprehensive Planning with Mike Baumann
Navigating Your Financial Future: Comprehensive Planning with Mike Baumann
 
Dr. Alyce Su Cover Story - China's Investment Leader
Dr. Alyce Su Cover Story - China's Investment LeaderDr. Alyce Su Cover Story - China's Investment Leader
Dr. Alyce Su Cover Story - China's Investment Leader
 
Importance of community participation in development projects.pdf
Importance of community participation in development projects.pdfImportance of community participation in development projects.pdf
Importance of community participation in development projects.pdf
 
Singapore Event 2024 State of Play Slides
Singapore Event 2024 State of Play SlidesSingapore Event 2024 State of Play Slides
Singapore Event 2024 State of Play Slides
 
一比一原版(cwu毕业证书)美国中央华盛顿大学毕业证如何办理
一比一原版(cwu毕业证书)美国中央华盛顿大学毕业证如何办理一比一原版(cwu毕业证书)美国中央华盛顿大学毕业证如何办理
一比一原版(cwu毕业证书)美国中央华盛顿大学毕业证如何办理
 
Calculation of compliance cost: Veterinary and sanitary control of aquatic bi...
Calculation of compliance cost: Veterinary and sanitary control of aquatic bi...Calculation of compliance cost: Veterinary and sanitary control of aquatic bi...
Calculation of compliance cost: Veterinary and sanitary control of aquatic bi...
 
PM pre reads for the product manager framework
PM pre reads for the product manager frameworkPM pre reads for the product manager framework
PM pre reads for the product manager framework
 
Discovering Delhi - India's Cultural Capital.pptx
Discovering Delhi - India's Cultural Capital.pptxDiscovering Delhi - India's Cultural Capital.pptx
Discovering Delhi - India's Cultural Capital.pptx
 

database.pdf

  • 1. Lineage-based database There is some code written already. I need it completed and debugged. Although the topic is database it requires you to write in Python. Instructions are attached below in doc file. The professor is pretty open with how everything is implemented. Info about L-store L-Store is a relational database. Simply put, data is stored in a table form consisting of rows and columns. Each row of a table is a record (also called a tuple), and the columns hold the attributes of each record. Each record is identified by a unique primary key that can be referenced by other records to form relationships through foreign keys. (S1) Data Model: The key idea of L-Store is to separate the original version of a record inserted into the database (a base record) and the subsequent updates to it (tail records). Records are stored in physical pages where a page is basically a fixed-size contiguous memory chunk, say 4 KB (you may experiment with larger page sizes and observe its effects on the performance). The base records are stored in read-only pages called base pages. Each base page is associated with a set of append-only pages called tail pages that will hold the corresponding tail records, namely, any updates to a record will be added to the tail pages and will be maintained as a tail record. We will generalize how we associate base and tail pages when we discuss page ranges. Data storage in L-Store is columnar, meaning that instead of storing all fields of a record contiguously, data from different records for the same column are stored together. Each page is dedicated to a certain column. The idea behind this layout is that most update and read operations will only affect a small set of columns; hence, separating the storage for each column would reduce the amount of contention and I/O. Also, the data in each column tends to be homogeneous, and the data on each page can be compressed more effectively. As a result, the base page (or tail page) is a logical concept because, physically, each base page (or tail page) consists of a set of physical pages (4K each, for example), one for each column. In the database, each record is assigned a unique identifier called a RID, which is often the physical location of where the record is actually stored. In L-Store, this identifier will never change during a record’s lifecycle. Each record also includes an indirection column that points to the latest tail record holding the latest update to the record. When updating a
  • 2. record, a new tail record is inserted in the corresponding tail pages, and the indirection column of the base record is set to point to the RID of this new tail record.The tail record’s own indirection is set to point to the RID of the previous tail record (the previous update) for the same base record if available. Tail records can be either cumulative or non-cumulative. A cumulative tail record will contain the latest updated values for all columns while a non-cumulative one only includes values of the updated columns and sets the values of other columns to a special NULL value. The choice between cumulative or non-cumulative updates offers a trade-off between update and read performance. For non-cumulative updates, the whole lineage needs (past updates) to be traversed to get the latest values for all columns. This design might seem inefficient in the sense that it needs to read multiple records to yield all columns; however, in practice, the entire lineage may rarely be traversed as most queries need specific columns. In your implementation, you may choose either option, or you may experiment with both options and quantify the difference. In order to see a difference, you need to insert/update many records, perhaps up to a few million records. Each base record also contains a schema encoding column. This is a bit vector with one bit per column that stores information about the updated state of each column. In the base records, the schema encoding will include a 0 bit for all the columns that have not yet been updated and a 1 bit for those that have been updated. This helps optimize queries by determining whether we need to follow the lineage or not. In non-cumulative tail records, the schema encoding serves to distinguish between columns with updated values and those with NULL values. L-Store also supports deleting records. When a record is deleted, the base record will be invalidated by setting the RID of itself and all its tail records to a special value. These invalidated records will be removed during the next merge cycle for the corresponding page range. The invalidation part needs to be implemented during this milestone. The removal of invalidated records will be implemented in the merge routine of the next milestone. (S2) Bufferpool Management: In this milestone, we have a simplified bufferpool because data resides only in memory and is not backed by disk. To keep track of data, whether, in memory (or disk), we require to have a page directory that maps RIDs to pages in memory (or disk) to allow fast retrieval of records. Recall records are stored in pages, and records are partitioned across page ranges. Given a RID, the page directory returns the location of a certain record inside the page within the page range. The efficiency of this data structure is a key factor in performance. (S3) Query Interface: We will require simple query capabilities in this milestone that provides standard SQL-like functionalities, which are also similar to Key-Value Stores (NoSQL). For this milestone, you need to provide select, insert, update, and delete of a single key along with a simple aggregation query, namely, to return the summation of a single column for a range of keys. #Implementation You will find three main classes in the provided skeleton. Some of the needed methods in each class are provided as stubs. But you must implement the APIs listed in db.py, query.py,
  • 3. table.py, and index.py; you also need to ensure that you can run main.py to allow auto- grading as well. The Database class is a general interface to the database and handles high-level operations such as starting and shutting down the database instance and loading the database from stored disk files. This class also handles the creation and deletion of tables via the create and drop function. The create function will create a new table in the database. The Table constructor takes as input the name of the table, the number of columns, and the index of the key column. The drop function drops the specified table. The Query class provides standard SQL operations such as insert, select, update, delete, and sum. The select function returns matching records based on the search key and projected columns. The insert function inserts a new record in the table. The update function updates values for specified columns. The delete function deletes a record with the specified key. The sum function calculates the sum of selected column values for a range of records. Tables are queried using direct function calls. The Table class provides the core relational storage functionality. All columns are 64-bit integers. Tables manage storage and retrieval of data and are responsible for managing pages and merging page ranges. The Index class provides a data structure for fast processing of queries. The key column of all tables is indexed by default. The API exposes create_index and drop_index functions (optional). The Page class provides low-level physical storage capabilities. Each page has a fixed size of 4096 KB for optimal performance. This class is mostly used internally by the Table class for storing and retrieving records. The config.py file stores all configuration options and constant values used in the code.