SlideShare a Scribd company logo
Gray & Reuter Log
10a: 1
Log ManagerLog Manager
Jim GrayJim Gray
Microsoft, Gray @ Microsoft.comMicrosoft, Gray @ Microsoft.com
Andreas ReuterAndreas Reuter
International University, Andreas.Reuter@i-u.deInternational University, Andreas.Reuter@i-u.de
9:00
11:00
1:30
3:30
7:00
Overview
Faults
Tolerance
T Models
Party
TP mons
Lock Theory
Lock Techniq
Queues
Workflow
Log
ResMgr
CICS & Inet
Adv TM
Cyberbrick
Files &Buffers
COM+
Corba
Replication
Party
B-tree
Access Paths
Groupware
Benchmark
Mon Tue Wed Thur Fri
Gray & Reuter Log
10a: 2
Log ConceptLog Concept
• Log is a history of all changes to the state.Log is a history of all changes to the state.
• Log + old state gives new state
• Log + new state gives old state (not in this picture)
• Log is a sequential file.
• Complete log is the complete history
• Current state is just a "cache" of the log records.
Archive
Sunday Master Monday Master
Monday
Transactions
Monday
Night
Batch
Run
Monday Master
Tuesday Master
Tuesday
Transactions
Tuesday
Night
Batch
Run
Tuesday Master
Wednesday
Transactions
Wednesday
Night
Batch
Run
Wednesday Master
Wednesday Master
Gray & Reuter Log
10a: 3
How Log is UsedHow Log is Used
• Recovery from faults
A redundant copy of the state and transitions
• Security audits:
Who did what to whom.
Often too low-level for this.
• Performance Monitor & Accounting:
But only records changes (not reads).
• ISSUES: Who should be allowed to read the log?
It is a security hole.
Must authorize access on a per-record basis.
Gray & Reuter Log
10a: 4
The Log Manager in the Scheme of Things
Interesting thing is the cycle:
Need log to recover archive to recover log.
Break the cycle with a bootstrap file.
Log Manager
Transaction Manager
Lock Manager
Buffer Manager
Media Manager
SQL & Other
Resource Managers
Archive
Manager
Operating
System
File
System
File Manager
Gray & Reuter Log
10a: 5
Log Is a Sequential File.Log Is a Sequential File.
Encapsulation of the log: it is a shared resource.
Startup: Log manager holds startup info for all others.
Careful writes: Log manager provides a
• High performance.
• Very reliable
• Semi-infinite
• Archived
Sequential file.
Some RMs keep private logs anyway.
(Notably PORTABLE DB systems.)
Then user or system has to manage multiple logs
Gray & Reuter Log
10a: 6
The Log Table
Log table is a sequential set (relation).
Log Records have standard part and then a log body.
Often want to query table via one attribute or
another: . RMID, TRID, timestamp,
create domain LSN unsigned integer(64); -- log sequence number (file #, rba)
create domain RMID unsigned integer; -- resource manager identifier
create domain TRID char(12); -- transaction identifier
create table log_table (
lsn LSN, -- the record’s log sequence number
prev_lsn LSN, -- the lsn of the previous record in log
timestamp TIMESTAMP, -- time log record was created
resource_manager RMID, -- resource mgr that wrote this record
trid TRID, -- id of transaction that wrote this record
tran_prev_lsn LSN, -- prev log record of this transaction (or 0)
body varchar, -- log data: rm understands it
primary key (lsn) -- lsn is primary key
foreign key (prev_lsn) -- previous log record in this table
references a_log_table(lsn), --
foreign key (tran_prev_lsn) -- transaction's prev log rec also in table
references a_log_table(lsn), --
) entry sequenced; -- inserts go at end of file
Gray & Reuter Log
10a: 7
Log is complete historyLog is complete history
Log anchor points at chain of each transaction.
May maintain other chains.
Log records map to sequence of N-plexed files
Old files are archived.
Eventually, archive files are discarded (weeks, months, never)
A files B files
Archive
lsn
prev_lsn
resource_mgr
trid
tran_prev_lsn
body
Log Table
Log Anchor
trid,
max_lsn,
min_lsn...
Gray & Reuter Log
10a: 8
The Log LSN
Each log record has a logical sequence number.
This number (LSN for Log Sequence Number) plays a
key role in many algorithms.
Key property MONOTONICITY:
If action A happened after action B then
LSN(A) > LSN(B).
Gray & Reuter Log
10a: 9
Reading The Log
long log_read_lsn( LSN lsn, /* lsn of record to be read */
log_record_header header, /* header fields of record to be read */
long offset, /* offset into body to start read */
pointer buffer, /* buffer to receive log data */
long n); /* length of buffer */
LSN log_max_lsn(void); /* returns the current maximum lsn of the log table.*/
Read with C (see next slide) or SQL:
long sql_count( RMID rmid) /* count log records written by this rmid */
{ long rec_count; /* count of records */
exec sql SELECT count (*) /* ask sql to scan log counting records */
INTO :rec_count /* written by the calling resource mgr and */
FROM log_table /* place count in the rec_count */
WHERE resource_manager = :rmid; /* */
return rec_count; /* return the answer. */
};
Gray & Reuter Log
10a: 10
Reading the Log: SQL is easier than CReading the Log: SQL is easier than C
long c_count( RMID rmid)/* count log records written by this rmid */
{ log_record_header header; /* structure to receive log record header */
LSN lsn; /* log sequence number of next log rec */
char buffer[1];/* null buffer to receive log record body. */
long rec_count = 0; /* count of records */
int n = 1; /* size of log body returned */
if (!log_open(READ)) panic(); /* open the log (authorization check)*/
lsn = log_max_lsn( ); /* get most recent lsn */
while (lsn != NullLSN) /* scan backward through the log */
{ n = log_read_lsn( lsn, /* lsn of record to be read */
header, /* log record header fields */
0L, &buffer, 1L );/* log rec body ignored. */
if (header.rmid == rmid) /* if record written by this RMID then */
rec_count = rec_count + 1; /* increment count */
lsn = header.prev_lsn; /* go to previous LSN. */
}; /* loop over LSNs */
logtable_close( ); /* close log table */
return rec_count; /* return the answer. */
}; /* */
Gray & Reuter Log
10a: 11
Writing The Log
Add a log record, Log manager fills in header.
LSN log_insert( char * buffer, long n);
/* log body is buffer[0..n-1] */
Force log up to a certain LSN to persistent storage:
LSN log_flush( LSN lsn, Boolean lazy); /**/
(lazy waits for a batch write or timeout == boxcar)
Note: many real interfaces allow some of:
empty buffer: to allow RM to fill it in (avoids data copies)
incremental copy: build the "buffer" in steps.
gather: take log data from many buffers.
Few offer SQL access to the log.
Gray & Reuter Log
10a: 12
Summary Of Log Structure And Verbs
Operations: Open/Close
Read(LSN),
Insert(body),
Flush(LSN)
SQL read operations.
Log Table
header
body
A file
Log pages
in buffer pool
log page header
end of
durable
log
current end of log
B file
empty page in
buffer pool
durable
storage
Pages written in next write
Gray & Reuter Log
10a: 13
Log Anchor Logging and Locking
Log records never updated: only inserted and read.
So no locks needed on log.
Semaphore (or something) needed on "end" of log
to manage space/growth/LSN for inserts
typedef struct {
filename tablename; /* name of log table */
struct log_files;/* A & B file prefix names & active file # */
xsemaphore lock; /* semaphore regulates log write */
LSN prev_lsn; /* LSN of most recent write */
LSN lsn; /* LSN of next record */
LSN durable_lsn; /* max lsn in durable storage */
LSN TM_anchor_lsn; /* lsn of trans mgr's last ckpt */
struct { /* array of open log parts */
long partno; /* partition number */
int os_fnum; /* operating system file # */
} part [MAXOPENS]; /* */
} log_anchor ; /* */
Gray & Reuter Log
10a: 14
Making Optimistic Log Reads Work
Log is duplexed.
Log manager reads only one copy of the page.
What if the "other" copy has more data?
Trick:
read BOTH copies of FIRST and LAST page in log.
Other pages have "full" flag and a timestamp.
IF not full or timestamp < prev_timestamp THEN
read other page and take highest timestamp
Torn log pages
Log page consists of disk sectors (512B).
Write may only write some sectors.
How detect missing fragments?
1. Checksum?
2. Byte stuffing: stuff a “parity” byte on each page
Gray & Reuter Log
10a: 15
Log InsertLog Insert
Log semaphore covers
Incrementing LSN
Finding the log end
filling in the page(s)
allocating space on a page, perhaps allocating new pages.
LSN log_insert( char * buffer, long n) /* insert a log record with body buffer[0..n]*/
/* Acquire the log lock (an exclusive semaphore on the log) */
Xsem_get(&log_anchor.lock); /* lock the log end in exclusive mode */
lsn = log_anchor.lsn; /* make a copy of the record’s lsn. */
/* find page and allocate space in it. */
/* fill in log record header & body */
/* update the anchors */
log_anchor.prev_lsn = lsn; /* log anchor lsn points past this record */
log_anchor.lsn.rba = log_anchor.lsn.rba + rec_len; /* */
Xsem_give(&log_anchor.lock); /* unlock the log end */
return lsn; }; /* return lsn of record just inserted */
Gray & Reuter Log
10a: 16
Log Write Demon
Log Semaphore can be a hotspot so: No IO under semaphore
Allocation (OS requests), and Archiving is done in advance.
Flush to persistent storage (disc) is done asynchronously.
Demons driven by timers and by events (requests)
Demons need not touch end-of-log semaphore
log daemon
to flush
(carefully write)
log pages as needed
log data in shared
memory and on disc
log daemon
to allocate
new log files
as needed
application
programs
resource
managers
log code
Gray & Reuter Log
10a: 17
Careful Writes
If partial pages may be written then
subsequent write may invalidate previous write.
Standard technique:
Serial Writes: write one page then write the second page.
Problem: ~ 1/2 disc bandwidth, 2x delay.
Ping-Pong technique:
Never overwrite good page: Ping-Pong between I and I+1
When complete, assure that page I has final data
Never worse than serial write, generally 2x better.
Also note the careful techniques for optimistic reads and torn pages.
Disc Page
Disc Page
Disc Page
i:
i+1:
Parallel
Ping-Pong
Writes
New Log
Gray & Reuter Log
10a: 18
Group Commit (Boxcaring)
Batch processing of log writes.
If receive 1,000 log force requests/second
why not just execute 50 of them?
Response time will be the same (~20ms).
IOs will be 20x fewer
CPU will be ~ 10x smaller (10x fewer dispatches, 20x fewer OS IO).
Without it, systems are limited to about
50tps no ping-pong
100tps ping-pong.
With it, systems are limited to disc bandwidth >>10ktps.
Group commit threshold can be set automatically.
Gray & Reuter Log
10a: 19
WADS- Giving the Log Disc Zero Latency
Log disc is dedicated, so only has rotational latency.
Reserve some cylinders on the disc as scratch.
For each write:
Write at current position on next track (zero latency).
When have a full-track (or two) of log data
consolidate the write in ram
do a single LARGE write (100KB = 1 rotation) to the log.
cost of this is seek + rotation ~ 20ms.
This reserved area is called the Write Ahead Data Set (WADS).
At restart:
read cylinders
gather recent log data
rewrite end of log.
RAID Write Cache makes this obsolete (if it works).
Gray & Reuter Log
10a: 20
Log: Normal Use
Transaction UNDO During Normal Operation
Transaction log anchor: needed during normal operation
Points to most recent log rec of that transaction.
Follow the transaction prev_lsn chain.
EASY!
Gray & Reuter Log
10a: 21
The Log Anchor: Where It All StartsThe Log Anchor: Where It All Starts
REDO/UNDO at System / RM Restart.
Need to bootstrap the most recent log state.
Log manager is the first to restart
Helps Transaction Manager recover
Transaction manager helps Resource mangers recover.
Alternate design (each RM has its own log).
All this depends on rebuilding the log anchor.
Log Anchor
Transaction Manager
Checkpoint Record
Resource Manager
Checkpoint Records
The Log
Previous Transaction
Manager Checpoint Record
Gray & Reuter Log
10a: 22
Preparing For Restart:
Careful Write of Log Anchor
Use the "standard" careful write techniques:
Put the anchor in a special well-known place(s)
Ping-Pong to 2 or more copies
Timestamp each copy
N-plex the copies on devices with independent failures.
Align copies so that writes are "atomic"
Accept most recent copy on pessimistic reads.
Now TM and RMs can bootstrap:
their anchors are in the log.
Gray & Reuter Log
10a: 23
Finding the End of the Log
Find the anchor
If using WADS, go to the WADS area and write log end.
else Scan forward from the most log-anchor lsn
Read optimistic all full pages.
At 1/2 full page or bad page read pessimistic.
Now have end-of log.
Finish 1/2 finished record at end of log and give to TM
Pages
End of log
Half-finished record
Invalid Page
Pages
End of log
Gray & Reuter Log
10a: 24
Archiving The Log And "Old" Transactions
What if transaction/RM low water mark is 1-month old?
Abort?
Copy aside:
copy the undo/redo log records to a side file
Copy forward:
copy the undo/redo log records forward in the file.
Dynamic log:
copy undo records aside (so can online-undo if needed).
All advance the low water mark.
Gray & Reuter Log
10a: 25
Archiving the Log Online
Log
1
2
2
3 1
3
1 2 3
Archive
Staggered
Allocation of
Log Tables on
Secondary Storage
Gray & Reuter Log
10a: 26
The Safety Spectrum
Just UNDO
transactional storage (no durable log)
Just Online Restart:
keep simplexed durable log.
Online plus Off-line Archive (no single point of failure):
periodic copies of data
duplex log
Electronic vaulting:
archive copies and duplexing is done to remote site.
via fast communications links (or Federal Express).
Gray & Reuter Log
10a: 27
Multiple Logs?
Transaction Manager has a log (DECdtm, MS-DTC,…)
Transaction Monitor has a log (CICS, Tuxedo, ACMS,...)
Each DB instance (3 Oracle, 2 Informix, 4 Rdb) has a log.
Some have 3 logs: UNDO, REDO, SNAPSHOT.
Cons
Lots of tapes/files.
Lots of IOs at commit
Lots of things to break.
Pros:
Portable
Performance (in the 1 RM case)
You decide
Gray & Reuter Log
10a: 28
Client/Server Logging
One server design (can be process pair)
Well known log server in the net.
Client sends a BATCH of log records to the server.
Gets back a LSN
Uses "local" LSNs for his objects.
Log servers can be N-plexed processes.
Multi-server design
Client forms a quorum (majority of servers).
Client sends log batch to all, gets back N-LSNs.
If less than majority, client must poll ALL N servers
Servers synchronize their "logical" logs as "sum" of
physical logs (need a majority).
Gray & Reuter Log
10a: 29
Summary
• Log is a sequential file
• Contains entire history of DB
• Many tricks to write it efficiently and
carefully
• Many tricks to archive and recover it

More Related Content

What's hot

The Ring programming language version 1.2 book - Part 15 of 84
The Ring programming language version 1.2 book - Part 15 of 84The Ring programming language version 1.2 book - Part 15 of 84
The Ring programming language version 1.2 book - Part 15 of 84
Mahmoud Samir Fayed
 
Commands...
Commands...Commands...
Commands...
BRIJESH SINGH
 
Microsoft F# and functional programming
Microsoft F# and functional programmingMicrosoft F# and functional programming
Microsoft F# and functional programming
Radek Mika
 
JDO 2019: Kubernetes logging techniques with a touch of LogSense - Marcin Stożek
JDO 2019: Kubernetes logging techniques with a touch of LogSense - Marcin StożekJDO 2019: Kubernetes logging techniques with a touch of LogSense - Marcin Stożek
JDO 2019: Kubernetes logging techniques with a touch of LogSense - Marcin Stożek
PROIDEA
 
Nested Locks in the Lock Implementation: The Real-Time Read-Write Semaphores ...
Nested Locks in the Lock Implementation: The Real-Time Read-Write Semaphores ...Nested Locks in the Lock Implementation: The Real-Time Read-Write Semaphores ...
Nested Locks in the Lock Implementation: The Real-Time Read-Write Semaphores ...
Daniel Bristot de Oliveira
 
Linuxppt
LinuxpptLinuxppt
LinuxpptReka
 
3.2 process text streams using filters
3.2 process text streams using filters3.2 process text streams using filters
3.2 process text streams using filters
Acácio Oliveira
 
Ora static and-dynamic-listener
Ora static and-dynamic-listenerOra static and-dynamic-listener
Ora static and-dynamic-listener
liu yulin
 
13 coms 525 tcpip - applications - file transfer protocol
13   coms 525 tcpip - applications - file transfer protocol13   coms 525 tcpip - applications - file transfer protocol
13 coms 525 tcpip - applications - file transfer protocol
Palanivel Kuppusamy
 
Chapter 5 notes
Chapter 5 notesChapter 5 notes
Chapter 5 notes
HarshitParkar6677
 
Show an FSM for an implementation of readLine() that accepts lines that satis...
Show an FSM for an implementation of readLine() that accepts lines that satis...Show an FSM for an implementation of readLine() that accepts lines that satis...
Show an FSM for an implementation of readLine() that accepts lines that satis...
hwbloom111
 
Comparison of Unix and Linux Log File Management Tools by Dusan Baljevic
Comparison of Unix and Linux Log File Management Tools by Dusan BaljevicComparison of Unix and Linux Log File Management Tools by Dusan Baljevic
Comparison of Unix and Linux Log File Management Tools by Dusan Baljevic
Circling Cycle
 
Linux commands
Linux commandsLinux commands
Linux commands
Ajaigururaj R
 
patelchodu
patelchodupatelchodu
patelchodu
sammyy502
 
VTU 3RD SEM UNIX AND SHELL PROGRAMMING SOLVED PAPERS
VTU 3RD SEM UNIX AND SHELL PROGRAMMING SOLVED PAPERSVTU 3RD SEM UNIX AND SHELL PROGRAMMING SOLVED PAPERS
VTU 3RD SEM UNIX AND SHELL PROGRAMMING SOLVED PAPERS
vtunotesbysree
 

What's hot (19)

The Ring programming language version 1.2 book - Part 15 of 84
The Ring programming language version 1.2 book - Part 15 of 84The Ring programming language version 1.2 book - Part 15 of 84
The Ring programming language version 1.2 book - Part 15 of 84
 
Commands...
Commands...Commands...
Commands...
 
Microsoft F# and functional programming
Microsoft F# and functional programmingMicrosoft F# and functional programming
Microsoft F# and functional programming
 
JDO 2019: Kubernetes logging techniques with a touch of LogSense - Marcin Stożek
JDO 2019: Kubernetes logging techniques with a touch of LogSense - Marcin StożekJDO 2019: Kubernetes logging techniques with a touch of LogSense - Marcin Stożek
JDO 2019: Kubernetes logging techniques with a touch of LogSense - Marcin Stożek
 
Linuxppt
LinuxpptLinuxppt
Linuxppt
 
Linuxppt
LinuxpptLinuxppt
Linuxppt
 
Ipc in linux
Ipc in linuxIpc in linux
Ipc in linux
 
Nested Locks in the Lock Implementation: The Real-Time Read-Write Semaphores ...
Nested Locks in the Lock Implementation: The Real-Time Read-Write Semaphores ...Nested Locks in the Lock Implementation: The Real-Time Read-Write Semaphores ...
Nested Locks in the Lock Implementation: The Real-Time Read-Write Semaphores ...
 
Linuxppt
LinuxpptLinuxppt
Linuxppt
 
3.2 process text streams using filters
3.2 process text streams using filters3.2 process text streams using filters
3.2 process text streams using filters
 
F T P
F T PF T P
F T P
 
Ora static and-dynamic-listener
Ora static and-dynamic-listenerOra static and-dynamic-listener
Ora static and-dynamic-listener
 
13 coms 525 tcpip - applications - file transfer protocol
13   coms 525 tcpip - applications - file transfer protocol13   coms 525 tcpip - applications - file transfer protocol
13 coms 525 tcpip - applications - file transfer protocol
 
Chapter 5 notes
Chapter 5 notesChapter 5 notes
Chapter 5 notes
 
Show an FSM for an implementation of readLine() that accepts lines that satis...
Show an FSM for an implementation of readLine() that accepts lines that satis...Show an FSM for an implementation of readLine() that accepts lines that satis...
Show an FSM for an implementation of readLine() that accepts lines that satis...
 
Comparison of Unix and Linux Log File Management Tools by Dusan Baljevic
Comparison of Unix and Linux Log File Management Tools by Dusan BaljevicComparison of Unix and Linux Log File Management Tools by Dusan Baljevic
Comparison of Unix and Linux Log File Management Tools by Dusan Baljevic
 
Linux commands
Linux commandsLinux commands
Linux commands
 
patelchodu
patelchodupatelchodu
patelchodu
 
VTU 3RD SEM UNIX AND SHELL PROGRAMMING SOLVED PAPERS
VTU 3RD SEM UNIX AND SHELL PROGRAMMING SOLVED PAPERSVTU 3RD SEM UNIX AND SHELL PROGRAMMING SOLVED PAPERS
VTU 3RD SEM UNIX AND SHELL PROGRAMMING SOLVED PAPERS
 

Viewers also liked

Transactions
TransactionsTransactions
Transactions
ashish61_scs
 
05 tp mon_orbs
05 tp mon_orbs05 tp mon_orbs
05 tp mon_orbs
ashish61_scs
 
21 domino mohan-1
21 domino mohan-121 domino mohan-1
21 domino mohan-1
ashish61_scs
 
Dynamic Power of QFT - Copy
Dynamic Power of QFT - CopyDynamic Power of QFT - Copy
Dynamic Power of QFT - CopyJacques Benoit
 
Assignment1.2012
Assignment1.2012Assignment1.2012
Assignment1.2012
ashish61_scs
 
4G MOBILE COMMUNICATION SYSTEM
4G MOBILE COMMUNICATION SYSTEM4G MOBILE COMMUNICATION SYSTEM
4G MOBILE COMMUNICATION SYSTEM
lalit kumar
 
03 fault model
03 fault model03 fault model
03 fault model
ashish61_scs
 
15 bufferand records
15 bufferand records15 bufferand records
15 bufferand records
ashish61_scs
 
UPDATED INTERVIEW GUIDE
UPDATED INTERVIEW GUIDEUPDATED INTERVIEW GUIDE
UPDATED INTERVIEW GUIDEKhang Vang
 
Emerging Trends and Tech
Emerging Trends and TechEmerging Trends and Tech
Emerging Trends and Tech
Quick Marketing
 

Viewers also liked (13)

Transactions
TransactionsTransactions
Transactions
 
05 tp mon_orbs
05 tp mon_orbs05 tp mon_orbs
05 tp mon_orbs
 
Solution5.2012
Solution5.2012Solution5.2012
Solution5.2012
 
21 domino mohan-1
21 domino mohan-121 domino mohan-1
21 domino mohan-1
 
Dynamic Power of QFT - Copy
Dynamic Power of QFT - CopyDynamic Power of QFT - Copy
Dynamic Power of QFT - Copy
 
pics
picspics
pics
 
Assignment1.2012
Assignment1.2012Assignment1.2012
Assignment1.2012
 
MarkH-CV 2015
MarkH-CV 2015MarkH-CV 2015
MarkH-CV 2015
 
4G MOBILE COMMUNICATION SYSTEM
4G MOBILE COMMUNICATION SYSTEM4G MOBILE COMMUNICATION SYSTEM
4G MOBILE COMMUNICATION SYSTEM
 
03 fault model
03 fault model03 fault model
03 fault model
 
15 bufferand records
15 bufferand records15 bufferand records
15 bufferand records
 
UPDATED INTERVIEW GUIDE
UPDATED INTERVIEW GUIDEUPDATED INTERVIEW GUIDE
UPDATED INTERVIEW GUIDE
 
Emerging Trends and Tech
Emerging Trends and TechEmerging Trends and Tech
Emerging Trends and Tech
 

Similar to 10a log

Application Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.keyApplication Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.key
Tim Bunce
 
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...
Insight Technology, Inc.
 
File system is full - what do i do
File system is full - what do i doFile system is full - what do i do
File system is full - what do i do
Nizar Fanany
 
Inside database
Inside databaseInside database
Inside database
Takashi Hoshino
 
Bluestore
BluestoreBluestore
Bluestore
Patrick McGarry
 
Bluestore
BluestoreBluestore
Bluestore
Ceph Community
 
Sql introduction
Sql introductionSql introduction
Sql introduction
vimal_guru
 
Designing Tracing Tools
Designing Tracing ToolsDesigning Tracing Tools
Designing Tracing Tools
Sysdig
 
Kafka Tiered Storage | Satish Duggana and Sriharsha Chintalapani, Uber
Kafka Tiered Storage | Satish Duggana and Sriharsha Chintalapani, UberKafka Tiered Storage | Satish Duggana and Sriharsha Chintalapani, Uber
Kafka Tiered Storage | Satish Duggana and Sriharsha Chintalapani, Uber
HostedbyConfluent
 
Logging, Serilog, Structured Logging, Seq
Logging, Serilog, Structured Logging, SeqLogging, Serilog, Structured Logging, Seq
Logging, Serilog, Structured Logging, Seq
Doruk Uluçay
 
Profiling the logwriter and database writer
Profiling the logwriter and database writerProfiling the logwriter and database writer
Profiling the logwriter and database writerEnkitec
 
Logging Last Resource Optimization for Distributed Transactions in Oracle We...
Logging Last Resource Optimization for Distributed Transactions in  Oracle We...Logging Last Resource Optimization for Distributed Transactions in  Oracle We...
Logging Last Resource Optimization for Distributed Transactions in Oracle We...Gera Shegalov
 
Transaction unit 1 topic 4
Transaction unit 1 topic 4Transaction unit 1 topic 4
Transaction unit 1 topic 4avniS
 
Example Stream Setup
Example  Stream  SetupExample  Stream  Setup
Example Stream Setup
cfministries
 
Trouble shoot with linux syslog
Trouble shoot with linux syslogTrouble shoot with linux syslog
Trouble shoot with linux syslogashok191
 
Elk stack @inbot
Elk stack @inbotElk stack @inbot
Elk stack @inbot
Jilles van Gurp
 
Boost Your Environment With XMLDB - UKOUG 2008 - Marco Gralike
Boost Your Environment With XMLDB - UKOUG 2008 - Marco GralikeBoost Your Environment With XMLDB - UKOUG 2008 - Marco Gralike
Boost Your Environment With XMLDB - UKOUG 2008 - Marco GralikeMarco Gralike
 
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
Lucidworks
 

Similar to 10a log (20)

Application Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.keyApplication Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.key
 
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...
 
File system is full - what do i do
File system is full - what do i doFile system is full - what do i do
File system is full - what do i do
 
Ui disk & terminal drivers
Ui disk & terminal driversUi disk & terminal drivers
Ui disk & terminal drivers
 
Aries
AriesAries
Aries
 
Inside database
Inside databaseInside database
Inside database
 
Bluestore
BluestoreBluestore
Bluestore
 
Bluestore
BluestoreBluestore
Bluestore
 
Sql introduction
Sql introductionSql introduction
Sql introduction
 
Designing Tracing Tools
Designing Tracing ToolsDesigning Tracing Tools
Designing Tracing Tools
 
Kafka Tiered Storage | Satish Duggana and Sriharsha Chintalapani, Uber
Kafka Tiered Storage | Satish Duggana and Sriharsha Chintalapani, UberKafka Tiered Storage | Satish Duggana and Sriharsha Chintalapani, Uber
Kafka Tiered Storage | Satish Duggana and Sriharsha Chintalapani, Uber
 
Logging, Serilog, Structured Logging, Seq
Logging, Serilog, Structured Logging, SeqLogging, Serilog, Structured Logging, Seq
Logging, Serilog, Structured Logging, Seq
 
Profiling the logwriter and database writer
Profiling the logwriter and database writerProfiling the logwriter and database writer
Profiling the logwriter and database writer
 
Logging Last Resource Optimization for Distributed Transactions in Oracle We...
Logging Last Resource Optimization for Distributed Transactions in  Oracle We...Logging Last Resource Optimization for Distributed Transactions in  Oracle We...
Logging Last Resource Optimization for Distributed Transactions in Oracle We...
 
Transaction unit 1 topic 4
Transaction unit 1 topic 4Transaction unit 1 topic 4
Transaction unit 1 topic 4
 
Example Stream Setup
Example  Stream  SetupExample  Stream  Setup
Example Stream Setup
 
Trouble shoot with linux syslog
Trouble shoot with linux syslogTrouble shoot with linux syslog
Trouble shoot with linux syslog
 
Elk stack @inbot
Elk stack @inbotElk stack @inbot
Elk stack @inbot
 
Boost Your Environment With XMLDB - UKOUG 2008 - Marco Gralike
Boost Your Environment With XMLDB - UKOUG 2008 - Marco GralikeBoost Your Environment With XMLDB - UKOUG 2008 - Marco Gralike
Boost Your Environment With XMLDB - UKOUG 2008 - Marco Gralike
 
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
 

More from ashish61_scs

7 concurrency controltwo
7 concurrency controltwo7 concurrency controltwo
7 concurrency controltwo
ashish61_scs
 
22 levine
22 levine22 levine
22 levine
ashish61_scs
 
20 access paths
20 access paths20 access paths
20 access paths
ashish61_scs
 
19 structured files
19 structured files19 structured files
19 structured files
ashish61_scs
 
18 philbe replication stanford99
18 philbe replication stanford9918 philbe replication stanford99
18 philbe replication stanford99
ashish61_scs
 
17 wics99 harkey
17 wics99 harkey17 wics99 harkey
17 wics99 harkey
ashish61_scs
 
16 greg hope_com_wics
16 greg hope_com_wics16 greg hope_com_wics
16 greg hope_com_wics
ashish61_scs
 
14 turing wics
14 turing wics14 turing wics
14 turing wics
ashish61_scs
 
14 scaleabilty wics
14 scaleabilty wics14 scaleabilty wics
14 scaleabilty wics
ashish61_scs
 
13 tm adv
13 tm adv13 tm adv
13 tm adv
ashish61_scs
 
11 tm
11 tm11 tm
10b rm
10b rm10b rm
10b rm
ashish61_scs
 
09 workflow
09 workflow09 workflow
09 workflow
ashish61_scs
 
08 message and_queues_dieter_gawlick
08 message and_queues_dieter_gawlick08 message and_queues_dieter_gawlick
08 message and_queues_dieter_gawlick
ashish61_scs
 
06 07 lock
06 07 lock06 07 lock
06 07 lock
ashish61_scs
 
04 transaction models
04 transaction models04 transaction models
04 transaction models
ashish61_scs
 
02 fault tolerance
02 fault tolerance02 fault tolerance
02 fault tolerance
ashish61_scs
 
01 whirlwind tour
01 whirlwind tour01 whirlwind tour
01 whirlwind tour
ashish61_scs
 

More from ashish61_scs (20)

7 concurrency controltwo
7 concurrency controltwo7 concurrency controltwo
7 concurrency controltwo
 
22 levine
22 levine22 levine
22 levine
 
20 access paths
20 access paths20 access paths
20 access paths
 
19 structured files
19 structured files19 structured files
19 structured files
 
18 philbe replication stanford99
18 philbe replication stanford9918 philbe replication stanford99
18 philbe replication stanford99
 
17 wics99 harkey
17 wics99 harkey17 wics99 harkey
17 wics99 harkey
 
16 greg hope_com_wics
16 greg hope_com_wics16 greg hope_com_wics
16 greg hope_com_wics
 
14 turing wics
14 turing wics14 turing wics
14 turing wics
 
14 scaleabilty wics
14 scaleabilty wics14 scaleabilty wics
14 scaleabilty wics
 
13 tm adv
13 tm adv13 tm adv
13 tm adv
 
11 tm
11 tm11 tm
11 tm
 
10b rm
10b rm10b rm
10b rm
 
09 workflow
09 workflow09 workflow
09 workflow
 
08 message and_queues_dieter_gawlick
08 message and_queues_dieter_gawlick08 message and_queues_dieter_gawlick
08 message and_queues_dieter_gawlick
 
06 07 lock
06 07 lock06 07 lock
06 07 lock
 
04 transaction models
04 transaction models04 transaction models
04 transaction models
 
02 fault tolerance
02 fault tolerance02 fault tolerance
02 fault tolerance
 
01 whirlwind tour
01 whirlwind tour01 whirlwind tour
01 whirlwind tour
 
Solution6.2012
Solution6.2012Solution6.2012
Solution6.2012
 
Solution7.2012
Solution7.2012Solution7.2012
Solution7.2012
 

Recently uploaded

How to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS ModuleHow to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS Module
Celine George
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
Celine George
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
Special education needs
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
Jisc
 
Introduction to Quality Improvement Essentials
Introduction to Quality Improvement EssentialsIntroduction to Quality Improvement Essentials
Introduction to Quality Improvement Essentials
Excellence Foundation for South Sudan
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
Vivekanand Anglo Vedic Academy
 
ESC Beyond Borders _From EU to You_ InfoPack general.pdf
ESC Beyond Borders _From EU to You_ InfoPack general.pdfESC Beyond Borders _From EU to You_ InfoPack general.pdf
ESC Beyond Borders _From EU to You_ InfoPack general.pdf
Fundacja Rozwoju Społeczeństwa Przedsiębiorczego
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
siemaillard
 
Sectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdfSectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdf
Vivekanand Anglo Vedic Academy
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
TechSoup
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
Pavel ( NSTU)
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
joachimlavalley1
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
MysoreMuleSoftMeetup
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
Vikramjit Singh
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
Sandy Millin
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
Celine George
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
Anna Sz.
 
How to Break the cycle of negative Thoughts
How to Break the cycle of negative ThoughtsHow to Break the cycle of negative Thoughts
How to Break the cycle of negative Thoughts
Col Mukteshwar Prasad
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
MIRIAMSALINAS13
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 

Recently uploaded (20)

How to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS ModuleHow to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS Module
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
 
Introduction to Quality Improvement Essentials
Introduction to Quality Improvement EssentialsIntroduction to Quality Improvement Essentials
Introduction to Quality Improvement Essentials
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
 
ESC Beyond Borders _From EU to You_ InfoPack general.pdf
ESC Beyond Borders _From EU to You_ InfoPack general.pdfESC Beyond Borders _From EU to You_ InfoPack general.pdf
ESC Beyond Borders _From EU to You_ InfoPack general.pdf
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
Sectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdfSectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdf
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
 
How to Break the cycle of negative Thoughts
How to Break the cycle of negative ThoughtsHow to Break the cycle of negative Thoughts
How to Break the cycle of negative Thoughts
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
 

10a log

  • 1. Gray & Reuter Log 10a: 1 Log ManagerLog Manager Jim GrayJim Gray Microsoft, Gray @ Microsoft.comMicrosoft, Gray @ Microsoft.com Andreas ReuterAndreas Reuter International University, Andreas.Reuter@i-u.deInternational University, Andreas.Reuter@i-u.de 9:00 11:00 1:30 3:30 7:00 Overview Faults Tolerance T Models Party TP mons Lock Theory Lock Techniq Queues Workflow Log ResMgr CICS & Inet Adv TM Cyberbrick Files &Buffers COM+ Corba Replication Party B-tree Access Paths Groupware Benchmark Mon Tue Wed Thur Fri
  • 2. Gray & Reuter Log 10a: 2 Log ConceptLog Concept • Log is a history of all changes to the state.Log is a history of all changes to the state. • Log + old state gives new state • Log + new state gives old state (not in this picture) • Log is a sequential file. • Complete log is the complete history • Current state is just a "cache" of the log records. Archive Sunday Master Monday Master Monday Transactions Monday Night Batch Run Monday Master Tuesday Master Tuesday Transactions Tuesday Night Batch Run Tuesday Master Wednesday Transactions Wednesday Night Batch Run Wednesday Master Wednesday Master
  • 3. Gray & Reuter Log 10a: 3 How Log is UsedHow Log is Used • Recovery from faults A redundant copy of the state and transitions • Security audits: Who did what to whom. Often too low-level for this. • Performance Monitor & Accounting: But only records changes (not reads). • ISSUES: Who should be allowed to read the log? It is a security hole. Must authorize access on a per-record basis.
  • 4. Gray & Reuter Log 10a: 4 The Log Manager in the Scheme of Things Interesting thing is the cycle: Need log to recover archive to recover log. Break the cycle with a bootstrap file. Log Manager Transaction Manager Lock Manager Buffer Manager Media Manager SQL & Other Resource Managers Archive Manager Operating System File System File Manager
  • 5. Gray & Reuter Log 10a: 5 Log Is a Sequential File.Log Is a Sequential File. Encapsulation of the log: it is a shared resource. Startup: Log manager holds startup info for all others. Careful writes: Log manager provides a • High performance. • Very reliable • Semi-infinite • Archived Sequential file. Some RMs keep private logs anyway. (Notably PORTABLE DB systems.) Then user or system has to manage multiple logs
  • 6. Gray & Reuter Log 10a: 6 The Log Table Log table is a sequential set (relation). Log Records have standard part and then a log body. Often want to query table via one attribute or another: . RMID, TRID, timestamp, create domain LSN unsigned integer(64); -- log sequence number (file #, rba) create domain RMID unsigned integer; -- resource manager identifier create domain TRID char(12); -- transaction identifier create table log_table ( lsn LSN, -- the record’s log sequence number prev_lsn LSN, -- the lsn of the previous record in log timestamp TIMESTAMP, -- time log record was created resource_manager RMID, -- resource mgr that wrote this record trid TRID, -- id of transaction that wrote this record tran_prev_lsn LSN, -- prev log record of this transaction (or 0) body varchar, -- log data: rm understands it primary key (lsn) -- lsn is primary key foreign key (prev_lsn) -- previous log record in this table references a_log_table(lsn), -- foreign key (tran_prev_lsn) -- transaction's prev log rec also in table references a_log_table(lsn), -- ) entry sequenced; -- inserts go at end of file
  • 7. Gray & Reuter Log 10a: 7 Log is complete historyLog is complete history Log anchor points at chain of each transaction. May maintain other chains. Log records map to sequence of N-plexed files Old files are archived. Eventually, archive files are discarded (weeks, months, never) A files B files Archive lsn prev_lsn resource_mgr trid tran_prev_lsn body Log Table Log Anchor trid, max_lsn, min_lsn...
  • 8. Gray & Reuter Log 10a: 8 The Log LSN Each log record has a logical sequence number. This number (LSN for Log Sequence Number) plays a key role in many algorithms. Key property MONOTONICITY: If action A happened after action B then LSN(A) > LSN(B).
  • 9. Gray & Reuter Log 10a: 9 Reading The Log long log_read_lsn( LSN lsn, /* lsn of record to be read */ log_record_header header, /* header fields of record to be read */ long offset, /* offset into body to start read */ pointer buffer, /* buffer to receive log data */ long n); /* length of buffer */ LSN log_max_lsn(void); /* returns the current maximum lsn of the log table.*/ Read with C (see next slide) or SQL: long sql_count( RMID rmid) /* count log records written by this rmid */ { long rec_count; /* count of records */ exec sql SELECT count (*) /* ask sql to scan log counting records */ INTO :rec_count /* written by the calling resource mgr and */ FROM log_table /* place count in the rec_count */ WHERE resource_manager = :rmid; /* */ return rec_count; /* return the answer. */ };
  • 10. Gray & Reuter Log 10a: 10 Reading the Log: SQL is easier than CReading the Log: SQL is easier than C long c_count( RMID rmid)/* count log records written by this rmid */ { log_record_header header; /* structure to receive log record header */ LSN lsn; /* log sequence number of next log rec */ char buffer[1];/* null buffer to receive log record body. */ long rec_count = 0; /* count of records */ int n = 1; /* size of log body returned */ if (!log_open(READ)) panic(); /* open the log (authorization check)*/ lsn = log_max_lsn( ); /* get most recent lsn */ while (lsn != NullLSN) /* scan backward through the log */ { n = log_read_lsn( lsn, /* lsn of record to be read */ header, /* log record header fields */ 0L, &buffer, 1L );/* log rec body ignored. */ if (header.rmid == rmid) /* if record written by this RMID then */ rec_count = rec_count + 1; /* increment count */ lsn = header.prev_lsn; /* go to previous LSN. */ }; /* loop over LSNs */ logtable_close( ); /* close log table */ return rec_count; /* return the answer. */ }; /* */
  • 11. Gray & Reuter Log 10a: 11 Writing The Log Add a log record, Log manager fills in header. LSN log_insert( char * buffer, long n); /* log body is buffer[0..n-1] */ Force log up to a certain LSN to persistent storage: LSN log_flush( LSN lsn, Boolean lazy); /**/ (lazy waits for a batch write or timeout == boxcar) Note: many real interfaces allow some of: empty buffer: to allow RM to fill it in (avoids data copies) incremental copy: build the "buffer" in steps. gather: take log data from many buffers. Few offer SQL access to the log.
  • 12. Gray & Reuter Log 10a: 12 Summary Of Log Structure And Verbs Operations: Open/Close Read(LSN), Insert(body), Flush(LSN) SQL read operations. Log Table header body A file Log pages in buffer pool log page header end of durable log current end of log B file empty page in buffer pool durable storage Pages written in next write
  • 13. Gray & Reuter Log 10a: 13 Log Anchor Logging and Locking Log records never updated: only inserted and read. So no locks needed on log. Semaphore (or something) needed on "end" of log to manage space/growth/LSN for inserts typedef struct { filename tablename; /* name of log table */ struct log_files;/* A & B file prefix names & active file # */ xsemaphore lock; /* semaphore regulates log write */ LSN prev_lsn; /* LSN of most recent write */ LSN lsn; /* LSN of next record */ LSN durable_lsn; /* max lsn in durable storage */ LSN TM_anchor_lsn; /* lsn of trans mgr's last ckpt */ struct { /* array of open log parts */ long partno; /* partition number */ int os_fnum; /* operating system file # */ } part [MAXOPENS]; /* */ } log_anchor ; /* */
  • 14. Gray & Reuter Log 10a: 14 Making Optimistic Log Reads Work Log is duplexed. Log manager reads only one copy of the page. What if the "other" copy has more data? Trick: read BOTH copies of FIRST and LAST page in log. Other pages have "full" flag and a timestamp. IF not full or timestamp < prev_timestamp THEN read other page and take highest timestamp Torn log pages Log page consists of disk sectors (512B). Write may only write some sectors. How detect missing fragments? 1. Checksum? 2. Byte stuffing: stuff a “parity” byte on each page
  • 15. Gray & Reuter Log 10a: 15 Log InsertLog Insert Log semaphore covers Incrementing LSN Finding the log end filling in the page(s) allocating space on a page, perhaps allocating new pages. LSN log_insert( char * buffer, long n) /* insert a log record with body buffer[0..n]*/ /* Acquire the log lock (an exclusive semaphore on the log) */ Xsem_get(&log_anchor.lock); /* lock the log end in exclusive mode */ lsn = log_anchor.lsn; /* make a copy of the record’s lsn. */ /* find page and allocate space in it. */ /* fill in log record header & body */ /* update the anchors */ log_anchor.prev_lsn = lsn; /* log anchor lsn points past this record */ log_anchor.lsn.rba = log_anchor.lsn.rba + rec_len; /* */ Xsem_give(&log_anchor.lock); /* unlock the log end */ return lsn; }; /* return lsn of record just inserted */
  • 16. Gray & Reuter Log 10a: 16 Log Write Demon Log Semaphore can be a hotspot so: No IO under semaphore Allocation (OS requests), and Archiving is done in advance. Flush to persistent storage (disc) is done asynchronously. Demons driven by timers and by events (requests) Demons need not touch end-of-log semaphore log daemon to flush (carefully write) log pages as needed log data in shared memory and on disc log daemon to allocate new log files as needed application programs resource managers log code
  • 17. Gray & Reuter Log 10a: 17 Careful Writes If partial pages may be written then subsequent write may invalidate previous write. Standard technique: Serial Writes: write one page then write the second page. Problem: ~ 1/2 disc bandwidth, 2x delay. Ping-Pong technique: Never overwrite good page: Ping-Pong between I and I+1 When complete, assure that page I has final data Never worse than serial write, generally 2x better. Also note the careful techniques for optimistic reads and torn pages. Disc Page Disc Page Disc Page i: i+1: Parallel Ping-Pong Writes New Log
  • 18. Gray & Reuter Log 10a: 18 Group Commit (Boxcaring) Batch processing of log writes. If receive 1,000 log force requests/second why not just execute 50 of them? Response time will be the same (~20ms). IOs will be 20x fewer CPU will be ~ 10x smaller (10x fewer dispatches, 20x fewer OS IO). Without it, systems are limited to about 50tps no ping-pong 100tps ping-pong. With it, systems are limited to disc bandwidth >>10ktps. Group commit threshold can be set automatically.
  • 19. Gray & Reuter Log 10a: 19 WADS- Giving the Log Disc Zero Latency Log disc is dedicated, so only has rotational latency. Reserve some cylinders on the disc as scratch. For each write: Write at current position on next track (zero latency). When have a full-track (or two) of log data consolidate the write in ram do a single LARGE write (100KB = 1 rotation) to the log. cost of this is seek + rotation ~ 20ms. This reserved area is called the Write Ahead Data Set (WADS). At restart: read cylinders gather recent log data rewrite end of log. RAID Write Cache makes this obsolete (if it works).
  • 20. Gray & Reuter Log 10a: 20 Log: Normal Use Transaction UNDO During Normal Operation Transaction log anchor: needed during normal operation Points to most recent log rec of that transaction. Follow the transaction prev_lsn chain. EASY!
  • 21. Gray & Reuter Log 10a: 21 The Log Anchor: Where It All StartsThe Log Anchor: Where It All Starts REDO/UNDO at System / RM Restart. Need to bootstrap the most recent log state. Log manager is the first to restart Helps Transaction Manager recover Transaction manager helps Resource mangers recover. Alternate design (each RM has its own log). All this depends on rebuilding the log anchor. Log Anchor Transaction Manager Checkpoint Record Resource Manager Checkpoint Records The Log Previous Transaction Manager Checpoint Record
  • 22. Gray & Reuter Log 10a: 22 Preparing For Restart: Careful Write of Log Anchor Use the "standard" careful write techniques: Put the anchor in a special well-known place(s) Ping-Pong to 2 or more copies Timestamp each copy N-plex the copies on devices with independent failures. Align copies so that writes are "atomic" Accept most recent copy on pessimistic reads. Now TM and RMs can bootstrap: their anchors are in the log.
  • 23. Gray & Reuter Log 10a: 23 Finding the End of the Log Find the anchor If using WADS, go to the WADS area and write log end. else Scan forward from the most log-anchor lsn Read optimistic all full pages. At 1/2 full page or bad page read pessimistic. Now have end-of log. Finish 1/2 finished record at end of log and give to TM Pages End of log Half-finished record Invalid Page Pages End of log
  • 24. Gray & Reuter Log 10a: 24 Archiving The Log And "Old" Transactions What if transaction/RM low water mark is 1-month old? Abort? Copy aside: copy the undo/redo log records to a side file Copy forward: copy the undo/redo log records forward in the file. Dynamic log: copy undo records aside (so can online-undo if needed). All advance the low water mark.
  • 25. Gray & Reuter Log 10a: 25 Archiving the Log Online Log 1 2 2 3 1 3 1 2 3 Archive Staggered Allocation of Log Tables on Secondary Storage
  • 26. Gray & Reuter Log 10a: 26 The Safety Spectrum Just UNDO transactional storage (no durable log) Just Online Restart: keep simplexed durable log. Online plus Off-line Archive (no single point of failure): periodic copies of data duplex log Electronic vaulting: archive copies and duplexing is done to remote site. via fast communications links (or Federal Express).
  • 27. Gray & Reuter Log 10a: 27 Multiple Logs? Transaction Manager has a log (DECdtm, MS-DTC,…) Transaction Monitor has a log (CICS, Tuxedo, ACMS,...) Each DB instance (3 Oracle, 2 Informix, 4 Rdb) has a log. Some have 3 logs: UNDO, REDO, SNAPSHOT. Cons Lots of tapes/files. Lots of IOs at commit Lots of things to break. Pros: Portable Performance (in the 1 RM case) You decide
  • 28. Gray & Reuter Log 10a: 28 Client/Server Logging One server design (can be process pair) Well known log server in the net. Client sends a BATCH of log records to the server. Gets back a LSN Uses "local" LSNs for his objects. Log servers can be N-plexed processes. Multi-server design Client forms a quorum (majority of servers). Client sends log batch to all, gets back N-LSNs. If less than majority, client must poll ALL N servers Servers synchronize their "logical" logs as "sum" of physical logs (need a majority).
  • 29. Gray & Reuter Log 10a: 29 Summary • Log is a sequential file • Contains entire history of DB • Many tricks to write it efficiently and carefully • Many tricks to archive and recover it