[9fans] could not write super block; waiting 10 seconds

Discussion:

[9fans] could not write super block; waiting 10 seconds

(too old to reply)

Richard Miller

2012-03-26 10:18:17 UTC

Has anyone else been unsettled by the occasional messages from
fossil saying (1) "could not write super block; waiting 10 seconds"
and (2) "blistAlloc: called on clean block"?

Patch fossil-superblock-write gets rid of them.

(1) When taking a snapshot, blockWrite in cache.c is called to write
an updated super block S, which has a pointer to the root block R
for the new epoch. To maintain consistency on the disk, R must be
written before S, so blockWrite checks whether R is still in the
cache and marked dirty. Very rarely, blockWrite finds R locked (eg
because the flush thread is just now writing it), so it gives up and
returns zero. The zero return is OK when blockWrite is called by
the flush thread, because the flush thread can get on with writing
out other blocks before coming back to try the failed block again.
But when blockWrite is called by superWrite, there's nothing else to
do; hence the 10 second sleep and warning message. The solution is
to add a waitlock parameter to blockWrite, so superWrite can tell it
to wait for a locked dependent block.

(2) After the new super block S is sent to the disk write queue,
superWrite removes the previous epoch's root block R' from the
active file system. This is normally done by attaching a BList
entry to S in the cache, noting that R' must be marked closed after
S actually goes to the disk. Rarely, S has already been written by
the time blistAlloc is called. In this case the correct thing was
being done (just close R' immediately), but a spurious warning was
produced.

Russ Cox

2012-03-26 12:04:03 UTC

Post by Richard Miller
(1) When taking a snapshot, blockWrite in cache.c is called to write
an updated super block S, which has a pointer to the root block R
for the new epoch. To maintain consistency on the disk, R must be
written before S, so blockWrite checks whether R is still in the
cache and marked dirty. Very rarely, blockWrite finds R locked (eg
because the flush thread is just now writing it), so it gives up and
returns zero. The zero return is OK when blockWrite is called by
the flush thread, because the flush thread can get on with writing
out other blocks before coming back to try the failed block again.
But when blockWrite is called by superWrite, there's nothing else to
do; hence the 10 second sleep and warning message. The solution is
to add a waitlock parameter to blockWrite, so superWrite can tell it
to wait for a locked dependent block.
(2) After the new super block S is sent to the disk write queue,
superWrite removes the previous epoch's root block R' from the
active file system. This is normally done by attaching a BList
entry to S in the cache, noting that R' must be marked closed after
S actually goes to the disk. Rarely, S has already been written by
the time blistAlloc is called. In this case the correct thing was
being done (just close R' immediately), but a spurious warning was
produced.

Than you for cleaning these up. These are both things that
I meant to come back to some day, but I never did.

Russ

Continue reading on narkive:

Search results for '[9fans] could not write super block; waiting 10 seconds' (Questions and Answers)

I want to quit school and write? :S?

started 2009-05-10 06:55:28 UTC

books & authors

Are there laws regulating how long a train can block an intersection?

started 2009-11-06 20:03:11 UTC

write a brief history of Mc Donald's :)?

started 2006-11-27 06:19:17 UTC

higher education (university +)

How do I learn to not throw away the stories I write?

started 2008-05-14 16:17:35 UTC

books & authors

What Do You Think Of The Book I Wrote.(ps im 12)?

started 2008-09-20 22:45:57 UTC

books & authors

1 Reply
2 Views
Permalink to this page
Disable enhanced parsing

Thread Navigation

Richard Miller 2012-03-26 10:18:17 UTC

Russ Cox 2012-03-26 12:04:03 UTC

about - legalese

Loading...