Behaviour of fsync() in case of write-back errors

Previous Topic Next Topic
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

Behaviour of fsync() in case of write-back errors

Thomas Munro

I work on PostgreSQL.  I don't use OpenBSD, but recently I've been
investigating how fsync() reports write-back errors on all operating
systems that people like to run PostgreSQL on:

It seems to me that on OpenBSD, asynchronous write-back errors might
not be reported to userspace in a subsequent call to fsync(), and
synchronous write-back errors that are reported to userspace might not
be reported in a follow-up call to fsync() (that is, retrying will
appear to be successful but in fact your data is gone).

Am I wrong?  Perhaps some other code elsewhere will record the error
at device, filesystem or inode level?  I didn't try to test this: I
simply compared the brelse() error handling code with that of FreeBSD
and NetBSD whose behaviours are known to be correct and incorrect
respectively, according to our assessment of what fsync() *should* do.
(Or at least the behaviour that PostgreSQL relies on, when it reports
that your data exists on disk as part of its checkpoint protocol).
OpenBSD certainly appears to be like NetBSD in this respect, so I
thought it was worth pinging your list and asking for an expert

Do you think that my suspicion is correct?  Would you consider that to
be a bug?


Thomas Munro