| BUFFERIO(9) | Kernel Developer's Manual | BUFFERIO(9) | 
BUFFERIO, biodone,
  biowait, getiobuf,
  putiobuf, nestiobuf_setup,
  nestiobuf_done —
#include <sys/buf.h>
void
  
  biodone(buf_t
    *bp);
int
  
  biowait(buf_t
    *bp);
buf_t *
  
  getiobuf(struct
    vnode *vp, bool
    waitok);
void
  
  putiobuf(buf_t
    *bp);
void
  
  nestiobuf_setup(buf_t
    *mbp, buf_t *bp,
    int offset,
    size_t size);
void
  
  nestiobuf_done(buf_t
    *mbp, int
    donebytes, int
    error);
BUFFERIO subsystem manages block I/O buffer
  transfers, described by the struct buf structure, which
  serves multiple purposes between users in BUFFERIO,
  users in buffercache(9),
  and users in block device drivers to execute transfers to physical disks.
BUFFERIO wishing to submit a buffer for block
  I/O transfer must obtain a struct buf, e.g. via
  getiobuf(), fill its parameters, and submit it to a
  block device with
  bdev_strategy(9), usually
  via VOP_STRATEGY(9).
The parameters to an I/O transfer described by bp are specified by the following struct buf fields:
->b_flagsB_READB_ASYNC->b_iodone and must
          not call
        biowait(bp).B_WRITE, which is zero.->b_data->b_bcount->b_blkno->b_iodoneB_ASYNC must not be set
      in bp->b_flags.Additionally, if the I/O transfer is a write associated with a
    vnode(9)
    vp, then before the user submits it to a block device,
    the user must increment
    vp->v_numoutput. The user
    must not acquire vp's vnode lock between incrementing
    vp->v_numoutput and
    submitting bp to a block device — doing so will
    likely cause deadlock with the syncer.
Block I/O transfer completion may be notified by the
    bp->b_iodone callback, by
    signalling biowait() waiters, or not at all in the
    B_ASYNC case.
->b_iodone callback to
      a non-NULL function pointer, it will be called in
      soft interrupt context when the I/O transfer is complete. The user
      may not call
      biowait(bp) in this
    case.B_ASYNC is set, then the I/O transfer is
      asynchronous and the user will not be notified when it is completed. The
      user may not call
      biowait(bp) in this
    case.->b_iodone is
      NULL and B_ASYNC is not
      specified, the user may wait for the I/O transfer to complete with
      biowait(bp).Once an I/O transfer has completed, its struct
    buf may be reused, but the user must first clear the
    BO_DONE flag of
    bp->b_oflags before reusing
    it.
After initializing the b_flags,
    b_data, and b_bcount
    parameters of an I/O transfer for the buffer, called the
    master buffer, the user can issue smaller transfers for
    segments of the buffer using nestiobuf_setup(). When
    nested I/O transfers complete, in any order, they debit from the amount of
    work left to be done in the master buffer. If any segments of the buffer
    were skipped, the user can report this with
    nestiobuf_done() to debit the skipped part of the
    work.
The master buffer's I/O transfer is completed when all nested
    buffers' I/O transfers are completed, and if
    nestiobuf_done() is called in the case of skipped
    segments.
For writes associated with a vnode vp,
    nestiobuf_setup() accounts for
    vp->v_numoutput, so the
    caller is not allowed to acquire vp's vnode lock
    before submitting the nested I/O transfer to a block device. However, the
    caller is responsible for accounting the master buffer in
    vp->v_numoutput. This must
    be done very carefully because after incrementing
    vp->v_numoutput, the caller
    is not allowed to acquire vp's vnode lock before
    either calling nestiobuf_done() or submitting the
    last nested I/O transfer to a block device.
For example:
struct buf *mbp, *bp;
size_t skipped = 0;
unsigned i;
int error = 0;
mbp = getiobuf(vp, true);
mbp->b_data = data;
mbp->b_resid = mbp->b_bcount = datalen;
mbp->b_flags = B_WRITE;
KASSERT(0 < nsegs);
KASSERT(datalen == nsegs*segsz);
for (i = 0; i < nsegs; i++) {
	struct vnode *devvp;
	daddr_t blkno;
	vn_lock(vp, LK_EXCLUSIVE | LK_RETRY);
	error = VOP_BMAP(vp, i*segsz, &devvp, &blkno, NULL);
	VOP_UNLOCK(vp);
	if (error == 0 && blkno == -1)
		error = EIO;
	if (error) {
		/* Give up early, don't try to handle holes.  */
		skipped += datalen - i*segsz;
		break;
	}
	bp = getiobuf(vp, true);
	nestiobuf_setup(bp, mbp, i*segsz, segsz);
	bp->b_blkno = blkno;
	if (i == nsegs - 1)	/* Last segment.  */
		break;
	VOP_STRATEGY(devvp, bp);
}
/*
 * Account v_numoutput for master write.
 * (Must not vn_lock before last VOP_STRATEGY!)
 */
mutex_enter(&vp->v_interlock);
vp->v_numoutput++;
mutex_exit(&vp->v_interlock);
if (skipped)
	nestiobuf_done(mbp, skipped, error);
else
	VOP_STRATEGY(devvp, bp);
d_strategy member of struct
  bdevsw (driver(9)), to
  queue a buffer for disk I/O. The inputs to the strategy method are:
->b_flagsB_READ->b_data->b_bcount->b_blknoIf the strategy method uses bufq(9), it must additionally initialize the following fields before queueing bp with bufq_put(9):
->b_rawblknoWhen the I/O transfer is complete, whether it succeeded or failed, the strategy method must:
->b_error to zero
      on success, or to an errno(2)
      error code on failure.->b_resid to the
      number of bytes remaining to transfer, whether on success or on failure.
      If no bytes were transferred, this must be set to
      bp->b_bcount.biodone(bp).biodone(bp)To be called by a block device driver. Caller must first set
        bp->b_error to an error
        code and bp->b_resid to
        the number of bytes remaining to transfer.
biowait(bp)->b_error.
    To be called by a user requesting the I/O transfer.
May not be called if bp has a callback
        or is asynchronous — that is, if
        bp->b_iodone is set, or
        if B_ASYNC is set in
        bp->b_flags.
getiobuf(vp,
    waitok)NULL, the transfer
      is associated with it. If waitok is false, returns
      NULL if none can be allocated immediately.
    The resulting struct buf pointer must
        eventually be passed to putiobuf() to release
        it. Do not use
        brelse(9).
The buffer may not be used for an asynchronous I/O transfer,
        because there is no way to know when it is completed and may be safely
        passed to putiobuf(). Asynchronous I/O transfers
        are allowed only for buffers in the
        buffercache(9).
May sleep if waitok is true.
putiobuf(bp)getiobuf(). Either bp must
      never have been submitted to a block device, or the I/O transfer must have
      completed.BUFFERIO subsystem is implemented in
  sys/kern/vfs_bio.c.
BUFFERIO abstraction provides no way to cancel an
  I/O transfer once it has been submitted to a block device.
The BUFFERIO abstraction provides no way
    to do I/O transfers with non-kernel pages, e.g. directly to buffers in
    userland without copying into the kernel first.
The struct buf type is all mixed up with the buffercache(9).
The BUFFERIO abstraction is a totally
    idiotic API design.
The v_numoutput accounting required of
    BUFFERIO callers is asinine.
| September 12, 2019 | NetBSD 10.0 |