Beautiful Code [223]
int
VOP_READ_APV(struct { vop_vector *vop, struct vop_read_args *a)
{
[...]
/*
* Drill down the filesystem layers to find one
* that implements the function or a bypass
*/
while (vop != NULL &&
vop->vop_read == NULL && vop->vop_bypass == NULL)
vop = vop->vop_default;
/* Call the function or the bypass */
if (vop->vop_read != NULL)
rc = vop->vop_read(a);
else
rc = vop->vop_bypass(&a->a_gen);
Elegantly, at the bottom of all filesystem layers lies a filesystem that returns the Unix "operation not supported" error (EOPNOTSUPP) for any function that wasn't implemented by the filesystems layered on top of it. This is our pinball's drain:
#define VOP_EOPNOTSUPP ((void*)(uintptr_t)vop_eopnotsupp)
struct vop_vector default_vnodeops = {
.vop_default = NULL,
.vop_bypass = VOP_EOPNOTSUPP,
}
int
vop_eopnotsupp(struct vop_generic_args *ap)
{
return (EOPNOTSUPP);
}
Another Level of Indirection > From Filesystems to Filesystem Layers
17.3. From Filesystems to Filesystem Layers
For a concrete example of filesystem layering, consider the case where you mount on your computer a remote filesystem using the NFS (Network File System) protocol. Unfortunately, in your case, the user and group identifiers on the remote system don't match those used on your computer. However, by interposing a umapfs filesystem over the actual NFS implementation, we can specify through external files the correct user and group mappings. Figure 17-3 illustrates how some operating system kernel function calls first get routed through the bypass function of umpafs—umap_bypass—before continuing their journey to the corresponding NFS client functions.
In contrast to the null_bypass function, the implementation of umap_bypass actually does some work before making a call to the underlying layer. The vop_generic_args structure passed as its argument contains a description of the actual arguments for each vnode operation:
Code View: Scroll / Show All
/*
* A generic structure.
* This can be used by bypass routines to identify generic arguments.
*/
struct vop_generic_args {
struct vnodeop_desc *a_desc;
/* other random data follows, presumably */
};
/*
* This structure describes the vnode operation taking place.
*/
struct vnodeop_desc {
char *vdesc_name; /* a readable name for debugging */
int vdesc_flags; /* VDESC_* flags */
vop_bypass_t *vdesc_call; /* Function to call */
/*
* These ops are used by bypass routines to map and locate arguments.
* Creds and procs are not needed in bypass routines, but sometimes
* they are useful to (for example) transport layers.
* Nameidata is useful because it has a cred in it.
*/
int *vdesc_vp_offsets; /* list ended by VDESC_NO_OFFSET */
int vdesc_vpp_offset /* return vpp location */
int vdesc_cred_offset; /* cred location, if any */
int vdesc_thread_offset /* thread location, if any *
int vdesc_componentname_offset; /* if any */
};
For instance, the vnodeop_desc structure for the arguments passed to the vop_read operation is the following:
struct vnodeop_desc vop_read_desc = {
"vop_read",
0,
(vop_bypass_t *)VOP_READ_AP,
vop_read_vp_offsets,
VDESC_NO_OFFSET,
VOPARG_OFFSETOF(struct vop_read_args,a_cred),
VDESC_NO_OFFSET,
VDESC_NO_OFFSET,
};
Importantly, apart from the name of the function (used for debugging purposes) and the underlying function to call (VOP_READ_AP), the structure contains in its vdesc_cred_offset field the location of the user credential data field (a_cred) within the read call's arguments. By using this field, umap_bypasscan map the credentials of any vnode operation with the following code:
if (descp->vdesc_cred_offset != VDESC_NO_OFFSET) {
credpp = VOPARG_OFFSETTO(struct ucred**,
descp->vdesc_cred_offset, ap);
/* Save old values */
savecredp = (*credpp);
if (savecredp != NOCRED)
(*credpp) = crdup(savecredp);
credp = *credpp;