Skip to content

Commit

Permalink
Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kern…
Browse files Browse the repository at this point in the history
…el/git/viro/vfs

Pull the big VFS changes from Al Viro:
 "This one is *big* and changes quite a few things around VFS.  What's in there:

   - the first of two really major architecture changes - death to open
     intents.

     The former is finally there; it was very long in making, but with
     Miklos getting through really hard and messy final push in
     fs/namei.c, we finally have it.  Unlike his variant, this one
     doesn't introduce struct opendata; what we have instead is
     ->atomic_open() taking preallocated struct file * and passing
     everything via its fields.

     Instead of returning struct file *, it returns -E...  on error, 0
     on success and 1 in "deal with it yourself" case (e.g.  symlink
     found on server, etc.).

     See comments before fs/namei.c:atomic_open().  That made a lot of
     goodies finally possible and quite a few are in that pile:
     ->lookup(), ->d_revalidate() and ->create() do not get struct
     nameidata * anymore; ->lookup() and ->d_revalidate() get lookup
     flags instead, ->create() gets "do we want it exclusive" flag.

     With the introduction of new helper (kern_path_locked()) we are rid
     of all struct nameidata instances outside of fs/namei.c; it's still
     visible in namei.h, but not for long.  Come the next cycle,
     declaration will move either to fs/internal.h or to fs/namei.c
     itself.  [me, miklos, hch]

   - The second major change: behaviour of final fput().  Now we have
     __fput() done without any locks held by caller *and* not from deep
     in call stack.

     That obviously lifts a lot of constraints on the locking in there.
     Moreover, it's legal now to call fput() from atomic contexts (which
     has immediately simplified life for aio.c).  We also don't need
     anti-recursion logics in __scm_destroy() anymore.

     There is a price, though - the damn thing has become partially
     asynchronous.  For fput() from normal process we are guaranteed
     that pending __fput() will be done before the caller returns to
     userland, exits or gets stopped for ptrace.

     For kernel threads and atomic contexts it's done via
     schedule_work(), so theoretically we might need a way to make sure
     it's finished; so far only one such place had been found, but there
     might be more.

     There's flush_delayed_fput() (do all pending __fput()) and there's
     __fput_sync() (fput() analog doing __fput() immediately).  I hope
     we won't need them often; see warnings in fs/file_table.c for
     details.  [me, based on task_work series from Oleg merged last
     cycle]

   - sync series from Jan

   - large part of "death to sync_supers()" work from Artem; the only
     bits missing here are exofs and ext4 ones.  As far as I understand,
     those are going via the exofs and ext4 trees resp.; once they are
     in, we can put ->write_super() to the rest, along with the thread
     calling it.

   - preparatory bits from unionmount series (from dhowells).

   - assorted cleanups and fixes all over the place, as usual.

  This is not the last pile for this cycle; there's at least jlayton's
  ESTALE work and fsfreeze series (the latter - in dire need of fixes,
  so I'm not sure it'll make the cut this cycle).  I'll probably throw
  symlink/hardlink restrictions stuff from Kees into the next pile, too.
  Plus there's a lot of misc patches I hadn't thrown into that one -
  it's large enough as it is..."

* 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (127 commits)
  ext4: switch EXT4_IOC_RESIZE_FS to mnt_want_write_file()
  btrfs: switch btrfs_ioctl_balance() to mnt_want_write_file()
  switch dentry_open() to struct path, make it grab references itself
  spufs: shift dget/mntget towards dentry_open()
  zoran: don't bother with struct file * in zoran_map
  ecryptfs: don't reinvent the wheels, please - use struct completion
  don't expose I_NEW inodes via dentry->d_inode
  tidy up namei.c a bit
  unobfuscate follow_up() a bit
  ext3: pass custom EOF to generic_file_llseek_size()
  ext4: use core vfs llseek code for dir seeks
  vfs: allow custom EOF in generic_file_llseek code
  vfs: Avoid unnecessary WB_SYNC_NONE writeback during sys_sync and reorder sync passes
  vfs: Remove unnecessary flushing of block devices
  vfs: Make sys_sync writeout also block device inodes
  vfs: Create function for iterating over block devices
  vfs: Reorder operations during sys_sync
  quota: Move quota syncing to ->sync_fs method
  quota: Split dquot_quota_sync() to writeback and cache flushing part
  vfs: Move noop_backing_dev_info check from sync into writeback
  ...
  • Loading branch information
torvalds committed Jul 23, 2012
2 parents a6be1fc + 8cae6f7 commit a66d2c8
Show file tree
Hide file tree
Showing 210 changed files with 2,607 additions and 2,396 deletions.
11 changes: 7 additions & 4 deletions Documentation/filesystems/Locking
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ be able to use diff(1).

--------------------------- dentry_operations --------------------------
prototypes:
int (*d_revalidate)(struct dentry *, struct nameidata *);
int (*d_revalidate)(struct dentry *, unsigned int);
int (*d_hash)(const struct dentry *, const struct inode *,
struct qstr *);
int (*d_compare)(const struct dentry *, const struct inode *,
Expand Down Expand Up @@ -37,9 +37,8 @@ d_manage: no no yes (ref-walk) maybe

--------------------------- inode_operations ---------------------------
prototypes:
int (*create) (struct inode *,struct dentry *,umode_t, struct nameidata *);
struct dentry * (*lookup) (struct inode *,struct dentry *, struct nameid
ata *);
int (*create) (struct inode *,struct dentry *,umode_t, bool);
struct dentry * (*lookup) (struct inode *,struct dentry *, unsigned int);
int (*link) (struct dentry *,struct inode *,struct dentry *);
int (*unlink) (struct inode *,struct dentry *);
int (*symlink) (struct inode *,struct dentry *,const char *);
Expand All @@ -62,6 +61,9 @@ ata *);
int (*removexattr) (struct dentry *, const char *);
int (*fiemap)(struct inode *, struct fiemap_extent_info *, u64 start, u64 len);
void (*update_time)(struct inode *, struct timespec *, int);
int (*atomic_open)(struct inode *, struct dentry *,
struct file *, unsigned open_flag,
umode_t create_mode, int *opened);

locking rules:
all may block
Expand Down Expand Up @@ -89,6 +91,7 @@ listxattr: no
removexattr: yes
fiemap: no
update_time: no
atomic_open: yes

Additionally, ->rmdir(), ->unlink() and ->rename() have ->i_mutex on
victim.
Expand Down
21 changes: 15 additions & 6 deletions Documentation/filesystems/porting
Original file line number Diff line number Diff line change
Expand Up @@ -355,12 +355,10 @@ protects *all* the dcache state of a given dentry.
via rcu-walk path walk (basically, if the file can have had a path name in the
vfs namespace).

i_dentry and i_rcu share storage in a union, and the vfs expects
i_dentry to be reinitialized before it is freed, so an:

INIT_LIST_HEAD(&inode->i_dentry);

must be done in the RCU callback.
Even though i_dentry and i_rcu share storage in a union, we will
initialize the former in inode_init_always(), so just leave it alone in
the callback. It used to be necessary to clean it there, but not anymore
(starting at 3.2).

--
[recommended]
Expand Down Expand Up @@ -433,3 +431,14 @@ release it yourself.
d_alloc_root() is gone, along with a lot of bugs caused by code
misusing it. Replacement: d_make_root(inode). The difference is,
d_make_root() drops the reference to inode if dentry allocation fails.

--
[mandatory]
The witch is dead! Well, 2/3 of it, anyway. ->d_revalidate() and
->lookup() do *not* take struct nameidata anymore; just the flags.
--
[mandatory]
->create() doesn't take struct nameidata *; unlike the previous
two, it gets "is it an O_EXCL or equivalent?" boolean argument. Note that
local filesystems can ignore tha argument - they are guaranteed that the
object doesn't exist. It's remote/distributed ones that might care...
23 changes: 17 additions & 6 deletions Documentation/filesystems/vfs.txt
Original file line number Diff line number Diff line change
Expand Up @@ -341,8 +341,8 @@ This describes how the VFS can manipulate an inode in your
filesystem. As of kernel 2.6.22, the following members are defined:

struct inode_operations {
int (*create) (struct inode *,struct dentry *, umode_t, struct nameidata *);
struct dentry * (*lookup) (struct inode *,struct dentry *, struct nameidata *);
int (*create) (struct inode *,struct dentry *, umode_t, bool);
struct dentry * (*lookup) (struct inode *,struct dentry *, unsigned int);
int (*link) (struct dentry *,struct inode *,struct dentry *);
int (*unlink) (struct inode *,struct dentry *);
int (*symlink) (struct inode *,struct dentry *,const char *);
Expand All @@ -364,6 +364,9 @@ struct inode_operations {
ssize_t (*listxattr) (struct dentry *, char *, size_t);
int (*removexattr) (struct dentry *, const char *);
void (*update_time)(struct inode *, struct timespec *, int);
int (*atomic_open)(struct inode *, struct dentry *,
struct file *, unsigned open_flag,
umode_t create_mode, int *opened);
};

Again, all methods are called without any locks being held, unless
Expand Down Expand Up @@ -476,6 +479,14 @@ otherwise noted.
an inode. If this is not defined the VFS will update the inode itself
and call mark_inode_dirty_sync.

atomic_open: called on the last component of an open. Using this optional
method the filesystem can look up, possibly create and open the file in
one atomic operation. If it cannot perform this (e.g. the file type
turned out to be wrong) it may signal this by returning 1 instead of
usual 0 or -ve . This method is only called if the last
component is negative or needs lookup. Cached positive dentries are
still handled by f_op->open().

The Address Space Object
========================

Expand Down Expand Up @@ -891,7 +902,7 @@ the VFS uses a default. As of kernel 2.6.22, the following members are
defined:

struct dentry_operations {
int (*d_revalidate)(struct dentry *, struct nameidata *);
int (*d_revalidate)(struct dentry *, unsigned int);
int (*d_hash)(const struct dentry *, const struct inode *,
struct qstr *);
int (*d_compare)(const struct dentry *, const struct inode *,
Expand All @@ -910,11 +921,11 @@ struct dentry_operations {
dcache. Most filesystems leave this as NULL, because all their
dentries in the dcache are valid

d_revalidate may be called in rcu-walk mode (nd->flags & LOOKUP_RCU).
d_revalidate may be called in rcu-walk mode (flags & LOOKUP_RCU).
If in rcu-walk mode, the filesystem must revalidate the dentry without
blocking or storing to the dentry, d_parent and d_inode should not be
used without care (because they can go NULL), instead nd->inode should
be used.
used without care (because they can change and, in d_inode case, even
become NULL under us).

If a situation is encountered that rcu-walk cannot handle, return
-ECHILD and it will be called again in ref-walk mode.
Expand Down
48 changes: 18 additions & 30 deletions arch/powerpc/platforms/cell/spufs/inode.c
Original file line number Diff line number Diff line change
Expand Up @@ -317,28 +317,23 @@ spufs_mkdir(struct inode *dir, struct dentry *dentry, unsigned int flags,
return ret;
}

static int spufs_context_open(struct dentry *dentry, struct vfsmount *mnt)
static int spufs_context_open(struct path *path)
{
int ret;
struct file *filp;

ret = get_unused_fd();
if (ret < 0) {
dput(dentry);
mntput(mnt);
goto out;
}
if (ret < 0)
return ret;

filp = dentry_open(dentry, mnt, O_RDONLY, current_cred());
filp = dentry_open(path, O_RDONLY, current_cred());
if (IS_ERR(filp)) {
put_unused_fd(ret);
ret = PTR_ERR(filp);
goto out;
return PTR_ERR(filp);
}

filp->f_op = &spufs_context_fops;
fd_install(ret, filp);
out:
return ret;
}

Expand Down Expand Up @@ -453,6 +448,7 @@ spufs_create_context(struct inode *inode, struct dentry *dentry,
int affinity;
struct spu_gang *gang;
struct spu_context *neighbor;
struct path path = {.mnt = mnt, .dentry = dentry};

ret = -EPERM;
if ((flags & SPU_CREATE_NOSCHED) &&
Expand Down Expand Up @@ -495,11 +491,7 @@ spufs_create_context(struct inode *inode, struct dentry *dentry,
put_spu_context(neighbor);
}

/*
* get references for dget and mntget, will be released
* in error path of *_open().
*/
ret = spufs_context_open(dget(dentry), mntget(mnt));
ret = spufs_context_open(&path);
if (ret < 0) {
WARN_ON(spufs_rmdir(inode, dentry));
if (affinity)
Expand Down Expand Up @@ -556,46 +548,42 @@ spufs_mkgang(struct inode *dir, struct dentry *dentry, umode_t mode)
return ret;
}

static int spufs_gang_open(struct dentry *dentry, struct vfsmount *mnt)
static int spufs_gang_open(struct path *path)
{
int ret;
struct file *filp;

ret = get_unused_fd();
if (ret < 0) {
dput(dentry);
mntput(mnt);
goto out;
}
if (ret < 0)
return ret;

filp = dentry_open(dentry, mnt, O_RDONLY, current_cred());
/*
* get references for dget and mntget, will be released
* in error path of *_open().
*/
filp = dentry_open(path, O_RDONLY, current_cred());
if (IS_ERR(filp)) {
put_unused_fd(ret);
ret = PTR_ERR(filp);
goto out;
return PTR_ERR(filp);
}

filp->f_op = &simple_dir_operations;
fd_install(ret, filp);
out:
return ret;
}

static int spufs_create_gang(struct inode *inode,
struct dentry *dentry,
struct vfsmount *mnt, umode_t mode)
{
struct path path = {.mnt = mnt, .dentry = dentry};
int ret;

ret = spufs_mkgang(inode, dentry, mode & S_IRWXUGO);
if (ret)
goto out;

/*
* get references for dget and mntget, will be released
* in error path of *_open().
*/
ret = spufs_gang_open(dget(dentry), mntget(mnt));
ret = spufs_gang_open(&path);
if (ret < 0) {
int err = simple_rmdir(inode, dentry);
WARN_ON(err);
Expand Down
98 changes: 41 additions & 57 deletions drivers/base/devtmpfs.c
Original file line number Diff line number Diff line change
Expand Up @@ -227,33 +227,24 @@ static int handle_create(const char *nodename, umode_t mode, struct device *dev)

static int dev_rmdir(const char *name)
{
struct nameidata nd;
struct path parent;
struct dentry *dentry;
int err;

err = kern_path_parent(name, &nd);
if (err)
return err;

mutex_lock_nested(&nd.path.dentry->d_inode->i_mutex, I_MUTEX_PARENT);
dentry = lookup_one_len(nd.last.name, nd.path.dentry, nd.last.len);
if (!IS_ERR(dentry)) {
if (dentry->d_inode) {
if (dentry->d_inode->i_private == &thread)
err = vfs_rmdir(nd.path.dentry->d_inode,
dentry);
else
err = -EPERM;
} else {
err = -ENOENT;
}
dput(dentry);
dentry = kern_path_locked(name, &parent);
if (IS_ERR(dentry))
return PTR_ERR(dentry);
if (dentry->d_inode) {
if (dentry->d_inode->i_private == &thread)
err = vfs_rmdir(parent.dentry->d_inode, dentry);
else
err = -EPERM;
} else {
err = PTR_ERR(dentry);
err = -ENOENT;
}

mutex_unlock(&nd.path.dentry->d_inode->i_mutex);
path_put(&nd.path);
dput(dentry);
mutex_unlock(&parent.dentry->d_inode->i_mutex);
path_put(&parent);
return err;
}

Expand Down Expand Up @@ -305,50 +296,43 @@ static int dev_mynode(struct device *dev, struct inode *inode, struct kstat *sta

static int handle_remove(const char *nodename, struct device *dev)
{
struct nameidata nd;
struct path parent;
struct dentry *dentry;
struct kstat stat;
int deleted = 1;
int err;

err = kern_path_parent(nodename, &nd);
if (err)
return err;
dentry = kern_path_locked(nodename, &parent);
if (IS_ERR(dentry))
return PTR_ERR(dentry);

mutex_lock_nested(&nd.path.dentry->d_inode->i_mutex, I_MUTEX_PARENT);
dentry = lookup_one_len(nd.last.name, nd.path.dentry, nd.last.len);
if (!IS_ERR(dentry)) {
if (dentry->d_inode) {
err = vfs_getattr(nd.path.mnt, dentry, &stat);
if (!err && dev_mynode(dev, dentry->d_inode, &stat)) {
struct iattr newattrs;
/*
* before unlinking this node, reset permissions
* of possible references like hardlinks
*/
newattrs.ia_uid = 0;
newattrs.ia_gid = 0;
newattrs.ia_mode = stat.mode & ~0777;
newattrs.ia_valid =
ATTR_UID|ATTR_GID|ATTR_MODE;
mutex_lock(&dentry->d_inode->i_mutex);
notify_change(dentry, &newattrs);
mutex_unlock(&dentry->d_inode->i_mutex);
err = vfs_unlink(nd.path.dentry->d_inode,
dentry);
if (!err || err == -ENOENT)
deleted = 1;
}
} else {
err = -ENOENT;
if (dentry->d_inode) {
struct kstat stat;
err = vfs_getattr(parent.mnt, dentry, &stat);
if (!err && dev_mynode(dev, dentry->d_inode, &stat)) {
struct iattr newattrs;
/*
* before unlinking this node, reset permissions
* of possible references like hardlinks
*/
newattrs.ia_uid = 0;
newattrs.ia_gid = 0;
newattrs.ia_mode = stat.mode & ~0777;
newattrs.ia_valid =
ATTR_UID|ATTR_GID|ATTR_MODE;
mutex_lock(&dentry->d_inode->i_mutex);
notify_change(dentry, &newattrs);
mutex_unlock(&dentry->d_inode->i_mutex);
err = vfs_unlink(parent.dentry->d_inode, dentry);
if (!err || err == -ENOENT)
deleted = 1;
}
dput(dentry);
} else {
err = PTR_ERR(dentry);
err = -ENOENT;
}
mutex_unlock(&nd.path.dentry->d_inode->i_mutex);
dput(dentry);
mutex_unlock(&parent.dentry->d_inode->i_mutex);

path_put(&nd.path);
path_put(&parent);
if (deleted && strchr(nodename, '/'))
delete_path(nodename);
return err;
Expand Down
4 changes: 3 additions & 1 deletion drivers/media/video/zoran/zoran.h
Original file line number Diff line number Diff line change
Expand Up @@ -172,8 +172,10 @@ struct zoran_jpg_settings {
struct v4l2_jpegcompression jpg_comp; /* JPEG-specific capture settings */
};

struct zoran_fh;

struct zoran_mapping {
struct file *file;
struct zoran_fh *fh;
int count;
};

Expand Down
4 changes: 2 additions & 2 deletions drivers/media/video/zoran/zoran_driver.c
Original file line number Diff line number Diff line change
Expand Up @@ -2811,7 +2811,7 @@ static void
zoran_vm_close (struct vm_area_struct *vma)
{
struct zoran_mapping *map = vma->vm_private_data;
struct zoran_fh *fh = map->file->private_data;
struct zoran_fh *fh = map->fh;
struct zoran *zr = fh->zr;
int i;

Expand Down Expand Up @@ -2938,7 +2938,7 @@ zoran_mmap (struct file *file,
res = -ENOMEM;
goto mmap_unlock_and_return;
}
map->file = file;
map->fh = fh;
map->count = 1;

vma->vm_ops = &zoran_vm_ops;
Expand Down
Loading

0 comments on commit a66d2c8

Please sign in to comment.