Skip to content

Commit

Permalink
eventfd: don't take the spinlock in eventfd_poll
Browse files Browse the repository at this point in the history
The spinlock in eventfd_poll is trying to protect the count of events so
it can decide if it should return POLLIN, POLLERR, or POLLOUT.  But,
because of the way we drop the lock after calling poll_wait, and drop it
again before returning, we have the same pile of races with the lock as
we do with a single read of ctx->count().

This replaces the lock with a read barrier and single read.

eventfd_write does a single bump of ctx->count, so this should not add
new races with adding events.  eventfd_read is similar, it will do a
single decrement with the lock held, and so we're making the race with
concurrent readers slightly larger.

This spinlock is the top CPU user in kernel code during one of our
workloads.  Removing it gives us a ~2% boost.

[[email protected]: avoid unused variable warning]
[[email protected]: type bug in eventfd_poll()]
Signed-off-by: Chris Mason <[email protected]>
Cc: Davide Libenzi <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
  • Loading branch information
masoncl authored and torvalds committed Feb 17, 2015
1 parent 7647f14 commit e22553e
Showing 1 changed file with 6 additions and 6 deletions.
12 changes: 6 additions & 6 deletions fs/eventfd.c
Original file line number Diff line number Diff line change
Expand Up @@ -118,18 +118,18 @@ static unsigned int eventfd_poll(struct file *file, poll_table *wait)
{
struct eventfd_ctx *ctx = file->private_data;
unsigned int events = 0;
unsigned long flags;
u64 count;

poll_wait(file, &ctx->wqh, wait);
smp_rmb();
count = ctx->count;

spin_lock_irqsave(&ctx->wqh.lock, flags);
if (ctx->count > 0)
if (count > 0)
events |= POLLIN;
if (ctx->count == ULLONG_MAX)
if (count == ULLONG_MAX)
events |= POLLERR;
if (ULLONG_MAX - 1 > ctx->count)
if (ULLONG_MAX - 1 > count)
events |= POLLOUT;
spin_unlock_irqrestore(&ctx->wqh.lock, flags);

return events;
}
Expand Down

0 comments on commit e22553e

Please sign in to comment.