ck_ec: event count with OS-assisted blocking #133

pkhuong · 2018-10-28T05:35:56Z

I finally convinced myself that this thing works, and will always do so reliably.

It's x86-TSO only, so I had to introduce some weirdness in the makefiles.

~~TODO: update docs for new vtable~~
~~TODO: test the timeval internal utils~~
~~TODO: make the regression tests work on x86~~
TODONT: final style check
TODONT: man page
~~TODO: benchmarks~~
~~TODO: fix build on !linux x86oids, done, except for validate.~~

To discuss: how do we feel about the type generic macros.

I'm gonna leave the man pages for later ;)

pkhuong · 2018-10-29T22:21:18Z

pkhuong@2delilah ~/ck/r/c/benchmark ♪ [ pwd && make check ] ± ck_ec@10ce30… ⌚ 18:20:05
/home/pkhuong/ck/regressions/ck_ec/benchmark
./ck_ec 24 1
SP ec32
SP ec32_value: 3
SP ec32_wait fast: 4
SP ec32_wait timeout: 21
SP ec32_inc: 8
SP ec32_inc slow: 710
SP ec32_add: 8
SP ec32_add slow: 705

MP ec32
MP ec32_value: 2
MP ec32_wait fast: 4
MP ec32_wait timeout: 20
MP ec32_inc: 32
MP ec32_inc slow: 730
MP ec32_add: 32
MP ec32_add slow: 725

SP ec64
SP ec64_value: 3
SP ec64_wait fast: 4
SP ec64_wait timeout: 20
SP ec64_inc: 8
SP ec64_inc slow: 710
SP ec64_add: 8
SP ec64_add slow: 699

MP ec64
MP ec64_value: 1
MP ec64_wait fast: 4
MP ec64_wait timeout: 21
MP ec64_inc: 31
MP ec64_inc slow: 725
MP ec64_add: 32
MP ec64_add slow: 742
1:21.75s total, 75.03s user, 6.70s system, 99% cpu

pkhuong · 2018-11-04T22:05:51Z

Packed the predicate's argument list in a struct for more ABI stability.

sbahra · 2018-11-07T00:32:30Z

@pkhuong This is great work! Look forward to merging it in. I see a few things building:

prop_test_wakeup.c:17:12: warning: initialization of ‘void (*)(const struct ck_ec_wait_state *, const uint32_t *, uint32_t,  const struct timespec *)’ {aka ‘void (*)(const struct ck_ec_wait_state *, const unsigned int *, unsigned int,  const struct timespec *)’} from incompatible pointer type ‘void (*)(const struct ck_ec_ops *, const uint32_t *, uint32_t,  const struct timespec *)’ {aka ‘void (*)(const struct ck_ec_ops *, const unsigned int *, unsigned int,  const struct timespec *)’} [-Wincompatible-pointer-types]

prop_test_wakeup.c:17:12: note: (near initialization for ‘test_ops.wait32’)
prop_test_wakeup.c:18:12: warning: initialization of ‘void (*)(const struct ck_ec_wait_state *, const uint64_t *, uint64_t,  const struct timespec *)’ {aka ‘void (*)(const struct ck_ec_wait_state *, const long unsigned int *, long unsigned int,  const struct timespec *)’} from incompatible pointer type ‘void (*)(const struct ck_ec_ops *, const uint64_t *, uint64_t,  const struct timespec *)’ {aka ‘void (*)(const struct ck_ec_ops *, const long unsigned int *, long unsigned int,  const struct timespec *)’} [-Wincompatible-pointer-types]
  .wait64 = wait64

Less important, but putting on radar just in case it matters:

../../../include/ck_ec.h:815:23: warning: inlining failed in call to ‘ck_ec64_add.isra.6’: call is unlikely and code size would grow [-Winline]
 CK_CC_INLINE uint64_t ck_ec64_add(struct ck_ec64 *ec```

pkhuong · 2018-11-09T20:30:02Z

Fixed the type mismatch. Tried to get inlining smarter wrt dispatching, but I think we want to let the compiler keep ck_ec_add out of line when it makes sense.

pkhuong · 2018-11-13T16:12:51Z

Assign copyright to Google LLC

sbahra · 2018-11-29T04:37:39Z

@pkhuong - Any other follow-up commits or should I start process of merging down?

ck_ec implements 32- and (on 64 bit platforms) 64- bit event counts. Event counts let us easily integrate OS-level blocking (e.g., futexes) in lock-free protocols. Waking up waiters only locks in the OS kernel, and does not happen at all when no waiter is blocked. Waiters only block conditionally, if the event count's value is still equal to some prior value. ck_ec supports multiple producers (wakers) and consumers (waiters), and, on x86-TSO, has a more efficient specialisation for single producer mode. In the latter mode, the overhead compared to a version counter is on the order of 2-3 cycles and 1-2 instructions, in the fast path. The slow path, when there are threads blocked on the event count, consists of one additional atomic instruction and a futex syscall. Similarly, the fast path for consumers, when an update comes quickly, has no overhead compared to spinning on a read-only counter. After a few thousand cycles, consumers (waiters) enter the slow path with one atomic instruction and a few blocking syscalls. The single-producer specialisation requires the x86-TSO memory model, x86's non-atomic read-modify-write instructions, and, ideally a futex-like OS abstraction. On !x86/x86_64 platforms, single producer increments fall back to the multiple producer code path. Fixes #79

sbahra · 2018-12-03T14:01:41Z

Merging in, I expect additional iteration in master, treating as experimental. Thanks Paul!

pkhuong force-pushed the ck_ec branch from 67a69e7 to 65a3bae Compare October 28, 2018 05:38

pkhuong requested a review from sbahra October 28, 2018 05:39

pkhuong force-pushed the ck_ec branch 7 times, most recently from 5a269a7 to 1352f8a Compare October 29, 2018 18:35

pkhuong force-pushed the ck_ec branch 2 times, most recently from ab6db1e to abb9030 Compare October 29, 2018 22:35

pkhuong changed the title ~~ck_ec: futex-backed event count for linux/x86{,_64}~~ ck_ec: event count with OS-assisted blocking Oct 30, 2018

pkhuong force-pushed the ck_ec branch 4 times, most recently from 22ab421 to 8bdc2ea Compare October 30, 2018 01:20

sbahra requested a review from cognet October 30, 2018 01:40

pkhuong force-pushed the ck_ec branch from d21374f to 0f5a248 Compare November 4, 2018 21:37

pkhuong force-pushed the ck_ec branch 4 times, most recently from 90d1cf5 to 62d393d Compare November 5, 2018 19:05

pkhuong force-pushed the ck_ec branch from 62d393d to dab40f3 Compare November 9, 2018 20:29

pkhuong force-pushed the ck_ec branch from dab40f3 to fe9f41a Compare November 9, 2018 20:31

pkhuong force-pushed the ck_ec branch from fe9f41a to 2b754ac Compare December 1, 2018 15:28

sbahra merged commit a16642f into master Dec 3, 2018

sbahra deleted the ck_ec branch May 29, 2019 22:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ck_ec: event count with OS-assisted blocking #133

ck_ec: event count with OS-assisted blocking #133

pkhuong commented Oct 28, 2018 •

edited

Loading

pkhuong commented Oct 29, 2018 •

edited

Loading

pkhuong commented Nov 4, 2018

sbahra commented Nov 7, 2018 •

edited

Loading

pkhuong commented Nov 9, 2018

pkhuong commented Nov 13, 2018 •

edited

Loading

sbahra commented Nov 29, 2018

sbahra commented Dec 3, 2018

ck_ec: event count with OS-assisted blocking #133

ck_ec: event count with OS-assisted blocking #133

Conversation

pkhuong commented Oct 28, 2018 • edited Loading

pkhuong commented Oct 29, 2018 • edited Loading

pkhuong commented Nov 4, 2018

sbahra commented Nov 7, 2018 • edited Loading

pkhuong commented Nov 9, 2018

pkhuong commented Nov 13, 2018 • edited Loading

sbahra commented Nov 29, 2018

sbahra commented Dec 3, 2018

pkhuong commented Oct 28, 2018 •

edited

Loading

pkhuong commented Oct 29, 2018 •

edited

Loading

sbahra commented Nov 7, 2018 •

edited

Loading

pkhuong commented Nov 13, 2018 •

edited

Loading