Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Synthetic" Benchmark for PIConGPU #131

Open
ax3l opened this issue Oct 5, 2016 · 1 comment
Open

"Synthetic" Benchmark for PIConGPU #131

ax3l opened this issue Oct 5, 2016 · 1 comment

Comments

@ax3l
Copy link
Member

ax3l commented Oct 5, 2016

@psychocoderHPC @slizzered we should make an additional benchmark setup that is close to the usage of PIConGPU particle allocations.

e.g., "allocate and free N chunks of few KB of (particle) data per second" (from T threads)

We will need such a benchmark since with hardware such as knights landing and Power 8/9 we could even be new-bound on the host side and need to know at which level of concurrency this will kick in.

Related to #96 and #130

@bussmann @juckel this might be an interesting task for the next many-core lecture (HOPS+CO)

@ax3l ax3l changed the title "Stynthetic" Benchmark for PIConGPU "Synthetic" Benchmark for PIConGPU Dec 5, 2016
@tdd11235813
Copy link

this is planned as a GPU students final project for this year. Currently preparing a plan, like:

  • benchmark code into /benchmarks
    • measuring mallocMC alloc + free performance
    • allocate chunks and perform random + stream access (for upcoming page migration test)
  • new allocation policy to get unified memory (cudaMallocManaged) (for testing page migration)

What do you think?

Note that unified memory does not work with IPC.
This is currently only for CUDA, but benchmarks will be necessary for hip-clang too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants