Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build a Dendrogram class, adapt Louvain/Leiden/ECG to use it #1359

Merged
merged 21 commits into from
Feb 5, 2021
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
ab335b7
create a dendogram in the C++ code
ChuckHastings Jan 7, 2021
4d984c6
fix clang format issues
ChuckHastings Jan 7, 2021
9c276ae
update copyright date
ChuckHastings Jan 7, 2021
0ded26d
Merge branch 'branch-0.18' into fea_louvain_dendogram
ChuckHastings Jan 14, 2021
637aee1
New idea for ECG
ChuckHastings Jan 21, 2021
7187bdb
Merge branch 'branch-0.18' into fea_louvain_dendogram
ChuckHastings Jan 21, 2021
347051c
rename ECG to make it consistent
ChuckHastings Jan 21, 2021
755c298
missed renaming ECG in CMakeLists.txt
ChuckHastings Jan 21, 2021
142024c
Merge branch 'branch-0.18' into fea_louvain_dendogram
ChuckHastings Jan 27, 2021
07ad090
fix spelling of dendrogram, fix clang formatting issues
ChuckHastings Jan 27, 2021
becbc63
Merge branch 'branch-0.18' into fea_louvain_dendogram
ChuckHastings Feb 2, 2021
bc3f26d
address comments from PR
ChuckHastings Feb 3, 2021
a3dd26d
Merge branch 'branch-0.18' into fea_louvain_dendogram
ChuckHastings Feb 3, 2021
84d3bf1
add checks for Pascal architecture in Louvain derived tests
ChuckHastings Feb 4, 2021
2e89e2c
reformat and update copyright dates
ChuckHastings Feb 4, 2021
825d6af
a few more flake8 errors
ChuckHastings Feb 4, 2021
4f0453e
Merge branch 'branch-0.18' into fea_louvain_dendogram
ChuckHastings Feb 4, 2021
9afc8de
refactor notebook tests to make it easier to filter tests; filter out…
ChuckHastings Feb 5, 2021
3db5697
delete some unused code in script
ChuckHastings Feb 5, 2021
d40f5a0
misspelled update from Rick
ChuckHastings Feb 5, 2021
9532b13
add early breaks, fix copyright dates
ChuckHastings Feb 5, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion cpp/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -357,7 +357,7 @@ add_library(cugraph SHARED
src/community/louvain.cu
src/community/leiden.cu
src/community/ktruss.cu
src/community/ECG.cu
src/community/ecg.cu
src/community/triangles_counting.cu
src/community/extract_subgraph_by_vertex.cu
src/cores/core_number.cu
Expand Down
61 changes: 61 additions & 0 deletions cpp/src/community/dendrogram.cuh
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
/*
* Copyright (c) 2021, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#pragma once

#include <memory>
#include <rmm/device_buffer.hpp>
#include <vector>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any guidance on ordering include statements?

What I do is including more local headers first (rmm headers are more local than std C++ headers) like the following.

#include <rmm/device_buffer.hpp>

#include <memory>
#include <vector>

AFAIK, this is quite widely used in RAPIDS, and I started to follow this practice, and we may better stick to this unless we have other guidelines?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had forgotten to leave the blank line and clang-format reordered them. This will be updated in my next push


namespace cugraph {

template <typename vertex_t>
class Dendrogram {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's our guideline on class naming? IIRC, we're asked to use dendrogram_t instead of Dendrogram.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like class names to start with a capital letter. dendrogram_t should be a type (which technically could be a class )

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a mix in our code base. I prefer the capital letter starting a class name, and using the _t suffix on type parameters in the templates.

We should codify our coding standards better.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I thought the same: naming convention is dendogram_t. If not, then what is the rule of naming types (a class is a type...)?

public:
Dendrogram() : level_size_(), level_ptr_() {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this? This does not do anything special, so can't we just rely on the default constructor (automatically generated by the compiler).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Residual. An earlier version (pre-PR) had something that was actually initialized here. When I separated the code for flattening the Dendrogram out of the class (in preparation for moving to RAFT), the constructor became simpler.

Removed in next push.


void add_level(vertex_t num_verts)
{
cudaStream_t stream{0};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not stream ready. Better follow rmm's approach?

https://github.com/rapidsai/rmm/blob/branch-0.18/include/rmm/device_buffer.hpp#L289

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That might be more than we need, especially since RAFT (where this class will be moved in 0.19) doesn't use RMM the same way. I'll push a change to partially address this. Let me know if you think I should go further than the next push or if that's sufficient for now (since we'll have to refactor this method for RAFT integration).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We at least need a mechanism to specify the stream to execute this. The RMM way is one example (and I think following the RMM style is good to enforce some consistency across RAPIDS projects) but if this is too much and there is a simpler way, I am OK as well. But we at least need a mechanism to specify the stream to execute this.

rmm::mr::device_memory_resource *mr = rmm::mr::get_current_device_resource();

level_ptr_.push_back(
std::make_unique<rmm::device_buffer>(num_verts * sizeof(vertex_t), stream, mr));
level_size_.push_back(num_verts);
}

size_t current_level() const { return level_size_.size() - 1; }

size_t num_levels() const { return level_size_.size(); }

vertex_t *get_level_ptr_unsafe(size_t level) const
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just FYI,

I have used nocheck instead of unsafe, but I think unsafe is a better name.

If no one opposes, I will submit a PR replacing nocheck with unsafe to be more consistent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Little more on this.

I just realized that I used nocheck following cudf.

https://github.com/rapidsai/cudf/blob/branch-0.18/cpp/include/cudf/column/column_device_view.cuh#L167

cudf is using unsafe as well (e.g. set_bit_unsafe) but nocheck is used in cudf::column_device_view which is more widely used than bit manipulation utilities.

This being said, any preference between nocheck vs unsafe?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should probably switch to nocheck. I recalled using the unsafe suffix in some previous work and it seemed to be appropriate here. However, after your comment I looked back at some of that. Generally what I have done in the past is:

  • nocheck suffix indicating don't check bounds
  • unsafe suffix indicating the method is not thread-safe

That seems like a better and more consistent naming approach.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to nocheck in next push

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is dangerous to have a const getter return a non-const pointer to a member. Something like this, for example, can happen:

template<typename T>
void foo(Dendrogram<T> const& d)
{
  d.get_level_ptr_unsafe(0)[0] = T{3};
}

int main(void)
{
  Dendrogram<int> d;

  foo(d);

  return 0;
}

foo() should not be permitted to modify the const& argument, d. But it can. This breaks the contract that the getter is supposed to fulfill (namely being an immutable getter).
(to compile this simple example, I changed device_buffer to std::vector<vertex_t>, but the idea is the same).

At the very least, get_level_ptr_unsafe() should not be a const member.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, that was careless. Next push will address this.

There should be a const version that returns a const pointer and a non-const version that returns a non-const pointer as it is used in both contexts.

{
return static_cast<vertex_t *>(level_ptr_[level]->data());
}

vertex_t get_level_size_unsafe(size_t level) const { return level_size_[level]; }

vertex_t *current_level_begin() const { return get_level_ptr_unsafe(current_level()); }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about making this a non-const getter.


vertex_t *current_level_end() const { return current_level_begin() + current_level_size(); }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about making this a non-const getter.


vertex_t current_level_size() const { return get_level_size_unsafe(current_level()); }

private:
std::vector<vertex_t> level_size_;
std::vector<std::unique_ptr<rmm::device_buffer>> level_ptr_;
};

} // namespace cugraph
88 changes: 61 additions & 27 deletions cpp/src/community/ECG.cu → cpp/src/community/ecg.cu
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,11 @@

#include <rmm/thrust_rmm_allocator.h>
#include <thrust/random.h>
#include <community/louvain.cuh>
#include <converters/permute_graph.cuh>
#include <ctime>
#include <utilities/error.hpp>
#include "utilities/graph_utils.cuh"
#include <utilities/graph_utils.cuh>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better order include statements?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do that in the next push.


namespace {
template <typename IndexType>
Expand All @@ -42,25 +43,20 @@ binsearch_maxle(const IndexType *vec, const IndexType val, IndexType low, IndexT
}

template <typename IdxT, typename ValT>
__global__ void match_check_kernel(IdxT size,
IdxT num_verts,
IdxT *offsets,
IdxT *indices,
IdxT *permutation,
IdxT *parts,
ValT *weights)
__global__ void match_check_kernel(
IdxT size, IdxT num_verts, IdxT *offsets, IdxT *indices, IdxT *parts, ValT *weights)
Copy link
Contributor

@seunghwak seunghwak Feb 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we better use thrust?

thrust::transform(
...
[...]__device__(auto ...) {
  auto source = thrust::upper_bound(thrust::seq, ...);
  ...
}
);

What do we get by writing a custom kernel?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't know. This custom kernel was already there. My clang-format restructured this line.

I'd rather address this issue in the MNMG ECG implementation when we refactor this than to do this now. I'm adding a FIXME to remind me to do that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree w/ @seunghwak . The long-held policy has been to avoid custom-written kernels as much as possible. And use thrust / CUB instead.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me restate my response, perhaps I was too terse.

I agree, this seems like a place where we should just use thrust/cub. There's no reason that I can think of that this custom kernel would be better than a thrust/cub implementation. There's nothing in this kernel that takes advantage of features that would make a custom kernel perform better, and generally it increases the maintenance cost of the code.

I modified this file (ecg.cu) solely for the purpose of using the Dendrogram class, since the ECG algorithm calls Louvain and I modified Louvain. I'm reluctant, in general, to rewrite things in the code that I tangentially touch. It tends to make the work continue without getting anything finished and merged into the baseline.

The only reason this was even in the list of changes is because I ran clang-format and it changed the format of these lines of code. I made no changes relevant to this kernel or the calls to it.

I have a branch where I have started the MNMG ECG work. In that branch I will completely refactor how the ECG code works. My intention is to remove this kernel in that branch which hopefully will be merged sometime during release 0.19.

If we think it's critical that I do this for release 0.18, I can certainly take a few hours and refactor this once I'm done addressing the device_uvector request.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are you using "IdxT" rather than "index_t" ? It seems like we are not using capitalized types anywhere else

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an existing kernel that has existed for a while. All I did was reformat.

I'd like to defer this until I finish implementing MNMG ECG when I will be reworking all of this code. (@seunghwak suggested I delete the kernel entirely).

{
IdxT tid = blockIdx.x * blockDim.x + threadIdx.x;
while (tid < size) {
IdxT source = binsearch_maxle(offsets, tid, (IdxT)0, num_verts);
IdxT dest = indices[tid];
if (parts[permutation[source]] == parts[permutation[dest]]) weights[tid] += 1;
if (parts[source] == parts[dest]) weights[tid] += 1;
tid += gridDim.x * blockDim.x;
}
}

struct prg {
__host__ __device__ float operator()(int n)
__device__ float operator()(int n)
{
thrust::default_random_engine rng;
thrust::uniform_real_distribution<float> dist(0.0, 1.0);
Expand Down Expand Up @@ -103,6 +99,31 @@ void get_permutation_vector(T size, T seed, T *permutation, cudaStream_t stream)
rmm::exec_policy(stream)->on(stream), randoms_v.begin(), randoms_v.end(), permutation);
}

template <typename graph_type>
class EcgLouvain : public cugraph::Louvain<graph_type> {
public:
using graph_t = graph_type;
using vertex_t = typename graph_type::vertex_type;
using edge_t = typename graph_type::edge_type;
using weight_t = typename graph_type::weight_type;

EcgLouvain(raft::handle_t const &handle, graph_type const &graph, vertex_t seed)
: cugraph::Louvain<graph_type>(handle, graph), seed_(seed)
{
}

void initialize_dendrogram_level(vertex_t num_vertices)
{
this->dendrogram_->add_level(num_vertices);

get_permutation_vector(
num_vertices, seed_, this->dendrogram_->current_level_begin(), this->stream_);
}

private:
vertex_t seed_;
};

} // anonymous namespace

namespace cugraph {
Expand All @@ -114,37 +135,51 @@ void ecg(raft::handle_t const &handle,
vertex_t ensemble_size,
vertex_t *clustering)
{
using graph_type = GraphCSRView<vertex_t, edge_t, weight_t>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should try and get rid of legacy classes in SG Louvain and ECG to use graph_t. Or the MG path could be used to support the 1 GPU case.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's work currently scheduled for 0.19 that will adapt all of these to use the graph primitives. We should be able to consider getting rid of the legacy versions at that point.


CUGRAPH_EXPECTS(graph.edge_data != nullptr,
"Invalid input argument: louvain expects a weighted graph");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be - ECG only works on a weighted graph. Is there an option to use default weights?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From python all of the clustering algorithms will specify a default weight if no weight is included on the graph.

From C++ the user is expected to provide a weight.

Will fix the old error message in next push.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why public inheritance? If no method is overriden (virtual) then public inheritance is too much of an exposure (hence creating unnecessary coupling). Why not use private inheritance?

Moreover, if Louvain class is meant to be publicly derived (and it seems it is since it exposes at least one virtual method) then it should have a virtual destructor.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, confusion about class naming convention, shouldn't be <small_caps>_t? I understand that some are older (legacy) classes, but perhaps new classed might follow the rule?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a virtual destructor in the next push. I have clarified a few methods with virtual and override keywords to make the inheritance more clear (and possibly correct, it seems like one of the implementations might have been broken without it, although I got neither compiler errors nor erroneous results - maybe I got lucky).

EcgLouvain and Leiden are both publicly derived from Louvain. They override different virtual methods.
virtual weight_t operator()(size_t max_level, weight_t resolution) is the primary entry point for all 3 classes. EcgLouvain uses the parent version, Leiden provide its own override. EcgLouvain overrides the initialize_dendrogram_level function but otherwise behaves the same as Louvain.

CUGRAPH_EXPECTS(clustering != nullptr, "Invalid input argument: clustering is NULL");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you expand this to state what clustering should be set to, or point to docs

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improved the error message in next push.


// TODO: New idea... rather than creating a permuted graph
// Observe that because of the up/down behavior... the only difference graph ordering
// has on the computation is in deciding what moves in each up/down pass.
// This *should be* equivalent to the following:
// Instead of initializing each vertex to be in a cluster by itself that
// is equal to cluster id, initialize each vertex to be in a cluster by
// itself equal to a random id. For each random run of a 1-level Louvain,
// the cluster ids would be randomized differently.
//
// For MNMG, we could assume an equal distribution across the GPUs and
// generate to a pattern so that we don't need to do any comms. It wouldn't
// be completely random, but it should result in an appropriate amount of
// randomness to get the desired effect.
//
// Note that this would require implementing the highest level Louvain function,
// or splitting the initialization into an overridable function...
//
//
// MNMG issue... in order to create an MNMG implementation we need to create
// a distributed implementation of get_permutation_vector, preferably without
// comms...
//

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an issue to address this TODO? If so, add reference. If not, should there be? I would prefer not to lose sight that these TODOs need to be addressed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleted the TODO (and the others in this file). These were notes to myself related to refactoring ECG for MNMG.

The work described in this note is already in an MNMG ECG branch that stalled when @seunghwak and I discussed a new graph primitive that is required.

cudaStream_t stream{0};

rmm::device_vector<weight_t> ecg_weights_v(graph.edge_data,
graph.edge_data + graph.number_of_edges);

vertex_t size{graph.number_of_vertices};
// TODO: This seed should be a parameter
// TODO: MNMG - should add the rank to the seed so every thread is a separate seed
vertex_t seed{1};

auto permuted_graph = std::make_unique<GraphCSR<vertex_t, edge_t, weight_t>>(
size, graph.number_of_edges, graph.has_data());

// Iterate over each member of the ensemble
for (vertex_t i = 0; i < ensemble_size; i++) {
// Take random permutation of the graph
rmm::device_vector<vertex_t> permutation_v(size);
vertex_t *d_permutation = permutation_v.data().get();

get_permutation_vector(size, seed, d_permutation, stream);
EcgLouvain<graph_type> runner(handle, graph, seed);
seed += size;

detail::permute_graph<vertex_t, edge_t, weight_t>(graph, d_permutation, permuted_graph->view());

// Run one level of Louvain clustering on the random permutation
rmm::device_vector<vertex_t> parts_v(size);
vertex_t *d_parts = parts_v.data().get();

cugraph::louvain(handle, permuted_graph->view(), d_parts, size_t{1});
weight_t wt = runner(size_t{1}, weight_t{1});

// For each edge in the graph determine whether the endpoints are in the same partition
// Keep a sum for each edge of the total number of times its endpoints are in the same partition
Expand All @@ -155,8 +190,7 @@ void ecg(raft::handle_t const &handle,
graph.number_of_vertices,
graph.offsets,
graph.indices,
permutation_v.data().get(),
d_parts,
runner.get_dendrogram().get_level_ptr_unsafe(0),
ecg_weights_v.data().get());
}

Expand Down
52 changes: 52 additions & 0 deletions cpp/src/community/flatten_dendrogram.cuh
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
/*
* Copyright (c) 2021, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#pragma once

#include <rmm/thrust_rmm_allocator.h>
#include <community/dendrogram.cuh>
#include <experimental/graph_functions.hpp>
#include <raft/handle.hpp>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better order include statements?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in next push.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seunghwak, include statements order is dictated by clang-format

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you put blank lines in between blocks of statements, clang-format will honor the blocks.


namespace cugraph {

template <typename vertex_t, bool multi_gpu>
void partition_at_level(raft::handle_t const &handle,
Dendrogram<vertex_t> const &dendrogram,
vertex_t const *d_vertex_ids,
vertex_t *d_partition,
size_t level)
{
vertex_t local_num_verts = dendrogram.get_level_size_unsafe(0);

thrust::copy(rmm::exec_policy(handle.get_stream())->on(handle.get_stream()),
d_vertex_ids,
d_vertex_ids + local_num_verts,
d_partition);

std::for_each(thrust::make_counting_iterator<size_t>(0),
thrust::make_counting_iterator<size_t>(level),
[&handle, &dendrogram, d_vertex_ids, &d_partition, local_num_verts](size_t l) {
cugraph::experimental::relabel<vertex_t, multi_gpu>(
handle,
std::tuple<vertex_t const *, vertex_t const *>(
d_vertex_ids, dendrogram.get_level_ptr_unsafe(l)),
dendrogram.get_level_size_unsafe(l),
d_partition,
local_num_verts);
});
}

} // namespace cugraph
21 changes: 19 additions & 2 deletions cpp/src/community/leiden.cu
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2020, NVIDIA CORPORATION.
* Copyright (c) 2020-2021, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand All @@ -14,6 +14,7 @@
* limitations under the License.
*/

#include <community/flatten_dendrogram.cuh>
#include <community/leiden.cuh>

namespace cugraph {
Expand All @@ -30,8 +31,24 @@ std::pair<size_t, weight_t> leiden(raft::handle_t const &handle,
CUGRAPH_EXPECTS(clustering != nullptr, "Invalid input argument: clustering is null");

Leiden<GraphCSRView<vertex_t, edge_t, weight_t>> runner(handle, graph);
weight_t wt = runner(max_level, resolution);

return runner(clustering, max_level, resolution);
thrust::device_vector<vertex_t> vertex_ids_v(graph.number_of_vertices);
Copy link
Contributor

@seunghwak seunghwak Feb 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're not dealing with non-arithmetic types, rmm::device_uvector will be more efficient and more stream ready.

Note that rmm::device_uvector does not invoke default constructor to initialize vector elements, so if your code expects initialization, you need to call thrust::fill().

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed in the next push.


thrust::copy(rmm::exec_policy(handle.get_stream())->on(handle.get_stream()),
thrust::make_counting_iterator<vertex_t>(0), // MNMG - base vertex id
thrust::make_counting_iterator<vertex_t>(
graph.number_of_vertices), // MNMG - base vertex id + number_of_vertices
vertex_ids_v.begin());

partition_at_level<vertex_t, false>(handle,
runner.get_dendrogram(),
vertex_ids_v.data().get(),
clustering,
runner.get_dendrogram().num_levels());

// FIXME: Consider returning the Dendrogram at some point
return std::make_pair(runner.get_dendrogram().num_levels(), wt);
}

// Explicit template instantations
Expand Down
Loading