Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update v2.0-pre #953

Merged
merged 65 commits into from
Apr 14, 2022
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
65 commits
Select commit Hold shift + click to select a range
9ac1e78
Update doc URL. (#821)
csukuangfj Sep 8, 2021
bbe0ded
Support indexing 2-axes RaggedTensor, Support slicing for RaggedTenso…
pkufool Sep 14, 2021
2c28070
Prune with max_arcs in IntersectDense (#820)
pkufool Sep 14, 2021
210175c
Release v1.8
pkufool Sep 14, 2021
33a212c
Create a ragged tensor from a regular tensor. (#827)
csukuangfj Sep 15, 2021
971af7d
Trigger GitHub actions manually. (#829)
csukuangfj Sep 16, 2021
646704e
Run GitHub actions on merging. (#830)
csukuangfj Sep 16, 2021
8030001
Support printing ragged tensors in a more compact way. (#831)
csukuangfj Sep 17, 2021
d73a5b5
Add levenshtein alignment (#828)
pkufool Sep 19, 2021
f2fd997
Release v1.9
pkufool Sep 19, 2021
601d663
Support a[b[i]] where both a and b are ragged tensors. (#833)
csukuangfj Sep 25, 2021
8694fee
Display import error solution message on MacOS (#837)
pzelasko Sep 30, 2021
86e5479
Fix installation doc. (#841)
csukuangfj Oct 8, 2021
b72589c
fix typos in the install instructions (#844)
jtrmal Oct 13, 2021
6ac9795
make cmake adhere to the modernized way of finding packages outside d…
jtrmal Oct 13, 2021
2537a3f
import torch first in the smoke tests to preven SEGFAULT (#846)
jtrmal Oct 14, 2021
cae610a
Add doc about how to install a CPU version of k2. (#850)
csukuangfj Oct 23, 2021
d061bc6
Support PyTorch 1.10. (#851)
csukuangfj Oct 24, 2021
7178d67
Fix test cases for k2.union() (#853)
csukuangfj Oct 26, 2021
e6db5dc
Fix out-of-boundary access (read). (#859)
csukuangfj Nov 2, 2021
e8c589a
Update all the example codes in the docs (#861)
luomingshuang Nov 4, 2021
fd5565d
Fix compilation errors with CUB 1.15. (#865)
csukuangfj Nov 10, 2021
bdcaaf8
Update README. (#873)
csukuangfj Nov 12, 2021
31e1307
Fix ctc graph (make aux_labels of final arcs -1) (#877)
pkufool Nov 19, 2021
12f5915
Fix LICENSE location to k2 folder (#880)
lumaku Nov 24, 2021
a0d75c8
Release v1.11. (#881)
csukuangfj Nov 29, 2021
2cb3eea
Update documentation for hash.h (#887)
danpovey Dec 5, 2021
aab2dd7
Wrap MonotonicLowerBound (#883)
pkufool Dec 14, 2021
5517b3e
Remove extra commas after 'TOPSORTED' properity and fix RaggedTensor …
drawfish Dec 25, 2021
5f4cc79
Fix small typos (#896)
danpovey Jan 6, 2022
e799928
Fix k2.ragged.create_ragged_shape2 (#901)
csukuangfj Jan 13, 2022
d6323d5
Add rnnt loss (#891)
pkufool Jan 17, 2022
d3fbb1b
Use more efficient way to fix boundaries (#906)
pkufool Jan 25, 2022
9a91ec6
Release v1.12 (#907)
pkufool Jan 25, 2022
3367c7f
Change the sign of the rnnt_loss and add reduction argument (#911)
pkufool Jan 29, 2022
779a9bd
Fix building doc. (#908)
csukuangfj Jan 29, 2022
47c4b75
Fix building doc (#912)
pkufool Jan 29, 2022
cf32e2d
Support torch 1.10.x (#914)
csukuangfj Feb 8, 2022
9e7b2a9
Update INSTALL.rst (#915)
alexei-v-ivanov Feb 8, 2022
43ed450
Fix torch/cuda/python versions in the doc. (#918)
csukuangfj Feb 10, 2022
f4fefe4
Fix building for CUDA 11.6 (#917)
csukuangfj Feb 10, 2022
56edc82
Implement Unstack (#920)
pkufool Feb 20, 2022
854b792
SubsetRagged & PruneRagged (#919)
pkufool Feb 20, 2022
3cc74f1
Add Hash64 (#895)
pkufool Feb 22, 2022
0feefc7
Modified rnnt (#902)
pkufool Feb 25, 2022
2239c39
Fix Stack (#925)
wgb14 Feb 25, 2022
5ee082e
Fix 'TypeError' of rnnt_loss_pruned function. (#924)
drawfish Feb 27, 2022
36e2b8d
Support torch 1.11.0 and CUDA 11.5 (#931)
csukuangfj Mar 15, 2022
f4b4247
Implement Rnnt decoding (#926)
pkufool Mar 16, 2022
9a0d72c
fix building docs (#933)
pkufool Mar 16, 2022
6833270
Release v1.14
pkufool Mar 16, 2022
613e03d
Remove unused DiscountedCumSum. (#936)
csukuangfj Mar 17, 2022
281378f
Fix compiler warnings. (#937)
csukuangfj Mar 17, 2022
10b9423
Minor fixes for RNN-T decoding. (#938)
csukuangfj Mar 19, 2022
846c39c
Removes arcs with label 0 from the TrivialGraph. (#939)
csukuangfj Mar 29, 2022
0f65420
Implement linear_fsa_with_self_loops. (#940)
csukuangfj Mar 29, 2022
a830c60
Fix the pruning with max-states (#941)
pkufool Mar 30, 2022
8c28c86
Rnnt allow different encoder/decoder dims (#945)
danpovey Apr 3, 2022
d977865
Supporting building k2 on Windows (#946)
csukuangfj Apr 6, 2022
a4d76d2
Fix nightly windows CPU build (#948)
csukuangfj Apr 7, 2022
4fb6b88
Check the versions of PyTorch and CUDA at the import time. (#949)
csukuangfj Apr 8, 2022
9ebd757
More straightforward message when CUDA support is missing (#950)
nshmyrev Apr 11, 2022
3b83183
Implement ArrayOfRagged (#927)
LvHang Apr 12, 2022
1b29f0a
Fix precision (#951)
pkufool Apr 13, 2022
93d528a
Merge branch 'master' into v2.0
pkufool Apr 14, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Implement ArrayOfRagged (#927)
* Implement ArrayOfRagged

* Fix issues and pass tests

* fix style

* change few statements of functions and move the definiation of template Array1OfRagged to header file

* add offsets test code
  • Loading branch information
LvHang authored Apr 12, 2022
commit 3b83183234d0f1d8391872630551c5af7c491ed2
104 changes: 89 additions & 15 deletions k2/csrc/array_of_ragged.cu
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
/**
* Copyright 2022 Xiaomi Corporation (authors: Wei Kang)
* Copyright 2022 Xiaomi Corporation (authors: Daniel Povey, Wei Kang)
* 2022 ASLP@NWPU (authors: Hang Lyu)

*
* See LICENSE for clarification regarding multiple authors
*
Expand All @@ -20,35 +22,107 @@

namespace k2 {

Array1OfRaggedShape::Array1OfRaggedShape(RaggedShape *src, int32_t num_srcs)
: num_srcs_(num_srcs) {
K2_CHECK_GE(num_srcs, 1);
K2_CHECK(src);
num_axes_ = src[0].NumAxes();
c_ = src[0].Context();
Array1OfRaggedShape::Array1OfRaggedShape(RaggedShape *srcs, int32_t num_srcs) :
num_srcs_(num_srcs) {
K2_CHECK_GT(num_srcs, 0);
K2_CHECK(srcs);

// Initialize context and num_axes_.
c_ = srcs[0].Context();
num_axes_ = srcs[0].NumAxes();

// Check if they have same num-axes and compatible context.
for (int32_t i = 1; i < num_srcs_; ++i) {
K2_CHECK_EQ(num_axes_, srcs[i].NumAxes());
K2_CHECK(c_->IsCompatible(*(srcs[i].Context())));
}

row_splits_ =
Array2<const int32_t *>(GetCpuContext(), num_axes_ - 1, num_srcs_);
// Initialize row_splits__, row_ids_ and tot_sizes_.
//
// Notice: since the Data() function is a __host__ function, it cannot be
// called on GPU. It limits us to work on CPU so that the row_splits_ and
// row_ids_ are populated on CPU, although the operator() of Array2 is a
// __host__ and __device__ function. Bear in mind, we cannot access the
// GPU data on CPU.
row_splits_ = Array2<const int32_t *>(GetCpuContext(),
num_axes_ - 1, num_srcs_);
row_ids_ = Array2<const int32_t *>(GetCpuContext(), num_axes_ - 1, num_srcs_);

// Notice: no matter the return value of TotSize() is from 'cached_tot_size'
// or the Back() function (i.e. operator[]) of array1, it it a CPU value.
tot_sizes_ = Array1<int32_t>(GetCpuContext(), num_axes_, 0);

auto row_splits_acc = row_splits_.Accessor(),
row_ids_acc = row_ids_.Accessor();
// Bear in mind, when axis == 0, the TotSize() is row_splits.Dim() - 1.
// When 0 < axis < NumAxes(), the TotSize() is row_splits.Back().
int32_t *tot_sizes_data = tot_sizes_.Data();

for (int32_t i = 0; i < num_srcs_; ++i) {
K2_CHECK_EQ(src[i].NumAxes(), num_axes_);
K2_CHECK(c_->IsCompatible(*(src[i].Context())));
for (int32_t j = 1; j < num_axes_; ++j) {
row_splits_acc(j - 1, i) = src[i].RowSplits(j).Data();
row_ids_acc(j - 1, i) = src[i].RowIds(j).Data();
tot_sizes_data[j] += src[i].TotSize(j);
row_splits_acc(j - 1, i) = srcs[i].RowSplits(j).Data();
row_ids_acc(j - 1, i) = srcs[i].RowIds(j).Data();
tot_sizes_data[j] += srcs[i].TotSize(j);
}
tot_sizes_data[0] += src[i].TotSize(0);
tot_sizes_data[0] += srcs[i].TotSize(0);
}

row_splits_ = row_splits_.To(c_);
row_ids_ = row_ids_.To(c_);
tot_sizes_ = tot_sizes_.To(c_);


// Initialize meat_row_splits_
// We populate this on CPU and transfer to GPU.
meta_row_splits_ = Array2<int32_t>(GetCpuContext(), num_axes_, num_srcs_ + 1);
offsets_ = Array2<int32_t>(GetCpuContext(), num_axes_ + 1, num_srcs_ + 1);

auto meta_row_splits_acc = meta_row_splits_.Accessor(),
offsets_acc = offsets_.Accessor();

// Initialize the 1st row of offsets_, which contains 0,1,2,...
for (int32_t col = 0; col <= num_srcs_; ++col) {
offsets_acc(0, col) = col;
}
// Initialize the 1st col of meta_row_splits_ and offsets_
for (int32_t row = 0; row < num_axes_; ++row) {
meta_row_splits_acc(row, 0) = 0;
offsets_acc(row + 1, 0) = 0;
}

// The meta_row_splits_ is the cumulative sum of the tot-sizes of the
// individual arrays.
for (int32_t i = 0; i < num_axes_; ++i) {
for (int32_t j = 1; j <= num_srcs_; ++j) {
meta_row_splits_acc(i, j) = meta_row_splits_acc(i, j - 1) +
srcs[j - 1].TotSize(i);
offsets_acc(i + 1, j) = meta_row_splits_acc(i, j);
}
}

// Initialize meta_row_ids_
// Elements are in [0, NumSrcs() - 1]
meta_row_ids_.resize(num_axes_);

for (int32_t axis = 0; axis < num_axes_; ++axis) {
// The length equals to TotSize(axis)
meta_row_ids_.at(axis) = Array1<int32_t>(
GetCpuContext(), meta_row_splits_acc(axis, num_srcs_));
int32_t *meta_row_ids_data = meta_row_ids_[axis].Data();

int32_t cur_row_start = meta_row_splits_acc(axis, 0);
for (int32_t src = 0; src < num_srcs_; ++src) {
int32_t next_row_start = meta_row_splits_acc(axis, src + 1);
for (; cur_row_start < next_row_start; ++cur_row_start) {
meta_row_ids_data[cur_row_start] = src;
}
}
meta_row_ids_[axis] = meta_row_ids_[axis].To(c_);
}

meta_row_splits_ = meta_row_splits_.To(c_);
offsets_ = offsets_.To(c_);
}


} // namespace k2
119 changes: 83 additions & 36 deletions k2/csrc/array_of_ragged.h
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
/**
* Copyright 2022 Xiaomi Corporation (authors: Daniel Povey, Wei Kang)
* 2022 ASLP@NWPU (authors: Hang Lyu)
*
* See LICENSE for clarification regarding multiple authors
*
Expand All @@ -24,31 +25,48 @@
#include <vector>

#include "k2/csrc/array.h"
#include "k2/csrc/array_ops.h"
#include "k2/csrc/context.h"
#include "k2/csrc/log.h"
#include "k2/csrc/ragged_ops.h"
#include "k2/csrc/ragged.h"

namespace k2 {

/*
Array1OfRagged<T> is a 1-dimensional array of Ragged<T>.
It is intended for situations where you want to do some operations on
arrays of ragged arrays, without explicitly concatenating them (e.g. to
save time). This is a fairly low-level interface, intended to
be used mostly by CUDA/C++ implementation code. It is a convenience
wrapper that saves you the trouble of creating arrays of pointers.
*/


/*
Array1OfRaggedShape is a convenience function that gives you easy access
to pointers-of-pointers for an array of ragged shapes.
*/
class Array1OfRaggedShape {
public:
// Default constructor.
Array1OfRaggedShape() = default;

/*
Constructor.
Args:
srcs: pointers to the source shapes, a CPU pointer
num_srcs: the number of source shapes. All shapes must have the
same NumAxes() and must be on the same device.
srcs: pointers to the source shapes, a CPU pointer
num_srcs: the number of source shapes. All shapes must have the
same NumAxes() and must be on the same device.

TODO: we'll likely, later, add optional args which dictate which of
the MetaRowSplits() and MetaRowIds() are to be pre-populated; this should
enable us to save kernels by combining certain operations across the
axes.

*/
Array1OfRaggedShape(RaggedShape *srcs, int32_t num_srcs);
Array1OfRaggedShape() = default;
Array1OfRaggedShape(RaggedShape *srcs,
int32_t num_srcs);


int32_t NumSrcs() const { return num_srcs_; }
int32_t NumAxes() const { return num_axes_; }
Expand All @@ -63,23 +81,35 @@ class Array1OfRaggedShape {
// Returns device-accessible vector of row-splits for a particular
// axis, indexed by 0 <= src < num_srcs.
const int32_t **RowSplits(int32_t axis) {
return row_splits_.Row(axis - 1).Data();
K2_CHECK_LT(static_cast<uint32_t>(axis),
static_cast<uint32_t>(num_axes_));
return row_splits_.Row(axis - 1).Data();
}

// Returns device-accessible array of row-ids for the individual shapes
// indexed [axis-1][src], with 0 <= src < num_srcs. The shape of this
// Array2 is [NumAxes() - 1][NumSrcs()].
const Array2<const int32_t *> *RowIds() const { return &row_ids_; }
const Array2<const int32_t*> *RowIds() const { return &row_ids_; }


// Returns device-accessible vector of row-splits for a particular
// axis, indexed by 0 <= src < num_srcs.
const int32_t **RowIds(int32_t axis) { return row_ids_.Row(axis - 1).Data(); }
const int32_t **RowIds(int32_t axis) {
K2_CHECK_LT(static_cast<uint32_t>(axis),
static_cast<uint32_t>(num_axes_));
return row_ids_.Row(axis - 1).Data();
}


/* Return the total size on this axis, which is the sum of the TotSize() of
the individual shapes. Requires 0 <= axis < NumAxes() and
for axis=0 the returned value is the same as Dim0().
*/
int32_t TotSize(int32_t axis) const { return tot_sizes_[axis]; }
int32_t TotSize(int32_t axis) const {
K2_CHECK_LT(static_cast<uint32_t>(axis),
static_cast<uint32_t>(num_axes_));
return tot_sizes_[axis];
}

// equivalent to TotSize(0).
int32_t Dim0() const { return TotSize(0); }
Expand All @@ -88,7 +118,7 @@ class Array1OfRaggedShape {
along the src axis, of the tot-sizes of the individual arrays.
This Array2 is of shape [NumAxes()][NumSrcs() + 1], indexed [axis][src];
caution, the indexing is different from RowSplits(), there is no offset.
Also, the meta_row_splits0 is a thing, unlike with regular row-splits
Also, the meta_row_splits_ is a thing, unlike with regular row-splits
which start from 1.

Caution: the lengths of the arrays pointed to by the elements of this
Expand All @@ -99,38 +129,47 @@ class Array1OfRaggedShape {
to GPU, this will be faster than invoking an extra kernel in normal cases
when the NumSrcs() is small. [Also: see GetRowInfoMulti()].
*/
// TODO: implement it...
Array2<int32_t> MetaRowSplits();
const Array2<int32_t> &MetaRowSplits() const { return meta_row_splits_; }

// could POSSIBLY add this so this code could be used in functions like
// Stack(). would be like MetaRowSplits but with an extra 1st row containing
// 0,1,2,... We could perhaps create it with 1 extra initial row so this is
// always convenient to output.
// TODO: implement it...
Array2<int32_t> Offsets();
const Array2<int32_t> &Offsets() const { return offsets_; }

/*
Returns the meta-row-splits for a particular axis, with 0 <= axis <
NumAxes(); this is the cumulative sum of the TotSize(axis) for all of the
sources, with MetaRowSplits(axis).Dim() == NumSrcs() + 1.
Returns the meta-row-splits for a particular axis, with
0 <= axis < NumAxes();
this is the cumulative sum of the TotSize(axis) for all of the sources,
with MetaRowSplits(axis).Dim() == NumSrcs() + 1.

Note: in ragged_ops.cu we refer to this as composed_row_splits
Note: in ragged_opts.cu we refer to this as composed_row_splits
*/
// TODO: implement it...
Array1<int32_t> MetaRowSplits(int32_t axis);
Array1<int32_t> MetaRowSplits(int32_t axis) {
K2_CHECK_LT(static_cast<uint32_t>(axis),
static_cast<uint32_t>(num_axes_));
return meta_row_splits_.Row(axis);
}

/* Return the device-accessible meta-row-ids, which are the row-ids
corresponding to MetaRowSplits(); this tells us, for indexes into the
appended/concatenated array, which source array they belong to, i.e.
elements are in [0,NumSrcs()-1].
appended/concatenated array, which source array they belong to,
i.e. elements are in [0,NumSrcs()-1].

This cannot be an Array2 because unlike the MetaRowSplits(), all the
row-ids arrays are of different lengths.

Note: in ragged_ops.cu we refer to this as composed_row_ids.
*/
// TODO: implement it...
Array1<int32_t *> MetaRowIds();
Array1<const int32_t*> MetaRowIds() {
Array1<const int32_t*> ans(GetCpuContext(), num_axes_);
const int32_t* *ans_data = ans.Data();
for (int32_t i = 0; i < num_axes_; ++i) {
ans_data[i] = meta_row_ids_[i].Data();
}
ans = ans.To(c_);
return ans;
}

/*
Returns the meta-row-ids for a particular axis, with 0 <= axis < NumAxes();
Expand All @@ -140,18 +179,28 @@ class Array1OfRaggedShape {
would tell us which source an idx012 with value 100 into axis 2 of
concatenated array would come from.
*/
// TODO: implement it...
Array1<int32_t> MetaRowIds(int32_t axis);
const Array1<int32_t> &MetaRowIds(int32_t axis) const {
K2_CHECK_LT(static_cast<uint32_t>(axis),
static_cast<uint32_t>(num_axes_));
return meta_row_ids_[axis];
}

private:
ContextPtr c_;
int32_t num_srcs_;
int32_t num_axes_;

Array2<const int32_t *> row_splits_; // shape [num_axes_ - 1][num_srcs_]
Array2<const int32_t *> row_ids_; // shape [num_axes_ - 1][num_srcs_]
Array1<int32_t> tot_sizes_; // dim num_axes_, this is on CPU
Array1<int32_t> tot_sizes_; // dim num_axes_

Array2<int32_t> meta_row_splits_; // shape [num_axes_][num_srcs_ + 1]
Array2<int32_t> offsets_; // shape [num_axes_][num_srcs_ + 1]
std::vector<Array1<int32_t> > meta_row_ids_; // dim num_axes_
};



/*
Array1OfRagged<T> is a 1-dimensional array of Ragged<T>.
It is intended for situations where you want to do some operations on
Expand All @@ -171,17 +220,14 @@ struct Array1OfRagged {
int32_t NumSrcs() const { return values.Dim(); }
ContextPtr &Context() { return shape.Context(); }

// Default constructor will not leave this a valid Array1OfRagged object,
// you shouldn't do anything with it. Both members will be initialized with
// default constructors.
Array1OfRagged() = default;

/*
Constructor.
Args:
srcs: pointers to the source ragged tensors, a CPU pointer
num_srcs: the number of source ragged tensors. All ragged tensors must
have the same NumAxes() and must be on the same device.
*/
// The 'srcs' should have the same number of axes.
Array1OfRagged(Ragged<T> *srcs, int32_t num_srcs) {
K2_CHECK_GE(num_srcs, 1);
K2_CHECK_GT(num_srcs, 0);
K2_CHECK(srcs);
values = Array1<T *>(GetCpuContext(), num_srcs);
T **values_data = values.Data();
Expand All @@ -195,6 +241,7 @@ struct Array1OfRagged {
}
};


} // namespace k2

#endif // K2_CSRC_ARRAY_OF_RAGGED_H_
Loading