Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update v2.0-pre #953

Merged
merged 65 commits into from
Apr 14, 2022
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
65 commits
Select commit Hold shift + click to select a range
9ac1e78
Update doc URL. (#821)
csukuangfj Sep 8, 2021
bbe0ded
Support indexing 2-axes RaggedTensor, Support slicing for RaggedTenso…
pkufool Sep 14, 2021
2c28070
Prune with max_arcs in IntersectDense (#820)
pkufool Sep 14, 2021
210175c
Release v1.8
pkufool Sep 14, 2021
33a212c
Create a ragged tensor from a regular tensor. (#827)
csukuangfj Sep 15, 2021
971af7d
Trigger GitHub actions manually. (#829)
csukuangfj Sep 16, 2021
646704e
Run GitHub actions on merging. (#830)
csukuangfj Sep 16, 2021
8030001
Support printing ragged tensors in a more compact way. (#831)
csukuangfj Sep 17, 2021
d73a5b5
Add levenshtein alignment (#828)
pkufool Sep 19, 2021
f2fd997
Release v1.9
pkufool Sep 19, 2021
601d663
Support a[b[i]] where both a and b are ragged tensors. (#833)
csukuangfj Sep 25, 2021
8694fee
Display import error solution message on MacOS (#837)
pzelasko Sep 30, 2021
86e5479
Fix installation doc. (#841)
csukuangfj Oct 8, 2021
b72589c
fix typos in the install instructions (#844)
jtrmal Oct 13, 2021
6ac9795
make cmake adhere to the modernized way of finding packages outside d…
jtrmal Oct 13, 2021
2537a3f
import torch first in the smoke tests to preven SEGFAULT (#846)
jtrmal Oct 14, 2021
cae610a
Add doc about how to install a CPU version of k2. (#850)
csukuangfj Oct 23, 2021
d061bc6
Support PyTorch 1.10. (#851)
csukuangfj Oct 24, 2021
7178d67
Fix test cases for k2.union() (#853)
csukuangfj Oct 26, 2021
e6db5dc
Fix out-of-boundary access (read). (#859)
csukuangfj Nov 2, 2021
e8c589a
Update all the example codes in the docs (#861)
luomingshuang Nov 4, 2021
fd5565d
Fix compilation errors with CUB 1.15. (#865)
csukuangfj Nov 10, 2021
bdcaaf8
Update README. (#873)
csukuangfj Nov 12, 2021
31e1307
Fix ctc graph (make aux_labels of final arcs -1) (#877)
pkufool Nov 19, 2021
12f5915
Fix LICENSE location to k2 folder (#880)
lumaku Nov 24, 2021
a0d75c8
Release v1.11. (#881)
csukuangfj Nov 29, 2021
2cb3eea
Update documentation for hash.h (#887)
danpovey Dec 5, 2021
aab2dd7
Wrap MonotonicLowerBound (#883)
pkufool Dec 14, 2021
5517b3e
Remove extra commas after 'TOPSORTED' properity and fix RaggedTensor …
drawfish Dec 25, 2021
5f4cc79
Fix small typos (#896)
danpovey Jan 6, 2022
e799928
Fix k2.ragged.create_ragged_shape2 (#901)
csukuangfj Jan 13, 2022
d6323d5
Add rnnt loss (#891)
pkufool Jan 17, 2022
d3fbb1b
Use more efficient way to fix boundaries (#906)
pkufool Jan 25, 2022
9a91ec6
Release v1.12 (#907)
pkufool Jan 25, 2022
3367c7f
Change the sign of the rnnt_loss and add reduction argument (#911)
pkufool Jan 29, 2022
779a9bd
Fix building doc. (#908)
csukuangfj Jan 29, 2022
47c4b75
Fix building doc (#912)
pkufool Jan 29, 2022
cf32e2d
Support torch 1.10.x (#914)
csukuangfj Feb 8, 2022
9e7b2a9
Update INSTALL.rst (#915)
alexei-v-ivanov Feb 8, 2022
43ed450
Fix torch/cuda/python versions in the doc. (#918)
csukuangfj Feb 10, 2022
f4fefe4
Fix building for CUDA 11.6 (#917)
csukuangfj Feb 10, 2022
56edc82
Implement Unstack (#920)
pkufool Feb 20, 2022
854b792
SubsetRagged & PruneRagged (#919)
pkufool Feb 20, 2022
3cc74f1
Add Hash64 (#895)
pkufool Feb 22, 2022
0feefc7
Modified rnnt (#902)
pkufool Feb 25, 2022
2239c39
Fix Stack (#925)
wgb14 Feb 25, 2022
5ee082e
Fix 'TypeError' of rnnt_loss_pruned function. (#924)
drawfish Feb 27, 2022
36e2b8d
Support torch 1.11.0 and CUDA 11.5 (#931)
csukuangfj Mar 15, 2022
f4b4247
Implement Rnnt decoding (#926)
pkufool Mar 16, 2022
9a0d72c
fix building docs (#933)
pkufool Mar 16, 2022
6833270
Release v1.14
pkufool Mar 16, 2022
613e03d
Remove unused DiscountedCumSum. (#936)
csukuangfj Mar 17, 2022
281378f
Fix compiler warnings. (#937)
csukuangfj Mar 17, 2022
10b9423
Minor fixes for RNN-T decoding. (#938)
csukuangfj Mar 19, 2022
846c39c
Removes arcs with label 0 from the TrivialGraph. (#939)
csukuangfj Mar 29, 2022
0f65420
Implement linear_fsa_with_self_loops. (#940)
csukuangfj Mar 29, 2022
a830c60
Fix the pruning with max-states (#941)
pkufool Mar 30, 2022
8c28c86
Rnnt allow different encoder/decoder dims (#945)
danpovey Apr 3, 2022
d977865
Supporting building k2 on Windows (#946)
csukuangfj Apr 6, 2022
a4d76d2
Fix nightly windows CPU build (#948)
csukuangfj Apr 7, 2022
4fb6b88
Check the versions of PyTorch and CUDA at the import time. (#949)
csukuangfj Apr 8, 2022
9ebd757
More straightforward message when CUDA support is missing (#950)
nshmyrev Apr 11, 2022
3b83183
Implement ArrayOfRagged (#927)
LvHang Apr 12, 2022
1b29f0a
Fix precision (#951)
pkufool Apr 13, 2022
93d528a
Merge branch 'master' into v2.0
pkufool Apr 14, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Support indexing 2-axes RaggedTensor, Support slicing for RaggedTensor (
#825)

* Support index 2-axes RaggedTensor, Support slicing for RaggedTensor

* Fix compiling errors

* Fix unit test

* Change RaggedTensor.data to RaggedTensor.values

* Fix style

* Add docs

* Run nightly-cpu when pushing code to nightly-cpu branch
  • Loading branch information
pkufool authored Sep 14, 2021
commit bbe0dedc67cf82021ec8277c5e863d2f07ecce49
3 changes: 3 additions & 0 deletions .github/workflows/nightly-cpu.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@
name: nightly-cpu

on:
push:
branches:
- nightly-cpu
schedule:
# minute (0-59)
# hour (0-23)
Expand Down
45 changes: 39 additions & 6 deletions k2/python/csrc/torch/v2/any.cu
Original file line number Diff line number Diff line change
Expand Up @@ -70,15 +70,48 @@ void PybindRaggedAny(py::module &m) {

any.def(
"__getitem__",
[](RaggedAny &self, int32_t i) -> RaggedAny {
return self.Index(/*axis*/ 0, i);
[](RaggedAny &self, int32_t i) -> py::object {
if (self.any.NumAxes() > 2) {
RaggedAny ragged = self.Index(/*axis*/ 0, i);
return py::cast(ragged);
} else {
K2_CHECK_EQ(self.any.NumAxes(), 2);
Array1<int32_t> row_split = self.any.RowSplits(1).To(GetCpuContext());
const int32_t *row_split_data = row_split.Data();
int32_t begin = row_split_data[i],
end = row_split_data[i + 1];
Dtype t = self.any.GetDtype();
FOR_REAL_AND_INT32_TYPES(t, T, {
Array1<T> array =
self.any.Specialize<T>().values.Arange(begin, end);
torch::Tensor tensor = ToTorch(array);
return py::cast(tensor);
});
}
// Unreachable code
return py::none();
},
py::arg("i"), kRaggedAnyGetItemDoc);

any.def(
"__getitem__",
[](RaggedAny &self, const py::slice &slice) -> RaggedAny {
py::ssize_t start = 0, stop = 0, step = 0, slicelength = 0;
if (!slice.compute(self.any.Dim0(), &start, &stop, &step, &slicelength))
throw py::error_already_set();
int32_t istart = static_cast<int32_t>(start);
int32_t istop = static_cast<int32_t>(stop);
int32_t istep = static_cast<int32_t>(step);
K2_CHECK_EQ(istep, 1) << "Only support slicing with step 1, given : "
<< istep;

return self.Arange(/*axis*/ 0, istart, istop);
}, py::arg("key"), kRaggedAnyGetItemSliceDoc);

any.def("index",
static_cast<RaggedAny (RaggedAny::*)(RaggedAny &, bool)>(
static_cast<RaggedAny (RaggedAny::*)(RaggedAny &)>(
&RaggedAny::Index),
py::arg("indexes"), py::arg("remove_axis") = true,
py::arg("indexes"),
kRaggedAnyRaggedIndexDoc);

any.def("index",
Expand Down Expand Up @@ -325,8 +358,8 @@ void PybindRaggedAny(py::module &m) {
// Return the underlying memory of this tensor.
// No data is copied. Memory is shared.
any.def_property_readonly(
"data", [](RaggedAny &self) -> torch::Tensor { return self.Data(); },
kRaggedAnyDataDoc);
"values", [](RaggedAny &self) -> torch::Tensor { return self.Data(); },
kRaggedAnyValuesDoc);

any.def_property_readonly(
"shape", [](RaggedAny &self) -> RaggedShape { return self.any.shape; },
Expand Down
70 changes: 44 additions & 26 deletions k2/python/csrc/torch/v2/doc/any.h
Original file line number Diff line number Diff line change
Expand Up @@ -350,9 +350,6 @@ Select the i-th sublist along axis 0.
Caution:
Support for autograd is to be implemented.

Note:
It requires that this tensor has at least 3 axes.

>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor('[ [[1 3] [] [9]] [[8]] ]')
Expand All @@ -363,11 +360,45 @@ Select the i-th sublist along axis 0.
>>> a[1]
[ [ 8 ] ]

>>> a = k2r.RaggedTensor('[ [1 3] [9] [8] ]')
>>> a
[ [ 1 3 ] [ 9 ] [ 8 ] ]
>>> a[0]
tensor([1, 3], dtype=torch.int32)
>>> a[1]
tensor([9], dtype=torch.int32)

Args:
i:
The i-th sublist along axis 0.
Returns:
Return a new ragged tensor with one fewer axis.
Return a new ragged tensor with one fewer axis. If `num_axes == 2`, the
return value will be a 1D tensor.
)doc";

static constexpr const char *kRaggedAnyGetItemSliceDoc = R"doc(
Slices sublists along axis 0 with the given range. Only support slicing step
equals to 1.

Caution:
Support for autograd is to be implemented.

>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor('[ [[1 3] [] [9]] [[8]] [[10 11]] ]')
>>> a
[ [ [ 1 3 ] [ ] [ 9 ] ] [ [ 8 ] ] [ [ 10 11 ] ] ]
>>> a[0:2]
[ [ [ 1 3 ] [ ] [ 9 ] [ [ 8 ] ] ] ]
>>> a[1:2]
[ [ [ 8 ] ] [ [ 10 11 ] ] ]

Args:
key:
Slice containing integer constants.
Returns:
Return a new ragged tensor with the same axes as original ragged tensor, but
only contains the sublists within the range.
)doc";

static constexpr const char *kRaggedAnyCloneDoc = R"doc(
Expand Down Expand Up @@ -644,23 +675,23 @@ device(type='cuda', index=0)
>>> b.device == torch.device('cuda:0')
)doc";

static constexpr const char *kRaggedAnyDataDoc = R"doc(
static constexpr const char *kRaggedAnyValuesDoc = R"doc(
Return the underlying memory as a 1-D tensor.

>>> import torch
>>> import k2.ragged as k2r
>>> a = k2r.RaggedTensor([[1, 2], [], [5], [], [8, 9, 10]])
>>> a.data
>>> a.values
tensor([ 1, 2, 5, 8, 9, 10], dtype=torch.int32)
>>> isinstance(a.data, torch.Tensor)
>>> isinstance(a.values, torch.Tensor)
True
>>> a.data[0] = -1
>>> a.values[-2] = -1
>>> a
[ [ -1 2 ] [ ] [ 5 ] [ ] [ 8 9 10 ] ]
>>> a.data[3] = -3
>>> a.values[3] = -3
>>> a
[ [ -1 2 ] [ ] [ 5 ] [ ] [ -3 9 10 ] ]
>>> a.data[2] = -2
>>> a.values[2] = -2
>>> a
[ [ -1 2 ] [ ] [ -2 ] [ ] [ -3 9 10 ] ]
)doc";
Expand Down Expand Up @@ -1301,24 +1332,18 @@ Index a ragged tensor with a ragged tensor.
>>> import k2.ragged as k2r
>>> src = k2r.RaggedTensor([[10, 11], [12, 13.5]])
>>> indexes = k2r.RaggedTensor([[0, 1]])
>>> src.index(indexes, remove_axis=True)
[ [ 10 11 12 13.5 ] ]
>>> src.index(indexes, remove_axis=False)
>>> src.index(indexes)
[ [ [ 10 11 ] [ 12 13.5 ] ] ]
>>> i = k2r.RaggedTensor([[0], [1], [0, 0]])
>>> src.index(i, remove_axis=True)
[ [ 10 11 ] [ 12 13.5 ] [ 10 11 10 11 ] ]
>>> src.index(i, remove_axis=False)
>>> src.index(i)
[ [ [ 10 11 ] ] [ [ 12 13.5 ] ] [ [ 10 11 ] [ 10 11 ] ] ]

**Example 2**:

>>> import k2.ragged as k2r
>>> src = k2r.RaggedTensor([ [[1, 0], [], [2]], [[], [3], [0, 0, 1]], [[1, 2], [-1]]])
>>> i = k2r.RaggedTensor([[[0, 2], [1]], [[0]]])
>>> src.index(i, remove_axis=True)
[ [ [ [ 1 0 2 ] [ 1 2 -1 ] ] [ [ 3 0 0 1 ] ] ] [ [ [ 1 0 2 ] ] ] ]
>>> src.index(i, remove_axis=False)
>>> src.index(i)
[ [ [ [ [ 1 0 ] [ ] [ 2 ] ] [ [ 1 2 ] [ -1 ] ] ] [ [ [ ] [ 3 ] [ 0 0 1 ] ] ] ] [ [ [ [ 1 0 ] [ ] [ 2 ] ] ] ] ]

Args:
Expand All @@ -1328,13 +1353,6 @@ Index a ragged tensor with a ragged tensor.
Caution:
Its dtype has to be ``torch.int32``.

remove_axis:
If ``True``, then we remove the last-but-one axis,
which has the effect of appending lists, e.g.
if ``self`` is ``[[ 10 11 ] [ 12 13 ]]`` and ``indexes``
is ``[[0 1]]`, this function will give us ``[[ 10 11 12 13 ]]``.
If ``False`` the answer will have at least 3 axes, e.g., ``[[[10 11]] [12 13]]]`` ,
in this case.
Returns:
Return indexed tensor.
)doc";
Expand Down
4 changes: 2 additions & 2 deletions k2/python/csrc/torch/v2/ragged_any.cu
Original file line number Diff line number Diff line change
Expand Up @@ -560,13 +560,13 @@ torch::optional<torch::Tensor> RaggedAny::Sort(
return ans;
}

RaggedAny RaggedAny::Index(RaggedAny &indexes,
bool remove_axis /* = true*/) /*const*/ {
RaggedAny RaggedAny::Index(RaggedAny &indexes) /*const*/ {
K2_CHECK_EQ(indexes.any.GetDtype(), kInt32Dtype)
<< "Unsupported dtype: " << TraitsOf(indexes.any.GetDtype()).Name();

DeviceGuard guard(any.Context());

bool remove_axis = false;
Dtype t = any.GetDtype();
FOR_REAL_AND_INT32_TYPES(t, T, {
return RaggedAny(k2::Index<T>(any.Specialize<T>(),
Expand Down
2 changes: 1 addition & 1 deletion k2/python/csrc/torch/v2/ragged_any.h
Original file line number Diff line number Diff line change
Expand Up @@ -226,7 +226,7 @@ struct RaggedAny {
bool need_new2old_indexes = false);

/// Wrapper for k2::Index
RaggedAny Index(RaggedAny &indexes, bool remove_axis = true) /*const*/;
RaggedAny Index(RaggedAny &indexes) /*const*/;

/// Wrapper for k2::Index
std::pair<RaggedAny, torch::optional<torch::Tensor>> Index(
Expand Down
2 changes: 1 addition & 1 deletion k2/python/k2/autograd_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ def backward(ctx, out_fsa_scores_grad: torch.Tensor
dtype=torch.float32,
device=unused_in_fsa_scores.device,
requires_grad=False)
_k2.index_add(arc_map.data, expanded, ans)
_k2.index_add(arc_map.values, expanded, ans)

return (
None, # out_fsa
Expand Down
2 changes: 1 addition & 1 deletion k2/python/k2/fsa.py
Original file line number Diff line number Diff line change
Expand Up @@ -1394,7 +1394,7 @@ def set_scores_stochastic_(self, scores) -> None:

# Note we use `to` here since `scores` and `self.scores` may not
# be on the same device.
self.scores = ragged_scores.data.to(self.scores.device)
self.scores = ragged_scores.values.to(self.scores.device)

def convert_attr_to_ragged_(self, name: str,
remove_eps: bool = True) -> 'Fsa':
Expand Down
4 changes: 2 additions & 2 deletions k2/python/k2/fsa_algo.py
Original file line number Diff line number Diff line change
Expand Up @@ -466,7 +466,7 @@ def shortest_path(fsa: Fsa, use_double_scores: bool) -> Fsa:
'''
entering_arcs = fsa._get_entering_arcs(use_double_scores)
ragged_arc, ragged_int = _k2.shortest_path(fsa.arcs, entering_arcs)
arc_map = ragged_int.data
arc_map = ragged_int.values

out_fsa = k2.utils.fsa_from_unary_function_tensor(fsa, ragged_arc, arc_map)
return out_fsa
Expand Down Expand Up @@ -1016,7 +1016,7 @@ def ctc_graph(symbols: Union[List[List[int]], k2.RaggedTensor],
if isinstance(symbols, k2.RaggedTensor):
assert device is None
assert symbols.num_axes == 2
symbol_values = symbols.data
symbol_values = symbols.values
else:
symbol_values = torch.tensor(
[it for symbol in symbols for it in symbol],
Expand Down
3 changes: 2 additions & 1 deletion k2/python/k2/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -169,7 +169,7 @@ def convert_aux_label_to_symbol(
if end == begin:
return ':<eps>'

labels = aux_labels.data[begin:end]
labels = aux_labels.values[begin:end]
ans = []
for label in labels.tolist():
if label == -1:
Expand Down Expand Up @@ -538,6 +538,7 @@ def fsa_from_unary_function_ragged(src: Fsa,
# We currently don't support float ragged attributes
assert value.dtype == torch.int32
new_value = value.index(arc_map)
new_value = new_value.remove_axis(new_value.num_axes - 2)
setattr(dest, name, new_value)

for name, value in src.named_non_tensor_attr():
Expand Down
7 changes: 4 additions & 3 deletions k2/python/tests/index_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -145,6 +145,7 @@ def test(self):
device=device)
ragged_index = k2.RaggedTensor(index_shape, index_values)
ans = src.index(ragged_index)
ans = ans.remove_axis(1)
expected_row_splits = torch.tensor([0, 5, 5, 5, 9],
dtype=torch.int32,
device=device)
Expand All @@ -153,7 +154,7 @@ def test(self):
expected_values = torch.tensor([1, 2, 4, 5, 6, 3, 3, 1, 2],
dtype=torch.int32,
device=device)
self.assertTrue(torch.allclose(ans.data, expected_values))
self.assertTrue(torch.allclose(ans.values, expected_values))

# index with tensor
tensor_index = torch.tensor([0, 3, 2, 1, 2, 1],
Expand All @@ -168,7 +169,7 @@ def test(self):
expected_values = torch.tensor([1, 2, 4, 5, 6, 3, 3],
dtype=torch.int32,
device=device)
self.assertTrue(torch.allclose(ans.data, expected_values))
self.assertTrue(torch.allclose(ans.values, expected_values))


class TestIndexTensorWithRaggedInt(unittest.TestCase):
Expand Down Expand Up @@ -203,7 +204,7 @@ def test(self):
expected_values = torch.tensor([1, 4, 3, 4, 6, 2, 4],
dtype=torch.int32,
device=device)
self.assertTrue(torch.allclose(ans.data, expected_values))
self.assertTrue(torch.allclose(ans.values, expected_values))


if __name__ == '__main__':
Expand Down
Loading