Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Remove defunct demo #9060

Merged
merged 7 commits into from
Dec 18, 2017
Merged

Remove defunct demo #9060

merged 7 commits into from
Dec 18, 2017

Conversation

simoncorstonoliver
Copy link
Contributor

I have spent ~4 hours working through the demo trying to get it working. It has problems so severe that I don’t think it’s worth trying to salvage it. This example might make a good blog entry for people who are thoroughly versed in Kaldi, an Open Source automatic speech recognition (ASR) project, and who are interested in an MXNet wrapper for it but does not stand alone as an introductory example to speech recognition suitable for inclusion in the MXNet repo. Alternatively, an ASR expert could perhaps rework this example and resubmit it if it illustrated general MXNet concepts.

Specific criticisms that lead me to propose removing the demo altogether:

  • Depends on Kaldi which is a bear to install and compile (but I got there in the end).
  • The example includes a Python wrapper around Kaldi which does not compile. Editing the makefile to remove the link dependency on ../thread/kaldi-thread.a allowed it to compile but I have no idea if the result will work as intended.
  • The discussion of preparing data gives no hint of where you would download compatible data or transcripts. It tells you to run Kaldi scripts that barf and ask where your Fisher transcripts are.
  • The example has vague instructions to go read the Kaldi docs but doesn't even give a title or topic that you should go looking for.
  • The tutorial is full of acronyms that appear to have ASR-specific meanings that are never explained and that a user of MXNet would not reasonably be expected to know e.g. AMI does not mean "Amazon Machine Image". Similarly TIMIT is a speech corpus.

@piiswrong
Copy link
Contributor

@Soonhwan-Kwon

@Soonhwan-Kwon
Copy link
Contributor

Soonhwan-Kwon commented Dec 15, 2017

@piiswrong @simoncorstonoliver
Our team had looked the example for 2 weeks about an year ago, and we skipped the few steps to make example works.
As mentioned in pull request there are some problems.
First kaldi preprocessing on dataset took days.
And second we've made many detours to fill the gap of vague instructions.

But we referenced the network architecture of it and also method for reading configuration,
and it was quite helpful.

@simoncorstonoliver
Copy link
Contributor Author

simoncorstonoliver commented Dec 15, 2017

@Soonhwan-Kwon @piiswrong
Glad to hear the tutorial was helpful for your project but the question really is whether it belongs in the repo. We're trying to improve the quality of examples and would like to move to a world where the examples can be automatically validated for each release. If preprocessing the data using the external ASR tool Kaldi takes days then I propose that the tutorial should not be included in the repo.

if someone with expertise in ASR were to prepare some sample data that could be used with the reference network architecture then perhaps a new example tutorial could be created in the future, but for now let's remove it so it doesn't cause confusion on the web site for people who are new to MXNet.

@piiswrong
Copy link
Contributor

Doesn't the speech_recognition example work? @simoncorstonoliver

@piiswrong piiswrong merged commit 5858d62 into apache:master Dec 18, 2017
@simoncorstonoliver
Copy link
Contributor Author

I was not able to verify that it works. @Soonhwan-Kwon says that people with more expertise than me were eventually able to get it working after filling in the missing/vague information and training for several days so I would say no, as it stands it doesn't work.

@Soonhwan-Kwon
Copy link
Contributor

@piiswrong @simoncorstonoliver I'm afraid that you've confused the author who wrote the speech-demo example(maybe pluskid?). The example I've made is speech_recognition.

rahul003 pushed a commit to rahul003/mxnet that referenced this pull request Jun 4, 2018
* Update for MXNet 1.0. PEP8 fixes to code and misc improvements. Tested under Python 2.7 and Python 3.6 on Sagemaker.

* Remove defunct tutorial page

* Remove defunct demo

* Remove duplicate material
zheng-da pushed a commit to zheng-da/incubator-mxnet that referenced this pull request Jun 28, 2018
* Update for MXNet 1.0. PEP8 fixes to code and misc improvements. Tested under Python 2.7 and Python 3.6 on Sagemaker.

* Remove defunct tutorial page

* Remove defunct demo

* Remove duplicate material
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants