Skip to content

"An Improved Deep Embedding Learning Method for Short Duration Speaker Verification" pytorch implementation

License

Notifications You must be signed in to change notification settings

qqueing/speaker_embedding-pytorch

Repository files navigation

An Improved Deep Embedding Learning Method for Short Duration Speaker Verification - Pytorch Implementation

This is a pytorch implementation of the model(modified cross-conv. pooling) presented by Zhifu Gao in An Improved Deep Embedding Learning Method for Short Duration Speaker Verification.

I am sorry that most of the code except the model is old and dirty. Because I try to it only private database. but there is no problem with performance or operation. If you only fit the input size - batch X 1 X feature dim. X frame.

Original paper's parameter is very big model. Cross-conv. pooling layer output is 512 x 512 = 262144, it makes small batch size and a lot of training time and so on. I recommend you use small size parameter about 128 x 128.

I hope this code helps researcher reach higher score.

Data input

  • batch X 1 X feature dim. X frame.

Credits

Original paper:

  • Gao's paper:
@article{,
  author    = {Zhifu Gao, Yan Song, Ian McLoughlin, Wu Guo and Lirong Dai},
  title     = {An Improved Deep Embedding Learning Method for Short Duration Speaker Verification},
  conference   = {Interspeech 2018},
  year      = {2018},
}

Also, use the part of code:

Features

  • This code has only model implementation. Data loader and the other code was recycled from this code

Authors

[email protected]( or [email protected])

About

"An Improved Deep Embedding Learning Method for Short Duration Speaker Verification" pytorch implementation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages