Skip to content

Tags: daqingyi770923/BERTopic

Tags

v0.9.1

Toggle v0.9.1's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

v0.9.0

Toggle v0.9.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
v0.9 (MaartenGr#176)

* Get most representative documents per topic: `topic_model.get_representative_docs(topic=1)`
* Added `normalize_frequency` parameter to `visualize_topics_per_class` and `visualize_topics_over_time` 
* Return flat probabilities as default, only calculate the probabilities of all topics per document if `calculate_probabilities` is True
* Implemented a guided BERTopic by defining seed topics: `BERTopic(seed_topic_list=seed_topic_list)`
* Fix loading embedding model
* Fix probability mapping
* Improve accuracy of probabilities
* Additional FAQs

v0.8.1

Toggle v0.8.1's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
v0.8.1 (MaartenGr#138)

* Fix tests
* Add Kaggle example
* Add interactive visualizations to API documentation
* Set transformers in Flair

v0.8.0

Toggle v0.8.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
v0.8 (MaartenGr#120)

Visualization Update:
* Topic Hierarchy: topic_model.visualize_hierarchy()
* Topic Similarity Heatmap: topic_model.visualize_heatmap()
* Topic Representation Barchart: topic_model.visualize_barchart()
* Term Score Decline: topic_model.visualize_term_rank()

Improvements:  
* Created bertopic.plotting library to easily extend visualizations
* Improved automatic topic reduction by using HDBSCAN to detect similar topics
* Sort topic ids by their frequency. -1 is the outlier class and contains typically the most documents. After that 0 is the largest topic, 1 the second largest, etc.
* Update MKDOCS with new visualizations

Fixes:
* Fix typo MaartenGr#113, MaartenGr#117
* Fix MaartenGr#121 
* Fix mapping of topics after reduction (it now excludes 0) (MaartenGr#103)

v0.7.0

Toggle v0.7.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
v0.7 (MaartenGr#87)

Highlights:  

* (semi-)supervised topic modeling
* Added Spacy, Gensim, USE (TFHub)
* Use a different backend for document embeddings and word embeddings
* Create your own backends with `bertopic.backend.BaseEmbedder`
* Calculate and visualize topics per class

Fixes:

* Fixed issues with Torch req
* Prevent saving term frequency matrix in CTFIDF class
* Fixed DTM not working when reducing topics (MaartenGr#96)
* Moved visualization dependencies to base BERTopic
* `pip install bertopic[visualization]` becomes `pip install bertopic`
* Allow precomputed embeddings in bertopic.find_topics() (MaartenGr#79)

v0.6.0

Toggle v0.6.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Fix typo (MaartenGr#69)

v0.5.0

Toggle v0.5.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
v0.5 (MaartenGr#46)

* Add Flair to allow for more (custom) token/document embeddings
* Option to use custom UMAP, HDBSCAN, and CountVectorizer
* Added low_memory parameter to reduce memory during computation
* Improved verbosity (shows progress bar)
* Improved testing
* Use the newest version of sentence-transformers as it speeds ups encoding significantly
* Return the figure of visualize_topics()
* Expose all parameters with a single function: get_params()
* Option to disable the saving of embedding_model, should reduce BERTopic size significantly
* Add FAQ page

v0.4.2

Toggle v0.4.2's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Fixed embedding parameter not working (MaartenGr#36)

v0.4.1

Toggle v0.4.1's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Fix language bug (MaartenGr#34)

v0.4.0

Toggle v0.4.0's commit message
Update documentation