Releases: bab2min/tomotopy
Releases · bab2min/tomotopy
0.13.0
- New features
- Major features of Topic Model Viewer
tomotopy.viewer.open_viewer()are ready now. tomotopy.LDAModel.get_hash()is added. You can get 128bit hash value of the model.- Add an argument
ngram_listtotomotopy.utils.SimpleTokenizer.
- Major features of Topic Model Viewer
- Bug fixes
- Fixed inconsistent
spansbug afterCorpus.concat_ngramsis called. - Optimized the bottleneck of
tomotopy.LDAModel.load()andtomotopy.LDAModel.save()and improved its speed more than 10 times.
- Fixed inconsistent
0.12.7
0.12.6
- New features
- Added some convenience features to
tomotopy.LDAModel.trainandtomotopy.LDAModel.set_word_prior. LDAModel.trainnow has new argumentscallback,callback_intervalandshow_progresto monitor the training progress.LDAModel.set_word_priornow can acceptDict[int, float]type as its argumentprior.
- Added some convenience features to
0.12.5
0.12.4
0.12.3
New features
- Now, inserting an empty document using
tomotopy.LDAModel.add_doc()just ignores it instead of raising an exception. If the newly added argumentignore_empty_wordsis set to False, an exception is raised as before. (#161) tomotopy.HDPModel.purge_dead_topics()method is added to remove non-live topics from the model. (#152)
Bug fixes
- Fixed an issue that prevents setting user defined values for nuSq in
tomotopy.SLDAModel(by @jucendrero). (#174) - Fixed an issue where
tomotopy.utils.Coherencedid not work fortomotopy.DTModel. (#164) - Fixed an issue that often crashed when calling
make_doc()before callingtrain(). (#166) - Resolved the problem that the results of
tomotopy.DMRModelandtomotopy.GDMRModelare different even when the seed is fixed. (#63) - The parameter optimization process of
tomotopy.DMRModelandtomotopy.GDMRModelhas been improved. - Fixed an issue that sometimes crashed when calling
tomotopy.PTModel.copy().
0.12.2
- An issue where calling
convert_to_ldaoftomotopy.HDPModelwithmin_cf > 0,min_df > 0orrm_top > 0causes a crash has been fixed. - A new argument
from_pseudo_docis added totomotopy.Document.get_topicsandtomotopy.Document.get_topic_dist.
This argument is only valid for documents ofPTModel, it enables to control a source for computing topic distribution. - A default value for argument
poftomotopy.PTModelhas been changed. The new default value isk * 10. - Using documents generated by
make_docwithout callinginferdoesn't cause a crash anymore, but just print warning messages. - An issue where the internal C++ code isn't compiled at clang c++17 environment has been fixed.
0.12.1
- An issue where
tomotopy.LDAModel.set_word_prior()causes a crash has been fixed. - Now
tomotopy.LDAModel.perplexityandtomotopy.LDAModel.ll_per_wordreturn the accurate value whenTermWeightis notONE. tomotopy.LDAModel.used_vocab_weighted_freqwas added, which returns term-weighted frequencies of words.- Now
tomotopy.LDAModel.summary()shows not only the entropy of words, but also the entropy of term-weighted words.
0.12.0
- Now
tomotopy.DMRModelandtomotopy.GDMRModelsupport multiple values of metadata (see https://github.com/bab2min/tomotopy/blob/main/examples/dmr_multi_label.py ) - The performance of
tomotopy.GDMRModelwas improved. - A
copy()method has been added for all topic models to do a deep copy. - An issue was fixed where words that are excluded from training (by
min_cf,min_df) have incorrect topic id. Now all excluded words have-1as topic id. - Now all exceptions and warnings that generated by
tomotopyfollow standard Python types. - Compiler requirements have been raised to C++14.