Skip to content

liangcd/jieba_rb

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

JiebaRb

Gem Version

Build Status

Ruby extension for Cppjieba

Installation

Add this line to your application's Gemfile:

gem 'jieba_rb'

And then execute:

$ bundle

Or install it yourself as:

$ gem install jieba_rb

Word segment Usage

Mix Segment mode (HMM with Max Prob, default):

require 'jieba_rb'
seg = JiebaRb::Segment.new  # equivalent to "JiebaRb::Segment.new mode: :mix"
words = seg.cut "令狐冲是云计算行业的专家"
# 令狐冲 是 云 计算 行业 的 专家

Mix Segment mode with user-defined dictionary:

seg = JiebaRb::Segment.new mode: :mix, user_dict: "ext/cppjieba/dict/user.dict.utf8"
words = seg.cut "令狐冲是云计算行业的专家"
# 令狐冲 是 云计算 行业 的 专家

HMM or Max probability (mp) Segment mode:

seg = JiebaRb::Segment.new mode: :hmm # or  mode: :mp
words = seg.cut "令狐冲是云计算行业的专家"

Keyword Extractor Usage

  • only support TF-IDF currently
    keyword = JiebaRb::Keyword.new
    keywords_weights = keyword.extract "我是拖拉机学院手扶拖拉机专业的。不用多久,我就会升职加薪,当上CEO,走上人生巅峰。", 5

                  [
                  ["CEO", 11.739204307083542],
                  ["升职", 10.8561552143],
                  ["加薪", 10.642581114],
                  ["手扶拖拉机", 10.0088573539],
                  ["巅峰", 9.49395840471]
                  ]

Contributing

  1. Fork it ( http://github.com//jieba_rb/fork )
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create new Pull Request

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Ruby 48.1%
  • C++ 46.4%
  • C 5.5%