v2.1.0 • Bayes + LSI
Text classification for Ruby made simple
Bayesian classification and Latent Semantic Indexing for categorizing documents, detecting spam, analyzing sentiment, and building semantic search.
Bayesian Classification
Train categories with examples, classify new text with probability scores.
Semantic Search
Find related documents using Latent Semantic Indexing and SVD.
Native Extensions
5-50x faster matrix operations with optional C extensions.
Persistence Framework
Pluggable storage backends for files, memory, Redis, and custom stores.
See it in action
Train a Classifier
require 'classifier' classifier = Classifier::Bayes.new 'Spam', 'Ham' classifier.train_spam "Buy cheap viagra now!!!" classifier.train_spam "You've won $1,000,000!" classifier.train_ham "Meeting tomorrow at 3pm" classifier.train_ham "Project deadline reminder"
Create categories and train with example text
Classify Text
classifier.classify "Claim your free prize now" # => "Spam" classifier.classify "Quarterly review scheduled" # => "Ham" # Get probability scores classifier.classifications "Limited time offer" # => {"Spam" => -4.2, "Ham" => -8.7}
Get the best category for new text
Semantic Search with LSI
lsi = Classifier::LSI.new lsi.add_item doc1, "Ruby programming language" lsi.add_item doc2, "Python snake reptile" lsi.add_item doc3, "Rails web framework Ruby" lsi.search "web development with Ruby" # => [doc3, doc1] # Semantically related results
Find documents by meaning, not just keywords
Find Related Documents
lsi.find_related doc1, 3 # => [doc3, ...] # Documents similar to doc1 # Classify by learned categories lsi.add_item "Ruby gem tutorial", :ruby lsi.add_item "Python pip package", :python lsi.classify "Installing bundler gem" # => :ruby
Discover similar content automatically
Start in seconds
$ gem install classifier
Or add to your Gemfile: gem 'classifier'