Prior Versions of this plugin
These are here mainly for historical reasons, please use the svn repository.
Original code by Kasper Weibel
This code was originally published by Kasper Weibel in an email on the Ruby on rails mailing list. It has been modified so that it will work on multiple ActiveRecord? Objects. It hasn't been thoroughly tested yet. The result is the acts_as_ferret Mixin for ActiveRecord?.
Use it as follows: In any model.rb add acts_as_ferret
class Foo < ActiveRecord::Base acts_as_ferret end
All CRUD operations will be performed on both ActiveRecord? (as usual) and a ferret index for further searching.
The following method is available in your controllers:
ActiveRecord::find_by_contents(query) # Query is a string representing your query
The plugin follows the usual plugin structure and consists of 2 files:
{RAILS_ROOT}/vendor/plugins/acts_as_ferret/init.rb
{RAILS_ROOT}/vendor/plugins/acts_as_ferret/lib/acts_as_ferret.rb
The Ferret DB is stored in:
{RAILS_ROOT}/db/index.db
(Does this hurt scaleability with multiple round-robin servers not sharing a common disk space? Too intensive to fit this into a central DB table?) Here follows the code:
# CODE for init.rb require 'acts_as_ferret' # END init.rb
# Copyright (c) 2006 Kasper Weibel Nielsen-Refs
# Permission is hereby granted, free of charge, to any person obtaining
# a copy of this software and associated documentation files (the
# "Software"), to deal in the Software without restriction, including
# without limitation the rights to use, copy, modify, merge, publish,
# distribute, sublicense, and/or sell copies of the Software, and to
# permit persons to whom the Software is furnished to do so, subject to
# the following conditions:
# The above copyright notice and this permission notice shall be
# included in all copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
# LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
# WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
# CODE for acts_as_ferret.rb
require 'active_record'
require 'ferret'
module FerretMixin
module Acts #:nodoc:
module ARFerret #:nodoc:
def self.append_features(base)
super
base.extend(MacroMethods)
end
# declare the class level helper methods
# which will load the relevant instance methods defined below when invoked
module MacroMethods
def acts_as_ferret
extend FerretMixin::Acts::ARFerret::ClassMethods
class_eval do
include FerretMixin::Acts::ARFerret::ClassMethods
after_create :ferret_create
after_update :ferret_update
after_destroy :ferret_destroy
end
end
end
module ClassMethods
include Ferret
INDEX_DIR = "#{RAILS_ROOT}/db/index.db"
def self.reloadable?; false end
# Finds instances by file contents.
def find_by_contents(query, options = {})
index_searcher ||= Search::IndexSearcher.new(INDEX_DIR)
query_parser ||= QueryParser.new(index_searcher.reader.get_field_names.to_a)
query = query_parser.parse(query + " +ferret_table:#{self.table_name}")
result = []
index_searcher.search_each(query) do |doc, score|
id = index_searcher.reader.get_document(doc)[:id]
res = self.find(id)
result << res if res
end
return result
end
# private
def ferret_create
# code to update or add to the index
index ||= Index::Index.new(:key => [:id, :ferret_table],
:path => INDEX_DIR,
:auto_flush => true)
index << self.to_doc
end
alias :ferret_update :ferret_create
def ferret_destroy
# code to delete from index
index ||= Index::Index.new(:key => [:id, :ferret_table],
:path => INDEX_DIR,
:auto_flush => true)
index.query_delete("+id:#{self.id} +ferret_table:#{self.table_name}")
end
def to_doc
# Churn through the complete Active Record and add it to the Ferret document
doc = Ferret::Document::Document.new
doc << Ferret::Document::Field.new(:ferret_table, self.table_name, Ferret::Document::Field::Store::YES, Ferret::Document::Field::Index::UNTOKENIZED)
self.attributes.each_pair do |key,val|
if key == :id
doc << Ferret::Document::Field.new(key, val.to_s, Ferret::Document::Field::Store::YES, Ferret::Document::Field::Index::UNTOKENIZED)
else
doc << Ferret::Document::Field.new(key, val.to_s, Ferret::Document::Field::Store::NO, Ferret::Document::Field::Index::TOKENIZED)
end
end
return doc
end
end
end
end
end
# reopen ActiveRecord and include all the above to make
# them available to all our models if they want it
ActiveRecord::Base.class_eval do
include FerretMixin::Acts::ARFerret
end
# END acts_as_ferret.rb
Alternate Version by Thomas Lockney
The code listed above has a few issues as discussed in this email thread. I've been working on some enhancements, but it's still a work in progress. Here's the code I have so far. There are definitely bugs, but I'll update the code here as I work through them and add other features.
A couple of notes about this implementation: * The class based querying is broken, but then again so is the implementation in the code listed above. * It would be nice to allow for the use of both the filesystem based indexing AND the in-memory approach, but currently I only allow for a string path to the index. I think this should be a straightforward fix, but it's not in there yet. * I'm still working on implementing the code that allows for passing a Query object to the find_by_contents method. * There are certainly a lot of other options for the index that need to be allowed for. I'm thinking that this could be implemented as a hash that can be set in environment.rb and then overridden in the case of per-class indexes.
# CODE for acts_as_ferret.rb
require 'active_record'
require 'ferret'
module FerretMixin
module Acts #:nodoc:
module ARFerret #:nodoc:
mattr_accessor :index_dir
@@index_dir ||= "#{RAILS_ROOT}/index"
def self.append_features(base)
super
base.extend(MacroMethods)
end
# declare the class level helper methods
# which will load the relevant instance methods defined below when invoked
module MacroMethods
def define_to_field_method(field, options = {})
default_opts = { :store => Field::Store::YES,
:index => Field::Index::UNTOKENIZED,
:term_vector => Field::TermVector::NO,
:binary => false,
:boost => 1.0}
default_opts.update(options) if options.is_a?(Hash)
fields_for_ferret << field
define_method ("#{field}_to_ferret".to_sym) do
val = self[field] || self.instance_variable_get("@#{field.to_s}".to_sym)
logger.debug("Adding field #{field} with value '#{val}' to index")
Ferret::Document::Field.new(field.to_s,
val,
default_opts[:store],
default_opts[:index],
default_opts[:term_vector],
default_opts[:binary],
default_opts[:boost])
end
end
def acts_as_ferret(options={})
configuration = {:fields => :all, :index_dir => FerretMixin::Acts::ARFerret::index_dir}
configuration.update(options) if options.is_a?(Hash)
extend FerretMixin::Acts::ARFerret::SingletonMethods
class_eval <<-EOV
include FerretMixin::Acts::ARFerret::SingletonMethods
after_create :ferret_create
after_update :ferret_update
after_destroy :ferret_destroy
cattr_accessor :fields_for_ferret
cattr_accessor :class_index_dir
@@fields_for_ferret = Array.new
@@class_index_dir = configuration[:index_dir]
# private
if configuration[:fields].respond_to?(:each_pair)
configuration[:fields].each_pair do |key,val|
define_to_field_method(key,val)
end
elsif configuration[:fields].respond_to?(:each)
configuration[:fields].each do |field|
define_to_field_method(field)
end
else
#need to handle :all case
end
EOV
end
end
module SingletonMethods
include Ferret
def self.reloadable?; false end
def ferret_index
@@index ||= Index::Index.new(:key => [:id, :ferret_class],
:path => class_index_dir,
:auto_flush => true,
:create_if_missing => true)
end
# Finds instances by file contents.
def find_by_contents(q, options = {})
index_searcher ||= Search::IndexSearcher.new(FerretMixin::Acts::ARFerret::index_dir)
query_parser ||= QueryParser.new(index_searcher.reader.get_field_names.to_a)
query = Search::BooleanQuery.new
if (q.is_a?(Search::Query))
query << Search::BooleanClause.new(q)
else
query << Search::BooleanClause.new(query_parser.parse(q))
end
query << Search::BooleanClause.new(Search::TermQuery.new(Index::Term.new("ferret_class", self.class.name)))
result = []
index_searcher.search_each(query) do |doc, score|
id = index_searcher.reader.get_document(doc)["id"]
res = self.find(id)
result << res
end
return result
end
def ferret_create
ferret_index << self.to_doc
end
alias :ferret_update :ferret_create
def ferret_destroy
# code to delete from index
begin
ferret_index.query_delete("+id:#{self.id} +ferret_class:#{self.class.name}")
rescue
logger.warn("Could not find indexed value for this object")
end
end
def to_doc
# Churn through the complete Active Record and add it to the Ferret document
doc = Document::Document.new
# store the table_name for every item indexed
doc << Document::Field.new("ferret_class", "#{self.class.name}", Document::Field::Store::YES, Document::Field::Index::UNTOKENIZED)
# store the id of each item
doc << Document::Field.new("id", self.id, Document::Field::Store::YES, Document::Field::Index::UNTOKENIZED)
# iterate through the fields and add them to the document
fields_for_ferret.each do |field|
doc << self.send("#{field}_to_ferret")
end
return doc
end
end
end
end
end
# reopen ActiveRecord and include all the above to make
# them available to all our models if they want it
ActiveRecord::Base.class_eval do
include FerretMixin::Acts::ARFerret
end
# END acts_as_ferret.rb
Third Version by Jens Kraemer - integrating Ferret with Typo
Jens integrated Ferret into his Typo installation, using above acts_as_ferret implementations as a starting point. See this post for more info and the code.
