[
  {
    "path": ".github/workflows/test.yml",
    "content": "name: Test\n\non: [push, pull_request]\n\njobs:\n  test:\n\n    runs-on: ${{ matrix.os }}\n\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-18.04, ubuntu-20.04]\n        ruby: [2.6, 2.7, 3.0]\n    services:\n      redis:\n        image: redis\n        options: >-\n          --health-cmd \"redis-cli ping\"\n          --health-interval 10s\n          --health-timeout 5s\n          --health-retries 5\n        ports:\n          - 6379:6379\n    \n    steps:\n    - uses: actions/checkout@v2\n    - name: Set up Ruby ${{ matrix.ruby }}\n      uses: ruby/setup-ruby@v1\n      with:\n        bundler-cache: true\n        ruby-version: ${{ matrix.ruby }}\n    - name: Install dependencies\n      run: bundle install\n    - name: Run tests\n      run: bundle exec rake\n"
  },
  {
    "path": ".gitignore",
    "content": "bin/\n*.gem\nGemfile.lock\next/Makefile\n"
  },
  {
    "path": "Changelog.md",
    "content": "# Predictor Changelog\nAll notable changes to this project will be documented in this file.\n\n## [Unreleased]\n### Changed\n- Support rake version 11.0 or higher＆rspec version 3.4.0 or higher\n- Fix title of README\n- Change a test with github actions\n- Made it possible to run tests on ubuntu-18.04 and ubuntu-20.04\n- Fix the homepage entry in predictor.gemspec\n\n### **BREAKING CHANGES**\n- Ruby 2.1 ~ 2.5 will no longer be supported because of eol\n\n## [2.3.0] - 2014-09-06\n- The logic for processing item similarities was ported to a Lua script. Use `Predictor.processing_technique(:lua)` to use the Lua script for all similarity calculations, or use `MyRecommender.processing_technique(:lua)` to use it for specific recommenders. It is substantially faster than the default (old) Ruby mechanism, but has the disadvantage of blocking the Redis server while it runs.\n- An alternate method of calculating item similarities was added, which uses a ZUNIONSTORE across item sets. The results are similar to those achieved by using the Ruby or Lua scripts, but faster. Use `Predictor.processing_technique(:union)` to use the ZUNIONSTORE technique for all similarity calculations, or use `MyRecommender.processing_technique(:union)` to use it for specific recommenders.\n\n## [2.2.0] - 2014-06-24\n- The namespace used for keys in Redis is now configurable on a global or per-class basis. See the readme for more information. If you were overriding the redis_prefix instance method before, it is recommended that you use the new redis_prefix class method instead.\n- Data stored in Redis is now namespaced by the class name of the recommender it is stored by. This change ensures that different recommenders with input matrices of the same name don't overwrite each others' data. After upgrading you'll need to either reindex your data in Redis or configure Predictor to use the naming system you were using before. If you were using the defaults before and you're not worried about matrix name collisions, you can mimic the old behavior with:\n```ruby\n  class MyRecommender\n    include Predictor::Base\n    redis_prefix [nil]\n  end\n```\n- The #predictions_for method on recommenders now accepts a :boost option to give more weight to items with particular attributes. See the readme for more information.\n\n## [2.1.0] - 2014-06-19\n- The similarity limit now defaults to 128, instead of being unlimited. This is intended to save space in Redis. See the Readme for more information. It is strongly recommended that you run `ensure_similarity_limit_is_obeyed!` to shrink existing similarity sets.\n\n## [2.0.0] - 2014-04-17\n**Rewrite of 1.0.0 and contains several breaking changes!**\n\nVersion 1.0.0 (which really should have been 0.0.1) contained several issues that made compatability with v2 not worth the trouble. This includes:\n- In v1, similarities were cached per input_matrix, and Predictor::Base utilized those caches when determining similarities and predictions. This quickly ate up Redis memory with even a semi-large dataset, as each input_matrix had a significant memory requirement. v2 caches similarities at the root (Recommender::Base), which means you can add any number of input matrices with little impact on memory usage.\n- Added the ability to limit the number of items stored in the similarity cache (via the 'limit_similarities_to' option). Now that similarities are cached at the root, this is possible and can greatly help memory usage.\n- Removed bang methods from input_matrix (add_set!, and_single!, etc). These called process! for you previously, but since the cache is no longer kept at the input_matrix level, process! has to be called at the root (Recommender::Base)\n- Bug fix: Fixed bug where a call to delete_item! on the input matrix didn't update the similarity cache.\n- Other minor fixes."
  },
  {
    "path": "Gemfile",
    "content": "source 'https://rubygems.org'\n\ngemspec\n"
  },
  {
    "path": "LICENSE",
    "content": "The MIT License (MIT)\n\nCopyright (c) 2014 Pathgather\n\nPermission is hereby granted, free of charge, to any person obtaining a copy of\nthis software and associated documentation files (the \"Software\"), to deal in\nthe Software without restriction, including without limitation the rights to\nuse, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of\nthe Software, and to permit persons to whom the Software is furnished to do so,\nsubject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS\nFOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR\nCOPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER\nIN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN\nCONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.\n"
  },
  {
    "path": "README.md",
    "content": "# Predictor\n\nFast and efficient recommendations and predictions using Ruby & Redis. Developed by and used at [Pathgather](http://pathgather.com) to generate course similarities and content recommendations to users.\n\n![Test](https://github.com/nyagato-00/predictor/workflows/Test/badge.svg?branch=master)\n\nOriginally forked and based on [Recommendify](https://github.com/paulasmuth/recommendify) by Paul Asmuth, so a huge thanks to him for his contributions to Recommendify. Predictor has been almost completely rewritten to\n* Be much, much more performant and efficient by using Redis for most logic.\n* Provide item similarities such as \"Users that read this book also read ...\"\n* Provide personalized predictions based on a user's past history, such as \"You read these 10 books, so you might also like to read ...\"\n\nAt the moment, Predictor uses the [Jaccard index](http://en.wikipedia.org/wiki/Jaccard_index) or the [Sorenson-Dice coefficient](http://en.wikipedia.org/wiki/S%C3%B8rensen%E2%80%93Dice_coefficient) (default is Jaccard) to determine similarities between items. There are other ways to do this, which we intend to implement eventually, but if you want to beat us to the punch, pull requests are quite welcome :)\n\nNotice\n---------------------\nThis is the readme for Predictor 2.0, which contains a few breaking changes from 1.0. The 1.0 readme can be found [here](https://github.com/Pathgather/predictor/blob/master/docs/READMEv1.md). See below on how to upgrade to 2.0\n\nInstallation\n---------------------\nIn your Gemfile:\n```ruby\ngem 'predictor'\n```\nGetting Started\n---------------------\nFirst step is to configure Predictor with your Redis instance.\n```ruby\n# in config/initializers/predictor.rb\nPredictor.redis = Redis.new(:url => ENV[\"PREDICTOR_REDIS\"])\n\n# Or, to improve performance, add hiredis as your driver (you'll need to install the hiredis gem first)\nPredictor.redis = Redis.new(:url => ENV[\"PREDICTOR_REDIS\"], :driver => :hiredis)\n```\n\nInputting Data\n---------------------\nCreate a class and include the Predictor::Base module. Define an input_matrix for each relationship you'd like to keep track of. This can be anything you think is a significant metric for the item: page views, purchases, categories the item belongs to, etc.\n\nBelow, we're building a recommender to recommend courses based off of:\n* Users that have taken a course. If 2 courses were taken by the same user, this is 3 times as important to us than if the courses share the same topic. This will lead to sets like:\n  * \"user1\" -> \"course-1\", \"course-3\",\n  * \"user2\" -> \"course-1\", \"course-4\"\n* Tags and their courses. This will lead to sets like:\n  * \"rails\" -> \"course-1\", \"course-2\",\n  * \"microeconomics\" -> \"course-3\", \"course-4\"\n* Topics and their courses. This will lead to sets like:\n  * \"computer science\" -> \"course-1\", \"course-2\",\n  * \"economics and finance\" -> \"course-3\", \"course-4\"\n\n```ruby\nclass CourseRecommender\n  include Predictor::Base\n\n  input_matrix :users, weight: 3.0\n  input_matrix :tags, weight: 2.0\n  input_matrix :topics, weight: 1.0, measure: :sorensen_coefficient # Use Sorenson over Jaccard\nend\n```\n\nNow, we just need to update our matrices when courses are created, users take a course, topics are changed, etc:\n```ruby\nrecommender = CourseRecommender.new\n\n# Add a single course to topic-1's items. If topic-1 already exists as a set ID, this just adds course-1 to the set\nrecommender.add_to_matrix!(:topics, \"topic-1\", \"course-1\")\n\n# If your dataset is even remotely large, add_to_matrix! could take some time, as it must calculate the similarity scores\n# for course-1 and other courses that share a set with course-1. If this is the case, use add_to_matrix and\n# process the items at a more convenient time, perhaps in a background job\nrecommender.topics.add_to_set(\"topic-1\", \"course-1\", \"course-2\") # Same as recommender.add_to_matrix(:topics, \"topic-1\", \"course-1\", \"course-2\")\nrecommender.process_items!(\"course-1\", \"course-2\")\n```\n\nAs noted above, it's important to remember that if you don't use the bang method 'add_to_matrix!', you'll need to manually update your similarities. If your dataset is even remotely large, you'll probably want to do this:\n* If you want to update the similarities for certain item(s):\n  ````\n  recommender.process_items!(item1, item2, etc)\n  ````\n* If you want to update all similarities for all items:\n  ````\n  recommender.process!\n  ````\n\nRetrieving Similarities and Recommendations\n---------------------\nNow that your matrices have been initialized with several relationships, you can start generating similarities and recommendations! First, let's start with similarities, which will use the weights we specify on each matrix to determine which courses share the most in common with a given course.\n\n```ruby\nrecommender = CourseRecommender.new\n\n# Return all similarities for course-1 (ordered by most similar to least).\nrecommender.similarities_for(\"course-1\")\n\n# Need to paginate? Not a problem! Specify an offset and a limit\nrecommender.similarities_for(\"course-1\", offset: 10, limit: 10) # Gets similarities 11-20\n\n# Want scores?\nrecommender.similarities_for(\"course-1\", with_scores: true)\n\n# Want to ignore a certain set of courses in similarities?\nrecommender.similarities_for(\"course-1\", exclusion_set: [\"course-2\"])\n```\n\nThe above examples are great for situations like \"Users that viewed this also liked ...\", but what if you wanted to recommend courses to a user based on the courses they've already taken? Not a problem!\n\n```ruby\nrecommender = CourseRecommender.new\n\n# User has taken course-1 and course-2. Let's see what else they might like...\nrecommender.predictions_for(item_set: [\"course-1\", \"course-2\"])\n\n# Already have the set you need stored in an input matrix? In our case, we do (the users matrix stores the courses a user has taken), so we can just do:\nrecommender.predictions_for(\"user-1\", matrix_label: :users)\n\n# Paginate too!\nrecommender.predictions_for(\"user-1\", matrix_label: :users, offset: 10, limit: 10)\n\n# Gimme some scores and ignore course-2....that course-2 is one sketchy fella\nrecommender.predictions_for(\"user-1\", matrix_label: :users, with_scores: true, exclusion_set: [\"course-2\"])\n```\n\nDeleting Items\n---------------------\nIf your data is deleted from your persistent storage, you certainly don't want to recommend it to a user. To ensure that doesn't happen, simply call delete_from_matrix! with the individual matrix or delete_item! if the item is completely gone:\n```ruby\nrecommender = CourseRecommender.new\n\n# User removed course-1 from topic-1, but course-1 still exists\n\nrecommender.delete_pair_from_matrix!(:topics, \"topic-1\", \"course-1\")\n\n#User removed course-1 from all topics\nrecommender.delete_from_matrix!(:topics, \"course-1\")\n\n# course-1 was permanently deleted\nrecommender.delete_item!(\"course-1\")\n\n# Something crazy has happened, so let's just start fresh and wipe out all previously stored similarities:\nrecommender.clean!\n```\n\nLimiting Similarities\n---------------------\nBy default, Predictor caches 128 similarities for each item. This is because this is the maximum size for the similarity sorted sets to be kept in a [memory-efficient format](http://redis.io/topics/memory-optimization). If you want to keep more similarities than that, and you don't mind using more memory, you may want to increase the similarity limit, like so:\n\n```ruby\nclass CourseRecommender\n  include Predictor::Base\n\n  limit_similarities_to 500\n  input_matrix :users, weight: 3.0\n  input_matrix :tags, weight: 2.0\n  input_matrix :topics, weight: 1.0\nend\n```\n\nThe memory penalty can be heavy, though. In our testing, similarity caches for 1,000 objects varied in size like so:\n\n```\nlimit_similarities_to(128) # 8.5 MB (this is the default)\nlimit_similarities_to(129) # 22.74 MB\nlimit_similarities_to(500) # 76.72 MB\n```\n\nIf you decide you need to store more than 128 similarities, you may want to see the Redis documentation linked above and consider increasing `zset-max-ziplist-entries` in your configuration.\n\nPredictions fetched with the predictions_for call utilizes the similarity caches, so if you're using predictions_for, make sure you set the limit high enough so that intelligent predictions can be generated. If you aren't using predictions and are just using similarities, then feel free to set this to the maximum number of similarities you'd possibly want to show!\n\nYou can also use `limit_similarities_to(nil)` to remove the limit entirely. This means if you have 10,000 items, and each item is somehow related to the other, you'll have 10,000 sets each with 9,999 items, which will run up your Redis bill quite quickly. Removing the limit is not recommended unless you're sure you know what you're doing.\n\nIf at some point you decide to lower your similarity limits, you'll want to be sure to shrink the size of the sorted sets already in Redis. You can do this with `CourseRecommender.new.ensure_similarity_limit_is_obeyed!`.\n\nBoost\n---------------------\nWhat if you want to recommend courses to users based not only on what courses they've taken, but on other attributes of courses that they may be interested in? You can do that by passing the :boost argument to predictions_for:\n\n```ruby\nclass CourseRecommender\n  include Predictor::Base\n\n  # Courses are compared to one another by the users taking them and their tags.\n  input_matrix :users,  weight: 3.0\n  input_matrix :tags,   weight: 2.0\n  input_matrix :topics, weight: 2.0\nend\n\nrecommender = CourseRecommender.new\n\n# We want to find recommendations for Billy, who's told us that he's\n# especially interested in free, interactive courses on Photoshop. So, we give\n# a boost to courses that are tagged as free and interactive and have\n# Photoshop as a topic:\nrecommender.predictions_for(\"Billy\", matrix_label: :users, boost: {tags: ['free', 'interactive'], topics: [\"Photoshop\"]})\n\n# We can also modify how much these tags and topics matter by specifying a\n# weight. The default is 1.0, but if that's too much we can just tweak it:\nrecommender.predictions_for(\"Billy\", matrix_label: :users, boost: {tags: {values: ['free', 'interactive'], weight: 0.4}, topics: {values: [\"Photoshop\"], weight: 0.3}})\n```\n\nKey Prefixes\n---------------------\nAs of 2.2.0, there is much more control available over the format of the keys Predictor will use in Redis. By default, the CourseRecommender given as an example above will use keys like \"predictor:CourseRecommender:users:items:user1\". You can configure the global namespace like so:\n\n```ruby\n  Predictor.redis_prefix 'my_namespace' # => \"my_namespace:CourseRecommender:users:items:user1\"\n  # Or, for a multitenanted setup:\n  Predictor.redis_prefix { \"user-#{User.current.id}\" } # => \"user-7:CourseRecommender:users:items:user1\"\n```\n\nYou can also configure the namespace used by each class you create:\n\n```ruby\n  class CourseRecommender\n    include Predictor::Base\n    redis_prefix \"courses\" # => \"predictor:courses:users:items:user1\"\n    redis_prefix { \"courses_for_user-#{User.current.id}\" } # => \"predictor:courses_for_user-7:users:items:user1\"\n  end\n```\n\nYou can also configure the namespace used by each instance you create in addition to class and global namespace:\n\n```ruby\n  class CourseRecommender\n    include Predictor::Base\n\n    def initialize(prefix)\n      @prefix = prefix\n    end\n\n    # Simply override this instance method with the prefix you want\n    def get_redis_prefix\n      @prefix\n    end\n  end\n\n  recommender = CourseRecommender.new(\"super\")\n  recommender.redis_prefix # \"predictor:CourseRecommender:super\"\n```\n\nProcessing Items\n---------------------\nAs of 2.3.0, there are now multiple techniques available for processing item similarities. You can choose between them by setting a global default like `Predictor.processing_technique(:lua)` or setting a technique for certain classes like `CourseRecommender.processing_technique(:union)`. There are three values.\n- :ruby - This is the default, and is how Predictor calculated similarities before 2.3.0. With this technique the Jaccard and Sorensen calculations are performed in Ruby, with frequent calls to Redis to retrieve simple values. It is somewhat slow.\n- :lua - This option performs the Jaccard and Sorensen calculations in a Lua script on the Redis server. It is substantially faster than the :ruby technique, but blocks the Redis server while each set of calculations are run. The period of blocking will vary based on the size and disposition of your data, but each call may take up to several hundred milliseconds. If your application requires your Redis server to always return results quickly, and you're not able to simply run calculations during off-hours, you should use a different strategy.\n- :union - This option skips Jaccard and Sorensen entirely, and uses a simpler technique involving a ZUNIONSTORE across many item sets to calculate similarities. The results are different from, but similar to the results of using the Jaccard and Sorensen algorithms. It is even faster than the :lua option and does not have the same problem of blocking Redis for long periods of time, but before using it you should sample the output to ensure that it is good enough for your application.\n\nPredictor now contains a benchmarking script that you can use to compare the speed of these options. An example output from the processing of a relatively small dataset is:\n\n```\nruby = 21.098 seconds\nlua = 2.106 seconds\nunion = 0.741 seconds\n```\n\nUpgrading from 1.0 to 2.0\n---------------------\nAs mentioned, 2.0.0 is quite a bit different than 1.0.0, so simply upgrading with no changes likely won't work. My apologies for this. I promise this won't happen in future releases, as I'm much more confident in this Predictor release than the last. Anywho, upgrading really shouldn't be that much of a pain if you follow these steps:\n\n* Change predictor.matrix.add_set! and predictor.matrix.add_single! calls to predictor.add_to_matrix!. For example:\n```ruby\n# Change\npredictor.topics.add_single!(\"topic-1\", \"course-1\")\n# to\npredictor.add_to_matrix!(:topics, \"topic-1\", \"course-1\")\n\n# Change\npredictor.tags.add_set!(\"tag-1\", [\"course-1\", \"course-2\"])\n# to\npredictor.add_to_matrix!(:tags, \"tag-1\", \"course-1\", \"course-2\")\n```\n* Change predictor.matrix.process! or predictor.matrix.process_item! calls to just predictor.process! or predictor.process_items!\n```ruby\n# Change\npredictor.topics.process_item!(\"course-1\")\n# to\npredictor.process_items!(\"course-1\")\n```\n* Change predictor.matrix.delete_item! calls to predictor.delete_from_matrix!. This will update similarities too, so you may want to queue this to run in a background job.\n```ruby\n# Change\npredictor.topics.delete_item!(\"course-1\")\n# to delete_from_matrix! if you want to update similarities to account for the deleted item (in v1, this was a bug and didn't occur)\npredictor.delete_from_matrix!(:topics, \"course-1\")\n```\n* Regenerate your recommendations, as redis keys have changed for Predictor 2. You can use the recommender.clean! to clear out old similarities, then run your rake task (or whatever you've setup) to create new similarities.\n\nAbout Pathgather\n---------------------\nPathgather is an NYC-based startup building a platform that dramatically accelerates learning for enterprises by bringing employees, training content, and existing enterprise systems into one engaging platform.\n\nEvery Friday, we work on open-source software (our own or other projects). Want to join our always growing team? Peruse our [current opportunities](http://www.pathgather.com/jobs/) or reach out to us at <tech@pathgather.com>!\n\nProblems? Issues? Want to help out?\n---------------------\nJust submit a Gihub issue or pull request! We'd love to have you help out, as the most common library to use for this need, Recommendify, was last updated 2 years ago. We'll be sure to keep this maintained, but we could certainly use your help!\n\nThe MIT License (MIT)\n---------------------\nCopyright (c) 2014 Pathgather\n\nPermission is hereby granted, free of charge, to any person obtaining a copy of\nthis software and associated documentation files (the \"Software\"), to deal in\nthe Software without restriction, including without limitation the rights to\nuse, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of\nthe Software, and to permit persons to whom the Software is furnished to do so,\nsubject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS\nFOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR\nCOPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER\nIN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN\nCONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.\n"
  },
  {
    "path": "Rakefile",
    "content": "require 'bundler/gem_tasks'\n\nrequire 'rspec/core/rake_task'\nRSpec::Core::RakeTask.new(:spec)\n\ntask :default => :spec\n\nDir[\"./benchmark/*.rb\"].sort.each &method(:require)\n"
  },
  {
    "path": "benchmark/process.rb",
    "content": "namespace :benchmark do\n  task :process do\n    require 'predictor'\n    require 'pry'\n    require 'logger'\n\n    Predictor.redis = Redis.new #logger: Logger.new(STDOUT)\n    Predictor.redis_prefix \"predictor-benchmark\"\n\n    def flush!\n      keys = Predictor.redis.keys(\"predictor-benchmark*\")\n      Predictor.redis.del(keys) if keys.any?\n    end\n\n    class ItemRecommender\n      include Predictor::Base\n\n      input_matrix :users, weight: 2.0\n      input_matrix :parts, weight: 1.0\n    end\n\n    flush!\n\n    items = (1..200).map { |i| \"item-#{i}\" }\n    users = (1..100).map { |i| \"user-#{i}\" }\n    parts = (1..100).map { |i| \"part-#{i}\" }\n\n    r = ItemRecommender.new\n\n    start = Time.now\n    users.each { |user| r.users.add_to_set user, *items.sample(40) }\n    parts.each { |part| r.parts.add_to_set part, *items.sample(40) }\n    elapsed = Time.now - start\n\n    puts \"add_to_set = #{elapsed.round(3)} seconds\"\n\n    [:ruby, :lua, :union].each do |technique|\n      start = Time.now\n      Predictor.processing_technique technique\n      r.process!\n      elapsed = Time.now - start\n      puts \"#{technique} = #{elapsed.round(3)} seconds\"\n    end\n\n    flush!\n  end\nend\n"
  },
  {
    "path": "docs/READMEv1.md",
    "content": "=======\nPredictor\n=========\n\nFast and efficient recommendations and predictions using Ruby & Redis. Used in production over at [Pathgather](http://pathgather.com) to generate course similarities and content recommendations to users.\n\n![](https://www.codeship.io/projects/5aeeedf0-6053-0131-2319-5ede98f174ff/status)\n\nOriginally forked and based on [Recommendify](https://github.com/paulasmuth/recommendify) by Paul Asmuth, so a huge thanks to him for his contributions to Recommendify. Predictor has been almost completely rewritten to\n* Be much, much more performant and efficient by using Redis for most logic.\n* Provide item similarities such as \"Users that read this book also read ...\"\n* Provide personalized predictions based on a user's past history, such as \"You read these 10 books, so you might also like to read ...\"\n\nAt the moment, Predictor uses the [Jaccard index](http://en.wikipedia.org/wiki/Jaccard_index) to determine similarities between items. There are other ways to do this, which we intend to implement eventually, but if you want to beat us to the punch, pull requests are quite welcome :)\n\nInstallation\n---------------------\n```ruby\ngem install predictor\n````\nor in your Gemfile:\n````\ngem 'predictor'\n```\nGetting Started\n---------------------\nFirst step is to configure Predictor with your Redis instance.\n```ruby\n# in config/initializers/predictor.rb\nPredictor.redis = Redis.new(:url => ENV[\"PREDICTOR_REDIS\"])\n\n# Or, to improve performance, add hiredis as your driver (you'll need to install the hiredis gem first)\nPredictor.redis = Redis.new(:url => ENV[\"PREDICTOR_REDIS\"], :driver => :hiredis)\n```\nInputting Data\n---------------------\nCreate a class and include the Predictor::Base module. Define an input_matrix for each relationship you'd like to keep track of. This can be anything you think is a significant metric for the item: page views, purchases, categories the item belongs to, etc.\n\nBelow, we're building a recommender to recommend courses based off of:\n* Users that have taken a course. If 2 courses were taken by the same user, this is 3 times as important to us than if the courses share the same topic. This will lead to sets like:\n  * \"user1\" -> \"course-1\", \"course-3\",\n  * \"user2\" -> \"course-1\", \"course-4\"\n* Tags and their courses. This will lead to sets like:\n  * \"rails\" -> \"course-1\", \"course-2\",\n  * \"microeconomics\" -> \"course-3\", \"course-4\"\n* Topics and their courses. This will lead to sets like:\n  * \"computer science\" -> \"course-1\", \"course-2\",\n  * \"economics and finance\" -> \"course-3\", \"course-4\"\n\n```ruby\nclass CourseRecommender\n  include Predictor::Base\n\n  input_matrix :users, weight: 3.0\n  input_matrix :tags, weight: 2.0\n  input_matrix :topics, weight: 1.0\nend\n```\n\nNow, we just need to update our matrices when courses are created, users take a course, topics are changed, etc:\n```ruby\nrecommender = CourseRecommender.new\n\n# Add a single course to topic-1's items. If topic-1 already exists as a set ID, this just adds course-1 to the set\nrecommender.topics.add_single!(\"topic-1\", \"course-1\")\n\n# If your matrix is quite large, add_single! could take some time, as it must calculate the similarity scores\n# for course-1 across all other courses. If this is the case, use add_single and process the item at a more\n# convenient time, perhaps in a background job\nrecommender.topics.add_single(\"topic-1\", \"course-1\")\nrecommender.topics.process_item!(\"course-1\")\n\n# Add an array of courses to tag-1. Again, these will simply be added to tag-1's existing set, if it exists.\n# If not, the tag-1 set will be initialized with course-1 and course-2\nrecommender.tags.add_set!(\"tag-1\", [\"course-1\", \"course-2\"])\n\n# Or, just add the set and process whenever you like\nrecommender.tags.add_set(\"tag-1\", [\"course-1\", \"course-2\"])\n[\"course-1\", \"course-2\"].each { |course| recommender.topics.process_item!(course) }\n```\n\nAs noted above, it's important to remember that if you don't use the bang methods (add_set! and add_single!), you'll need to manually update your similarities (the bang methods will likely suffice for most use cases though). You can do so a variety of ways.\n* If you want to simply update the similarities for a single item in a specific matrix:\n  ````\n  recommender.matrix.process_item!(item)\n  ````\n* If you want to update the similarities for all items in a specific matrix:\n  ````\n  recommender.matrix.process!\n  ````\n* If you want to update the similarities for a single item in all matrices:\n  ````\n  recommender.process_item!(item)\n  ````\n* If you want to update all similarities in all matrices:\n  ````\n  recommender.process!\n  ````\n\nRetrieving Similarities and Recommendations\n---------------------\nNow that your matrices have been initialized with several relationships, you can start generating similarities and recommendations! First, let's start with similarities, which will use the weights we specify on each matrix to determine which courses share the most in common with a given course.\n\n![Course Alternative](http://pathgather.github.io/predictor/images/course-alts.png)\n\n```ruby\nrecommender = CourseRecommender.new\n\n# Return all similarities for course-1 (ordered by most similar to least).\nrecommender.similarities_for(\"course-1\")\n\n# Need to paginate? Not a problem! Specify an offset and a limit\nrecommender.similarities_for(\"course-1\", offset: 10, limit: 10) # Gets similarities 11-20\n\n# Want scores?\nrecommender.similarities_for(\"course-1\", with_scores: true)\n\n# Want to ignore a certain set of courses in similarities?\nrecommender.similarities_for(\"course-1\", exclusion_set: [\"course-2\"])\n```\n\nThe above examples are great for situations like \"Users that viewed this also liked ...\", but what if you wanted to recommend courses to a user based on the courses they've already taken? Not a problem!\n\n![Course Recommendations](http://pathgather.github.io/predictor/images/suggested.png)\n\n```ruby\nrecommender = CourseRecommender.new\n\n# User has taken course-1 and course-2. Let's see what else they might like...\nrecommender.predictions_for(item_set: [\"course-1\", \"course-2\"])\n\n# Already have the set you need stored in an input matrix? In our case, we do (the users matrix stores the courses a user has taken), so we can just do:\nrecommender.predictions_for(\"user-1\", matrix_label: :users)\n\n# Paginate too!\nrecommender.predictions_for(\"user-1\", matrix_label: :users, offset: 10, limit: 10)\n\n# Gimme some scores and ignore user-2....that user-2 is one sketchy fella\nrecommender.predictions_for(\"user-1\", matrix_label: :users, with_scores: true, exclusion_set: [\"user-2\"])\n```\n\nDeleting Items\n---------------------\nIf your data is deleted from your persistent storage, you certainly don't want to recommend that data to a user. To ensure that doesn't happen, simply call delete_item! on the individual matrix or recommender as a whole:\n```ruby\nrecommender = CourseRecommender.new\n\n# User removed course-1 from topic-1, but course-1 still exists\nrecommender.topics.delete_item!(\"course-1\")\n\n# course-1 was permanently deleted\nrecommender.delete_item!(\"course-1\")\n\n# Something crazy has happened, so let's just start fresh and wipe out all previously stored similarities:\nrecommender.clean!\n```\n\nMemory Management\n---------------------\nPredictor works by caching the similarities for each item in each matrix, then computing overall similarities off those caches. With an even semi-large dataset, this can really eat up Redis's memory. To limit the number of similarities cached in each matrix, specify a similarity_limit option when defining the matrix.\n```ruby\nclass CourseRecommender\n  include Predictor::Base\n\n  input_matrix :users, weight: 3.0, similarity_limit: 300\n  input_matrix :tags, weight: 2.0, similarity_limit: 300\n  input_matrix :topics, weight: 1.0, similarity_limit: 300\nend\n```\n\nThis will ensure that only the top 300 similarities for each item are cached in each matrix. This can greatly reduce your memory usage, and if you're just using Predictor for scenarios where you maybe show the top 5 or so similar items, then this can be hugely helpful. But note, **don't set similarity_limit to 5 in that case**. This simply limits the similarities cached in each matrix, but does not limit the similarities for an item across all matrices. That is computed (and can be limited) on the fly, and uses the similarity cache in each matrix. So, you need a large enough cache in each matrix to determine an intelligent similarity list across all matrices.\n\n*Note*: This is a bit of a hack, and there are most certainly other ways to improve Predictor's memory usage for large datasets, but each appear to require a more significant change than the trivial implementation of similarity_limit above. PRs are quite welcome that experiment with these other ways :)\n\nOh, and if you decide to tinker with your limit to try and find a sweet spot, I added a helpful method to ensure limits are obeyed to avoid regenerating all similarities. Of course, this only helps if you are decreasing the limit. If you're increasing it, you'll need to process similarities all over.\n```ruby\nrecommender.users.ensure_similarity_limit_is_obeyed!  # Remove similarities that disobey our current limit\nrecommender.tags.ensure_similarity_limit_is_obeyed!\nrecommender.topics.ensure_similarity_limit_is_obeyed!\n```\n\nProblems? Issues? Want to help out?\n---------------------\nJust submit a Gihub issue or pull request! We'd love to have you help out, as the most common library to use for this need, Recommendify, was last updated 2 years ago. We'll be sure to keep this maintained, but we could certainly use your help!\n\nThe MIT License (MIT)\n---------------------\nCopyright (c) 2014 Pathgather\n\nPermission is hereby granted, free of charge, to any person obtaining a copy of\nthis software and associated documentation files (the \"Software\"), to deal in\nthe Software without restriction, including without limitation the rights to\nuse, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of\nthe Software, and to permit persons to whom the Software is furnished to do so,\nsubject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS\nFOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR\nCOPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER\nIN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN\nCONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.\n\n"
  },
  {
    "path": "lib/predictor/base.rb",
    "content": "module Predictor::Base\n  def self.included(base)\n    base.extend(ClassMethods)\n  end\n\n  module ClassMethods\n    def input_matrix(key, opts={})\n      @matrices ||= {}\n      @matrices[key] = opts\n    end\n\n    def limit_similarities_to(val)\n      @similarity_limit_set = true\n      @similarity_limit     = val\n    end\n\n    def similarity_limit\n      @similarity_limit_set ? @similarity_limit : 128\n    end\n\n    def reset_similarity_limit!\n      @similarity_limit_set = nil\n      @similarity_limit     = nil\n    end\n\n    def input_matrices=(val)\n      @matrices = val\n    end\n\n    def input_matrices\n      @matrices\n    end\n\n    def redis_prefix(prefix = nil, &block)\n      @redis_prefix = block_given? ? block : prefix\n    end\n\n    def get_redis_prefix\n      if @redis_prefix\n        if @redis_prefix.respond_to?(:call)\n          @redis_prefix.call\n        else\n          @redis_prefix\n        end\n      else\n        to_s\n      end\n    end\n\n    def processing_technique(technique)\n      @technique = technique\n    end\n\n    def get_processing_technique\n      @technique || Predictor.get_processing_technique\n    end\n  end\n\n  def input_matrices\n    @input_matrices ||= Hash[self.class.input_matrices.map{ |key, opts|\n      opts.merge!(:key => key, :base => self)\n      [ key, Predictor::InputMatrix.new(opts) ]\n    }]\n  end\n\n  def get_redis_prefix\n    nil # Override in subclass.\n  end\n\n  def redis_prefix\n    [Predictor.get_redis_prefix, self.class.get_redis_prefix, self.get_redis_prefix].compact\n  end\n\n  def similarity_limit\n    self.class.similarity_limit\n  end\n\n  def redis_key(*append)\n    ([redis_prefix] + append).flatten.compact.join(\":\")\n  end\n\n  def method_missing(method, *args)\n    if input_matrices.has_key?(method)\n      input_matrices[method]\n    else\n      raise NoMethodError.new(method.to_s)\n    end\n  end\n\n  def respond_to?(method, include_all = false)\n    input_matrices.has_key?(method) ? true : super\n  end\n\n  def all_items\n    Predictor.redis.smembers(redis_key(:all_items))\n  end\n\n  def add_to_matrix(matrix, set, *items)\n    items = items.flatten if items.count == 1 && items[0].is_a?(Array)  # Old syntax\n    input_matrices[matrix].add_to_set(set, *items)\n  end\n\n  def add_to_matrix!(matrix, set, *items)\n    items = items.flatten if items.count == 1 && items[0].is_a?(Array)  # Old syntax\n    add_to_matrix(matrix, set, *items)\n    process_items!(*items)\n  end\n\n  def related_items(item)\n    keys = []\n    input_matrices.each do |key, matrix|\n      sets = Predictor.redis.smembers(matrix.redis_key(:sets, item))\n      keys.concat(sets.map { |set| matrix.redis_key(:items, set) })\n    end\n\n    keys.empty? ? [] : (Predictor.redis.sunion(keys) - [item.to_s])\n  end\n\n  def predictions_for(set=nil, item_set: nil, matrix_label: nil, with_scores: false, on: nil, offset: 0, limit: -1, exclusion_set: [], boost: {})\n    fail \"item_set or matrix_label and set is required\" unless item_set || (matrix_label && set)\n\n    on = Array(on)\n\n    if matrix_label\n      matrix = input_matrices[matrix_label]\n      item_set = Predictor.redis.smembers(matrix.redis_key(:items, set))\n    end\n\n    item_keys = []\n    weights   = []\n\n    item_set.each do |item|\n      item_keys << redis_key(:similarities, item)\n      weights   << 1.0\n    end\n\n    boost.each do |matrix_label, values|\n      m = input_matrices[matrix_label]\n\n      # Passing plain sets to zunionstore is undocumented, but tested and supported:\n      # https://github.com/antirez/redis/blob/2.8.11/tests/unit/type/zset.tcl#L481-L489\n\n      case values\n      when Hash\n        values[:values].each do |value|\n          item_keys << m.redis_key(:items, value)\n          weights   << values[:weight]\n        end\n      when Array\n        values.each do |value|\n          item_keys << m.redis_key(:items, value)\n          weights   << 1.0\n        end\n      else\n        raise \"Bad value for boost: #{boost.inspect}\"\n      end\n    end\n\n    return [] if item_keys.empty?\n\n    predictions = nil\n\n    Predictor.redis.multi do |multi|\n      multi.zunionstore 'temp', item_keys, weights: weights\n      multi.zrem 'temp', item_set if item_set.any?\n      multi.zrem 'temp', exclusion_set if exclusion_set.length > 0\n\n      if on.any?\n        multi.zadd 'temp2', on.map{ |val| [0.0, val] }\n        multi.zinterstore 'temp', ['temp', 'temp2']\n        multi.del 'temp2'\n      end\n\n      predictions = multi.zrevrange 'temp', offset, limit == -1 ? limit : offset + (limit - 1), with_scores: with_scores\n      multi.del 'temp'\n    end\n\n    predictions.value\n  end\n\n  def similarities_for(item, with_scores: false, offset: 0, limit: -1, exclusion_set: [])\n    neighbors = nil\n    Predictor.redis.multi do |multi|\n      multi.zunionstore 'temp', [1, redis_key(:similarities, item)]\n      multi.zrem 'temp', exclusion_set if exclusion_set.length > 0\n      neighbors = multi.zrevrange('temp', offset, limit == -1 ? limit : offset + (limit - 1), with_scores: with_scores)\n      multi.del 'temp'\n    end\n    return neighbors.value\n  end\n\n  def sets_for(item)\n    keys = input_matrices.map{ |k,m| m.redis_key(:sets, item) }\n    Predictor.redis.sunion keys\n  end\n\n  def process_item!(item)\n    process_items!(item)  # Old method\n  end\n\n  def process_items!(*items)\n    items = items.flatten if items.count == 1 && items[0].is_a?(Array) # Old syntax\n\n    case self.class.get_processing_technique\n    when :lua\n      matrix_data = {}\n      input_matrices.each do |name, matrix|\n        matrix_data[name] = {weight: matrix.weight, measure: matrix.measure_name}\n      end\n      matrix_json = JSON.dump(matrix_data)\n\n      items.each do |item|\n        Predictor.process_lua_script(redis_key, matrix_json, similarity_limit, item)\n      end\n    when :union\n      items.each do |item|\n        keys    = []\n        weights = []\n\n        input_matrices.each do |key, matrix|\n          k = matrix.redis_key(:sets, item)\n          item_keys = Predictor.redis.smembers(k).map { |set| matrix.redis_key(:items, set) }\n\n          counts = Predictor.redis.multi do |multi|\n            item_keys.each { |key| Predictor.redis.scard(key) }\n          end\n\n          item_keys.zip(counts).each do |key, count|\n            unless count.zero?\n              keys << key\n              weights << matrix.weight / count\n            end\n          end\n        end\n\n        Predictor.redis.multi do |multi|\n          key = redis_key(:similarities, item)\n          multi.del(key)\n\n          if keys.any?\n            multi.zunionstore(key, keys, weights: weights)\n            multi.zrem(key, item)\n            multi.zremrangebyrank(key, 0, -(similarity_limit + 1))\n            multi.zunionstore key, [key] # Rewrite zset for optimized storage.\n          end\n        end\n      end\n    else # Default to old behavior, processing things in Ruby.\n      items.each do |item|\n        related_items(item).each { |related_item| cache_similarity(item, related_item) }\n      end\n    end\n\n    return self\n  end\n\n  def process!\n    process_items!(*all_items)\n    return self\n  end\n\n  def delete_from_matrix!(matrix, item)\n    # Deleting from a specific matrix, so get related_items, delete, then update the similarity of those related_items\n    items = related_items(item)\n    input_matrices[matrix].delete_item(item)\n    items.each { |related_item| cache_similarity(item, related_item) }\n    return self\n  end\n\n  def delete_pair_from_matrix!(matrix, set, item)\n    items = related_items(item)\n    input_matrices[matrix].remove_from_set(set, item)\n    items.each { |related_item| cache_similarity(item, related_item) }\n    return self\n  end\n\n  def add_item(item)\n    Predictor.redis.sadd(redis_key(:all_items), item)\n  end\n\n  def delete_item!(item)\n    Predictor.redis.srem(redis_key(:all_items), item)\n    Predictor.redis.watch(redis_key(:similarities, item)) do\n      items = related_items(item)\n      Predictor.redis.multi do |multi|\n        items.each do |related_item|\n          multi.zrem(redis_key(:similarities, related_item), item)\n        end\n        multi.del redis_key(:similarities, item)\n      end\n    end\n\n    input_matrices.each do |k,m|\n      m.delete_item(item)\n    end\n    return self\n  end\n\n  def clean!\n    keys = Predictor.redis.keys(redis_key('*'))\n    unless keys.empty?\n      Predictor.redis.del(keys)\n    end\n  end\n\n  def ensure_similarity_limit_is_obeyed!\n    if similarity_limit\n      items = all_items\n      Predictor.redis.multi do |multi|\n        items.each do |item|\n          key = redis_key(:similarities, item)\n          multi.zremrangebyrank(key, 0, -(similarity_limit + 1))\n          multi.zunionstore key, [key] # Rewrite zset to take advantage of ziplist implementation.\n        end\n      end\n    end\n  end\n\n  private\n\n  def cache_similarity(item1, item2)\n    score = 0\n    input_matrices.each do |key, matrix|\n      score += (matrix.score(item1, item2) * matrix.weight)\n    end\n    if score > 0\n      add_similarity_if_necessary(item1, item2, score)\n      add_similarity_if_necessary(item2, item1, score)\n    else\n      Predictor.redis.multi do |multi|\n        multi.zrem(redis_key(:similarities, item1), item2)\n        multi.zrem(redis_key(:similarities, item2), item1)\n      end\n    end\n  end\n\n  def add_similarity_if_necessary(item, similarity, score)\n    store = true\n    key = redis_key(:similarities, item)\n    if similarity_limit\n      if Predictor.redis.zrank(key, similarity).nil? && Predictor.redis.zcard(key) >= similarity_limit\n        # Similarity is not already stored and we are at limit of similarities\n        lowest_scored_item = Predictor.redis.zrangebyscore(key, \"0\", \"+inf\", limit: [0, 1], with_scores: true)\n        unless lowest_scored_item.empty?\n          # If score is less than or equal to the lowest score, don't store it. Otherwise, make room by removing the lowest scored similarity\n          score <= lowest_scored_item[0][1] ? store = false : Predictor.redis.zrem(key, lowest_scored_item[0][0])\n        end\n      end\n    end\n    Predictor.redis.zadd(key, score, similarity) if store\n  end\nend\n"
  },
  {
    "path": "lib/predictor/distance.rb",
    "content": "module Predictor\n  module Distance\n    extend self\n\n    def jaccard_index(key_1, key_2, redis = Predictor.redis)\n      x, y = nil\n\n      redis.multi do |multi|\n        x = multi.sinterstore 'temp', [key_1, key_2]\n        y = multi.sunionstore 'temp', [key_1, key_2]\n        multi.del 'temp'\n      end\n\n      y.value > 0 ? (x.value.to_f/y.value.to_f) : 0.0\n    end\n\n    def sorensen_coefficient(key_1, key_2, redis = Predictor.redis)\n      x, y, z = nil\n\n      redis.multi do |multi|\n        x = multi.sinterstore 'temp', [key_1, key_2]\n        y = multi.scard key_1\n        z = multi.scard key_2\n        multi.del 'temp'\n      end\n\n      denom = (y.value + z.value)\n      denom > 0 ? (2 * (x.value) / denom.to_f) : 0.0\n    end\n  end\nend\n"
  },
  {
    "path": "lib/predictor/input_matrix.rb",
    "content": "module Predictor\n  class InputMatrix\n    def initialize(opts)\n      @opts = opts\n    end\n\n    def measure_name\n      @opts.fetch(:measure, :jaccard_index)\n    end\n\n    def base\n      @opts[:base]\n    end\n\n    def parent_redis_key(*append)\n      base.redis_key(*append)\n    end\n\n    def redis_key(*append)\n      base.redis_key(@opts.fetch(:key), *append)\n    end\n\n    def weight\n      (@opts[:weight] || 1).to_f\n    end\n\n    def add_to_set(set, *items)\n      items = items.flatten if items.count == 1 && items[0].is_a?(Array)\n      if items.any?\n        Predictor.redis.multi do |redis|\n          redis.sadd(parent_redis_key(:all_items), items)\n          redis.sadd(redis_key(:items, set), items)\n\n          items.each do |item|\n            # add the set to the item's set--inverting the sets\n            redis.sadd(redis_key(:sets, item), set)\n          end\n        end\n      end\n    end\n\n    # Delete a specific relationship\n    def remove_from_set(set, item)\n      Predictor.redis.multi do |redis|\n        redis.srem(redis_key(:items, set), item)\n        redis.srem(redis_key(:sets, item), set)\n      end\n    end\n\n    def add_set(set, items)\n      add_to_set(set, *items)\n    end\n\n    def add_single(set, item)\n      add_to_set(set, item)\n    end\n\n    def items_for(set)\n      Predictor.redis.smembers redis_key(:items, set)\n    end\n\n    def sets_for(item)\n      Predictor.redis.sunion redis_key(:sets, item)\n    end\n\n    def related_items(item)\n      sets = Predictor.redis.smembers(redis_key(:sets, item))\n      keys = sets.map { |set| redis_key(:items, set) }\n      keys.length > 0 ? Predictor.redis.sunion(keys) - [item.to_s] : []\n    end\n\n    # delete item from the matrix\n    def delete_item(item)\n      Predictor.redis.watch(redis_key(:sets, item)) do\n        sets = Predictor.redis.smembers(redis_key(:sets, item))\n        Predictor.redis.multi do |multi|\n          sets.each do |set|\n            multi.srem(redis_key(:items, set), item)\n          end\n\n          multi.del redis_key(:sets, item)\n        end\n      end\n    end\n\n    def score(item1, item2)\n      Distance.send(measure_name, redis_key(:sets, item1), redis_key(:sets, item2), Predictor.redis)\n    end\n\n    def calculate_jaccard(item1, item2)\n      warn 'InputMatrix#calculate_jaccard is now deprecated. Use InputMatrix#score instead'\n      Distance.jaccard_index(redis_key(:sets, item1), redis_key(:sets, item2), Predictor.redis)\n    end\n  end\nend\n"
  },
  {
    "path": "lib/predictor/predictor.rb",
    "content": "module Predictor\n  @@redis = nil\n  @@redis_prefix = nil\n\n  def self.redis=(redis)\n    @@redis = redis\n  end\n\n  def self.redis\n    return @@redis unless @@redis.nil?\n    raise \"redis not configured! - Predictor.redis = Redis.new\"\n  end\n\n  def self.redis_prefix(prefix = nil, &block)\n    @@redis_prefix = block_given? ? block : prefix\n  end\n\n  def self.get_redis_prefix\n    if @@redis_prefix\n      if @@redis_prefix.respond_to?(:call)\n        @@redis_prefix.call\n      else\n        @@redis_prefix\n      end\n    else\n      'predictor'\n    end\n  end\n\n  def self.capitalize(str_or_sym)\n  \tstr = str_or_sym.to_s.each_char.to_a\n  \tstr.first.upcase + str[1..-1].join(\"\").downcase\n  end\n\n  def self.constantize(klass)\n    Object.module_eval(\"Predictor::#{klass}\", __FILE__, __LINE__)\n  end\n\n  def self.processing_technique(algorithm)\n    @technique = algorithm\n  end\n\n  def self.get_processing_technique\n    @technique || :ruby\n  end\n\n  def self.process_lua_script(*args)\n    @process_sha ||= redis.script(:load, PROCESS_ITEMS_LUA_SCRIPT)\n    redis.evalsha(@process_sha, argv: args)\n  end\n\n  PROCESS_ITEMS_LUA_SCRIPT = <<-LUA\n    local redis_prefix = ARGV[1]\n    local input_matrices = cjson.decode(ARGV[2])\n    local similarity_limit = tonumber(ARGV[3])\n    local item = ARGV[4]\n    local keys = {}\n\n    for name, options in pairs(input_matrices) do\n      local key = table.concat({redis_prefix, name, 'sets', item}, ':')\n      local sets = redis.call('SMEMBERS', key)\n      for _, set in ipairs(sets) do\n        table.insert(keys, table.concat({redis_prefix, name, 'items', set}, ':'))\n      end\n    end\n\n    -- Account for empty tables.\n    if next(keys) == nil then\n      return nil\n    end\n\n    local related_items = redis.call('SUNION', unpack(keys))\n\n    local function add_similarity_if_necessary(item, similarity, score)\n      local store = true\n      local key = table.concat({redis_prefix, 'similarities', item}, ':')\n\n      if similarity_limit ~= nil then\n        local zrank = redis.call('ZRANK', key, similarity)\n\n        if zrank ~= nil then\n          local zcard = redis.call('ZCARD', key)\n\n          if zcard >= similarity_limit then\n            -- Similarity is not already stored and we are at limit of similarities.\n\n            local lowest_scored_item = redis.call('ZRANGEBYSCORE', key, '0', '+inf', 'withscores', 'limit', 0, 1)\n\n            if #lowest_scored_item > 0 then\n              -- If score is less than or equal to the lowest score, don't store it. Otherwise, make room by removing the lowest scored similarity\n              if score <= tonumber(lowest_scored_item[2]) then\n                store = false\n              else\n                redis.call('ZREM', key, lowest_scored_item[1])\n              end\n            end\n          end\n        end\n      end\n\n      if store then\n        redis.call('ZADD', key, score, similarity)\n      end\n    end\n\n    for i, related_item in ipairs(related_items) do\n      -- Disregard the current item.\n      if related_item ~= item then\n        local score = 0.0\n\n        for name, matrix in pairs(input_matrices) do\n          local s = 0.0\n\n          local key_1 = table.concat({redis_prefix, name, 'sets', item}, ':')\n          local key_2 = table.concat({redis_prefix, name, 'sets', related_item}, ':')\n\n          if matrix.measure == 'jaccard_index' then\n            local x = tonumber(redis.call('SINTERSTORE', 'temp', key_1, key_2))\n            local y = tonumber(redis.call('SUNIONSTORE', 'temp', key_1, key_2))\n            redis.call('DEL', 'temp')\n\n            if y > 0 then\n              s = s + (x / y)\n            end\n          elseif matrix.measure == 'sorensen_coefficient' then\n            local x = redis.call('SINTERSTORE', 'temp', key_1, key_2)\n            local y = redis.call('SCARD', key_1)\n            local z = redis.call('SCARD', key_2)\n\n            redis.call('DEL', 'temp')\n\n            local denom = y + z\n            if denom > 0 then\n              s = s + (2 * x / denom)\n            end\n          else\n            error(\"Bad matrix.measure: \" .. matrix.measure)\n          end\n\n          score = score + (s * matrix.weight)\n        end\n\n        if score > 0 then\n          add_similarity_if_necessary(item, related_item, score)\n          add_similarity_if_necessary(related_item, item, score)\n        else\n          redis.call('ZREM', table.concat({redis_prefix, 'similarities', item}, ':'), related_item)\n          redis.call('ZREM', table.concat({redis_prefix, 'similarities', related_item}, ':'), item)\n        end\n      end\n    end\n  LUA\nend\n"
  },
  {
    "path": "lib/predictor/version.rb",
    "content": "module Predictor\n  VERSION = \"2.3.1\"\nend\n"
  },
  {
    "path": "lib/predictor.rb",
    "content": "require 'json'\nrequire \"redis\"\nrequire \"predictor/predictor\"\nrequire \"predictor/distance\"\nrequire \"predictor/input_matrix\"\nrequire \"predictor/base\"\n"
  },
  {
    "path": "predictor.gemspec",
    "content": "# -*- encoding: utf-8 -*-\nrequire File.expand_path('../lib/predictor/version', __FILE__)\n\nGem::Specification.new do |s|\n  s.name        = \"predictor\"\n  s.version     = Predictor::VERSION\n  s.platform    = Gem::Platform::RUBY\n  s.authors     = [\"Pathgather\"]\n  s.email       = [\"tech@pathgather.com\"]\n  s.homepage    = \"https://github.com/nyagato-00/predictor\"\n  s.description = s.summary = \"Fast and efficient recommendations and predictions using Redis\"\n  s.licenses    = [\"MIT\"]\n\n  s.add_dependency \"redis\", \">= 3.0.0\"\n\n  s.add_development_dependency \"rspec\", \">= 3.4.0\"\n  s.add_development_dependency \"rake\", \">= 11.0\"\n  s.add_development_dependency \"pry\"\n  s.add_development_dependency \"yard\"\n\n  s.files         = `git ls-files`.split(\"\\n\") - [\".gitignore\", \".rspec\", \".travis.yml\"]\n  s.test_files    = `git ls-files -- spec/*`.split(\"\\n\")\n  s.require_paths = [\"lib\"]\nend\n"
  },
  {
    "path": "spec/base_spec.rb",
    "content": "require 'spec_helper'\n\ndescribe Predictor::Base do\n  before(:each) do\n    flush_redis!\n    BaseRecommender.input_matrices = {}\n    BaseRecommender.reset_similarity_limit!\n    BaseRecommender.redis_prefix(nil)\n    UserRecommender.input_matrices = {}\n    UserRecommender.reset_similarity_limit!\n    BaseRecommender.processing_technique nil\n    UserRecommender.processing_technique nil\n    Predictor.processing_technique nil\n  end\n\n  describe \"configuration\" do\n    it \"should add an input_matrix by 'key'\" do\n      BaseRecommender.input_matrix(:myinput)\n      expect(BaseRecommender.input_matrices.keys).to eq([:myinput])\n    end\n\n    it \"should default the similarity_limit to 128\" do\n      expect(BaseRecommender.similarity_limit).to eq(128)\n    end\n\n    it \"should allow the similarity limit to be configured\" do\n      BaseRecommender.limit_similarities_to(500)\n      expect(BaseRecommender.similarity_limit).to eq(500)\n    end\n\n    it \"should allow the similarity limit to be removed\" do\n      BaseRecommender.limit_similarities_to(nil)\n      expect(BaseRecommender.similarity_limit).to eq(nil)\n    end\n\n    it \"should retrieve an input_matrix on a new instance\" do\n      BaseRecommender.input_matrix(:myinput)\n      sm = BaseRecommender.new\n      expect{ sm.myinput }.not_to raise_error\n    end\n\n    it \"should retrieve an input_matrix on a new instance and correctly overload respond_to?\" do\n      BaseRecommender.input_matrix(:myinput)\n      sm = BaseRecommender.new\n      expect(sm.respond_to?(:process!)).to be_truthy\n      expect(sm.respond_to?(:myinput)).to be_truthy\n      expect(sm.respond_to?(:fnord)).to be_falsey\n    end\n\n    it \"should retrieve an input_matrix on a new instance and intialize the correct class\" do\n      BaseRecommender.input_matrix(:myinput)\n      sm = BaseRecommender.new\n      expect(sm.myinput).to be_a(Predictor::InputMatrix)\n    end\n\n    it \"should accept a custom processing_technique, or default to Predictor's default\" do\n      expect(BaseRecommender.get_processing_technique).to eq(:ruby)\n      Predictor.processing_technique :lua\n      expect(BaseRecommender.get_processing_technique).to eq(:lua)\n      BaseRecommender.processing_technique :union\n      expect(BaseRecommender.get_processing_technique).to eq(:union)\n    end\n  end\n\n  describe \"redis_key\" do\n    it \"should vary based on the class name\" do\n      expect(BaseRecommender.new.redis_key).to eq('predictor-test:BaseRecommender')\n      expect(UserRecommender.new.redis_key).to eq('predictor-test:UserRecommender')\n    end\n  end\n\n  describe \"redis_key\" do\n    it \"should vary based on the class name\" do\n      expect(BaseRecommender.new.redis_key).to eq('predictor-test:BaseRecommender')\n      expect(UserRecommender.new.redis_key).to eq('predictor-test:UserRecommender')\n    end\n\n    it \"should be able to mimic the old naming defaults\" do\n      BaseRecommender.redis_prefix([nil])\n      expect(BaseRecommender.new.redis_key(:key)).to eq('predictor-test:key')\n    end\n\n    it \"should respect the Predictor prefix configuration setting\" do\n      br = BaseRecommender.new\n\n      expect(br.redis_key).to eq(\"predictor-test:BaseRecommender\")\n      expect(br.redis_key(:another)).to eq(\"predictor-test:BaseRecommender:another\")\n      expect(br.redis_key(:another, :key)).to eq(\"predictor-test:BaseRecommender:another:key\")\n      expect(br.redis_key(:another, [:set, :of, :keys])).to eq(\"predictor-test:BaseRecommender:another:set:of:keys\")\n\n      i = 0\n      Predictor.redis_prefix { i += 1 }\n      expect(br.redis_key).to eq(\"1:BaseRecommender\")\n      expect(br.redis_key(:another)).to eq(\"2:BaseRecommender:another\")\n      expect(br.redis_key(:another, :key)).to eq(\"3:BaseRecommender:another:key\")\n      expect(br.redis_key(:another, [:set, :of, :keys])).to eq(\"4:BaseRecommender:another:set:of:keys\")\n\n      Predictor.redis_prefix nil\n      expect(br.redis_key).to eq(\"predictor:BaseRecommender\")\n      expect(br.redis_key(:another)).to eq(\"predictor:BaseRecommender:another\")\n      expect(br.redis_key(:another, :key)).to eq(\"predictor:BaseRecommender:another:key\")\n      expect(br.redis_key(:another, [:set, :of, :keys])).to eq(\"predictor:BaseRecommender:another:set:of:keys\")\n\n      Predictor.redis_prefix [nil]\n      expect(br.redis_key).to eq(\"BaseRecommender\")\n      expect(br.redis_key(:another)).to eq(\"BaseRecommender:another\")\n      expect(br.redis_key(:another, :key)).to eq(\"BaseRecommender:another:key\")\n      expect(br.redis_key(:another, [:set, :of, :keys])).to eq(\"BaseRecommender:another:set:of:keys\")\n\n      Predictor.redis_prefix { [1, 2, 3] }\n      expect(br.redis_key).to eq(\"1:2:3:BaseRecommender\")\n      expect(br.redis_key(:another)).to eq(\"1:2:3:BaseRecommender:another\")\n      expect(br.redis_key(:another, :key)).to eq(\"1:2:3:BaseRecommender:another:key\")\n      expect(br.redis_key(:another, [:set, :of, :keys])).to eq(\"1:2:3:BaseRecommender:another:set:of:keys\")\n\n      Predictor.redis_prefix 'predictor-test'\n      expect(br.redis_key).to eq(\"predictor-test:BaseRecommender\")\n      expect(br.redis_key(:another)).to eq(\"predictor-test:BaseRecommender:another\")\n      expect(br.redis_key(:another, :key)).to eq(\"predictor-test:BaseRecommender:another:key\")\n      expect(br.redis_key(:another, [:set, :of, :keys])).to eq(\"predictor-test:BaseRecommender:another:set:of:keys\")\n    end\n\n    it \"should respect the class prefix configuration setting\" do\n      br = BaseRecommender.new\n\n      BaseRecommender.redis_prefix('base')\n      expect(br.redis_key).to eq(\"predictor-test:base\")\n      expect(br.redis_key(:another)).to eq(\"predictor-test:base:another\")\n      expect(br.redis_key(:another, :key)).to eq(\"predictor-test:base:another:key\")\n      expect(br.redis_key(:another, [:set, :of, :keys])).to eq(\"predictor-test:base:another:set:of:keys\")\n\n      i = 0\n      BaseRecommender.redis_prefix { i += 1 }\n      expect(br.redis_key).to eq(\"predictor-test:1\")\n      expect(br.redis_key(:another)).to eq(\"predictor-test:2:another\")\n      expect(br.redis_key(:another, :key)).to eq(\"predictor-test:3:another:key\")\n      expect(br.redis_key(:another, [:set, :of, :keys])).to eq(\"predictor-test:4:another:set:of:keys\")\n\n      BaseRecommender.redis_prefix(nil)\n      expect(br.redis_key).to eq(\"predictor-test:BaseRecommender\")\n      expect(br.redis_key(:another)).to eq(\"predictor-test:BaseRecommender:another\")\n      expect(br.redis_key(:another, :key)).to eq(\"predictor-test:BaseRecommender:another:key\")\n      expect(br.redis_key(:another, [:set, :of, :keys])).to eq(\"predictor-test:BaseRecommender:another:set:of:keys\")\n    end\n\n    it \"should respect the instance prefix configuration setting\" do\n      br = PrefixRecommender.new(\"foo\")\n\n      expect(br.redis_key).to eq(\"predictor-test:PrefixRecommender:foo\")\n      expect(br.redis_key(:another)).to eq(\"predictor-test:PrefixRecommender:foo:another\")\n      expect(br.redis_key(:another, :key)).to eq(\"predictor-test:PrefixRecommender:foo:another:key\")\n      expect(br.redis_key(:another, [:set, :of, :keys])).to eq(\"predictor-test:PrefixRecommender:foo:another:set:of:keys\")\n\n\n      br.prefix = nil\n      expect(br.redis_key).to eq(\"predictor-test:PrefixRecommender\")\n      expect(br.redis_key(:another)).to eq(\"predictor-test:PrefixRecommender:another\")\n\n    end\n  end\n\n  describe \"all_items\" do\n    it \"returns all items across all matrices\" do\n      BaseRecommender.input_matrix(:anotherinput)\n      BaseRecommender.input_matrix(:yetanotherinput)\n      sm = BaseRecommender.new\n      sm.add_to_matrix(:anotherinput, 'a', \"foo\", \"bar\")\n      sm.add_to_matrix(:yetanotherinput, 'b', \"fnord\", \"shmoo\", \"bar\")\n      expect(sm.all_items).to include('foo', 'bar', 'fnord', 'shmoo')\n      expect(sm.all_items.length).to eq(4)\n    end\n\n    it \"doesn't return items from other recommenders\" do\n      BaseRecommender.input_matrix(:anotherinput)\n      BaseRecommender.input_matrix(:yetanotherinput)\n      UserRecommender.input_matrix(:anotherinput)\n      UserRecommender.input_matrix(:yetanotherinput)\n      sm = BaseRecommender.new\n      sm.add_to_matrix(:anotherinput, 'a', \"foo\", \"bar\")\n      sm.add_to_matrix(:yetanotherinput, 'b', \"fnord\", \"shmoo\", \"bar\")\n      expect(sm.all_items).to include('foo', 'bar', 'fnord', 'shmoo')\n      expect(sm.all_items.length).to eq(4)\n\n      ur = UserRecommender.new\n      expect(ur.all_items).to eq([])\n    end\n  end\n\n  describe \"add_to_matrix\" do\n    it \"calls add_to_set on the given matrix\" do\n      BaseRecommender.input_matrix(:anotherinput)\n      sm = BaseRecommender.new\n      expect(sm.anotherinput).to receive(:add_to_set).with('a', 'foo', 'bar')\n      sm.add_to_matrix(:anotherinput, 'a', 'foo', 'bar')\n    end\n\n    it \"adds the items to the all_items storage\" do\n      BaseRecommender.input_matrix(:anotherinput)\n      sm = BaseRecommender.new\n      sm.add_to_matrix(:anotherinput, 'a', 'foo', 'bar')\n      expect(sm.all_items).to include('foo', 'bar')\n    end\n  end\n\n  describe \"add_to_matrix!\" do\n    it \"calls add_to_matrix and process_items! for the given items\" do\n      BaseRecommender.input_matrix(:anotherinput)\n      sm = BaseRecommender.new\n      expect(sm).to receive(:add_to_matrix).with(:anotherinput, 'a', 'foo')\n      expect(sm).to receive(:process_items!).with('foo')\n      sm.add_to_matrix!(:anotherinput, 'a', 'foo')\n    end\n  end\n\n  describe \"related_items\" do\n    it \"returns items in the sets across all matrices that the given item is also in\" do\n      BaseRecommender.input_matrix(:anotherinput)\n      BaseRecommender.input_matrix(:yetanotherinput)\n      BaseRecommender.input_matrix(:finalinput)\n      sm = BaseRecommender.new\n      sm.anotherinput.add_to_set('a', \"foo\", \"bar\")\n      sm.yetanotherinput.add_to_set('b', \"fnord\", \"shmoo\", \"bar\")\n      sm.finalinput.add_to_set('c', \"nada\")\n      sm.process!\n      expect(sm.related_items(\"bar\")).to include(\"foo\", \"fnord\", \"shmoo\")\n      expect(sm.related_items(\"bar\").length).to eq(3)\n    end\n  end\n\n  describe \"predictions_for\" do\n    it \"accepts an :on option to return scores of specific objects\" do\n      BaseRecommender.input_matrix(:users, weight: 4.0)\n      BaseRecommender.input_matrix(:tags, weight: 1.0)\n      sm = BaseRecommender.new\n      sm.users.add_to_set('me', \"foo\", \"bar\", \"fnord\")\n      sm.users.add_to_set('not_me', \"foo\", \"shmoo\")\n      sm.users.add_to_set('another', \"fnord\", \"other\")\n      sm.users.add_to_set('another', \"nada\")\n      sm.tags.add_to_set('tag1', \"foo\", \"fnord\", \"shmoo\")\n      sm.tags.add_to_set('tag2', \"bar\", \"shmoo\", \"other\")\n      sm.tags.add_to_set('tag3', \"shmoo\", \"nada\")\n      sm.process!\n      predictions = sm.predictions_for('me', matrix_label: :users, on: 'other', with_scores: true)\n      expect(predictions).to eq([['other', 3.0]])\n      predictions = sm.predictions_for('me', matrix_label: :users, on: ['other'], with_scores: true)\n      expect(predictions).to eq([['other', 3.0]])\n      predictions = sm.predictions_for('me', matrix_label: :users, on: ['other', 'nada'], with_scores: true)\n      expect(predictions).to eq([['other', 3.0], ['nada', 2.0]])\n      predictions = sm.predictions_for(item_set: [\"foo\", \"bar\", \"fnord\"], on: ['other', 'nada'], with_scores: true)\n      expect(predictions).to eq([['other', 3.0], ['nada', 2.0]])\n      predictions = sm.predictions_for(item_set: [\"foo\", \"bar\", \"fnord\"], on: ['other', 'nada'])\n      expect(predictions).to eq(['other', 'nada'])\n      predictions = sm.predictions_for('me', matrix_label: :users, on: ['shmoo', 'other', 'nada'], offset: 1, limit: 1, with_scores: true)\n      expect(predictions).to eq([[\"other\", 3.0]])\n      predictions = sm.predictions_for('me', matrix_label: :users, on: ['shmoo', 'other', 'nada'], offset: 1, with_scores: true)\n      expect(predictions).to eq([['other', 3.0], ['nada', 2.0]])\n    end\n  end\n\n  [:ruby, :lua, :union].each do |technique|\n    describe \"predictions_for with #{technique} processing\" do\n      before do\n        Predictor.processing_technique(technique)\n      end\n\n      it \"returns relevant predictions\" do\n        BaseRecommender.input_matrix(:users, weight: 4.0)\n        BaseRecommender.input_matrix(:tags, weight: 1.0)\n        sm = BaseRecommender.new\n        sm.users.add_to_set('me', \"foo\", \"bar\", \"fnord\")\n        sm.users.add_to_set('not_me', \"foo\", \"shmoo\")\n        sm.users.add_to_set('another', \"fnord\", \"other\")\n        sm.users.add_to_set('another', \"nada\")\n        sm.tags.add_to_set('tag1', \"foo\", \"fnord\", \"shmoo\")\n        sm.tags.add_to_set('tag2', \"bar\", \"shmoo\")\n        sm.tags.add_to_set('tag3', \"shmoo\", \"nada\")\n        sm.process!\n        predictions = sm.predictions_for('me', matrix_label: :users)\n        expect(predictions).to eq([\"shmoo\", \"other\", \"nada\"])\n        predictions = sm.predictions_for(item_set: [\"foo\", \"bar\", \"fnord\"])\n        expect(predictions).to eq([\"shmoo\", \"other\", \"nada\"])\n        predictions = sm.predictions_for('me', matrix_label: :users, offset: 1, limit: 1)\n        expect(predictions).to eq([\"other\"])\n        predictions = sm.predictions_for('me', matrix_label: :users, offset: 1)\n        expect(predictions).to eq([\"other\", \"nada\"])\n      end\n\n      it \"accepts a :boost option\" do\n        BaseRecommender.input_matrix(:users, weight: 4.0)\n        BaseRecommender.input_matrix(:tags, weight: 1.0)\n        sm = BaseRecommender.new\n        sm.users.add_to_set('me', \"foo\", \"bar\", \"fnord\")\n        sm.users.add_to_set('not_me', \"foo\", \"shmoo\")\n        sm.users.add_to_set('another', \"fnord\", \"other\")\n        sm.users.add_to_set('another', \"nada\")\n        sm.tags.add_to_set('tag1', \"foo\", \"fnord\", \"shmoo\")\n        sm.tags.add_to_set('tag2', \"bar\", \"shmoo\")\n        sm.tags.add_to_set('tag3', \"shmoo\", \"nada\")\n        sm.process!\n\n        # Syntax #1: Tags passed as array, weights assumed to be 1.0\n        predictions = sm.predictions_for('me', matrix_label: :users, boost: {tags: ['tag3']})\n        expect(predictions).to eq([\"shmoo\", \"nada\", \"other\"])\n        predictions = sm.predictions_for(item_set: [\"foo\", \"bar\", \"fnord\"], boost: {tags: ['tag3']})\n        expect(predictions).to eq([\"shmoo\", \"nada\", \"other\"])\n        predictions = sm.predictions_for('me', matrix_label: :users, offset: 1, limit: 1, boost: {tags: ['tag3']})\n        expect(predictions).to eq([\"nada\"])\n        predictions = sm.predictions_for('me', matrix_label: :users, offset: 1, boost: {tags: ['tag3']})\n        expect(predictions).to eq([\"nada\", \"other\"])\n\n        # Syntax #2: Weights explicitly set.\n        predictions = sm.predictions_for('me', matrix_label: :users, boost: {tags: {values: ['tag3'], weight: 1.0}})\n        expect(predictions).to eq([\"shmoo\", \"nada\", \"other\"])\n        predictions = sm.predictions_for(item_set: [\"foo\", \"bar\", \"fnord\"], boost: {tags: {values: ['tag3'], weight: 1.0}})\n        expect(predictions).to eq([\"shmoo\", \"nada\", \"other\"])\n        predictions = sm.predictions_for('me', matrix_label: :users, offset: 1, limit: 1, boost: {tags: {values: ['tag3'], weight: 1.0}})\n        expect(predictions).to eq([\"nada\"])\n        predictions = sm.predictions_for('me', matrix_label: :users, offset: 1, boost: {tags: {values: ['tag3'], weight: 1.0}})\n        expect(predictions).to eq([\"nada\", \"other\"])\n\n        # Make sure weights are actually being passed to Redis.\n        shmoo, nada, other = sm.predictions_for('me', matrix_label: :users, boost: {tags: {values: ['tag3'], weight: 10000.0}}, with_scores: true)\n        expect(shmoo[0]).to eq('shmoo')\n        expect(shmoo[1]).to be > 10000\n        expect(nada[0]).to eq('nada')\n        expect(nada[1]).to be > 10000\n        expect(other[0]).to eq('other')\n        expect(other[1]).to be < 10\n      end\n\n      it \"accepts a :boost option, even with an empty item set\" do\n        BaseRecommender.input_matrix(:users, weight: 4.0)\n        BaseRecommender.input_matrix(:tags, weight: 1.0)\n        sm = BaseRecommender.new\n        sm.users.add_to_set('not_me', \"foo\", \"shmoo\")\n        sm.users.add_to_set('another', \"fnord\", \"other\")\n        sm.users.add_to_set('another', \"nada\")\n        sm.tags.add_to_set('tag1', \"foo\", \"fnord\", \"shmoo\")\n        sm.tags.add_to_set('tag2', \"bar\", \"shmoo\")\n        sm.tags.add_to_set('tag3', \"shmoo\", \"nada\")\n        sm.process!\n\n        # Syntax #1: Tags passed as array, weights assumed to be 1.0\n        predictions = sm.predictions_for('me', matrix_label: :users, boost: {tags: ['tag3']})\n        expect(predictions).to eq([\"shmoo\", \"nada\"])\n        predictions = sm.predictions_for(item_set: [], boost: {tags: ['tag3']})\n        expect(predictions).to eq([\"shmoo\", \"nada\"])\n        predictions = sm.predictions_for('me', matrix_label: :users, offset: 1, limit: 1, boost: {tags: ['tag3']})\n        expect(predictions).to eq([\"nada\"])\n        predictions = sm.predictions_for('me', matrix_label: :users, offset: 1, boost: {tags: ['tag3']})\n        expect(predictions).to eq([\"nada\"])\n\n        # Syntax #2: Weights explicitly set.\n        predictions = sm.predictions_for('me', matrix_label: :users, boost: {tags: {values: ['tag3'], weight: 1.0}})\n        expect(predictions).to eq([\"shmoo\", \"nada\"])\n        predictions = sm.predictions_for(item_set: [], boost: {tags: {values: ['tag3'], weight: 1.0}})\n        expect(predictions).to eq([\"shmoo\", \"nada\"])\n        predictions = sm.predictions_for('me', matrix_label: :users, offset: 1, limit: 1, boost: {tags: {values: ['tag3'], weight: 1.0}})\n        expect(predictions).to eq([\"nada\"])\n        predictions = sm.predictions_for('me', matrix_label: :users, offset: 1, boost: {tags: {values: ['tag3'], weight: 1.0}})\n        expect(predictions).to eq([\"nada\"])\n      end\n    end\n\n    describe \"process_items! with #{technique} processing\" do\n      before do\n        Predictor.processing_technique(technique)\n      end\n\n      context \"with no similarity_limit\" do\n        it \"calculates the similarity between the item and all related_items (other items in a set the given item is in)\" do\n          BaseRecommender.input_matrix(:myfirstinput)\n          BaseRecommender.input_matrix(:mysecondinput)\n          BaseRecommender.input_matrix(:mythirdinput, weight: 3.0)\n          sm = BaseRecommender.new\n          sm.myfirstinput.add_to_set 'set1', 'item1', 'item2'\n          sm.mysecondinput.add_to_set 'set2', 'item2', 'item3'\n          sm.mythirdinput.add_to_set 'set3', 'item2', 'item3'\n          sm.mythirdinput.add_to_set 'set4', 'item1', 'item2', 'item3'\n          expect(sm.similarities_for('item2')).to be_empty\n          sm.process_items!('item2')\n          similarities = sm.similarities_for('item2')\n          expect(similarities).to eq([\"item3\", \"item1\"])\n        end\n      end\n\n      context \"with a similarity_limit\" do\n        it \"calculates the similarity between the item and all related_items (other items in a set the given item is in), but obeys the similarity_limit\" do\n          BaseRecommender.input_matrix(:myfirstinput)\n          BaseRecommender.input_matrix(:mysecondinput)\n          BaseRecommender.input_matrix(:mythirdinput, weight: 3.0)\n          BaseRecommender.limit_similarities_to(1)\n          sm = BaseRecommender.new\n          sm.myfirstinput.add_to_set 'set1', 'item1', 'item2'\n          sm.mysecondinput.add_to_set 'set2', 'item2', 'item3'\n          sm.mythirdinput.add_to_set 'set3', 'item2', 'item3'\n          sm.mythirdinput.add_to_set 'set4', 'item1', 'item2', 'item3'\n          expect(sm.similarities_for('item2')).to be_empty\n          sm.process_items!('item2')\n          similarities = sm.similarities_for('item2')\n          expect(similarities).to include(\"item3\")\n          expect(similarities.length).to eq(1)\n        end\n      end\n    end\n  end\n\n  describe \"similarities_for\" do\n    it \"should not throw exception for non existing items\" do\n      sm = BaseRecommender.new\n      expect(sm.similarities_for(\"not_existing_item\").length).to eq(0)\n    end\n\n    it \"correctly weighs and sums input matrices\" do\n      BaseRecommender.input_matrix(:users, weight: 1.0)\n      BaseRecommender.input_matrix(:tags, weight: 2.0)\n      BaseRecommender.input_matrix(:topics, weight: 4.0)\n\n      sm = BaseRecommender.new\n\n      sm.users.add_to_set('user1', \"c1\", \"c2\", \"c4\")\n      sm.users.add_to_set('user2', \"c3\", \"c4\")\n      sm.topics.add_to_set('topic1', \"c1\", \"c4\")\n      sm.topics.add_to_set('topic2', \"c2\", \"c3\")\n      sm.tags.add_to_set('tag1', \"c1\", \"c2\", \"c4\")\n      sm.tags.add_to_set('tag2', \"c1\", \"c4\")\n\n      sm.process!\n      expect(sm.similarities_for(\"c1\", with_scores: true)).to eq([[\"c4\", 6.5], [\"c2\", 2.0]])\n      expect(sm.similarities_for(\"c2\", with_scores: true)).to eq([[\"c3\", 4.0], [\"c1\", 2.0], [\"c4\", 1.5]])\n      expect(sm.similarities_for(\"c3\", with_scores: true)).to eq([[\"c2\", 4.0], [\"c4\", 0.5]])\n      expect(sm.similarities_for(\"c4\", with_scores: true, exclusion_set: [\"c3\"])).to eq([[\"c1\", 6.5], [\"c2\", 1.5]])\n    end\n  end\n\n  describe \"sets_for\" do\n    it \"should return all the sets the given item is in\" do\n      BaseRecommender.input_matrix(:set1)\n      BaseRecommender.input_matrix(:set2)\n      sm = BaseRecommender.new\n      sm.set1.add_to_set \"item1\", \"foo\", \"bar\"\n      sm.set1.add_to_set \"item2\", \"nada\", \"bar\"\n      sm.set2.add_to_set \"item3\", \"bar\", \"other\"\n      expect(sm.sets_for(\"bar\").length).to eq(3)\n      expect(sm.sets_for(\"bar\")).to include(\"item1\", \"item2\", \"item3\")\n      expect(sm.sets_for(\"other\")).to eq([\"item3\"])\n    end\n  end\n\n  describe \"process!\" do\n    it \"should call process_items for all_items's\" do\n      BaseRecommender.input_matrix(:anotherinput)\n      BaseRecommender.input_matrix(:yetanotherinput)\n      sm = BaseRecommender.new\n      sm.anotherinput.add_to_set('a', \"foo\", \"bar\")\n      sm.yetanotherinput.add_to_set('b', \"fnord\", \"shmoo\")\n      expect(sm.all_items).to include(\"foo\", \"bar\", \"fnord\", \"shmoo\")\n      expect(sm).to receive(:process_items!).with(*sm.all_items)\n      sm.process!\n    end\n  end\n\n  describe \"delete_pair_from_matrix!\" do\n    it \"should call remove_from_set on the matrix\" do\n      BaseRecommender.input_matrix(:anotherinput)\n      sm = BaseRecommender.new\n      sm.anotherinput.add_to_set('a', \"foo\")\n      sm.anotherinput.add_to_set('a', \"bar\")\n      sm.anotherinput.add_to_set('a', \"shmoo\")\n      sm.process!\n      expect(sm.similarities_for('bar')).to include('foo', 'shmoo')\n      expect(sm.anotherinput).to receive(:remove_from_set).with('a', 'foo')\n      sm.delete_pair_from_matrix!(:anotherinput, 'a', 'foo')\n    end\n\n    it \"updates similarities\" do\n      BaseRecommender.input_matrix(:anotherinput)\n      sm = BaseRecommender.new\n      sm.anotherinput.add_to_set('a', \"foo\")\n      sm.anotherinput.add_to_set('a', \"bar\")\n      sm.anotherinput.add_to_set('a', \"shmoo\")\n      sm.process!\n      expect(sm.similarities_for('bar')).to include('foo', 'shmoo')\n      sm.delete_pair_from_matrix!(:anotherinput, 'a', 'foo')\n      expect(sm.similarities_for('bar')).to eq(['shmoo'])\n    end\n  end\n\n  describe \"delete_from_matrix!\" do\n    it \"calls delete_item on the matrix\" do\n      BaseRecommender.input_matrix(:anotherinput)\n      BaseRecommender.input_matrix(:yetanotherinput)\n      sm = BaseRecommender.new\n      sm.anotherinput.add_to_set('a', \"foo\", \"bar\")\n      sm.yetanotherinput.add_to_set('b', \"bar\", \"shmoo\")\n      sm.process!\n      expect(sm.similarities_for('bar')).to include('foo', 'shmoo')\n      expect(sm.anotherinput).to receive(:delete_item).with('foo')\n      sm.delete_from_matrix!(:anotherinput, 'foo')\n    end\n\n    it \"updates similarities\" do\n      BaseRecommender.input_matrix(:anotherinput)\n      BaseRecommender.input_matrix(:yetanotherinput)\n      sm = BaseRecommender.new\n      sm.anotherinput.add_to_set('a', \"foo\", \"bar\")\n      sm.yetanotherinput.add_to_set('b', \"bar\", \"shmoo\")\n      sm.process!\n      expect(sm.similarities_for('bar')).to include('foo', 'shmoo')\n      sm.delete_from_matrix!(:anotherinput, 'foo')\n      expect(sm.similarities_for('bar')).to eq(['shmoo'])\n    end\n  end\n\n  describe \"delete_item!\" do\n    it \"should call delete_item on each input_matrix\" do\n      BaseRecommender.input_matrix(:myfirstinput)\n      BaseRecommender.input_matrix(:mysecondinput)\n      sm = BaseRecommender.new\n      expect(sm.myfirstinput).to receive(:delete_item).with(\"fnorditem\")\n      expect(sm.mysecondinput).to receive(:delete_item).with(\"fnorditem\")\n      sm.delete_item!(\"fnorditem\")\n    end\n\n    it \"should remove the item from all_items\" do\n      BaseRecommender.input_matrix(:anotherinput)\n      sm = BaseRecommender.new\n      sm.anotherinput.add_to_set('a', \"foo\", \"bar\")\n      sm.process!\n      expect(sm.all_items).to include('foo')\n      sm.delete_item!('foo')\n      expect(sm.all_items).not_to include('foo')\n    end\n\n    it \"should remove the item's similarities and also remove the item from related_items' similarities\" do\n      BaseRecommender.input_matrix(:anotherinput)\n      BaseRecommender.input_matrix(:yetanotherinput)\n      sm = BaseRecommender.new\n      sm.anotherinput.add_to_set('a', \"foo\", \"bar\")\n      sm.yetanotherinput.add_to_set('b', \"bar\", \"shmoo\")\n      sm.process!\n      expect(sm.similarities_for('bar')).to include('foo', 'shmoo')\n      expect(sm.similarities_for('shmoo')).to include('bar')\n      sm.delete_item!('shmoo')\n      expect(sm.similarities_for('bar')).not_to include('shmoo')\n      expect(sm.similarities_for('shmoo')).to be_empty\n    end\n  end\n\n  describe \"clean!\" do\n    it \"should clean out the Redis storage for this Predictor\" do\n      BaseRecommender.input_matrix(:set1)\n      BaseRecommender.input_matrix(:set2)\n      sm = BaseRecommender.new\n      sm.set1.add_to_set \"item1\", \"foo\", \"bar\"\n      sm.set1.add_to_set \"item2\", \"nada\", \"bar\"\n      sm.set2.add_to_set \"item3\", \"bar\", \"other\"\n\n      expect(Predictor.redis.keys(sm.redis_key('*'))).not_to be_empty\n      sm.clean!\n      expect(Predictor.redis.keys(sm.redis_key('*'))).to be_empty\n    end\n  end\n\n  describe \"ensure_similarity_limit_is_obeyed!\" do\n    it \"should shorten similarities to the given limit and rewrite the zset\" do\n      BaseRecommender.limit_similarities_to(nil)\n\n      BaseRecommender.input_matrix(:myfirstinput)\n      sm = BaseRecommender.new\n      sm.myfirstinput.add_to_set *(['set1'] + 130.times.map{|i| \"item#{i}\"})\n      expect(sm.similarities_for('item2')).to be_empty\n      sm.process_items!('item2')\n      expect(sm.similarities_for('item2').length).to eq(129)\n\n      redis = Predictor.redis\n      key = sm.redis_key(:similarities, 'item2')\n      expect(redis.zcard(key)).to eq(129)\n      expect(redis.object(:encoding, key)).to eq('skiplist') # Inefficient\n\n      BaseRecommender.reset_similarity_limit!\n      sm.ensure_similarity_limit_is_obeyed!\n\n      expect(redis.zcard(key)).to eq(128)\n      expect(redis.object(:encoding, key)).to eq('ziplist') # Efficient\n    end\n  end\nend\n"
  },
  {
    "path": "spec/input_matrix_spec.rb",
    "content": "require 'spec_helper'\n\ndescribe Predictor::InputMatrix do\n  let(:options) { @default_options.merge(@options) }\n\n  before(:each) { @options = {} }\n\n  before(:all) do\n    @base = BaseRecommender.new\n    @default_options = { base: @base, key: \"mymatrix\" }\n    @matrix = Predictor::InputMatrix.new(@default_options)\n  end\n\n  before(:each) do\n    flush_redis!\n  end\n\n  describe \"redis_key\" do\n    it \"should respect the global namespace configuration\" do\n      expect(@matrix.redis_key).to eq(\"predictor-test:BaseRecommender:mymatrix\")\n      expect(@matrix.redis_key(:another)).to eq(\"predictor-test:BaseRecommender:mymatrix:another\")\n      expect(@matrix.redis_key(:another, :key)).to eq(\"predictor-test:BaseRecommender:mymatrix:another:key\")\n      expect(@matrix.redis_key(:another, [:set, :of, :keys])).to eq(\"predictor-test:BaseRecommender:mymatrix:another:set:of:keys\")\n\n      i = 0\n      Predictor.redis_prefix { i += 1 }\n      expect(@matrix.redis_key).to eq(\"1:BaseRecommender:mymatrix\")\n      expect(@matrix.redis_key(:another)).to eq(\"2:BaseRecommender:mymatrix:another\")\n      expect(@matrix.redis_key(:another, :key)).to eq(\"3:BaseRecommender:mymatrix:another:key\")\n      expect(@matrix.redis_key(:another, [:set, :of, :keys])).to eq(\"4:BaseRecommender:mymatrix:another:set:of:keys\")\n\n      Predictor.redis_prefix(nil)\n      expect(@matrix.redis_key).to eq(\"predictor:BaseRecommender:mymatrix\")\n      expect(@matrix.redis_key(:another)).to eq(\"predictor:BaseRecommender:mymatrix:another\")\n      expect(@matrix.redis_key(:another, :key)).to eq(\"predictor:BaseRecommender:mymatrix:another:key\")\n      expect(@matrix.redis_key(:another, [:set, :of, :keys])).to eq(\"predictor:BaseRecommender:mymatrix:another:set:of:keys\")\n\n      Predictor.redis_prefix('predictor-test')\n      expect(@matrix.redis_key).to eq(\"predictor-test:BaseRecommender:mymatrix\")\n      expect(@matrix.redis_key(:another)).to eq(\"predictor-test:BaseRecommender:mymatrix:another\")\n      expect(@matrix.redis_key(:another, :key)).to eq(\"predictor-test:BaseRecommender:mymatrix:another:key\")\n      expect(@matrix.redis_key(:another, [:set, :of, :keys])).to eq(\"predictor-test:BaseRecommender:mymatrix:another:set:of:keys\")\n    end\n\n    it \"should respect the class-level configuration\" do\n      i = 0\n      BaseRecommender.redis_prefix { i += 1 }\n      expect(@matrix.redis_key).to eq(\"predictor-test:1:mymatrix\")\n      expect(@matrix.redis_key(:another)).to eq(\"predictor-test:2:mymatrix:another\")\n      expect(@matrix.redis_key(:another, :key)).to eq(\"predictor-test:3:mymatrix:another:key\")\n      expect(@matrix.redis_key(:another, [:set, :of, :keys])).to eq(\"predictor-test:4:mymatrix:another:set:of:keys\")\n\n      BaseRecommender.redis_prefix([nil])\n      expect(@matrix.redis_key).to eq(\"predictor-test:mymatrix\")\n      expect(@matrix.redis_key(:another)).to eq(\"predictor-test:mymatrix:another\")\n      expect(@matrix.redis_key(:another, :key)).to eq(\"predictor-test:mymatrix:another:key\")\n      expect(@matrix.redis_key(:another, [:set, :of, :keys])).to eq(\"predictor-test:mymatrix:another:set:of:keys\")\n\n      BaseRecommender.redis_prefix(['a', 'b'])\n      expect(@matrix.redis_key).to eq(\"predictor-test:a:b:mymatrix\")\n      expect(@matrix.redis_key(:another)).to eq(\"predictor-test:a:b:mymatrix:another\")\n      expect(@matrix.redis_key(:another, :key)).to eq(\"predictor-test:a:b:mymatrix:another:key\")\n      expect(@matrix.redis_key(:another, [:set, :of, :keys])).to eq(\"predictor-test:a:b:mymatrix:another:set:of:keys\")\n\n      BaseRecommender.redis_prefix(nil)\n      expect(@matrix.redis_key).to eq(\"predictor-test:BaseRecommender:mymatrix\")\n      expect(@matrix.redis_key(:another)).to eq(\"predictor-test:BaseRecommender:mymatrix:another\")\n      expect(@matrix.redis_key(:another, :key)).to eq(\"predictor-test:BaseRecommender:mymatrix:another:key\")\n      expect(@matrix.redis_key(:another, [:set, :of, :keys])).to eq(\"predictor-test:BaseRecommender:mymatrix:another:set:of:keys\")\n    end\n  end\n\n  describe \"weight\" do\n    it \"returns the weight configured or a default of 1\" do\n      expect(@matrix.weight).to eq(1.0)  # default weight\n      matrix = Predictor::InputMatrix.new(redis_prefix: \"predictor-test\", key: \"mymatrix\", weight: 5.0)\n      expect(matrix.weight).to eq(5.0)\n    end\n  end\n\n  describe \"add_to_set\" do\n    it \"adds each member of the set to the key's 'sets' set\" do\n      expect(@matrix.items_for(\"item1\")).not_to include(\"foo\", \"bar\", \"fnord\", \"blubb\")\n      @matrix.add_to_set \"item1\", \"foo\", \"bar\", \"fnord\", \"blubb\"\n      expect(@matrix.items_for(\"item1\")).to include(\"foo\", \"bar\", \"fnord\", \"blubb\")\n    end\n\n    it \"does not crash if the set of items is empty\" do\n      @matrix.add_to_set \"item1\"\n      @matrix.add_to_set \"item1\", []\n    end\n\n    it \"adds the key to each set member's 'items' set\" do\n      expect(@matrix.sets_for(\"foo\")).not_to include(\"item1\")\n      expect(@matrix.sets_for(\"bar\")).not_to include(\"item1\")\n      expect(@matrix.sets_for(\"fnord\")).not_to include(\"item1\")\n      expect(@matrix.sets_for(\"blubb\")).not_to include(\"item1\")\n      @matrix.add_to_set \"item1\", \"foo\", \"bar\", \"fnord\", \"blubb\"\n      expect(@matrix.sets_for(\"foo\")).to include(\"item1\")\n      expect(@matrix.sets_for(\"bar\")).to include(\"item1\")\n      expect(@matrix.sets_for(\"fnord\")).to include(\"item1\")\n      expect(@matrix.sets_for(\"blubb\")).to include(\"item1\")\n    end\n  end\n\n  describe \"items_for\" do\n    it \"returns the items in the given set ID\" do\n      @matrix.add_to_set \"item1\", [\"foo\", \"bar\", \"fnord\", \"blubb\"]\n      expect(@matrix.items_for(\"item1\")).to include(\"foo\", \"bar\", \"fnord\", \"blubb\")\n      @matrix.add_to_set \"item2\", [\"foo\", \"bar\", \"snafu\", \"nada\"]\n      expect(@matrix.items_for(\"item2\")).to include(\"foo\", \"bar\", \"snafu\", \"nada\")\n      expect(@matrix.items_for(\"item1\")).not_to include(\"snafu\", \"nada\")\n    end\n  end\n\n  describe \"sets_for\" do\n    it \"returns the set IDs the given item is in\" do\n      @matrix.add_to_set \"item1\", [\"foo\", \"bar\", \"fnord\", \"blubb\"]\n      @matrix.add_to_set \"item2\", [\"foo\", \"bar\", \"snafu\", \"nada\"]\n      expect(@matrix.sets_for(\"foo\")).to include(\"item1\", \"item2\")\n      expect(@matrix.sets_for(\"snafu\")).to eq([\"item2\"])\n    end\n  end\n\n  describe \"related_items\" do\n    it \"returns the items in sets the given item is also in\" do\n      @matrix.add_to_set \"item1\", [\"foo\", \"bar\", \"fnord\", \"blubb\"]\n      @matrix.add_to_set \"item2\", [\"foo\", \"bar\", \"snafu\", \"nada\"]\n      @matrix.add_to_set \"item3\", [\"nada\", \"other\"]\n      expect(@matrix.related_items(\"bar\")).to include(\"foo\", \"fnord\", \"blubb\", \"snafu\", \"nada\")\n      expect(@matrix.related_items(\"bar\").length).to eq(5)\n      expect(@matrix.related_items(\"other\")).to eq([\"nada\"])\n      expect(@matrix.related_items(\"snafu\")).to include(\"foo\", \"bar\", \"nada\")\n      expect(@matrix.related_items(\"snafu\").length).to eq(3)\n    end\n  end\n\n  describe \"delete_item\" do\n    before do\n      @matrix.add_to_set \"item1\", \"foo\", \"bar\", \"fnord\", \"blubb\"\n      @matrix.add_to_set \"item2\", \"foo\", \"bar\", \"snafu\", \"nada\"\n      @matrix.add_to_set \"item3\", \"nada\", \"other\"\n    end\n\n    it \"should delete the item from sets it is in\" do\n      expect(@matrix.items_for(\"item1\")).to include(\"bar\")\n      expect(@matrix.items_for(\"item2\")).to include(\"bar\")\n      expect(@matrix.sets_for(\"bar\")).to include(\"item1\", \"item2\")\n      @matrix.delete_item(\"bar\")\n      expect(@matrix.items_for(\"item1\")).not_to include(\"bar\")\n      expect(@matrix.items_for(\"item2\")).not_to include(\"bar\")\n      expect(@matrix.sets_for(\"bar\")).to be_empty\n    end\n  end\n\n  describe \"#score\" do\n    let(:matrix) { Predictor::InputMatrix.new(options) }\n\n    context \"default\" do\n      it \"scores as jaccard index by default\" do\n        matrix.add_to_set \"item1\", \"foo\", \"bar\", \"fnord\", \"blubb\"\n        matrix.add_to_set \"item2\", \"bar\", \"fnord\", \"shmoo\", \"snafu\"\n        matrix.add_to_set \"item3\", \"bar\", \"nada\", \"snafu\"\n\n        expect(matrix.score(\"bar\", \"snafu\")).to eq(2.0/3.0)\n      end\n\n      it \"scores as jaccard index when given option\" do\n        matrix = Predictor::InputMatrix.new(options.merge(measure: :jaccard_index))\n        matrix.add_to_set \"item1\", \"foo\", \"bar\", \"fnord\", \"blubb\"\n        matrix.add_to_set \"item2\", \"bar\", \"fnord\", \"shmoo\", \"snafu\"\n        matrix.add_to_set \"item3\", \"bar\", \"nada\", \"snafu\"\n\n        expect(matrix.score(\"bar\", \"snafu\")).to eq(2.0/3.0)\n      end\n\n      it \"should handle missing sets\" do\n        matrix.add_to_set \"item1\", \"foo\", \"bar\", \"fnord\", \"blubb\"\n\n        expect(matrix.score(\"is\", \"missing\")).to eq(0.0)\n      end\n    end\n\n    context \"sorensen_coefficient\" do\n      before { @options[:measure] = :sorensen_coefficient }\n\n      it \"should calculate the correct sorensen index\" do\n        matrix.add_to_set \"item1\", \"foo\", \"bar\", \"fnord\", \"blubb\"\n        matrix.add_to_set \"item2\", \"fnord\", \"shmoo\", \"snafu\"\n        matrix.add_to_set \"item3\", \"bar\", \"nada\", \"snafu\"\n\n        expect(matrix.score(\"bar\", \"snafu\")).to eq(2.0/4.0)\n      end\n\n      it \"should handle missing sets\" do\n        matrix.add_to_set \"item1\", \"foo\", \"bar\", \"fnord\", \"blubb\"\n\n        expect(matrix.score(\"is\", \"missing\")).to eq(0.0)\n      end\n    end\n  end\n\n  private\n\n  def add_two_item_test_data!(matrix)\n    matrix.add_to_set(\"user42\", \"fnord\", \"blubb\")\n    matrix.add_to_set(\"user44\", \"blubb\")\n    matrix.add_to_set(\"user46\", \"fnord\")\n    matrix.add_to_set(\"user48\", \"fnord\", \"blubb\")\n    matrix.add_to_set(\"user50\", \"fnord\")\n  end\n\n  def add_three_item_test_data!(matrix)\n    matrix.add_to_set(\"user42\", \"fnord\", \"blubb\", \"shmoo\")\n    matrix.add_to_set(\"user44\", \"blubb\")\n    matrix.add_to_set(\"user46\", \"fnord\", \"shmoo\")\n    matrix.add_to_set(\"user48\", \"fnord\", \"blubb\")\n    matrix.add_to_set(\"user50\", \"fnord\", \"shmoo\")\n  end\n\nend\n"
  },
  {
    "path": "spec/predictor_spec.rb",
    "content": "require 'spec_helper'\n\ndescribe Predictor do\n\n  it \"should store a redis connection\" do\n    Predictor.redis = \"asd\"\n    expect(Predictor.redis).to eq(\"asd\")\n  end\n\n  it \"should raise an exception if unconfigured redis connection is accessed\" do\n    Predictor.redis = nil\n    expect{ Predictor.redis }.to raise_error(/not configured/i)\n  end\n\nend\n"
  },
  {
    "path": "spec/spec_helper.rb",
    "content": "require \"predictor\"\nrequire \"pry\"\n\ndef flush_redis!\n  Predictor.redis = Redis.new\n  Predictor.redis.keys(\"predictor-test*\").each do |k|\n    Predictor.redis.del(k)\n  end\nend\n\nPredictor.redis_prefix \"predictor-test\"\n\nclass BaseRecommender\n  include Predictor::Base\nend\n\nclass UserRecommender\n  include Predictor::Base\nend\n\nclass TestRecommender\n  include Predictor::Base\n\n  input_matrix :jaccard_one\nend\n\nclass PrefixRecommender\n  include Predictor::Base\n\n  def initialize(prefix)\n    @prefix = prefix\n  end\n\n  def prefix=(new_prefix)\n    @prefix = new_prefix\n  end\n\n  def get_redis_prefix\n    @prefix\n  end\nend\n\nclass Predictor::TestInputMatrix\n  def initialize(opts)\n    @opts = opts\n  end\n\n  def method_missing(method, *args)\n    @opts[method]\n  end\nend\n"
  }
]