Skip to content
/ recommend Public
forked from stojg/recommend

PHP library that can be used for recommendations to users

Notifications You must be signed in to change notification settings

ind8/recommend

Repository files navigation

Recommend

Build Status Code Coverage Scrutinizer Quality Score

This library makes it easier to find recommendations and similarities between different things. There are a couple of use cases for it:

  • Recommend a list of music albums/artists to a user
  • Recommend an article that is similar to the current one that a user is reading
  • Find other users that have the same values as another user (think matchmaking ;)

Installation

The easiest way to get this installed in your project is by using composer

composer require stojg/recommend

Usage

Presume that we have some data where users have rated artists within a scale of one to five:

$artistRatings = array(
	"Abe" => array(
		"Blues Traveler" => 3,
		"Broken Bells" => 2,
		"Norah Jones" => 4,
		"Phoenix" => 5,
		"Slightly Stoopid" => 1,
		"The Strokes" => 2,
		"Vampire Weekend" => 2
	),
	"Blair" => array(
		"Blues Traveler" => 2,
		"Broken Bells" => 3,
		"Deadmau5" => 4,
		"Phoenix" => 2,
		"Slightly Stoopid" => 3,
		"Vampire Weekend" => 3
    ),
	"Clair" => array(
		"Blues Traveler" => 5,
		"Broken Bells" => 1,
		"Deadmau5" => 1,
		"Norah Jones" => 3,
		"Phoenix" => 5,
		"Slightly Stoopid" => 1
	)
);

Start with loading this data into the Data class

$data = new \stojg\recommend\Data($artistRatings);

If we want to find artists that Blair might like, we execute the recommend method.

$recommendations = $data->recommend('Blair', new \stojg\recommend\strategy\Manhattan());
var_export($recommendations);

The result of that computation would be:

array (
  0 => array (
	'key' => 'Norah Jones',
	'value' => 4,
  ),
  1 => array (
	'key' => 'The Strokes',
	'value' => 2,
  )
)

This means that Blair might like Norah Jones. The Strokes on the other hand will fit her taste.

The Recommender works by finding someone in the $artistRatings that have rated artist similar to to Blair. In this case it turns out to be Abe, so it then tries to find artists that Abe have rated but not Blair and return them as a list of recommendations.

How the 'nearest' neighbour is found depends on which strategy that is chosen and how big and dense the dataset is.

Dataset

The general rule is that the bigger the dataset is, the better. It have to be formatted as an array in the following format:

array(
	'uniqueID' => array(
		'objectID' => (int)'rating'
	)
);

Strategies

There are currently three (four, depending how you are counting) strategies and which one to pick depends on how the data is organized and populated.

Minkowski

If the data is dense (almost all objectID have a non zero rating) and the magnitude (rating) of the attributes values are important, this is a good strategy.

It can be have a defined "dimension" from 1 and up. The bigger the dimension is, the bigger the difference between the "score" will be.

Manhattan

Manhattan is a shortcut for a Minkowski with a dimension of one.

Paerson

Use this strategy if the data is subject to grade-inflation.

I.e. if I rate most items between 2-4 and you rate things between 4-5 this strategy tries to compensate the fact that my worst (2) is equal to your worst (4).

Cosine

This is the strategy to pick if the data is sparse.

I.e. If there is a list with ten thousand artists, it quite likely that the users only listened and rated a few of them.

It basically disregard the null values so they don't influence the similarity score.

About

PHP library that can be used for recommendations to users

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • PHP 100.0%