GridSearch feature (NicolasHug#7)

* ignored idea files generated by pycharm * Create GridSearch class: - Import itertools - Add an __init__ function that initializes class instance - Add evaluate method that evaluates the parameters on a dataset * Fixed evaluate() in GridSearch: - tests on correct algorithm instance * Created GridSearch testing file: -Added a function to make sure the combinations returned are correct by testing its length * Updated GridSearch evaluate(): - Added best score, best parameters and best index attributes - Removed 1 attribute and added cv_results_ attribute analogous to sklearn implementation - Added tests for best attributes for RMSE and FCP measures * first draft on Non negative matrix factorization * Added verbose parameter to evaluate method: - Default to True - Added some local variables needed for verbose messages - Change the loop to enumerate to follow similar code structure * More doc for NMF, plus some tests * tests for CoClustering algorithm * update README.md * Update README.md * Added a biased version for NMF * Update CONTRIBUTING.md * update TODO.md * Change GridSearch evaluate method to accept multiple measurements: - Best attributes are now dicts with measures as keys - Change the test to adapt to the new parameters of evaluate - Add absolute value to tests * Added parameters documentation to GridSearch class and refactored GridSearch parameters - Added parameters documenation - Renamed algo parameter to algo_class - Changed default measures from ['RMSE'] to become similar to evaluate ['rmse','mae'] * Made GridSearch best attributes not case sensitive: - Removed duplicate definition of attributes - Changed definition from dict to Case insensitive dict - Added a test to make sure input parameters and output attributes are not case sensitive * Corrected if condition that might lead to un-desired situation * Added a clip option to the predict method * Added params and measures as keys for cv_results_ * Created 3 verbosity levels: - 0: Do not print anything - 1: Print params when combination starts and Mean scores when it finishes - 2: Print same info as 2 plus the score on each fold * Added best estimator attribute: - Best algorithm instance with certain measure - Gives the ability for the user to use like any other algorithm class instance - Add test for this attribute * Added documentation for the GridSearch class * ignored idea files generated by pycharm * Create GridSearch class: - Import itertools - Add an __init__ function that initializes class instance - Add evaluate method that evaluates the parameters on a dataset * Fixed evaluate() in GridSearch: - tests on correct algorithm instance * Created GridSearch testing file: -Added a function to make sure the combinations returned are correct by testing its length * Updated GridSearch evaluate(): - Added best score, best parameters and best index attributes - Removed 1 attribute and added cv_results_ attribute analogous to sklearn implementation - Added tests for best attributes for RMSE and FCP measures * Added verbose parameter to evaluate method: - Default to True - Added some local variables needed for verbose messages - Change the loop to enumerate to follow similar code structure * Change GridSearch evaluate method to accept multiple measurements: - Best attributes are now dicts with measures as keys - Change the test to adapt to the new parameters of evaluate - Add absolute value to tests * Added parameters documentation to GridSearch class and refactored GridSearch parameters - Added parameters documenation - Renamed algo parameter to algo_class - Changed default measures from ['RMSE'] to become similar to evaluate ['rmse','mae'] * Made GridSearch best attributes not case sensitive: - Removed duplicate definition of attributes - Changed definition from dict to Case insensitive dict - Added a test to make sure input parameters and output attributes are not case sensitive * Corrected if condition that might lead to un-desired situation * Added params and measures as keys for cv_results_ * Created 3 verbosity levels: - 0: Do not print anything - 1: Print params when combination starts and Mean scores when it finishes - 2: Print same info as 2 plus the score on each fold * Added best estimator attribute: - Best algorithm instance with certain measure - Gives the ability for the user to use like any other algorithm class instance - Add test for this attribute * Added documentation for the GridSearch class * Remove @classmethod attribute. Correct test cases. old evaluate method and grid search evaluate gives the best results * Added CaseInsensitiveDefaultDictForBestResults class: - It is a clone of CaseInsensitiveDefaultDict but without overriding __str__ method - Users can now print the dict output normally for the best - Replaced the usage of the CaseInsensitiveDefaultDict to CaseInsensitiveDefaultDictForBestResults inGridSearch class * Added User-Guide for GridSearch feature: - Added an example file that contains the code of the user-guide - Edited the getting started .rst file to add the guide * Refactored some parts of the code: - Used enumerate instead of index to count in loop - changes cv_results_ to defaultdict(list) - Reduced the populating of scores and parameters for 1 block * Refactored code to use evaluate() method: - No need to manually iterate over folds - Some verbose print statements avoided * Addressed a set of simple enhancements: - Reduced the number of iterations in some test functions to reduce testing time - Added reference to GridSearchCV from sklearn - fixed test_measure_is_not_case_sensitive to actually fail if we have a bad key - Added few comments - Change verbose method of GridSearch evaluate - Reduced line sizes to less than 80 chars * Changed measure to upper case from the start * Make grid search test and example PEP-8 compliant. - One import in example file is left at the end of the file on purpose * Fixed errors and warning when building docs: - Renamed GridSearch attribute by removig the underscore from the end. Solved Errors - Gave different names for code blocks. Solved warnings * Removed specifying unicode character 'u' from gridsearch test
nathania · Jan 2, 2017 · 714be0b · 714be0b
1 parent 39930ff
commit 714be0b
Show file tree

Hide file tree

Showing 6 changed files with 322 additions and 4 deletions.
diff --git a/.gitignore b/.gitignore
@@ -15,6 +15,7 @@ surprise/prediction_algorithms/optimize_baselines.c
 surprise/prediction_algorithms/slope_one.c
 surprise/prediction_algorithms/co_clustering.c
 *.so
+.idea/*
 
 Gemfile.lock
 _site

diff --git a/doc/source/evaluate.rst b/doc/source/evaluate.rst
@@ -5,4 +5,4 @@ evaluate module
 
 .. automodule:: surprise.evaluate
     :members:
-    :exclude-members: CaseInsensitiveDefaultDict
+    :exclude-members: CaseInsensitiveDefaultDict, CaseInsensitiveDefaultDictForBestResults
diff --git a/doc/source/getting_started.rst b/doc/source/getting_started.rst
@@ -94,6 +94,67 @@ Advanced usage
 We will here get a little deeper on what can `Surprise
 <https://nicolashug.github.io/Surprise/>`_ do for you.
 
+.. _tuning_algorithm_parameters:
+
+Tune algorithm parameters
+~~~~~~~~~~~~~~~~~~~~~~~~~
+The :func:`evaluate() <surprise.evaluate.evaluate>` function gives us the
+results on one set of parameters given to the algorithm. If the user wants
+to try the algorithm on a different set of parameters
+:class:`GridSearch <surprise.evaluate.GridSearch>` class comes to the rescue.
+Given a ``dict`` of parameters as keys and and values ``list`` as values, this
+class exhaustively tries all the combination of parameters and help get the
+best combination for an accuracy measurement. It is analogous to
+`GridSearchCV <http://scikit-learn.org/stable/modules/generated/sklearn.model
+_selection.GridSearchCV.html>`_ from sklearn.
+
+For instance, suppose that we want to tune the parameters of the
+:class:`SVD <surprise.prediction_algorithms.matrix_factorization.SVD>`. Some of
+the parameters of this algorithm are `n_epochs`, `lr_all` and `reg_all`. Thus
+we define a parameters grid as follows
+
+.. literalinclude:: ../../examples/grid_search_usage.py
+    :caption: From file ``examples/grid_search_usage.py``
+    :name: grid_search_usage.py
+    :lines: 13-14
+
+Next we define a :class:`GridSearch <surprise.evaluate.GridSearch>` instance
+and give it
+:class:`SVD <surprise.prediction_algorithms.matrix_factorization.SVD>` as an
+algorithm, `param_grid` as the parameters to tune. We will compute both the
+RMSE and FCP values for all the combination. Thus the following definition:
+
+.. literalinclude:: ../../examples/grid_search_usage.py
+    :caption: From file ``examples/grid_search_usage.py``
+    :name: grid_search_usage2.py
+    :lines: 16
+
+Now that :class:`GridSearch <surprise.evaluate.GridSearch>` instance is ready,
+we want to evaluate it on the the data, so first we prepare our data as and
+then we call the evaluate method of
+:class:`GridSearch <surprise.evaluate.GridSearch>`:
+
+.. literalinclude:: ../../examples/grid_search_usage.py
+    :caption: From file ``examples/grid_search_usage.py``
+    :name: grid_search_usage3.py
+    :lines: 19-22
+
+Everything is ready now to read the results. For example, we get the best RMSE
+and FCP scores and the parameters combinations that created them as follows:
+
+.. literalinclude:: ../../examples/grid_search_usage.py
+    :caption: From file ``examples/grid_search_usage.py``
+    :name: grid_search_usage4.py
+    :lines: 24-32
+
+For further analysis, we can easily read all the results in a pandas
+``DataFrame`` as follows:
+
+.. literalinclude:: ../../examples/grid_search_usage.py
+    :caption: From file ``examples/grid_search_usage.py``
+    :name: grid_search_usage5.py
+    :lines: 34-36
+
 .. _iterate_over_folds:
 
 Manually iterate over folds

diff --git a/examples/grid_search_usage.py b/examples/grid_search_usage.py
@@ -0,0 +1,37 @@
+"""
+This module describes how to manually train and test an algorithm without using
+the evaluate() function.
+"""
+
+from __future__ import (absolute_import, division, print_function,
+                        unicode_literals)
+
+from surprise.evaluate import GridSearch
+from surprise.prediction_algorithms import SVD
+from surprise.dataset import Dataset
+
+param_grid = {'n_epochs': [5, 10], 'lr_all': [0.002, 0.005],
+              'reg_all': [0.4, 0.6]}
+
+gridSearch = GridSearch(SVD, param_grid, measures=['RMSE', 'FCP'])
+
+# Prepare Data
+data = Dataset.load_builtin('ml-100k')
+data.split(n_folds=3)
+
+gridSearch.evaluate(data)
+
+# best RMSE score
+print(gridSearch.best_score['RMSE'])
+# combination of parameters that gave the best RMSE score
+print(gridSearch.best_params['RMSE'])
+
+# best FCP score
+print(gridSearch.best_score['FCP'])
+# combination of parameters that gave the best FCP score
+print(gridSearch.best_params['FCP'])
+
+import pandas as pd
+
+results_df = pd.DataFrame.from_dict(gridSearch.cv_results)
+print(results_df)
diff --git a/surprise/evaluate.py b/surprise/evaluate.py
@@ -1,6 +1,5 @@
-"""
-The :mod:`evaluate` module defines the :func:`evaluate` function.
-"""
+"""The :mod:`evaluate` module defines the :func:`evaluate` function and
+:class:`GridSearch` class """
 
 from __future__ import (absolute_import, division, print_function,
                         unicode_literals)
@@ -11,6 +10,7 @@
 import numpy as np
 from six import iteritems
 from six import itervalues
+from itertools import product
 
 from . import accuracy
 from .dump import dump
@@ -129,3 +129,152 @@ def __str__(self):
             for (key, vals) in iteritems(self))
 
         return s
+
+
+class GridSearch:
+    """Evaluate the performance of the algorithm on all the combinations of
+    parameters given to it. It is analogous to
+    `GridSearchCV <http://scikit-learn.org/stable/modules/generated/sklearn.
+    model_selection.GridSearchCV.html>`_ from sklearn.
+
+    Used to get study the effect of parameters on algorithms and extract
+    best parameters.
+
+    Depending on the nature of the ``data`` parameter, it may or may not
+    perform cross validation.
+
+        Parameters:
+            algo_class(:obj:`AlgoBase \
+                <surprise.prediction_algorithms.algo_base.AlgoBase>`):
+                The algorithm to evaluate.
+            param_grid (dict):
+                The dictionary has algo_class parameters as keys
+                (string) and list of parameters as the desired values to try.
+                All combinations will be evaluated with desired algorithm
+            measures(list of string):
+                The performance measures to compute. Allowed
+                names are function names as defined in the :mod:`accuracy
+                <surprise.accuracy>` module. Default is ``['rmse', 'mae']``.
+            verbose(int):
+                Level of verbosity. If 0, nothing is printed. If 1
+                (default), accuracy measures for each parameters combination
+                are printed, with acombination values. If 2, folds accuray
+                values are also printed.
+        Attributes:
+            cv_results (dict of arrays):
+                a dict that contains all parameters
+                and accuracy information for each combination. Can  be
+                imported into pandas `DataFrame`
+            best_estimator (dict of AlgoBase):
+                Using accuracy measure as a key,
+                get the estimator that gave the best accuracy results for the
+                chosen measure
+            best_score (dict of floats):
+                Using accuracy measure as a key,
+                get the best score achieved for that measure
+            best_params (dict of dicts):
+                Using accuracy measure as a key,
+                get the parameters combination that gave the best accuracy
+                results for the chosen measure
+            best_index  (dict of ints):
+                Using accuracy measure as a key,
+                get the index that can be used with `cv_results_` that
+                achieved the highest accuracy for that measure
+        """
+
+    def __init__(self, algo_class, param_grid, measures=['rmse', 'mae'],
+                 verbose=1):
+        self.best_params = CaseInsensitiveDefaultDictForBestResults(list)
+        self.best_index = CaseInsensitiveDefaultDictForBestResults(list)
+        self.best_score = CaseInsensitiveDefaultDictForBestResults(list)
+        self.best_estimator = CaseInsensitiveDefaultDictForBestResults(list)
+        self.cv_results = defaultdict(list)
+        self.algo_class = algo_class
+        self.param_grid = param_grid
+        self.measures = [measure.upper() for measure in measures]
+        self.verbose = verbose
+        self.param_combinations = [dict(zip(param_grid, v)) for v in
+                                   product(*param_grid.values())]
+
+    def evaluate(self, data):
+        """Runs the grid search on dataset.
+
+        Class instance attributes can be accessed after the evaluate is done.
+
+        Args:
+            data (:obj:`Dataset <surprise.dataset.Dataset>`): The dataset on
+                which to evaluate the algorithm.
+        """
+
+        params = []
+        scores = []
+
+        # evaluate each combination of parameters using the evaluate method
+        for combination_index, combination in enumerate(
+                self.param_combinations):
+            params.append(combination)
+
+            if self.verbose >= 1:
+                num_of_combinations = len(self.param_combinations)
+                print('Parameters combination {} from {}'.
+                      format(combination_index + 1, num_of_combinations))
+                print('params: ', combination)
+
+            # the algorithm to use along with the combination parameters
+            algo_instance = self.algo_class(**combination)
+            evaluate_results = evaluate(algo_instance, data,
+                                        measures=self.measures,
+                                        verbose=(self.verbose == 2))
+
+            # measures as keys and folds average as values
+            mean_score = {}
+            for measure in self.measures:
+                mean_score[measure] = np.mean(evaluate_results[measure])
+            scores.append(mean_score)
+
+            if self.verbose == 1:
+                print('-' * 12)
+                print('-' * 12)
+                for measure in self.measures:
+                    print('Mean {0:4s}: {1:1.4f}'.format(
+                        measure, mean_score[measure]))
+                print('-' * 12)
+                print('-' * 12)
+
+        # Add all scores and parameters lists to dict
+        self.cv_results['params'] = params
+        self.cv_results['scores'] = scores
+
+        # Add accuracy measures and algorithm parameters as keys to dict
+        for param, score in zip(params, scores):
+            for param_key, score_key in zip(param.keys(), score.keys()):
+                self.cv_results[param_key].append(param[param_key])
+                self.cv_results[score_key].append(score[score_key])
+
+        # Get the best results
+        for measure in self.measures:
+            if measure == 'FCP':
+                best_dict = max(self.cv_results['scores'],
+                                key=lambda x: x[measure])
+            else:
+                best_dict = min(self.cv_results['scores'],
+                                key=lambda x: x[measure])
+            self.best_score[measure] = best_dict[measure]
+            self.best_index[measure] = self.cv_results['scores'].index(
+                best_dict)
+            self.best_params[measure] = self.cv_results['params'][
+                self.best_index[measure]]
+            self.best_estimator[measure] = self.algo_class(
+                **self.best_params[measure])
+
+
+class CaseInsensitiveDefaultDictForBestResults(defaultdict):
+    """ Same as CaseInsensitiveDefaultDict but without overriding __str__
+        because it is not relevant to "best" attributes"""
+    def __setitem__(self, key, value):
+        super(CaseInsensitiveDefaultDictForBestResults, self).__setitem__(
+            key.lower(), value)
+
+    def __getitem__(self, key):
+        return super(CaseInsensitiveDefaultDictForBestResults,
+                     self).__getitem__(key.lower())
diff --git a/tests/test_grid_search.py b/tests/test_grid_search.py
@@ -0,0 +1,70 @@
+"""
+Module for testing SearchGrid class.
+"""
+
+from __future__ import (absolute_import, division, print_function,
+                        unicode_literals)
+
+import os
+
+from surprise.evaluate import GridSearch
+from surprise.dataset import Dataset
+from surprise.dataset import Reader
+from surprise.prediction_algorithms import SVD
+from surprise.evaluate import evaluate
+
+# the test and train files are from the ml-100k dataset (10% of u1.base and
+# 10 % of u1.test)
+train_file = os.path.join(os.path.dirname(__file__), './u1_ml100k_train')
+test_file = os.path.join(os.path.dirname(__file__), './u1_ml100k_test')
+data = Dataset.load_from_folds([(train_file, test_file)], Reader('ml-100k'))
+
+
+def test_grid_search_cv_results():
+    param_grid = {'n_epochs': [2, 4], 'lr_all': [0.002, 0.005],
+                  'reg_all': [0.4, 0.6]}
+    grid_search = GridSearch(SVD, param_grid)
+    grid_search.evaluate(data)
+    assert len(grid_search.cv_results['params']) == 8
+
+
+def test_best_rmse():
+    param_grid = {'n_epochs': [5, 10], 'lr_all': [0.002, 0.005],
+                  'reg_all': [0.4, 0.6]}
+    grid_search = GridSearch(SVD, param_grid)
+    grid_search.evaluate(data)
+    assert grid_search.best_index['RMSE'] == 7
+    assert grid_search.best_params['RMSE'] == {
+        'lr_all': 0.005, 'reg_all': 0.6, 'n_epochs': 10}
+    assert (abs(grid_search.best_score['RMSE'] - 1.0751)) < 0.0001
+
+
+def test_best_fcp():
+    param_grid = {'n_epochs': [5, 10], 'lr_all': [0.002, 0.005],
+                  'reg_all': [0.4, 0.6]}
+    grid_search = GridSearch(SVD, param_grid, measures=['FCP'])
+    grid_search.evaluate(data)
+    assert grid_search.best_index['FCP'] == 7
+    assert grid_search.best_params['FCP'] == {
+        'lr_all': 0.005, 'reg_all': 0.6, 'n_epochs': 10}
+    assert (abs(grid_search.best_score['FCP'] - 0.5922)) < 0.0001
+
+
+def test_measure_is_not_case_sensitive():
+    param_grid = {'n_epochs': [2], 'lr_all': [0.002, 0.005],
+                  'reg_all': [0.4, 0.6]}
+    grid_search = GridSearch(SVD, param_grid, measures=['FCP', 'mae', 'rMSE'])
+    grid_search.evaluate(data)
+    assert isinstance(grid_search.best_index['fcp'], int)
+    assert isinstance(grid_search.best_params['MAE'], dict)
+    assert isinstance(grid_search.best_score['RmSe'], float)
+
+
+def test_best_estimator():
+    param_grid = {'n_epochs': [5], 'lr_all': [0.002, 0.005],
+                  'reg_all': [0.4, 0.6]}
+    grid_search = GridSearch(SVD, param_grid, measures=['FCP', 'mae', 'rMSE'])
+    grid_search.evaluate(data)
+    best_estimator = grid_search.best_estimator['MAE']
+    assert evaluate(
+        best_estimator, data)['MAE'] == grid_search.best_score['MAE']