Recommendation Algorithm

Currently, Pytorch on angel supports a series of recommendation algorithms.

In detail, the following methods are currently implemented:

We use DeepFM as an example to illustrate the details process of running an algorithm. The methods are similar for other algorithms.

Example of DeepFM

  1. ** Generate pytorch script model** First, go to directory of python/recommendation and execute the following command:

    python --input_dim 148 --n_fields 13 --embedding_dim 10 --fc_dims 10 5 1

    Some explanations for the parameters.

    • input_dim: the feature dimension for the data
    • n_fields: number of fields for data
    • embedding_dim: dimension for embedding layer
    • fc_dims: the dimensions for fc layers in deepfm. "10 5 1" indicates a two-layers mlp composed with one 10x5 layer and one 5x1 layer.

    This python script will generate a TorchScript model with the structure of dataflow graph for deepfm. This file is named

  2. ** Preparing the input data** The input data of DeepFM should be libsvm or libffm format. Each line of the input data represents one data sample.

    label feature1:value1 feature2:value2

    In Pytorch on angel, multi-hot field is allowed, which means some field can be appeared multi-times in one data example.

    label field1:feature1:value1 field2:feature2:value2
  3. ** Training model** After obtaining the model file ( and the input data, we can submit a task through Spark on Angel to train the model. The command is:

    source ./  
    $SPARK_HOME/bin/spark-submit \
          --master yarn-cluster\
          --conf \
          --conf \
          --conf$SONA_ANGEL_JARS \
          --conf \
          --conf \
          --conf spark.driver.extraJavaOptions=-Djava.library.path=$JAVA_LIBRARY_PATH:.:./torch/angel_libtorch \
          --conf spark.executor.extraJavaOptions=-Djava.library.path=$JAVA_LIBRARY_PATH:.:./torch/angel_libtorch \
          --conf spark.executor.extraLibraryPath=./torch/angel_libtorch \
          --conf spark.driver.extraLibraryPath=./torch/angel_libtorch \
          --conf spark.executorEnv.OMP_NUM_THREADS=2 \
          --conf spark.executorEnv.MKL_NUM_THREADS=2 \
          --queue $queue \
          --name "deepfm on angel" \
          --jars $SONA_SPARK_JARS  \
          --archives\  #path to c++ library files
          --files \   #path to pytorch script model
          --driver-memory 5g \
          --num-executors 5 \
          --executor-cores 1 \
          --executor-memory 5g \
          --class com.tencent.angel.pytorch.examples.supervised.RecommendationExample \
          ./pytorch-on-angel-*.jar \   # jar from Compiling java submodule
          trainInput:$input batchSize:128 \
          stepSize:0.001 numEpoch:10 testRatio:0.1 \
          angelModelOutputPath:$output \

    Description for the parameters:

    • trainInput: the input path (hdfs) for training data
    • batchSize: batch size for each optimizing step
    • torchModelPath: the name of the generated torch model
    • stepSize: learning rate
    • numEpoch: how many epoches for the training process
    • testRatio: how many training examples are used for testing
    • angelModelOutputPath: the output path (hdfs) for the training model