Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-20587][ML] Improve performance of ML ALS recommendForAll #17845

Closed
wants to merge 4 commits into from

Conversation

MLnick
Copy link
Contributor

@MLnick MLnick commented May 3, 2017

This PR is a DataFrame version of #17742 for SPARK-11968, for improving the performance of recommendAll methods.

How was this patch tested?

Existing unit tests.

@MLnick
Copy link
Contributor Author

MLnick commented May 3, 2017

cc @mpjlu

Also @srowen @sethah @jkbradley

@MLnick
Copy link
Contributor Author

MLnick commented May 3, 2017

Some quick perf numbers:

Using ml-latest dataset (~24 million ratings, ~260k users, ~39k movies); 4x workers 48 cores 100GB RAM each.

rank = 10, k = 10

master this PR
recommendForAllUsers
369s 16s
recommendForAllItems
547s 15s

So 23-37x improvement.

@SparkQA
Copy link

SparkQA commented May 3, 2017

Test build #76424 has finished for PR 17845 at commit baeadd0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@sethah sethah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first pass on style.

score += srcFactor(k) * dstFactor(k)
k += 1
}
pq += { (dstId, score) }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pq += dstId -> score?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

@@ -389,6 +436,17 @@ class ALSModel private[ml] (
)
recs.select($"id" as srcOutputColumn, $"recommendations" cast arrayType)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point - may as well fix it while here

*/
private def blockify(
factors: Dataset[(Int, Array[Float])],
/* TODO make blockSize a param? */blockSize: Int = 4096): Dataset[Seq[(Int, Array[Float])]] = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just put the comment in the doc and reference a JIRA.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

* relatively efficient, the approach implemented here is significantly more efficient.
*
* This approach groups factors into blocks and computes the top-k elements per block,
* using Level 1 BLAS (dot) and an efficient [[BoundedPriorityQueue]]. It then computes the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

below we say that blas is not used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about "... using dot product instead of gemm and an efficient ..."

val pq = new BoundedPriorityQueue[(Int, Float)](num)(Ordering.by(_._2))
srcIter.foreach { case (srcId, srcFactor) =>
dstIter.foreach { case (dstId, dstFactor) =>
/**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't use doc notation. Maybe we can reduce it to:

            /*
             * The below code is equivalent to
             *   `val score = blas.sdot(rank, srcFactor, 1, dstFactor, 1)`
             * The handwritten version is as or more efficient as BLAS calls in this case. 
             */

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good

@MLnick
Copy link
Contributor Author

MLnick commented May 4, 2017

Thanks @sethah will update shortly

@SparkQA
Copy link

SparkQA commented May 4, 2017

Test build #76447 has finished for PR 17845 at commit cf35eea.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@MLnick
Copy link
Contributor Author

MLnick commented May 8, 2017

jenkins retest this please

@SparkQA
Copy link

SparkQA commented May 8, 2017

Test build #76571 has finished for PR 17845 at commit cf35eea.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@jkbradley jkbradley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM; thanks for doing this! Feel free to merge or address my 1 comment

val m = srcIter.size
val n = math.min(dstIter.size, num)
val output = new Array[(Int, Int, Float)](m * n)
var j = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: You could combine j and i; you really just need 1 counter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

j iterates through src ids while i iterates through dst ids in the queue for each src id. So I don't think they can be combined.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway the iter.next() code is a bit ugly and since it's at most k elements it's not really performance critical, so could just use foreach I think

@jkbradley
Copy link
Member

One more comment I'll copy from the other PR: I'm not a fan of custom BLAS implementations scattered throughout MLlib. Could you please follow up by putting the dot as a private API in BLAS.scala and adding unit tests?

@MLnick
Copy link
Contributor Author

MLnick commented May 9, 2017

Merged to master/branch-2.2

Thanks @mpjlu for the original work on the approach!

asfgit pushed a commit that referenced this pull request May 9, 2017
This PR is a `DataFrame` version of #17742 for [SPARK-11968](https://issues.apache.org/jira/browse/SPARK-11968), for improving the performance of `recommendAll` methods.

## How was this patch tested?

Existing unit tests.

Author: Nick Pentreath <[email protected]>

Closes #17845 from MLnick/ml-als-perf.

(cherry picked from commit 10b00ab)
Signed-off-by: Nick Pentreath <[email protected]>
@asfgit asfgit closed this in 10b00ab May 9, 2017
asfgit pushed a commit that referenced this pull request May 16, 2017
Small clean ups from #17742 and #17845.

## How was this patch tested?

Existing unit tests.

Author: Nick Pentreath <[email protected]>

Closes #17919 from MLnick/SPARK-20677-als-perf-followup.

(cherry picked from commit 25b4f41)
Signed-off-by: Nick Pentreath <[email protected]>
ghost pushed a commit to dbtsai/spark that referenced this pull request May 16, 2017
Small clean ups from apache#17742 and apache#17845.

## How was this patch tested?

Existing unit tests.

Author: Nick Pentreath <[email protected]>

Closes apache#17919 from MLnick/SPARK-20677-als-perf-followup.
robert3005 pushed a commit to palantir/spark that referenced this pull request May 19, 2017
Small clean ups from apache#17742 and apache#17845.

## How was this patch tested?

Existing unit tests.

Author: Nick Pentreath <[email protected]>

Closes apache#17919 from MLnick/SPARK-20677-als-perf-followup.
liyichao pushed a commit to liyichao/spark that referenced this pull request May 24, 2017
This PR is a `DataFrame` version of apache#17742 for [SPARK-11968](https://issues.apache.org/jira/browse/SPARK-11968), for improving the performance of `recommendAll` methods.

## How was this patch tested?

Existing unit tests.

Author: Nick Pentreath <[email protected]>

Closes apache#17845 from MLnick/ml-als-perf.
liyichao pushed a commit to liyichao/spark that referenced this pull request May 24, 2017
Small clean ups from apache#17742 and apache#17845.

## How was this patch tested?

Existing unit tests.

Author: Nick Pentreath <[email protected]>

Closes apache#17919 from MLnick/SPARK-20677-als-perf-followup.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants