-
Notifications
You must be signed in to change notification settings - Fork 8.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MAPREDUCE-7386. Maven parallel builds (skipping tests) fail #4415
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should go ahead and explicitly add dependencies like that, just for some random build strategy, do a normal mvn clean install, if that works that is more than enough
Building the entire distribution in 10 minutes really facilitates
development. I think you're underestimating the savings that come from a
parallel build.
There are many organizations that run the tests less frequently since they
take so long to complete. I mentioned the skipping of tests specifically
since they still aren't capable of running safely in a parallel build and
therefore would have to be handled independently.
…On Tue, Jun 7, 2022 at 6:28 PM Ayush Saxena ***@***.***> wrote:
***@***.**** requested changes on this pull request.
I don't think we should go ahead and explicitly add dependencies like
that, just for some random build strategy, do a normal mvn clean install,
if that works that is more than enough
—
Reply to this email directly, view it on GitHub
<#4415 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAC3RCA4ZXMQXCUOMZLPIYTVN7EHVANCNFSM5YEBVLHA>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Patch LGTM... Looks like a build bug fix that allows us use mvn -T,--threads so builds can run faster locally for devs... @ayushtkn would you mind saying a bit more on why you object to this PR? Thanks. |
I was just worried like why are we adding dependecies for the build sake, maven should take care right? |
@ayushtkn makes sense. I was thinking they are probably needed at assembly time because mvn's dependency accounting when threading is likely a little lacking and it looks like these additions would be in the final artifact anyway, but maybe it would be good to confirm no difference in the product before and after this PR? Perhaps you have that info @snmvaughan ? Thanks @ayushtkn |
I'd be happy to build distributions serially and in parallel, and confirm that there are no differences. @saintstack is correct. The parallel build fails because the dependencies are needed, but aren't documented. Maven just needs more information in order to avoid a race condition which doesn't occur when run serially. |
💔 -1 overall
This message was automatically generated. |
I was able to perform a pkgdiff against a published distribution vs a local parallel build, and the only changes were those involving expected differences (e.g. timestamps, git checksums, etc.). |
i do a mvn -T 1C build by parallel by default and it always seems to work (spark builds hang, fwiw). maybe the 2C option, by doubling the parallelism, triggers these. |
In my testing I ran into several race conditions:
The changes provided ensures that these scenarios don't happen by making the implied dependencies explicit. |
💔 -1 overall
This message was automatically generated. |
I just repeated a build of mvn clean install -Pdist,native -Drequire.snappy -Drequire.zstd -Drequire.openssl -Drequire.isal -DskipTests -Dtar -Dmaven.javadoc.skip=true -T 1C [ERROR] Failed to execute goal org.apache.maven.plugins:maven-assembly-plugin:2.4:single (package-mapreduce) on project hadoop-mapreduce: Failed to create assembly: Artifact: org.apache.hadoop:hadoop-mapreduce-client-core:jar:3.4.0-SNAPSHOT (included by module) does not have an artifact with a file. Please ensure the package phase is run before the assembly is generated. -> [Help 1] In addition, the pkgdiff tests I performed previously demonstrated that without the native library dependencies that the native libraries would occasionally not be included in the distribution. |
I tried with -T1C, and first it failed with:
This is yarn stuff and when I do a normal build it passes. Second time it got hung here:
Third time it passed The behaviour is quite random, it never failed at Mapreduce though, but I don't think with such kind of behaviour, we can integrate this with our Pre-Commits. We can't trade off time save with random build failures. :-( May be this can stay as a dev-only enhancement for those who build locally using these options. |
unfortunate. how about making that focus on parallel test running for a speedup. that's fairly painful to do if anything is making assumptions about exclusive paths or ports, but once working is wonderful for anyone working on a single module, as well as jenkins builds |
I've been focusing on speeding up the build process in order to aid in development. There are several reasons why the tests can't safely run in parallel, which is why I've started looking into short-term opportunities for speedup in the meantime. My plan of attack has been to do the following:
In addition, I've been starting small by limiting the test runs to a focused subset. Currently I execute tests for |
@steveloughran Are we all building in the same environment? I.e., differences in maven version or maven dependency plugin version or maven assembly plugin version? |
Pardon @ayushtkn are you satisfied with the above justification? Do you have further changes you'd like to see or are you satisfied? If the latter, mind updating your vote accordingly? I agree that ideally there's an automated audit that correctly resolves and hardens the dependency graph. I've tried using various features of the dependency plugin for this in the past, but found it too unreliably to use for static build-time checking. |
@ndimiduk I pose no objections to this. Can be merged if others are convinced |
@ayushtkn What other random behavior did your see? |
@ndimiduk my setup
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all the new declarations aren't scoped; should they be listed as provided? I'm not sure they need to be, as the modules you are importing them into are not JAR ones. just making sure
+1 pending clarification
@steveloughran maven version matches my own, which is where I expected to find the difference. |
Testing refactoring of |
@ndimiduk just general race conditions then |
💔 -1 overall
This message was automatically generated. |
looks good; some duplicate declarations in yarn tho'. related?
|
The dependency issue in |
720c191
to
2227bb0
Compare
💔 -1 overall
This message was automatically generated. |
2227bb0
to
b4d622b
Compare
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 from me.
Does anyone else have any concerns before I merge?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a nice improvement to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Just one question...
hadoop-project/pom.xml
Outdated
@@ -167,7 +167,7 @@ | |||
|
|||
<!-- Plugin versions and config --> | |||
<maven-surefire-plugin.argLine>-Xmx2048m -XX:+HeapDumpOnOutOfMemoryError</maven-surefire-plugin.argLine> | |||
<maven-surefire-plugin.version>3.0.0-M1</maven-surefire-plugin.version> | |||
<maven-surefire-plugin.version>3.0.0-M7</maven-surefire-plugin.version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needed? Intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is needed because the Surefire plugging 3.0.0-M1 wasn't including the launcher as part of it's setup, so the Yarn tests were failing. Switching to 3.0.0-M7 fixes that issue, and demonstrates that the parallel build operate correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. should we make this a separate PR so it stands out and can be backported more easily?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've cherry-picked just the upgrade to 3.0.0-M7.
#4795
b066b5c
to
b621007
Compare
I've rebased this pull request to pick-up the Surefire upgrade to 3.0.0-M7 from HADOOP-18417 |
b621007
to
26e04a6
Compare
Rebased and re-running tests. |
💔 -1 overall
This message was automatically generated. |
Parallel builds failed because the assembly plugin did not understand the required dependencies.
Use dependencies in the mapreduce project for the purpose of assembly plugin and packaging. This moves all assembly dependencies added for parallel builds up to the module.
26e04a6
to
ebcd845
Compare
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for this in trunk; if all is good there after 1-2 weeks branch-3.3.
Woo! |
@ndimiduk woo? |
Description of PR
Parallel builds fail because the assembly plugin does not understand the required dependencies.
How was this patch tested?
This patch has been applied to a local build that runs in the Hadoop development environment, and is used to build a distribution. The same changes have been applied to trunk, branch-3.3, and branch-3.3.3.
For code changes:
LICENSE
,LICENSE-binary
,NOTICE-binary
files?