transpose next_land_dynamic for greater speed and clarity #247
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Here's another one I prepared yesterday.
This uses the fact that only very few coodinates are actually
dist
away from land. It then iterates over these coordinates and looks if the distance of their neighbours happens to be unknown. Only then an update is made. This is actually the more logical thing to do because it checks fordist
directly at the beginning of the loop iterating overdist
and it checksnext_land[ny,nx]
in the inner loop and makes a change to the same coordinate.The original implementation had this backwards. It checked for unknown distances in the outer loop and for the known distance in the inner loop which makes for a lot more checks because of how the numbers work out.
I think that change in ordering is where the main speedup comes from. Doing the more logical thing also saves two comparisons in the inner loop and is likely to be a tiny bit more cache friendly, but I don't think these things matter too much. Using numpy here makes the code look a bit cleaner but mainly it hides away some of the loops, so I don't think it is a great speedup in and of itself. I didn't measure that in detail though. The code is now reasonably fast and it looks cleaner so that was where I stopped investigating.
Cheers
Alex