-
-
Notifications
You must be signed in to change notification settings - Fork 366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correct implementation of rescale_discrete_levels #1078
Correct implementation of rescale_discrete_levels #1078
Conversation
@@ -172,20 +172,22 @@ def eq_hist(data, mask=None, nbins=256*256): | |||
|
|||
# Run more accurate value counting if data is of boolean or integer type | |||
# and unique value array is smaller than nbins. | |||
if data2.dtype == bool or (np.issubdtype(data2.dtype, np.integer) and data2.max() < nbins): | |||
if data2.dtype == bool or (np.issubdtype(data2.dtype, np.integer) and data2.ptp() < nbins): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed from data2.max()
to data2,ptp()
to cover the unusual but possible use case of a small number of discrete levels but a non-zero minimum.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, though there's always a non-zero minimum because zero values are treated as NaN, so really this better covers the case when there is a minimum other than 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
@@ -172,20 +172,22 @@ def eq_hist(data, mask=None, nbins=256*256): | |||
|
|||
# Run more accurate value counting if data is of boolean or integer type | |||
# and unique value array is smaller than nbins. | |||
if data2.dtype == bool or (np.issubdtype(data2.dtype, np.integer) and data2.max() < nbins): | |||
if data2.dtype == bool or (np.issubdtype(data2.dtype, np.integer) and data2.ptp() < nbins): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, though there's always a non-zero minimum because zero values are treated as NaN, so really this better covers the case when there is a minimum other than 1.
Original implementation of
rescale_discrete_levels
forhow='eq_hist'
used the maximum value in the agg rather than the number of discrete levels. This PR corrects it.The number of discrete levels is only calculated if an agg has bool or int
dtype
, and only ifmax(agg)-min(agg)
is less than 256. Rescaling is only ever applied if there are fewer than 100 discrete levels. It is possible to fool this logic by using, for example, an agg containing unique values of1
and300
. This could be fixed, if desired, by performing thenp.unique()
call earlier if thedtype
is integer.There are minor changes to images returned by
shade
that are indistinguishable to the naked eye for the standard use case ofagg=ds.count()
.