Performance issue when calculating loss #1255

findmyway · 2020-07-01T07:47:15Z

I notice that many loss functions in this package are written like this:

Flux.jl/src/layers/stateless.jl

Line 8 in 942d5f6

mae(ŷ, y) = sum(abs.(ŷ .- y)) * 1 // length(y)

Flux.jl/src/layers/stateless.jl

Line 23 in 942d5f6

mse(ŷ, y) = sum((ŷ .- y).^2) * 1 // length(y)

Flux.jl/src/layers/stateless.jl

Line 35 in 942d5f6

    
           msle(ŷ, y; ϵ=eps(eltype(ŷ))) = sum((log.(ŷ .+ ϵ) .- log.(y .+ ϵ)).^2) * 1 // length(y)

They all have a * 1 // . Originally I thought it was redundant. Only recently I got a performance issue and found that doing so will avoid a performance issue. Can anyone explain why we need this?

The text was updated successfully, but these errors were encountered:

CarloLucibello · 2020-07-01T15:24:19Z

Can you post some benchmarks?

DhairyaLGandhi · 2020-07-01T15:41:47Z

That was done to ensure type stability and prevent unnecessary type promotions that can occur by mixing Float64 and Float32 for example

findmyway · 2020-07-01T16:09:53Z

That was done to ensure type stability and prevent unnecessary type promotions that can occur by mixing Float64 and Float32 for example

I'm afraid it's not related to Float64 and Float32 here.

Can you post some benchmarks?

Here's the example:

m = Dense(32, 32)
x = rand(Float32, 32, 32)
y = rand(Float32, 32)

@benchmark gradient(Flux.params(m)) do
    sum(abs.(m(x) .- y)) / length(y) 
end

BenchmarkTools.Trial: 
  memory estimate:  114.19 KiB
  allocs estimate:  3197
  --------------
  minimum time:     136.582 μs (0.00% GC)
  median time:      140.018 μs (0.00% GC)
  mean time:        153.322 μs (7.62% GC)
  maximum time:     12.375 ms (95.82% GC)
  --------------
  samples:          10000
  evals/sample:     1

@benchmark gradient(Flux.params(m)) do
    sum(abs.(m(x) .- y)) * 1 // length(y) 
end

BenchmarkTools.Trial: 
  memory estimate:  97.72 KiB
  allocs estimate:  3196
  --------------
  minimum time:     78.068 μs (0.00% GC)
  median time:      82.311 μs (0.00% GC)
  mean time:        94.679 μs (11.36% GC)
  maximum time:     12.579 ms (96.22% GC)
  --------------
  samples:          10000
  evals/sample:     1

Note that, when model becomes larger, the performance difference soon becomes very large (in my case, it's about two order of magnitudes).

CarloLucibello · 2020-07-01T17:11:09Z

hmhm, I cannot reproduce

julia> @btime gradient(Flux.params(m)) do
           sum(abs.(m(x) .- y)) / length(y) 
       end
  74.090 μs (3192 allocations: 97.61 KiB)
Grads(...)

julia> @btime gradient(Flux.params(m)) do
           sum(abs.(m(x) .- y)) * 1 // length(y) 
       end
  74.381 μs (3195 allocations: 97.73 KiB)
Grads(...)

CarloLucibello · 2020-07-01T17:30:35Z

actually my last measurement was on Zygote 0.4.20, on newer Zygote versions I can reproduce the performance difference. @oxinabox could this be due to ChainRules?

CarloLucibello · 2020-07-01T17:31:57Z

this is quite relevant, since i removed all 1 // lenght(y) in #1150

oxinabox · 2020-07-01T18:32:56Z

Hard to say, without a lot more informatiom.
It shouldn't be.

CarloLucibello · 2020-07-02T09:38:34Z

Using mean instead is fine. This is good because that's the default way we now perform aggregations in losses

julia> using Flux, BenchmarkTools

julia> m = Dense(32, 32)
Dense(32, 32)

julia> x = rand(Float32, 32, 32);

julia> y = rand(Float32, 32);

julia> @btime gradient(Flux.params(m)) do
           sum(abs.(m(x) .- y)) / length(y) 
       end
  124.031 μs (3197 allocations: 114.19 KiB)
Grads(...)

julia> @btime gradient(Flux.params(m)) do
           sum(abs.(m(x) .- y)) * 1 // length(y) 
       end
  73.434 μs (3196 allocations: 97.72 KiB)
Grads(...)

julia> @btime gradient(Flux.params(m)) do
           mean(abs.(m(x) .- y)) 
       end
  73.004 μs (3190 allocations: 105.67 KiB)
Grads(...)

CarloLucibello · 2020-07-11T09:33:42Z

The integer division problem has been fixed in ChainRules

julia> using Flux, BenchmarkTools
[ Info: Precompiling Flux [587475ba-b771-5e3f-ad9e-33799f191a9c]

julia> m = Dense(32, 32)
Dense(32, 32)

julia> x = rand(Float32, 32, 32);

julia> y = rand(Float32, 32);

julia> @btime gradient(Flux.params(m)) do
           sum(abs.(m(x) .- y)) / length(y) 
       end
  75.177 μs (3193 allocations: 105.61 KiB)
Grads(...)

julia> @btime gradient(Flux.params(m)) do
           sum(abs.(m(x) .- y)) * 1 // length(y) 
       end
  75.825 μs (3196 allocations: 105.72 KiB)
Grads(...)

julia> @btime gradient(Flux.params(m)) do
           mean(abs.(m(x) .- y)) 
       end
  74.869 μs (3190 allocations: 113.67 KiB)
Grads(...)

CarloLucibello added this to the v0.11 milestone Jul 1, 2020

CarloLucibello added performance regression labels Jul 1, 2020

findmyway mentioned this issue Jul 1, 2020

resolve performance issue when calculating loss JuliaReinforcementLearning/ReinforcementLearningZoo.jl#58

Merged

CarloLucibello removed this from the v0.11 milestone Jul 2, 2020

This was referenced Jul 6, 2020

create Losses module #1264

Merged

wrong gradient type when dividing by integer FluxML/Zygote.jl#727

Closed

CarloLucibello closed this as completed Jul 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance issue when calculating loss #1255

Performance issue when calculating loss #1255

findmyway commented Jul 1, 2020

CarloLucibello commented Jul 1, 2020

DhairyaLGandhi commented Jul 1, 2020

findmyway commented Jul 1, 2020

CarloLucibello commented Jul 1, 2020

CarloLucibello commented Jul 1, 2020

CarloLucibello commented Jul 1, 2020

oxinabox commented Jul 1, 2020

CarloLucibello commented Jul 2, 2020

CarloLucibello commented Jul 11, 2020 •

edited

Loading

Performance issue when calculating loss #1255

Performance issue when calculating loss #1255

Comments

findmyway commented Jul 1, 2020

CarloLucibello commented Jul 1, 2020

DhairyaLGandhi commented Jul 1, 2020

findmyway commented Jul 1, 2020

CarloLucibello commented Jul 1, 2020

CarloLucibello commented Jul 1, 2020

CarloLucibello commented Jul 1, 2020

oxinabox commented Jul 1, 2020

CarloLucibello commented Jul 2, 2020

CarloLucibello commented Jul 11, 2020 • edited Loading

CarloLucibello commented Jul 11, 2020 •

edited

Loading