Skip to content

Commit

Permalink
Revert "defer call to mem_cgroup_sk_alloc()"
Browse files Browse the repository at this point in the history
This patch effectively reverts commit 9f1c267 ("net: memcontrol:
defer call to mem_cgroup_sk_alloc()").

Moving mem_cgroup_sk_alloc() to the inet_csk_accept() completely breaks
memcg socket memory accounting, as packets received before memcg
pointer initialization are not accounted and are causing refcounting
underflow on socket release.

Actually the free-after-use problem was fixed by
commit c0576e3 ("net: call cgroup_sk_alloc() earlier in
sk_clone_lock()") for the cgroup pointer.

So, let's revert it and call mem_cgroup_sk_alloc() just before
cgroup_sk_alloc(). This is safe, as we hold a reference to the socket
we're cloning, and it holds a reference to the memcg.

Also, let's drop BUG_ON(mem_cgroup_is_root()) check from
mem_cgroup_sk_alloc(). I see no reasons why bumping the root
memcg counter is a good reason to panic, and there are no realistic
ways to hit it.

Signed-off-by: Roman Gushchin <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Tejun Heo <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
  • Loading branch information
rgushchin authored and davem330 committed Feb 3, 2018
1 parent 4db428a commit edbe69e
Show file tree
Hide file tree
Showing 3 changed files with 15 additions and 5 deletions.
14 changes: 14 additions & 0 deletions mm/memcontrol.c
Original file line number Diff line number Diff line change
Expand Up @@ -5747,6 +5747,20 @@ void mem_cgroup_sk_alloc(struct sock *sk)
if (!mem_cgroup_sockets_enabled)
return;

/*
* Socket cloning can throw us here with sk_memcg already
* filled. It won't however, necessarily happen from
* process context. So the test for root memcg given
* the current task's memcg won't help us in this case.
*
* Respecting the original socket's memcg is a better
* decision in this case.
*/
if (sk->sk_memcg) {
css_get(&sk->sk_memcg->css);
return;
}

rcu_read_lock();
memcg = mem_cgroup_from_task(current);
if (memcg == root_mem_cgroup)
Expand Down
5 changes: 1 addition & 4 deletions net/core/sock.c
Original file line number Diff line number Diff line change
Expand Up @@ -1683,16 +1683,13 @@ struct sock *sk_clone_lock(const struct sock *sk, const gfp_t priority)
newsk->sk_dst_pending_confirm = 0;
newsk->sk_wmem_queued = 0;
newsk->sk_forward_alloc = 0;

/* sk->sk_memcg will be populated at accept() time */
newsk->sk_memcg = NULL;

atomic_set(&newsk->sk_drops, 0);
newsk->sk_send_head = NULL;
newsk->sk_userlocks = sk->sk_userlocks & ~SOCK_BINDPORT_LOCK;
atomic_set(&newsk->sk_zckey, 0);

sock_reset_flag(newsk, SOCK_DONE);
mem_cgroup_sk_alloc(newsk);
cgroup_sk_alloc(&newsk->sk_cgrp_data);

rcu_read_lock();
Expand Down
1 change: 0 additions & 1 deletion net/ipv4/inet_connection_sock.c
Original file line number Diff line number Diff line change
Expand Up @@ -475,7 +475,6 @@ struct sock *inet_csk_accept(struct sock *sk, int flags, int *err, bool kern)
}
spin_unlock_bh(&queue->fastopenq.lock);
}
mem_cgroup_sk_alloc(newsk);
out:
release_sock(sk);
if (req)
Expand Down

0 comments on commit edbe69e

Please sign in to comment.