Skip to content

Commit

Permalink
Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Browse files Browse the repository at this point in the history
Daniel Borkmann says:

====================
pull-request: bpf-next 2020-02-21

The following pull-request contains BPF updates for your *net-next* tree.

We've added 25 non-merge commits during the last 4 day(s) which contain
a total of 33 files changed, 2433 insertions(+), 161 deletions(-).

The main changes are:

1) Allow for adding TCP listen sockets into sock_map/hash so they can be used
   with reuseport BPF programs, from Jakub Sitnicki.

2) Add a new bpf_program__set_attach_target() helper for adding libbpf support
   to specify the tracepoint/function dynamically, from Eelco Chaudron.

3) Add bpf_read_branch_records() BPF helper which helps use cases like profile
   guided optimizations, from Daniel Xu.

4) Enable bpf_perf_event_read_value() in all tracing programs, from Song Liu.

5) Relax BTF mandatory check if only used for libbpf itself e.g. to process
   BTF defined maps, from Andrii Nakryiko.

6) Move BPF selftests -mcpu compilation attribute from 'probe' to 'v3' as it has
   been observed that former fails in envs with low memlock, from Yonghong Song.
====================

Signed-off-by: David S. Miller <[email protected]>
  • Loading branch information
davem330 committed Feb 21, 2020
2 parents e65ee2f + eb1e147 commit b105e8e
Show file tree
Hide file tree
Showing 33 changed files with 2,433 additions and 161 deletions.
29 changes: 12 additions & 17 deletions Documentation/bpf/bpf_devel_QA.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,11 +20,11 @@ Reporting bugs
Q: How do I report bugs for BPF kernel code?
--------------------------------------------
A: Since all BPF kernel development as well as bpftool and iproute2 BPF
loader development happens through the netdev kernel mailing list,
loader development happens through the bpf kernel mailing list,
please report any found issues around BPF to the following mailing
list:

netdev@vger.kernel.org
bpf@vger.kernel.org

This may also include issues related to XDP, BPF tracing, etc.

Expand All @@ -46,17 +46,12 @@ Submitting patches

Q: To which mailing list do I need to submit my BPF patches?
------------------------------------------------------------
A: Please submit your BPF patches to the netdev kernel mailing list:
A: Please submit your BPF patches to the bpf kernel mailing list:

[email protected]

Historically, BPF came out of networking and has always been maintained
by the kernel networking community. Although these days BPF touches
many other subsystems as well, the patches are still routed mainly
through the networking community.
[email protected]

In case your patch has changes in various different subsystems (e.g.
tracing, security, etc), make sure to Cc the related kernel mailing
networking, tracing, security, etc), make sure to Cc the related kernel mailing
lists and maintainers from there as well, so they are able to review
the changes and provide their Acked-by's to the patches.

Expand Down Expand Up @@ -168,7 +163,7 @@ a BPF point of view.
Be aware that this is not a final verdict that the patch will
automatically get accepted into net or net-next trees eventually:

On the netdev kernel mailing list reviews can come in at any point
On the bpf kernel mailing list reviews can come in at any point
in time. If discussions around a patch conclude that they cannot
get included as-is, we will either apply a follow-up fix or drop
them from the trees entirely. Therefore, we also reserve to rebase
Expand Down Expand Up @@ -494,15 +489,15 @@ A: You need cmake and gcc-c++ as build requisites for LLVM. Once you have
that set up, proceed with building the latest LLVM and clang version
from the git repositories::

$ git clone http://llvm.org/git/llvm.git
$ cd llvm/tools
$ git clone --depth 1 http://llvm.org/git/clang.git
$ cd ..; mkdir build; cd build
$ cmake .. -DLLVM_TARGETS_TO_BUILD="BPF;X86" \
$ git clone https://github.com/llvm/llvm-project.git
$ mkdir -p llvm-project/llvm/build/install
$ cd llvm-project/llvm/build
$ cmake .. -G "Ninja" -DLLVM_TARGETS_TO_BUILD="BPF;X86" \
-DLLVM_ENABLE_PROJECTS="clang" \
-DBUILD_SHARED_LIBS=OFF \
-DCMAKE_BUILD_TYPE=Release \
-DLLVM_BUILD_RUNTIME=OFF
$ make -j $(getconf _NPROCESSORS_ONLN)
$ ninja

The built binaries can then be found in the build/bin/ directory, where
you can point the PATH variable to.
Expand Down
20 changes: 3 additions & 17 deletions include/linux/skmsg.h
Original file line number Diff line number Diff line change
Expand Up @@ -352,29 +352,15 @@ static inline void sk_psock_update_proto(struct sock *sk,
psock->saved_write_space = sk->sk_write_space;

psock->sk_proto = sk->sk_prot;
sk->sk_prot = ops;
/* Pairs with lockless read in sk_clone_lock() */
WRITE_ONCE(sk->sk_prot, ops);
}

static inline void sk_psock_restore_proto(struct sock *sk,
struct sk_psock *psock)
{
sk->sk_prot->unhash = psock->saved_unhash;

if (psock->sk_proto) {
struct inet_connection_sock *icsk = inet_csk(sk);
bool has_ulp = !!icsk->icsk_ulp_data;

if (has_ulp) {
tcp_update_ulp(sk, psock->sk_proto,
psock->saved_write_space);
} else {
sk->sk_prot = psock->sk_proto;
sk->sk_write_space = psock->saved_write_space;
}
psock->sk_proto = NULL;
} else {
sk->sk_write_space = psock->saved_write_space;
}
tcp_update_ulp(sk, psock->sk_proto, psock->saved_write_space);
}

static inline void sk_psock_set_state(struct sk_psock *psock,
Expand Down
37 changes: 35 additions & 2 deletions include/net/sock.h
Original file line number Diff line number Diff line change
Expand Up @@ -527,10 +527,43 @@ enum sk_pacing {
SK_PACING_FQ = 2,
};

/* Pointer stored in sk_user_data might not be suitable for copying
* when cloning the socket. For instance, it can point to a reference
* counted object. sk_user_data bottom bit is set if pointer must not
* be copied.
*/
#define SK_USER_DATA_NOCOPY 1UL
#define SK_USER_DATA_PTRMASK ~(SK_USER_DATA_NOCOPY)

/**
* sk_user_data_is_nocopy - Test if sk_user_data pointer must not be copied
* @sk: socket
*/
static inline bool sk_user_data_is_nocopy(const struct sock *sk)
{
return ((uintptr_t)sk->sk_user_data & SK_USER_DATA_NOCOPY);
}

#define __sk_user_data(sk) ((*((void __rcu **)&(sk)->sk_user_data)))

#define rcu_dereference_sk_user_data(sk) rcu_dereference(__sk_user_data((sk)))
#define rcu_assign_sk_user_data(sk, ptr) rcu_assign_pointer(__sk_user_data((sk)), ptr)
#define rcu_dereference_sk_user_data(sk) \
({ \
void *__tmp = rcu_dereference(__sk_user_data((sk))); \
(void *)((uintptr_t)__tmp & SK_USER_DATA_PTRMASK); \
})
#define rcu_assign_sk_user_data(sk, ptr) \
({ \
uintptr_t __tmp = (uintptr_t)(ptr); \
WARN_ON_ONCE(__tmp & ~SK_USER_DATA_PTRMASK); \
rcu_assign_pointer(__sk_user_data((sk)), __tmp); \
})
#define rcu_assign_sk_user_data_nocopy(sk, ptr) \
({ \
uintptr_t __tmp = (uintptr_t)(ptr); \
WARN_ON_ONCE(__tmp & ~SK_USER_DATA_PTRMASK); \
rcu_assign_pointer(__sk_user_data((sk)), \
__tmp | SK_USER_DATA_NOCOPY); \
})

/*
* SK_CAN_REUSE and SK_NO_REUSE on a socket mean that the socket is OK
Expand Down
2 changes: 0 additions & 2 deletions include/net/sock_reuseport.h
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,4 @@ static inline bool reuseport_has_conns(struct sock *sk, bool set)
return ret;
}

int reuseport_get_id(struct sock_reuseport *reuse);

#endif /* _SOCK_REUSEPORT_H */
7 changes: 7 additions & 0 deletions include/net/tcp.h
Original file line number Diff line number Diff line change
Expand Up @@ -2203,6 +2203,13 @@ int tcp_bpf_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
int nonblock, int flags, int *addr_len);
int __tcp_bpf_recvmsg(struct sock *sk, struct sk_psock *psock,
struct msghdr *msg, int len, int flags);
#ifdef CONFIG_NET_SOCK_MSG
void tcp_bpf_clone(const struct sock *sk, struct sock *newsk);
#else
static inline void tcp_bpf_clone(const struct sock *sk, struct sock *newsk)
{
}
#endif

/* Call BPF_SOCK_OPS program that returns an int. If the return value
* is < 0, then the BPF op failed (for example if the loaded BPF
Expand Down
25 changes: 24 additions & 1 deletion include/uapi/linux/bpf.h
Original file line number Diff line number Diff line change
Expand Up @@ -2890,6 +2890,25 @@ union bpf_attr {
* Obtain the 64bit jiffies
* Return
* The 64 bit jiffies
*
* int bpf_read_branch_records(struct bpf_perf_event_data *ctx, void *buf, u32 size, u64 flags)
* Description
* For an eBPF program attached to a perf event, retrieve the
* branch records (struct perf_branch_entry) associated to *ctx*
* and store it in the buffer pointed by *buf* up to size
* *size* bytes.
* Return
* On success, number of bytes written to *buf*. On error, a
* negative value.
*
* The *flags* can be set to **BPF_F_GET_BRANCH_RECORDS_SIZE** to
* instead return the number of bytes required to store all the
* branch entries. If this flag is set, *buf* may be NULL.
*
* **-EINVAL** if arguments invalid or **size** not a multiple
* of sizeof(struct perf_branch_entry).
*
* **-ENOENT** if architecture does not support branch records.
*/
#define __BPF_FUNC_MAPPER(FN) \
FN(unspec), \
Expand Down Expand Up @@ -3010,7 +3029,8 @@ union bpf_attr {
FN(probe_read_kernel_str), \
FN(tcp_send_ack), \
FN(send_signal_thread), \
FN(jiffies64),
FN(jiffies64), \
FN(read_branch_records),

/* integer value in 'imm' field of BPF_CALL instruction selects which helper
* function eBPF program intends to call
Expand Down Expand Up @@ -3089,6 +3109,9 @@ enum bpf_func_id {
/* BPF_FUNC_sk_storage_get flags */
#define BPF_SK_STORAGE_GET_F_CREATE (1ULL << 0)

/* BPF_FUNC_read_branch_records flags. */
#define BPF_F_GET_BRANCH_RECORDS_SIZE (1ULL << 0)

/* Mode for BPF_FUNC_skb_adjust_room helper. */
enum bpf_adj_room_mode {
BPF_ADJ_ROOM_NET,
Expand Down
5 changes: 0 additions & 5 deletions kernel/bpf/reuseport_array.c
Original file line number Diff line number Diff line change
Expand Up @@ -305,11 +305,6 @@ int bpf_fd_reuseport_array_update_elem(struct bpf_map *map, void *key,
if (err)
goto put_file_unlock;

/* Ensure reuse->reuseport_id is set */
err = reuseport_get_id(reuse);
if (err < 0)
goto put_file_unlock;

WRITE_ONCE(nsk->sk_user_data, &array->ptrs[index]);
rcu_assign_pointer(array->ptrs[index], nsk);
free_osk = osk;
Expand Down
10 changes: 7 additions & 3 deletions kernel/bpf/verifier.c
Original file line number Diff line number Diff line change
Expand Up @@ -3693,14 +3693,16 @@ static int check_map_func_compatibility(struct bpf_verifier_env *env,
if (func_id != BPF_FUNC_sk_redirect_map &&
func_id != BPF_FUNC_sock_map_update &&
func_id != BPF_FUNC_map_delete_elem &&
func_id != BPF_FUNC_msg_redirect_map)
func_id != BPF_FUNC_msg_redirect_map &&
func_id != BPF_FUNC_sk_select_reuseport)
goto error;
break;
case BPF_MAP_TYPE_SOCKHASH:
if (func_id != BPF_FUNC_sk_redirect_hash &&
func_id != BPF_FUNC_sock_hash_update &&
func_id != BPF_FUNC_map_delete_elem &&
func_id != BPF_FUNC_msg_redirect_hash)
func_id != BPF_FUNC_msg_redirect_hash &&
func_id != BPF_FUNC_sk_select_reuseport)
goto error;
break;
case BPF_MAP_TYPE_REUSEPORT_SOCKARRAY:
Expand Down Expand Up @@ -3774,7 +3776,9 @@ static int check_map_func_compatibility(struct bpf_verifier_env *env,
goto error;
break;
case BPF_FUNC_sk_select_reuseport:
if (map->map_type != BPF_MAP_TYPE_REUSEPORT_SOCKARRAY)
if (map->map_type != BPF_MAP_TYPE_REUSEPORT_SOCKARRAY &&
map->map_type != BPF_MAP_TYPE_SOCKMAP &&
map->map_type != BPF_MAP_TYPE_SOCKHASH)
goto error;
break;
case BPF_FUNC_map_peek_elem:
Expand Down
45 changes: 43 additions & 2 deletions kernel/trace/bpf_trace.c
Original file line number Diff line number Diff line change
Expand Up @@ -843,6 +843,8 @@ tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
return &bpf_send_signal_proto;
case BPF_FUNC_send_signal_thread:
return &bpf_send_signal_thread_proto;
case BPF_FUNC_perf_event_read_value:
return &bpf_perf_event_read_value_proto;
default:
return NULL;
}
Expand All @@ -858,8 +860,6 @@ kprobe_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
return &bpf_get_stackid_proto;
case BPF_FUNC_get_stack:
return &bpf_get_stack_proto;
case BPF_FUNC_perf_event_read_value:
return &bpf_perf_event_read_value_proto;
#ifdef CONFIG_BPF_KPROBE_OVERRIDE
case BPF_FUNC_override_return:
return &bpf_override_return_proto;
Expand Down Expand Up @@ -1028,6 +1028,45 @@ static const struct bpf_func_proto bpf_perf_prog_read_value_proto = {
.arg3_type = ARG_CONST_SIZE,
};

BPF_CALL_4(bpf_read_branch_records, struct bpf_perf_event_data_kern *, ctx,
void *, buf, u32, size, u64, flags)
{
#ifndef CONFIG_X86
return -ENOENT;
#else
static const u32 br_entry_size = sizeof(struct perf_branch_entry);
struct perf_branch_stack *br_stack = ctx->data->br_stack;
u32 to_copy;

if (unlikely(flags & ~BPF_F_GET_BRANCH_RECORDS_SIZE))
return -EINVAL;

if (unlikely(!br_stack))
return -EINVAL;

if (flags & BPF_F_GET_BRANCH_RECORDS_SIZE)
return br_stack->nr * br_entry_size;

if (!buf || (size % br_entry_size != 0))
return -EINVAL;

to_copy = min_t(u32, br_stack->nr * br_entry_size, size);
memcpy(buf, br_stack->entries, to_copy);

return to_copy;
#endif
}

static const struct bpf_func_proto bpf_read_branch_records_proto = {
.func = bpf_read_branch_records,
.gpl_only = true,
.ret_type = RET_INTEGER,
.arg1_type = ARG_PTR_TO_CTX,
.arg2_type = ARG_PTR_TO_MEM_OR_NULL,
.arg3_type = ARG_CONST_SIZE_OR_ZERO,
.arg4_type = ARG_ANYTHING,
};

static const struct bpf_func_proto *
pe_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
{
Expand All @@ -1040,6 +1079,8 @@ pe_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
return &bpf_get_stack_proto_tp;
case BPF_FUNC_perf_prog_read_value:
return &bpf_perf_prog_read_value_proto;
case BPF_FUNC_read_branch_records:
return &bpf_read_branch_records_proto;
default:
return tracing_func_proto(func_id, prog);
}
Expand Down
27 changes: 11 additions & 16 deletions net/core/filter.c
Original file line number Diff line number Diff line change
Expand Up @@ -8620,6 +8620,7 @@ struct sock *bpf_run_sk_reuseport(struct sock_reuseport *reuse, struct sock *sk,
BPF_CALL_4(sk_select_reuseport, struct sk_reuseport_kern *, reuse_kern,
struct bpf_map *, map, void *, key, u32, flags)
{
bool is_sockarray = map->map_type == BPF_MAP_TYPE_REUSEPORT_SOCKARRAY;
struct sock_reuseport *reuse;
struct sock *selected_sk;

Expand All @@ -8628,26 +8629,20 @@ BPF_CALL_4(sk_select_reuseport, struct sk_reuseport_kern *, reuse_kern,
return -ENOENT;

reuse = rcu_dereference(selected_sk->sk_reuseport_cb);
if (!reuse)
/* selected_sk is unhashed (e.g. by close()) after the
* above map_lookup_elem(). Treat selected_sk has already
* been removed from the map.
if (!reuse) {
/* reuseport_array has only sk with non NULL sk_reuseport_cb.
* The only (!reuse) case here is - the sk has already been
* unhashed (e.g. by close()), so treat it as -ENOENT.
*
* Other maps (e.g. sock_map) do not provide this guarantee and
* the sk may never be in the reuseport group to begin with.
*/
return -ENOENT;
return is_sockarray ? -ENOENT : -EINVAL;
}

if (unlikely(reuse->reuseport_id != reuse_kern->reuseport_id)) {
struct sock *sk;

if (unlikely(!reuse_kern->reuseport_id))
/* There is a small race between adding the
* sk to the map and setting the
* reuse_kern->reuseport_id.
* Treat it as the sk has not been added to
* the bpf map yet.
*/
return -ENOENT;
struct sock *sk = reuse_kern->sk;

sk = reuse_kern->sk;
if (sk->sk_protocol != selected_sk->sk_protocol)
return -EPROTOTYPE;
else if (sk->sk_family != selected_sk->sk_family)
Expand Down
2 changes: 1 addition & 1 deletion net/core/skmsg.c
Original file line number Diff line number Diff line change
Expand Up @@ -512,7 +512,7 @@ struct sk_psock *sk_psock_init(struct sock *sk, int node)
sk_psock_set_state(psock, SK_PSOCK_TX_ENABLED);
refcount_set(&psock->refcnt, 1);

rcu_assign_sk_user_data(sk, psock);
rcu_assign_sk_user_data_nocopy(sk, psock);
sock_hold(sk);

return psock;
Expand Down
Loading

0 comments on commit b105e8e

Please sign in to comment.