Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge pull request #1 from torvalds/master #400

Closed
wants to merge 1 commit into from
Closed

Merge pull request #1 from torvalds/master #400

wants to merge 1 commit into from

Conversation

gspu
Copy link

@gspu gspu commented Apr 1, 2017

up2date

@gspu gspu closed this Apr 1, 2017
@gspu gspu reopened this Apr 1, 2017
@gspu gspu closed this Apr 1, 2017
amery pushed a commit to linux-sunxi/linux-sunxi that referenced this pull request Apr 4, 2017
[ Upstream commit af30922 ]

If a block device is closed while iterate_bdevs() is handling it, the
following NULL pointer dereference occurs because bdev->b_disk is NULL
in bdev_get_queue(), which is called from blk_get_backing_dev_info() (in
turn called by the mapping_cap_writeback_dirty() call in
__filemap_fdatawrite_range()):

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000508
 IP: [<ffffffff81314790>] blk_get_backing_dev_info+0x10/0x20
 PGD 9e62067 PUD 9ee8067 PMD 0
 Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
 Modules linked in:
 CPU: 1 PID: 2422 Comm: sync Not tainted 4.5.0-rc7+ torvalds#400
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
 task: ffff880009f4d700 ti: ffff880009f5c000 task.ti: ffff880009f5c000
 RIP: 0010:[<ffffffff81314790>]  [<ffffffff81314790>] blk_get_backing_dev_info+0x10/0x20
 RSP: 0018:ffff880009f5fe68  EFLAGS: 00010246
 RAX: 0000000000000000 RBX: ffff88000ec17a38 RCX: ffffffff81a4e940
 RDX: 7fffffffffffffff RSI: 0000000000000000 RDI: ffff88000ec176c0
 RBP: ffff880009f5fe68 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000001 R11: 0000000000000000 R12: ffff88000ec17860
 R13: ffffffff811b25c0 R14: ffff88000ec178e0 R15: ffff88000ec17a38
 FS:  00007faee505d700(0000) GS:ffff88000fb00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: 0000000000000508 CR3: 0000000009e8a000 CR4: 00000000000006e0
 Stack:
  ffff880009f5feb8 ffffffff8112e7f5 0000000000000000 7fffffffffffffff
  0000000000000000 0000000000000000 7fffffffffffffff 0000000000000001
  ffff88000ec178e0 ffff88000ec17860 ffff880009f5fec8 ffffffff8112e81f
 Call Trace:
  [<ffffffff8112e7f5>] __filemap_fdatawrite_range+0x85/0x90
  [<ffffffff8112e81f>] filemap_fdatawrite+0x1f/0x30
  [<ffffffff811b25d6>] fdatawrite_one_bdev+0x16/0x20
  [<ffffffff811bc402>] iterate_bdevs+0xf2/0x130
  [<ffffffff811b2763>] sys_sync+0x63/0x90
  [<ffffffff815d4272>] entry_SYSCALL_64_fastpath+0x12/0x76
 Code: 0f 1f 44 00 00 48 8b 87 f0 00 00 00 55 48 89 e5 <48> 8b 80 08 05 00 00 5d
 RIP  [<ffffffff81314790>] blk_get_backing_dev_info+0x10/0x20
  RSP <ffff880009f5fe68>
 CR2: 0000000000000508
 ---[ end trace 2487336ceb3de62d ]---

The crash is easily reproducible by running the following command, if an
msleep(100) is inserted before the call to func() in iterate_devs():

 while :; do head -c1 /dev/nullb0; done > /dev/null & while :; do sync; done

Fix it by holding the bd_mutex across the func() call and only calling
func() if the bdev is opened.

Cc: [email protected]
Fixes: 5c0d6b6 ("vfs: Create function for iterating over block devices")
Reported-and-tested-by: Wei Fang <[email protected]>
Signed-off-by: Rabin Vincent <[email protected]>
Signed-off-by: Jan Kara <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
fuzeman pushed a commit to fuzeman/linux-ubuntu-zesty that referenced this pull request Apr 6, 2017
commit af30922 upstream.

If a block device is closed while iterate_bdevs() is handling it, the
following NULL pointer dereference occurs because bdev->b_disk is NULL
in bdev_get_queue(), which is called from blk_get_backing_dev_info() (in
turn called by the mapping_cap_writeback_dirty() call in
__filemap_fdatawrite_range()):

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000508
 IP: [<ffffffff81314790>] blk_get_backing_dev_info+0x10/0x20
 PGD 9e62067 PUD 9ee8067 PMD 0
 Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
 Modules linked in:
 CPU: 1 PID: 2422 Comm: sync Not tainted 4.5.0-rc7+ torvalds#400
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
 task: ffff880009f4d700 ti: ffff880009f5c000 task.ti: ffff880009f5c000
 RIP: 0010:[<ffffffff81314790>]  [<ffffffff81314790>] blk_get_backing_dev_info+0x10/0x20
 RSP: 0018:ffff880009f5fe68  EFLAGS: 00010246
 RAX: 0000000000000000 RBX: ffff88000ec17a38 RCX: ffffffff81a4e940
 RDX: 7fffffffffffffff RSI: 0000000000000000 RDI: ffff88000ec176c0
 RBP: ffff880009f5fe68 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000001 R11: 0000000000000000 R12: ffff88000ec17860
 R13: ffffffff811b25c0 R14: ffff88000ec178e0 R15: ffff88000ec17a38
 FS:  00007faee505d700(0000) GS:ffff88000fb00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: 0000000000000508 CR3: 0000000009e8a000 CR4: 00000000000006e0
 Stack:
  ffff880009f5feb8 ffffffff8112e7f5 0000000000000000 7fffffffffffffff
  0000000000000000 0000000000000000 7fffffffffffffff 0000000000000001
  ffff88000ec178e0 ffff88000ec17860 ffff880009f5fec8 ffffffff8112e81f
 Call Trace:
  [<ffffffff8112e7f5>] __filemap_fdatawrite_range+0x85/0x90
  [<ffffffff8112e81f>] filemap_fdatawrite+0x1f/0x30
  [<ffffffff811b25d6>] fdatawrite_one_bdev+0x16/0x20
  [<ffffffff811bc402>] iterate_bdevs+0xf2/0x130
  [<ffffffff811b2763>] sys_sync+0x63/0x90
  [<ffffffff815d4272>] entry_SYSCALL_64_fastpath+0x12/0x76
 Code: 0f 1f 44 00 00 48 8b 87 f0 00 00 00 55 48 89 e5 <48> 8b 80 08 05 00 00 5d
 RIP  [<ffffffff81314790>] blk_get_backing_dev_info+0x10/0x20
  RSP <ffff880009f5fe68>
 CR2: 0000000000000508
 ---[ end trace 2487336ceb3de62d ]---

The crash is easily reproducible by running the following command, if an
msleep(100) is inserted before the call to func() in iterate_devs():

 while :; do head -c1 /dev/nullb0; done > /dev/null & while :; do sync; done

Fix it by holding the bd_mutex across the func() call and only calling
func() if the bdev is opened.

Fixes: 5c0d6b6 ("vfs: Create function for iterating over block devices")
Reported-and-tested-by: Wei Fang <[email protected]>
Signed-off-by: Rabin Vincent <[email protected]>
Signed-off-by: Jan Kara <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
fuzeman pushed a commit to fuzeman/linux-ubuntu-zesty that referenced this pull request Apr 6, 2017
commit af30922 upstream.

If a block device is closed while iterate_bdevs() is handling it, the
following NULL pointer dereference occurs because bdev->b_disk is NULL
in bdev_get_queue(), which is called from blk_get_backing_dev_info() (in
turn called by the mapping_cap_writeback_dirty() call in
__filemap_fdatawrite_range()):

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000508
 IP: [<ffffffff81314790>] blk_get_backing_dev_info+0x10/0x20
 PGD 9e62067 PUD 9ee8067 PMD 0
 Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
 Modules linked in:
 CPU: 1 PID: 2422 Comm: sync Not tainted 4.5.0-rc7+ torvalds#400
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
 task: ffff880009f4d700 ti: ffff880009f5c000 task.ti: ffff880009f5c000
 RIP: 0010:[<ffffffff81314790>]  [<ffffffff81314790>] blk_get_backing_dev_info+0x10/0x20
 RSP: 0018:ffff880009f5fe68  EFLAGS: 00010246
 RAX: 0000000000000000 RBX: ffff88000ec17a38 RCX: ffffffff81a4e940
 RDX: 7fffffffffffffff RSI: 0000000000000000 RDI: ffff88000ec176c0
 RBP: ffff880009f5fe68 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000001 R11: 0000000000000000 R12: ffff88000ec17860
 R13: ffffffff811b25c0 R14: ffff88000ec178e0 R15: ffff88000ec17a38
 FS:  00007faee505d700(0000) GS:ffff88000fb00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: 0000000000000508 CR3: 0000000009e8a000 CR4: 00000000000006e0
 Stack:
  ffff880009f5feb8 ffffffff8112e7f5 0000000000000000 7fffffffffffffff
  0000000000000000 0000000000000000 7fffffffffffffff 0000000000000001
  ffff88000ec178e0 ffff88000ec17860 ffff880009f5fec8 ffffffff8112e81f
 Call Trace:
  [<ffffffff8112e7f5>] __filemap_fdatawrite_range+0x85/0x90
  [<ffffffff8112e81f>] filemap_fdatawrite+0x1f/0x30
  [<ffffffff811b25d6>] fdatawrite_one_bdev+0x16/0x20
  [<ffffffff811bc402>] iterate_bdevs+0xf2/0x130
  [<ffffffff811b2763>] sys_sync+0x63/0x90
  [<ffffffff815d4272>] entry_SYSCALL_64_fastpath+0x12/0x76
 Code: 0f 1f 44 00 00 48 8b 87 f0 00 00 00 55 48 89 e5 <48> 8b 80 08 05 00 00 5d
 RIP  [<ffffffff81314790>] blk_get_backing_dev_info+0x10/0x20
  RSP <ffff880009f5fe68>
 CR2: 0000000000000508
 ---[ end trace 2487336ceb3de62d ]---

The crash is easily reproducible by running the following command, if an
msleep(100) is inserted before the call to func() in iterate_devs():

 while :; do head -c1 /dev/nullb0; done > /dev/null & while :; do sync; done

Fix it by holding the bd_mutex across the func() call and only calling
func() if the bdev is opened.

Fixes: 5c0d6b6 ("vfs: Create function for iterating over block devices")
Reported-and-tested-by: Wei Fang <[email protected]>
Signed-off-by: Rabin Vincent <[email protected]>
Signed-off-by: Jan Kara <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
jhofstee pushed a commit to victronenergy/linux that referenced this pull request Apr 18, 2017
[ Upstream commit af30922 ]

If a block device is closed while iterate_bdevs() is handling it, the
following NULL pointer dereference occurs because bdev->b_disk is NULL
in bdev_get_queue(), which is called from blk_get_backing_dev_info() (in
turn called by the mapping_cap_writeback_dirty() call in
__filemap_fdatawrite_range()):

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000508
 IP: [<ffffffff81314790>] blk_get_backing_dev_info+0x10/0x20
 PGD 9e62067 PUD 9ee8067 PMD 0
 Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
 Modules linked in:
 CPU: 1 PID: 2422 Comm: sync Not tainted 4.5.0-rc7+ torvalds#400
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
 task: ffff880009f4d700 ti: ffff880009f5c000 task.ti: ffff880009f5c000
 RIP: 0010:[<ffffffff81314790>]  [<ffffffff81314790>] blk_get_backing_dev_info+0x10/0x20
 RSP: 0018:ffff880009f5fe68  EFLAGS: 00010246
 RAX: 0000000000000000 RBX: ffff88000ec17a38 RCX: ffffffff81a4e940
 RDX: 7fffffffffffffff RSI: 0000000000000000 RDI: ffff88000ec176c0
 RBP: ffff880009f5fe68 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000001 R11: 0000000000000000 R12: ffff88000ec17860
 R13: ffffffff811b25c0 R14: ffff88000ec178e0 R15: ffff88000ec17a38
 FS:  00007faee505d700(0000) GS:ffff88000fb00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: 0000000000000508 CR3: 0000000009e8a000 CR4: 00000000000006e0
 Stack:
  ffff880009f5feb8 ffffffff8112e7f5 0000000000000000 7fffffffffffffff
  0000000000000000 0000000000000000 7fffffffffffffff 0000000000000001
  ffff88000ec178e0 ffff88000ec17860 ffff880009f5fec8 ffffffff8112e81f
 Call Trace:
  [<ffffffff8112e7f5>] __filemap_fdatawrite_range+0x85/0x90
  [<ffffffff8112e81f>] filemap_fdatawrite+0x1f/0x30
  [<ffffffff811b25d6>] fdatawrite_one_bdev+0x16/0x20
  [<ffffffff811bc402>] iterate_bdevs+0xf2/0x130
  [<ffffffff811b2763>] sys_sync+0x63/0x90
  [<ffffffff815d4272>] entry_SYSCALL_64_fastpath+0x12/0x76
 Code: 0f 1f 44 00 00 48 8b 87 f0 00 00 00 55 48 89 e5 <48> 8b 80 08 05 00 00 5d
 RIP  [<ffffffff81314790>] blk_get_backing_dev_info+0x10/0x20
  RSP <ffff880009f5fe68>
 CR2: 0000000000000508
 ---[ end trace 2487336ceb3de62d ]---

The crash is easily reproducible by running the following command, if an
msleep(100) is inserted before the call to func() in iterate_devs():

 while :; do head -c1 /dev/nullb0; done > /dev/null & while :; do sync; done

Fix it by holding the bd_mutex across the func() call and only calling
func() if the bdev is opened.

Cc: [email protected]
Fixes: 5c0d6b6 ("vfs: Create function for iterating over block devices")
Reported-and-tested-by: Wei Fang <[email protected]>
Signed-off-by: Rabin Vincent <[email protected]>
Signed-off-by: Jan Kara <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
lategoodbye pushed a commit to chargebyte/linux that referenced this pull request May 8, 2017
commit af30922 upstream.

If a block device is closed while iterate_bdevs() is handling it, the
following NULL pointer dereference occurs because bdev->b_disk is NULL
in bdev_get_queue(), which is called from blk_get_backing_dev_info() (in
turn called by the mapping_cap_writeback_dirty() call in
__filemap_fdatawrite_range()):

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000508
 IP: [<ffffffff81314790>] blk_get_backing_dev_info+0x10/0x20
 PGD 9e62067 PUD 9ee8067 PMD 0
 Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
 Modules linked in:
 CPU: 1 PID: 2422 Comm: sync Not tainted 4.5.0-rc7+ torvalds#400
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
 task: ffff880009f4d700 ti: ffff880009f5c000 task.ti: ffff880009f5c000
 RIP: 0010:[<ffffffff81314790>]  [<ffffffff81314790>] blk_get_backing_dev_info+0x10/0x20
 RSP: 0018:ffff880009f5fe68  EFLAGS: 00010246
 RAX: 0000000000000000 RBX: ffff88000ec17a38 RCX: ffffffff81a4e940
 RDX: 7fffffffffffffff RSI: 0000000000000000 RDI: ffff88000ec176c0
 RBP: ffff880009f5fe68 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000001 R11: 0000000000000000 R12: ffff88000ec17860
 R13: ffffffff811b25c0 R14: ffff88000ec178e0 R15: ffff88000ec17a38
 FS:  00007faee505d700(0000) GS:ffff88000fb00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: 0000000000000508 CR3: 0000000009e8a000 CR4: 00000000000006e0
 Stack:
  ffff880009f5feb8 ffffffff8112e7f5 0000000000000000 7fffffffffffffff
  0000000000000000 0000000000000000 7fffffffffffffff 0000000000000001
  ffff88000ec178e0 ffff88000ec17860 ffff880009f5fec8 ffffffff8112e81f
 Call Trace:
  [<ffffffff8112e7f5>] __filemap_fdatawrite_range+0x85/0x90
  [<ffffffff8112e81f>] filemap_fdatawrite+0x1f/0x30
  [<ffffffff811b25d6>] fdatawrite_one_bdev+0x16/0x20
  [<ffffffff811bc402>] iterate_bdevs+0xf2/0x130
  [<ffffffff811b2763>] sys_sync+0x63/0x90
  [<ffffffff815d4272>] entry_SYSCALL_64_fastpath+0x12/0x76
 Code: 0f 1f 44 00 00 48 8b 87 f0 00 00 00 55 48 89 e5 <48> 8b 80 08 05 00 00 5d
 RIP  [<ffffffff81314790>] blk_get_backing_dev_info+0x10/0x20
  RSP <ffff880009f5fe68>
 CR2: 0000000000000508
 ---[ end trace 2487336ceb3de62d ]---

The crash is easily reproducible by running the following command, if an
msleep(100) is inserted before the call to func() in iterate_devs():

 while :; do head -c1 /dev/nullb0; done > /dev/null & while :; do sync; done

Fix it by holding the bd_mutex across the func() call and only calling
func() if the bdev is opened.

Fixes: 5c0d6b6 ("vfs: Create function for iterating over block devices")
Reported-and-tested-by: Wei Fang <[email protected]>
Signed-off-by: Rabin Vincent <[email protected]>
Signed-off-by: Jan Kara <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
bgly pushed a commit to powervm/ibmvscsis that referenced this pull request May 8, 2017
BugLink: http://bugs.launchpad.net/bugs/1655057

commit af30922 upstream.

If a block device is closed while iterate_bdevs() is handling it, the
following NULL pointer dereference occurs because bdev->b_disk is NULL
in bdev_get_queue(), which is called from blk_get_backing_dev_info() (in
turn called by the mapping_cap_writeback_dirty() call in
__filemap_fdatawrite_range()):

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000508
 IP: [<ffffffff81314790>] blk_get_backing_dev_info+0x10/0x20
 PGD 9e62067 PUD 9ee8067 PMD 0
 Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
 Modules linked in:
 CPU: 1 PID: 2422 Comm: sync Not tainted 4.5.0-rc7+ torvalds#400
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
 task: ffff880009f4d700 ti: ffff880009f5c000 task.ti: ffff880009f5c000
 RIP: 0010:[<ffffffff81314790>]  [<ffffffff81314790>] blk_get_backing_dev_info+0x10/0x20
 RSP: 0018:ffff880009f5fe68  EFLAGS: 00010246
 RAX: 0000000000000000 RBX: ffff88000ec17a38 RCX: ffffffff81a4e940
 RDX: 7fffffffffffffff RSI: 0000000000000000 RDI: ffff88000ec176c0
 RBP: ffff880009f5fe68 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000001 R11: 0000000000000000 R12: ffff88000ec17860
 R13: ffffffff811b25c0 R14: ffff88000ec178e0 R15: ffff88000ec17a38
 FS:  00007faee505d700(0000) GS:ffff88000fb00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: 0000000000000508 CR3: 0000000009e8a000 CR4: 00000000000006e0
 Stack:
  ffff880009f5feb8 ffffffff8112e7f5 0000000000000000 7fffffffffffffff
  0000000000000000 0000000000000000 7fffffffffffffff 0000000000000001
  ffff88000ec178e0 ffff88000ec17860 ffff880009f5fec8 ffffffff8112e81f
 Call Trace:
  [<ffffffff8112e7f5>] __filemap_fdatawrite_range+0x85/0x90
  [<ffffffff8112e81f>] filemap_fdatawrite+0x1f/0x30
  [<ffffffff811b25d6>] fdatawrite_one_bdev+0x16/0x20
  [<ffffffff811bc402>] iterate_bdevs+0xf2/0x130
  [<ffffffff811b2763>] sys_sync+0x63/0x90
  [<ffffffff815d4272>] entry_SYSCALL_64_fastpath+0x12/0x76
 Code: 0f 1f 44 00 00 48 8b 87 f0 00 00 00 55 48 89 e5 <48> 8b 80 08 05 00 00 5d
 RIP  [<ffffffff81314790>] blk_get_backing_dev_info+0x10/0x20
  RSP <ffff880009f5fe68>
 CR2: 0000000000000508
 ---[ end trace 2487336ceb3de62d ]---

The crash is easily reproducible by running the following command, if an
msleep(100) is inserted before the call to func() in iterate_devs():

 while :; do head -c1 /dev/nullb0; done > /dev/null & while :; do sync; done

Fix it by holding the bd_mutex across the func() call and only calling
func() if the bdev is opened.

Fixes: 5c0d6b6 ("vfs: Create function for iterating over block devices")
Reported-and-tested-by: Wei Fang <[email protected]>
Signed-off-by: Rabin Vincent <[email protected]>
Signed-off-by: Jan Kara <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

Signed-off-by: Tim Gardner <[email protected]>
Signed-off-by: Luis Henriques <[email protected]>
Noltari pushed a commit to Noltari/linux that referenced this pull request Jun 15, 2017
commit af30922 upstream.

If a block device is closed while iterate_bdevs() is handling it, the
following NULL pointer dereference occurs because bdev->b_disk is NULL
in bdev_get_queue(), which is called from blk_get_backing_dev_info() (in
turn called by the mapping_cap_writeback_dirty() call in
__filemap_fdatawrite_range()):

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000508
 IP: [<ffffffff81314790>] blk_get_backing_dev_info+0x10/0x20
 PGD 9e62067 PUD 9ee8067 PMD 0
 Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
 Modules linked in:
 CPU: 1 PID: 2422 Comm: sync Not tainted 4.5.0-rc7+ torvalds#400
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
 task: ffff880009f4d700 ti: ffff880009f5c000 task.ti: ffff880009f5c000
 RIP: 0010:[<ffffffff81314790>]  [<ffffffff81314790>] blk_get_backing_dev_info+0x10/0x20
 RSP: 0018:ffff880009f5fe68  EFLAGS: 00010246
 RAX: 0000000000000000 RBX: ffff88000ec17a38 RCX: ffffffff81a4e940
 RDX: 7fffffffffffffff RSI: 0000000000000000 RDI: ffff88000ec176c0
 RBP: ffff880009f5fe68 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000001 R11: 0000000000000000 R12: ffff88000ec17860
 R13: ffffffff811b25c0 R14: ffff88000ec178e0 R15: ffff88000ec17a38
 FS:  00007faee505d700(0000) GS:ffff88000fb00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: 0000000000000508 CR3: 0000000009e8a000 CR4: 00000000000006e0
 Stack:
  ffff880009f5feb8 ffffffff8112e7f5 0000000000000000 7fffffffffffffff
  0000000000000000 0000000000000000 7fffffffffffffff 0000000000000001
  ffff88000ec178e0 ffff88000ec17860 ffff880009f5fec8 ffffffff8112e81f
 Call Trace:
  [<ffffffff8112e7f5>] __filemap_fdatawrite_range+0x85/0x90
  [<ffffffff8112e81f>] filemap_fdatawrite+0x1f/0x30
  [<ffffffff811b25d6>] fdatawrite_one_bdev+0x16/0x20
  [<ffffffff811bc402>] iterate_bdevs+0xf2/0x130
  [<ffffffff811b2763>] sys_sync+0x63/0x90
  [<ffffffff815d4272>] entry_SYSCALL_64_fastpath+0x12/0x76
 Code: 0f 1f 44 00 00 48 8b 87 f0 00 00 00 55 48 89 e5 <48> 8b 80 08 05 00 00 5d
 RIP  [<ffffffff81314790>] blk_get_backing_dev_info+0x10/0x20
  RSP <ffff880009f5fe68>
 CR2: 0000000000000508
 ---[ end trace 2487336ceb3de62d ]---

The crash is easily reproducible by running the following command, if an
msleep(100) is inserted before the call to func() in iterate_devs():

 while :; do head -c1 /dev/nullb0; done > /dev/null & while :; do sync; done

Fix it by holding the bd_mutex across the func() call and only calling
func() if the bdev is opened.

Fixes: 5c0d6b6 ("vfs: Create function for iterating over block devices")
Reported-and-tested-by: Wei Fang <[email protected]>
Signed-off-by: Rabin Vincent <[email protected]>
Signed-off-by: Jan Kara <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Willy Tarreau <[email protected]>
fengguang pushed a commit to 0day-ci/linux that referenced this pull request Nov 5, 2017
When asix_suspend() is called dev->driver_priv might not have been
assigned a value, so we need to check that it's not NULL.

Found by syzkaller.

kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] PREEMPT SMP KASAN
Modules linked in:
CPU: 0 PID: 24 Comm: kworker/0:1 Not tainted 4.14.0-rc4-43422-geccacdd69a8c torvalds#400
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Workqueue: usb_hub_wq hub_event
task: ffff88006bb36300 task.stack: ffff88006bba8000
RIP: 0010:asix_suspend+0x76/0xc0 drivers/net/usb/asix_devices.c:629
RSP: 0018:ffff88006bbae718 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: ffff880061ba3b80 RCX: 1ffff1000c34d644
RDX: 0000000000000001 RSI: 0000000000000402 RDI: 0000000000000008
RBP: ffff88006bbae738 R08: 1ffff1000d775cad R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800630a8b40
R13: 0000000000000000 R14: 0000000000000402 R15: ffff880061ba3b80
FS:  0000000000000000(0000) GS:ffff88006c600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ff33cf89000 CR3: 0000000061c0a000 CR4: 00000000000006f0
Call Trace:
 usb_suspend_interface drivers/usb/core/driver.c:1209
 usb_suspend_both+0x27f/0x7e0 drivers/usb/core/driver.c:1314
 usb_runtime_suspend+0x41/0x120 drivers/usb/core/driver.c:1852
 __rpm_callback+0x339/0xb60 drivers/base/power/runtime.c:334
 rpm_callback+0x106/0x220 drivers/base/power/runtime.c:461
 rpm_suspend+0x465/0x1980 drivers/base/power/runtime.c:596
 __pm_runtime_suspend+0x11e/0x230 drivers/base/power/runtime.c:1009
 pm_runtime_put_sync_autosuspend ./include/linux/pm_runtime.h:251
 usb_new_device+0xa37/0x1020 drivers/usb/core/hub.c:2487
 hub_port_connect drivers/usb/core/hub.c:4903
 hub_port_connect_change drivers/usb/core/hub.c:5009
 port_event drivers/usb/core/hub.c:5115
 hub_event+0x194d/0x3740 drivers/usb/core/hub.c:5195
 process_one_work+0xc7f/0x1db0 kernel/workqueue.c:2119
 worker_thread+0x221/0x1850 kernel/workqueue.c:2253
 kthread+0x3a1/0x470 kernel/kthread.c:231
 ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:431
Code: 8d 7c 24 20 48 89 fa 48 c1 ea 03 80 3c 02 00 75 5b 48 b8 00 00
00 00 00 fc ff df 4d 8b 6c 24 20 49 8d 7d 08 48 89 fa 48 c1 ea 03 <80>
3c 02 00 75 34 4d 8b 6d 08 4d 85 ed 74 0b e8 26 2b 51 fd 4c
RIP: asix_suspend+0x76/0xc0 RSP: ffff88006bbae718
---[ end trace dfc4f5649284342c ]---

Signed-off-by: Andrey Konovalov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Noltari pushed a commit to Noltari/linux that referenced this pull request Nov 9, 2017
When asix_suspend() is called dev->driver_priv might not have been
assigned a value, so we need to check that it's not NULL.

Similar issue is present in asix_resume(), this patch fixes it as well.

Found by syzkaller.

kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] PREEMPT SMP KASAN
Modules linked in:
CPU: 0 PID: 24 Comm: kworker/0:1 Not tainted 4.14.0-rc4-43422-geccacdd69a8c torvalds#400
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Workqueue: usb_hub_wq hub_event
task: ffff88006bb36300 task.stack: ffff88006bba8000
RIP: 0010:asix_suspend+0x76/0xc0 drivers/net/usb/asix_devices.c:629
RSP: 0018:ffff88006bbae718 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: ffff880061ba3b80 RCX: 1ffff1000c34d644
RDX: 0000000000000001 RSI: 0000000000000402 RDI: 0000000000000008
RBP: ffff88006bbae738 R08: 1ffff1000d775cad R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800630a8b40
R13: 0000000000000000 R14: 0000000000000402 R15: ffff880061ba3b80
FS:  0000000000000000(0000) GS:ffff88006c600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ff33cf89000 CR3: 0000000061c0a000 CR4: 00000000000006f0
Call Trace:
 usb_suspend_interface drivers/usb/core/driver.c:1209
 usb_suspend_both+0x27f/0x7e0 drivers/usb/core/driver.c:1314
 usb_runtime_suspend+0x41/0x120 drivers/usb/core/driver.c:1852
 __rpm_callback+0x339/0xb60 drivers/base/power/runtime.c:334
 rpm_callback+0x106/0x220 drivers/base/power/runtime.c:461
 rpm_suspend+0x465/0x1980 drivers/base/power/runtime.c:596
 __pm_runtime_suspend+0x11e/0x230 drivers/base/power/runtime.c:1009
 pm_runtime_put_sync_autosuspend ./include/linux/pm_runtime.h:251
 usb_new_device+0xa37/0x1020 drivers/usb/core/hub.c:2487
 hub_port_connect drivers/usb/core/hub.c:4903
 hub_port_connect_change drivers/usb/core/hub.c:5009
 port_event drivers/usb/core/hub.c:5115
 hub_event+0x194d/0x3740 drivers/usb/core/hub.c:5195
 process_one_work+0xc7f/0x1db0 kernel/workqueue.c:2119
 worker_thread+0x221/0x1850 kernel/workqueue.c:2253
 kthread+0x3a1/0x470 kernel/kthread.c:231
 ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:431
Code: 8d 7c 24 20 48 89 fa 48 c1 ea 03 80 3c 02 00 75 5b 48 b8 00 00
00 00 00 fc ff df 4d 8b 6c 24 20 49 8d 7d 08 48 89 fa 48 c1 ea 03 <80>
3c 02 00 75 34 4d 8b 6d 08 4d 85 ed 74 0b e8 26 2b 51 fd 4c
RIP: asix_suspend+0x76/0xc0 RSP: ffff88006bbae718
---[ end trace dfc4f5649284342c ]---

Signed-off-by: Andrey Konovalov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
ddstreet pushed a commit to ddstreet/linux that referenced this pull request Nov 15, 2017
GIT 5cf2360ba6eceee13ac2e72f4c21fdd31613f59b

commit 085c17ff6bacccb45201098e71f95675e08f0e17
Author: Maciej W. Rozycki <[email protected]>
Date:   Fri Nov 10 20:05:24 2017 +0000

    .mailmap: Add Maciej W. Rozycki's Imagination e-mail address
    
    Following my recent transition from Imagination Technologies to the=20
    reincarnated MIPS company add a .mailmap mapping for my work address,
    so that `scripts/get_maintainer.pl' gets it right for past commits.
    
    Signed-off-by: Maciej W. Rozycki <[email protected]>
    Signed-off-by: Linus Torvalds <[email protected]>

commit ea0ee33988778fb73e4f45e7c73fb735787e2f32
Author: Linus Torvalds <[email protected]>
Date:   Fri Nov 10 11:19:11 2017 -0800

    Revert "x86: CPU: Fix up "cpu MHz" in /proc/cpuinfo"
    
    This reverts commit 941f5f0f6ef5338814145cf2b813cf1f98873e2f.
    
    Sadly, it turns out that we really can't just do the cross-CPU IPI to
    all CPU's to get their proper frequencies, because it's much too
    expensive on systems with lots of cores.
    
    So we'll have to revert this for now, and revisit it using a smarter
    model (probably doing one system-wide IPI at open time, and doing all
    the frequency calculations in parallel).
    
    Reported-by: WANG Chao <[email protected]>
    Reported-by: Ingo Molnar <[email protected]>
    Cc: Rafael J Wysocki <[email protected]>
    Cc: [email protected]
    Signed-off-by: Linus Torvalds <[email protected]>

commit 60fdb44a23cbc5d8db224577e14ea667d7c53478
Author: Jarkko Sakkinen <[email protected]>
Date:   Thu Nov 9 13:38:21 2017 -0800

    MAINTAINERS: update TPM driver infrastructure changes
    
    [[email protected]: alpha-sort CREDITS, per Randy]
    Link: http://lkml.kernel.org/r/[email protected]
    Signed-off-by: Jarkko Sakkinen <[email protected]>
    Cc: Marcel Selhorst <[email protected]>
    Cc: Ashley Lai <[email protected]>
    Cc: Mimi Zohar <[email protected]>
    Cc: Jarkko Sakkinen <[email protected]>
    Cc: Boris Brezillon <[email protected]>
    Cc: Borislav Petkov <[email protected]>
    Cc: Håvard Skinnemoen <[email protected]>
    Cc: Martin Kepplinger <[email protected]>
    Cc: Pavel Machek <[email protected]>
    Cc: Geert Uytterhoeven <[email protected]>
    Cc: Hans-Christian Noren Egtvedt <[email protected]>
    Cc: Arnaldo Carvalho de Melo <[email protected]>
    Cc: Gertjan van Wingerde <[email protected]>
    Cc: Greg Kroah-Hartman <[email protected]>
    Cc: David S. Miller <[email protected]>
    Cc: Mauro Carvalho Chehab <[email protected]>
    Cc: Randy Dunlap <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Linus Torvalds <[email protected]>

commit e609a6b85182f8a479d265839d3ff9aad2e67b18
Author: Arnd Bergmann <[email protected]>
Date:   Thu Nov 9 13:38:18 2017 -0800

    sysctl: add register_sysctl() dummy helper
    
    register_sysctl() has been around for five years with commit
    fea478d4101a ("sysctl: Add register_sysctl for normal sysctl users") but
    now that arm64 started using it, I ran into a compile error:
    
      arch/arm64/kernel/armv8_deprecated.c: In function 'register_insn_emulation_sysctl':
      arch/arm64/kernel/armv8_deprecated.c:257:2: error: implicit declaration of function 'register_sysctl'
    
    This adds a inline function like we already have for
    register_sysctl_paths() and register_sysctl_table().
    
    Link: http://lkml.kernel.org/r/[email protected]
    Fixes: 38b9aeb32fa7 ("arm64: Port deprecated instruction emulation to new sysctl interface")
    Signed-off-by: Arnd Bergmann <[email protected]>
    Reviewed-by: Dave Martin <[email protected]>
    Acked-by: Kees Cook <[email protected]>
    Acked-by: Will Deacon <[email protected]>
    Cc: Catalin Marinas <[email protected]>
    Cc: "Luis R. Rodriguez" <[email protected]>
    Cc: Alex Benne <[email protected]>
    Cc: Arnd Bergmann <[email protected]>
    Cc: "Eric W. Biederman" <[email protected]>
    Cc: Thomas Gleixner <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Linus Torvalds <[email protected]>

commit 75ee94b20b46459e3d29f5ac2c3af3cebdeef777
Author: Hui Wang <[email protected]>
Date:   Thu Nov 9 08:48:08 2017 +0800

    ALSA: hda - fix headset mic problem for Dell machines with alc274
    
    Confirmed with Kailang of Realtek, the pin 0x19 is for Headset Mic, and
    the pin 0x1a is for Headphone Mic, he suggested to apply
    ALC269_FIXUP_DELL1_MIC_NO_PRESENCE to fix this problem. And we
    verified applying this FIXUP can fix this problem.
    
    Cc: <[email protected]>
    Cc: Kailang Yang <[email protected]>
    Signed-off-by: Hui Wang <[email protected]>
    Signed-off-by: Takashi Iwai <[email protected]>

commit 35c55fc156d85a396a975fc17636f560fc02fd65
Author: Cong Wang <[email protected]>
Date:   Mon Nov 6 13:47:30 2017 -0800

    cls_u32: use tcf_exts_get_net() before call_rcu()
    
    Hold netns refcnt before call_rcu() and release it after
    the tcf_exts_destroy() is done.
    
    Note, on ->destroy() path we have to respect the return value
    of tcf_exts_get_net(), on other paths it should always return
    true, so we don't need to care.
    
    Cc: Lucas Bates <[email protected]>
    Cc: Jamal Hadi Salim <[email protected]>
    Cc: Jiri Pirko <[email protected]>
    Signed-off-by: Cong Wang <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit f2b751053ee9314e82c178f6ca0fee7e160fac95
Author: Cong Wang <[email protected]>
Date:   Mon Nov 6 13:47:29 2017 -0800

    cls_tcindex: use tcf_exts_get_net() before call_rcu()
    
    Hold netns refcnt before call_rcu() and release it after
    the tcf_exts_destroy() is done.
    
    Note, on ->destroy() path we have to respect the return value
    of tcf_exts_get_net(), on other paths it should always return
    true, so we don't need to care.
    
    Cc: Lucas Bates <[email protected]>
    Cc: Jamal Hadi Salim <[email protected]>
    Cc: Jiri Pirko <[email protected]>
    Signed-off-by: Cong Wang <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit 96585063a27f0704dcf7a09f8b78edd6a8973965
Author: Cong Wang <[email protected]>
Date:   Mon Nov 6 13:47:28 2017 -0800

    cls_rsvp: use tcf_exts_get_net() before call_rcu()
    
    Hold netns refcnt before call_rcu() and release it after
    the tcf_exts_destroy() is done.
    
    Note, on ->destroy() path we have to respect the return value
    of tcf_exts_get_net(), on other paths it should always return
    true, so we don't need to care.
    
    Cc: Lucas Bates <[email protected]>
    Cc: Jamal Hadi Salim <[email protected]>
    Cc: Jiri Pirko <[email protected]>
    Signed-off-by: Cong Wang <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit 3fd51de5e3ba447624a08a8ba29f90d94f0fe909
Author: Cong Wang <[email protected]>
Date:   Mon Nov 6 13:47:27 2017 -0800

    cls_route: use tcf_exts_get_net() before call_rcu()
    
    Hold netns refcnt before call_rcu() and release it after
    the tcf_exts_destroy() is done.
    
    Note, on ->destroy() path we have to respect the return value
    of tcf_exts_get_net(), on other paths it should always return
    true, so we don't need to care.
    
    Cc: Lucas Bates <[email protected]>
    Cc: Jamal Hadi Salim <[email protected]>
    Cc: Jiri Pirko <[email protected]>
    Signed-off-by: Cong Wang <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit 57767e785321a427b8cdd282db2b8b33cd218ffa
Author: Cong Wang <[email protected]>
Date:   Mon Nov 6 13:47:26 2017 -0800

    cls_matchall: use tcf_exts_get_net() before call_rcu()
    
    Hold netns refcnt before call_rcu() and release it after
    the tcf_exts_destroy() is done.
    
    Cc: Lucas Bates <[email protected]>
    Cc: Jamal Hadi Salim <[email protected]>
    Cc: Jiri Pirko <[email protected]>
    Signed-off-by: Cong Wang <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit d5f984f5af1d926bc9c7a7f90e7a1e1e313a8ba7
Author: Cong Wang <[email protected]>
Date:   Mon Nov 6 13:47:25 2017 -0800

    cls_fw: use tcf_exts_get_net() before call_rcu()
    
    Hold netns refcnt before call_rcu() and release it after
    the tcf_exts_destroy() is done.
    
    Note, on ->destroy() path we have to respect the return value
    of tcf_exts_get_net(), on other paths it should always return
    true, so we don't need to care.
    
    Cc: Lucas Bates <[email protected]>
    Cc: Jamal Hadi Salim <[email protected]>
    Cc: Jiri Pirko <[email protected]>
    Signed-off-by: Cong Wang <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit 0dadc117ac8bc78d8144e862ac8ad23f342f9ea8
Author: Cong Wang <[email protected]>
Date:   Mon Nov 6 13:47:24 2017 -0800

    cls_flower: use tcf_exts_get_net() before call_rcu()
    
    Hold netns refcnt before call_rcu() and release it after
    the tcf_exts_destroy() is done.
    
    Note, on ->destroy() path we have to respect the return value
    of tcf_exts_get_net(), on other paths it should always return
    true, so we don't need to care.
    
    Cc: Lucas Bates <[email protected]>
    Cc: Jamal Hadi Salim <[email protected]>
    Cc: Jiri Pirko <[email protected]>
    Signed-off-by: Cong Wang <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit 22f7cec93f0af86c4b66bf34a977da9d7cef076e
Author: Cong Wang <[email protected]>
Date:   Mon Nov 6 13:47:23 2017 -0800

    cls_flow: use tcf_exts_get_net() before call_rcu()
    
    Hold netns refcnt before call_rcu() and release it after
    the tcf_exts_destroy() is done.
    
    Note, on ->destroy() path we have to respect the return value
    of tcf_exts_get_net(), on other paths it should always return
    true, so we don't need to care.
    
    Cc: Lucas Bates <[email protected]>
    Cc: Jamal Hadi Salim <[email protected]>
    Cc: Jiri Pirko <[email protected]>
    Signed-off-by: Cong Wang <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit ed1481681414e4d4264ad46864d5c1da5ff6ccb1
Author: Cong Wang <[email protected]>
Date:   Mon Nov 6 13:47:22 2017 -0800

    cls_cgroup: use tcf_exts_get_net() before call_rcu()
    
    Hold netns refcnt before call_rcu() and release it after
    the tcf_exts_destroy() is done.
    
    Note, on ->destroy() path we have to respect the return value
    of tcf_exts_get_net(), on other paths it should always return
    true, so we don't need to care.
    
    Cc: Lucas Bates <[email protected]>
    Cc: Jamal Hadi Salim <[email protected]>
    Cc: Jiri Pirko <[email protected]>
    Signed-off-by: Cong Wang <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit aae2c35ec89252639a97769fa72dbbf8f1cc3107
Author: Cong Wang <[email protected]>
Date:   Mon Nov 6 13:47:21 2017 -0800

    cls_bpf: use tcf_exts_get_net() before call_rcu()
    
    Hold netns refcnt before call_rcu() and release it after
    the tcf_exts_destroy() is done.
    
    Note, on ->destroy() path we have to respect the return value
    of tcf_exts_get_net(), on other paths it should always return
    true, so we don't need to care.
    
    Cc: Lucas Bates <[email protected]>
    Cc: Jamal Hadi Salim <[email protected]>
    Cc: Jiri Pirko <[email protected]>
    Signed-off-by: Cong Wang <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit 0b2a59894b7657fab46b50f176bd772aa495044f
Author: Cong Wang <[email protected]>
Date:   Mon Nov 6 13:47:20 2017 -0800

    cls_basic: use tcf_exts_get_net() before call_rcu()
    
    Hold netns refcnt before call_rcu() and release it after
    the tcf_exts_destroy() is done.
    
    Note, on ->destroy() path we have to respect the return value
    of tcf_exts_get_net(), on other paths it should always return
    true, so we don't need to care.
    
    Cc: Lucas Bates <[email protected]>
    Cc: Jamal Hadi Salim <[email protected]>
    Cc: Jiri Pirko <[email protected]>
    Signed-off-by: Cong Wang <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit e4b95c41df36befcfd117210900cd790bc2cd048
Author: Cong Wang <[email protected]>
Date:   Mon Nov 6 13:47:19 2017 -0800

    net_sched: introduce tcf_exts_get_net() and tcf_exts_put_net()
    
    Instead of holding netns refcnt in tc actions, we can minimize
    the holding time by saving it in struct tcf_exts instead. This
    means we can just hold netns refcnt right before call_rcu() and
    release it after tcf_exts_destroy() is done.
    
    However, because on netns cleanup path we call tcf_proto_destroy()
    too, obviously we can not hold netns for a zero refcnt, in this
    case we have to do cleanup synchronously. It is fine for RCU too,
    the caller cleanup_net() already waits for a grace period.
    
    For other cases, refcnt is non-zero and we can safely grab it as
    normal and release it after we are done.
    
    This patch provides two new API for each filter to use:
    tcf_exts_get_net() and tcf_exts_put_net(). And all filters now can
    use the following pattern:
    
    void __destroy_filter() {
      tcf_exts_destroy();
      tcf_exts_put_net();  // <== release netns refcnt
      kfree();
    }
    void some_work() {
      rtnl_lock();
      __destroy_filter();
      rtnl_unlock();
    }
    void some_rcu_callback() {
      tcf_queue_work(some_work);
    }
    
    if (tcf_exts_get_net())  // <== hold netns refcnt
      call_rcu(some_rcu_callback);
    else
      __destroy_filter();
    
    Cc: Lucas Bates <[email protected]>
    Cc: Jamal Hadi Salim <[email protected]>
    Cc: Jiri Pirko <[email protected]>
    Signed-off-by: Cong Wang <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit c7e460ce55724d4e4e22d3126e5c47273819c53a
Author: Cong Wang <[email protected]>
Date:   Mon Nov 6 13:47:18 2017 -0800

    Revert "net_sched: hold netns refcnt for each action"
    
    This reverts commit ceffcc5e254b450e6159f173e4538215cebf1b59.
    If we hold that refcnt, the netns can never be destroyed until
    all actions are destroyed by user, this breaks our netns design
    which we expect all actions are destroyed when we destroy the
    whole netns.
    
    Cc: Lucas Bates <[email protected]>
    Cc: Jamal Hadi Salim <[email protected]>
    Cc: Jiri Pirko <[email protected]>
    Signed-off-by: Cong Wang <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit 8f5624629105589bcc23d0e51cc01bd8103d09a5
Author: Andrey Konovalov <[email protected]>
Date:   Mon Nov 6 13:26:46 2017 +0100

    net: usb: asix: fill null-ptr-deref in asix_suspend
    
    When asix_suspend() is called dev->driver_priv might not have been
    assigned a value, so we need to check that it's not NULL.
    
    Similar issue is present in asix_resume(), this patch fixes it as well.
    
    Found by syzkaller.
    
    kasan: CONFIG_KASAN_INLINE enabled
    kasan: GPF could be caused by NULL-ptr deref or user memory access
    general protection fault: 0000 [#1] PREEMPT SMP KASAN
    Modules linked in:
    CPU: 0 PID: 24 Comm: kworker/0:1 Not tainted 4.14.0-rc4-43422-geccacdd69a8c #400
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
    Workqueue: usb_hub_wq hub_event
    task: ffff88006bb36300 task.stack: ffff88006bba8000
    RIP: 0010:asix_suspend+0x76/0xc0 drivers/net/usb/asix_devices.c:629
    RSP: 0018:ffff88006bbae718 EFLAGS: 00010202
    RAX: dffffc0000000000 RBX: ffff880061ba3b80 RCX: 1ffff1000c34d644
    RDX: 0000000000000001 RSI: 0000000000000402 RDI: 0000000000000008
    RBP: ffff88006bbae738 R08: 1ffff1000d775cad R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800630a8b40
    R13: 0000000000000000 R14: 0000000000000402 R15: ffff880061ba3b80
    FS:  0000000000000000(0000) GS:ffff88006c600000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007ff33cf89000 CR3: 0000000061c0a000 CR4: 00000000000006f0
    Call Trace:
     usb_suspend_interface drivers/usb/core/driver.c:1209
     usb_suspend_both+0x27f/0x7e0 drivers/usb/core/driver.c:1314
     usb_runtime_suspend+0x41/0x120 drivers/usb/core/driver.c:1852
     __rpm_callback+0x339/0xb60 drivers/base/power/runtime.c:334
     rpm_callback+0x106/0x220 drivers/base/power/runtime.c:461
     rpm_suspend+0x465/0x1980 drivers/base/power/runtime.c:596
     __pm_runtime_suspend+0x11e/0x230 drivers/base/power/runtime.c:1009
     pm_runtime_put_sync_autosuspend ./include/linux/pm_runtime.h:251
     usb_new_device+0xa37/0x1020 drivers/usb/core/hub.c:2487
     hub_port_connect drivers/usb/core/hub.c:4903
     hub_port_connect_change drivers/usb/core/hub.c:5009
     port_event drivers/usb/core/hub.c:5115
     hub_event+0x194d/0x3740 drivers/usb/core/hub.c:5195
     process_one_work+0xc7f/0x1db0 kernel/workqueue.c:2119
     worker_thread+0x221/0x1850 kernel/workqueue.c:2253
     kthread+0x3a1/0x470 kernel/kthread.c:231
     ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:431
    Code: 8d 7c 24 20 48 89 fa 48 c1 ea 03 80 3c 02 00 75 5b 48 b8 00 00
    00 00 00 fc ff df 4d 8b 6c 24 20 49 8d 7d 08 48 89 fa 48 c1 ea 03 <80>
    3c 02 00 75 34 4d 8b 6d 08 4d 85 ed 74 0b e8 26 2b 51 fd 4c
    RIP: asix_suspend+0x76/0xc0 RSP: ffff88006bbae718
    ---[ end trace dfc4f5649284342c ]---
    
    Signed-off-by: Andrey Konovalov <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit 1a8e6b48fbf534028ce4031d0d035e7e72779cef
Author: David S. Miller <[email protected]>
Date:   Thu Nov 9 09:21:44 2017 +0900

    Revert "net: usb: asix: fill null-ptr-deref in asix_suspend"
    
    This reverts commit baedf68a068ca29624f241426843635920f16e1d.
    
    There is an updated version of this fix which covers
    the problem more thoroughly.
    
    Signed-off-by: David S. Miller <[email protected]>

commit 87df26175e67c26ccdd3a002fbbb8cde78e28a19
Author: Jiri Kosina <[email protected]>
Date:   Wed Nov 8 21:18:18 2017 +0100

    x86/mm: Unbreak modules that rely on external PAGE_KERNEL availability
    
    Commit 7744ccdbc16f0 ("x86/mm: Add Secure Memory Encryption (SME)
    support") as a side-effect made PAGE_KERNEL all of a sudden unavailable
    to modules which can't make use of EXPORT_SYMBOL_GPL() symbols.
    
    This is because once SME is enabled, sme_me_mask (which is introduced as
    EXPORT_SYMBOL_GPL) makes its way to PAGE_KERNEL through _PAGE_ENC,
    causing imminent build failure for all the modules which make use of all
    the EXPORT-SYMBOL()-exported API (such as vmap(), __vmalloc(),
    remap_pfn_range(), ...).
    
    Exporting (as EXPORT_SYMBOL()) interfaces (and having done so for ages)
    that take pgprot_t argument, while making it impossible to -- all of a
    sudden -- pass PAGE_KERNEL to it, feels rather incosistent.
    
    Restore the original behavior and make it possible to pass PAGE_KERNEL
    to all its EXPORT_SYMBOL() consumers.
    
    [ This is all so not wonderful. We shouldn't need that "sme_me_mask"
      access at all in all those places that really don't care about that
      level of detail, and just want _PAGE_KERNEL or whatever.
    
      We have some similar issues with _PAGE_CACHE_WP and _PAGE_NOCACHE,
      both of which hide a "cachemode2protval()" call, and which also ends
      up using another EXPORT_SYMBOL(), but at least that only triggers for
      the much more rare cases.
    
      Maybe we could move these dynamic page table bits to be generated much
      deeper down in the VM layer, instead of hiding them in the macros that
      everybody uses.
    
      So this all would merit some cleanup. But not today.   - Linus ]
    
    Cc: Tom Lendacky <[email protected]>
    Signed-off-by: Jiri Kosina <[email protected]>
    Despised-by: Thomas Gleixner <[email protected]>
    Signed-off-by: Linus Torvalds <[email protected]>

commit f7dc4c9a855a13dbb33294c9fc94f17af03f6291
Author: John Johansen <[email protected]>
Date:   Wed Nov 8 08:09:52 2017 -0800

    apparmor: fix off-by-one comparison on MAXMAPPED_SIG
    
    This came in yesterday, and I have verified our regression tests
    were missing this and it can cause an oops. Please apply.
    
    There is a an off-by-one comparision on sig against MAXMAPPED_SIG
    that can lead to a read outside the sig_map array if sig
    is MAXMAPPED_SIG. Fix this.
    
    Verified that the check is an out of bounds case that can cause an oops.
    
    Revised: add comparison fix to second case
    Fixes: cd1dbf76b23d ("apparmor: add the ability to mediate signals")
    Signed-off-by: Colin Ian King <[email protected]>
    Signed-off-by: John Johansen <[email protected]>
    Signed-off-by: Linus Torvalds <[email protected]>

commit 423a8a942e95493b73228ba6a3f176dcc7f35fa9
Author: Chris Wilson <[email protected]>
Date:   Mon Nov 6 21:11:28 2017 +0000

    drm/i915: Deconstruct struct sgt_dma initialiser
    
    gcc-4.4 complains about:
    
            struct sgt_dma iter = {
                    .sg = vma->pages->sgl,
                    .dma = sg_dma_address(iter.sg),
                    .max = iter.dma + iter.sg->length,
            };
    
    drivers/gpu/drm/i915/i915_gem_gtt.c: In function ‘gen8_ppgtt_insert_4lvl’:
    drivers/gpu/drm/i915/i915_gem_gtt.c:938: error: ‘iter.sg’ is used uninitialized in this function
    drivers/gpu/drm/i915/i915_gem_gtt.c:939: error: ‘iter.dma’ is used uninitialized in this function
    
    and worse generates invalid code that triggers a GPF:
    
    BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
    IP: gen8_ppgtt_insert_4lvl+0x1b/0x1e0 [i915]
    PGD 0
    Oops: 0000 [#1] SMP
    Modules linked in: snd_aloop nf_conntrack_ipv6 nf_defrag_ipv6 nf_log_ipv6 ip6table_filter ip6_tables ctr ccm xt_state nf_log_ipv4
    nf_log_common xt_LOG xt_limit xt_recent xt_owner xt_addrtype iptable_filter ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat
    nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c ip_tables dm_mod vhost_net macvtap macvlan vhost tun kvm_intel kvm
    irqbypass uas usb_storage hid_multitouch btusb btrtl uvcvideo videobuf2_v4l2 videobuf2_core videodev media videobuf2_vmalloc videobuf2_memops
    sg ppdev dell_wmi sparse_keymap mei_wdt sd_mod iTCO_wdt iTCO_vendor_support rtsx_pci_ms memstick rtsx_pci_sdmmc mmc_core dell_smm_hwmon hwmon
    dell_laptop dell_smbios dcdbas joydev input_leds hci_uart btintel btqca btbcm bluetooth parport_pc parport i2c_hid
      intel_lpss_acpi intel_lpss pcspkr wmi int3400_thermal acpi_thermal_rel dell_rbtn mei_me mei snd_hda_codec_hdmi snd_hda_codec_realtek
    snd_hda_codec_generic ahci libahci acpi_pad xhci_pci xhci_hcd snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device
    snd_pcm snd_timer snd soundcore int3403_thermal arc4 e1000e ptp pps_core i2c_i801 iwlmvm mac80211 rtsx_pci iwlwifi cfg80211 rfkill
    intel_pch_thermal processor_thermal_device int340x_thermal_zone intel_soc_dts_iosf i915 video fjes
    CPU: 2 PID: 2408 Comm: X Not tainted 4.10.0-rc5+ #1
    Hardware name: Dell Inc. Latitude E7470/0T6HHJ, BIOS 1.11.3 11/09/2016
    task: ffff880219fe4740 task.stack: ffffc90005f98000
    RIP: 0010:gen8_ppgtt_insert_4lvl+0x1b/0x1e0 [i915]
    RSP: 0018:ffffc90005f9b8c8 EFLAGS: 00010246
    RAX: 0000000000000000 RBX: ffff8802167d8000 RCX: 0000000000000001
    RDX: 00000000ffff7000 RSI: ffff880219f94140 RDI: ffff880228444000
    RBP: ffffc90005f9b948 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000080
    R13: 0000000000000001 R14: ffffc90005f9bcd7 R15: ffff88020c9a83c0
    FS:  00007fb53e1ee920(0000) GS:ffff88024dd00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000010 CR3: 000000022ef95000 CR4: 00000000003406e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
      ppgtt_bind_vma+0x40/0x50 [i915]
      i915_vma_bind+0xcb/0x1c0 [i915]
      __i915_vma_do_pin+0x6e/0xd0 [i915]
      i915_gem_execbuffer_reserve_vma+0x162/0x1d0 [i915]
      i915_gem_execbuffer_reserve+0x4fc/0x510 [i915]
      ? __kmalloc+0x134/0x250
      ? i915_gem_wait_for_error+0x25/0x100 [i915]
      ? i915_gem_wait_for_error+0x25/0x100 [i915]
      i915_gem_do_execbuffer+0x2df/0xa00 [i915]
      ? drm_malloc_gfp.clone.0+0x42/0x80 [i915]
      ? path_put+0x22/0x30
      ? __check_object_size+0x62/0x1f0
      ? terminate_walk+0x44/0x90
      i915_gem_execbuffer2+0x95/0x1e0 [i915]
      drm_ioctl+0x243/0x490
      ? handle_pte_fault+0x1d7/0x220
      ? i915_gem_do_execbuffer+0xa00/0xa00 [i915]
      ? handle_mm_fault+0x10d/0x2a0
      vfs_ioctl+0x18/0x30
      do_vfs_ioctl+0x14b/0x3f0
      SyS_ioctl+0x92/0xa0
      entry_SYSCALL_64_fastpath+0x1a/0xa9
    RIP: 0033:0x7fb53b4fcb77
    RSP: 002b:00007ffe0c572898 EFLAGS: 00003246 ORIG_RAX: 0000000000000010
    RAX: ffffffffffffffda RBX: 00007fb53e17c038 RCX: 00007fb53b4fcb77
    RDX: 00007ffe0c572900 RSI: 0000000040406469 RDI: 000000000000000b
    RBP: 00007fb5376d67e0 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000028 R11: 0000000000003246 R12: 0000000000000000
    R13: 0000000000000000 R14: 000055eecb314d00 R15: 000055eecb315460
    Code: 0f 84 5d ff ff ff eb a2 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 58 0f 1f 44 00 00 31 c0 89 4d b0 <4c>
    8b 60 10 44 8b 70 0c 48 89 d0 4c 8b 2e 48 c1 e8 27 25 ff 01
    RIP: gen8_ppgtt_insert_4lvl+0x1b/0x1e0 [i915] RSP: ffffc90005f9b8c8
    CR2: 0000000000000010
    
    Recent gccs, such as 4.9, 6.3 or 7.2, do not generate the warning nor do
    they explode on use. If we manually create the struct using locals from
    the stack, this should eliminate this issue, and does not alter code
    generation with gcc-7.2.
    
    Fixes: 894ccebee2b0 ("drm/i915: Micro-optimise gen8_ppgtt_insert_entries()")
    Reported-by: Kelly French <[email protected]>
    Signed-off-by: Chris Wilson <[email protected]>
    Cc: Kelly French <[email protected]>
    Cc: Mika Kuoppala <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Tested-by: Kelly French <[email protected]>
    Reviewed-by: Tvrtko Ursulin <[email protected]>
    (cherry picked from commit 5684514ba9dc6d7aa932cc53d97d866b2386221f)
    Signed-off-by: Rodrigo Vivi <[email protected]>

commit 40a4884512d5bf9ed3a9cb162b8a7ec3306b6cfc
Author: Tvrtko Ursulin <[email protected]>
Date:   Tue Oct 31 10:23:25 2017 +0000

    drm/i915: Reject unknown syncobj flags
    
    We have to reject unknown flags for uAPI considerations, and also
    because the curent implementation limits their i915 storage space
    to two bits.
    
    v2: (Chris Wilson)
     * Fix fail in ABI check.
     * Added unknown flags and BUILD_BUG_ON.
    
    v3:
     * Use ARCH_KMALLOC_MINALIGN instead of alignof. (Chris Wilson)
    
    Signed-off-by: Tvrtko Ursulin <[email protected]>
    Fixes: cf6e7bac6357 ("drm/i915: Add support for drm syncobjs")
    Cc: Jason Ekstrand <[email protected]>
    Cc: Chris Wilson <[email protected]>
    Cc: Jani Nikula <[email protected]>
    Cc: Joonas Lahtinen <[email protected]>
    Cc: Rodrigo Vivi <[email protected]>
    Cc: David Airlie <[email protected]>
    Cc: [email protected]
    Cc: [email protected]
    Reviewed-by: Chris Wilson <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    (cherry picked from commit ebcaa1ff8b59097805d548fe7a676f194625c033)
    Signed-off-by: Rodrigo Vivi <[email protected]>

commit b084116f8587b222a2c5ef6dcd846f40f24b9420
Author: Oswald Buddenhagen <[email protected]>
Date:   Sun Oct 29 16:27:20 2017 +0100

    MIPS: AR7: Ensure that serial ports are properly set up
    
    Without UPF_FIXED_TYPE, the data from the PORT_AR7 uart_config entry is
    never copied, resulting in a dead port.
    
    Fixes: 154615d55459 ("MIPS: AR7: Use correct UART port type")
    Signed-off-by: Oswald Buddenhagen <[email protected]>
    [jonas.gorski: add Fixes tag]
    Signed-off-by: Jonas Gorski <[email protected]>
    Reviewed-by: Florian Fainelli <[email protected]>
    Cc: Ralf Baechle <[email protected]>
    Cc: Greg Kroah-Hartman <[email protected]>
    Cc: Yoshihiro YUNOMAE <[email protected]>
    Cc: Nicolas Schichan <[email protected]>
    Cc: Oswald Buddenhagen <[email protected]>
    Cc: [email protected]
    Cc: [email protected]
    Cc: <[email protected]>
    Patchwork: https://patchwork.linux-mips.org/patch/17543/
    Signed-off-by: James Hogan <[email protected]>

commit 6b7be529634bbfbf6395f23217a66d731fbed0a0
Author: Bjorn Helgaas <[email protected]>
Date:   Wed Nov 8 08:49:49 2017 -0600

    MAINTAINERS: Add Lorenzo Pieralisi for PCI host bridge drivers
    
    Add Lorenzo Pieralisi as maintainer for PCI native host bridge drivers and
    the endpoint driver framework.
    
    Signed-off-by: Bjorn Helgaas <[email protected]>
    Acked-by: Lorenzo Pieralisi <[email protected]>

commit 624f5ab8720b3371367327a822c267699c1823b8
Author: Eric Biggers <[email protected]>
Date:   Tue Nov 7 22:29:02 2017 +0000

    KEYS: fix NULL pointer dereference during ASN.1 parsing [ver #2]
    
    syzkaller reported a NULL pointer dereference in asn1_ber_decoder().  It
    can be reproduced by the following command, assuming
    CONFIG_PKCS7_TEST_KEY=y:
    
            keyctl add pkcs7_test desc '' @s
    
    The bug is that if the data buffer is empty, an integer underflow occurs
    in the following check:
    
            if (unlikely(dp >= datalen - 1))
                    goto data_overrun_error;
    
    This results in the NULL data pointer being dereferenced.
    
    Fix it by checking for 'datalen - dp < 2' instead.
    
    Also fix the similar check for 'dp >= datalen - n' later in the same
    function.  That one possibly could result in a buffer overread.
    
    The NULL pointer dereference was reproducible using the "pkcs7_test" key
    type but not the "asymmetric" key type because the "asymmetric" key type
    checks for a 0-length payload before calling into the ASN.1 decoder but
    the "pkcs7_test" key type does not.
    
    The bug report was:
    
        BUG: unable to handle kernel NULL pointer dereference at           (null)
        IP: asn1_ber_decoder+0x17f/0xe60 lib/asn1_decoder.c:233
        PGD 7b708067 P4D 7b708067 PUD 7b6ee067 PMD 0
        Oops: 0000 [#1] SMP
        Modules linked in:
        CPU: 0 PID: 522 Comm: syz-executor1 Not tainted 4.14.0-rc8 #7
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.3-20171021_125229-anatol 04/01/2014
        task: ffff9b6b3798c040 task.stack: ffff9b6b37970000
        RIP: 0010:asn1_ber_decoder+0x17f/0xe60 lib/asn1_decoder.c:233
        RSP: 0018:ffff9b6b37973c78 EFLAGS: 00010216
        RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000000021c
        RDX: ffffffff814a04ed RSI: ffffb1524066e000 RDI: ffffffff910759e0
        RBP: ffff9b6b37973d60 R08: 0000000000000001 R09: ffff9b6b3caa4180
        R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002
        R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
        FS:  00007f10ed1f2700(0000) GS:ffff9b6b3ea00000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 0000000000000000 CR3: 000000007b6f3000 CR4: 00000000000006f0
        Call Trace:
         pkcs7_parse_message+0xee/0x240 crypto/asymmetric_keys/pkcs7_parser.c:139
         verify_pkcs7_signature+0x33/0x180 certs/system_keyring.c:216
         pkcs7_preparse+0x41/0x70 crypto/asymmetric_keys/pkcs7_key_type.c:63
         key_create_or_update+0x180/0x530 security/keys/key.c:855
         SYSC_add_key security/keys/keyctl.c:122 [inline]
         SyS_add_key+0xbf/0x250 security/keys/keyctl.c:62
         entry_SYSCALL_64_fastpath+0x1f/0xbe
        RIP: 0033:0x4585c9
        RSP: 002b:00007f10ed1f1bd8 EFLAGS: 00000216 ORIG_RAX: 00000000000000f8
        RAX: ffffffffffffffda RBX: 00007f10ed1f2700 RCX: 00000000004585c9
        RDX: 0000000020000000 RSI: 0000000020008ffb RDI: 0000000020008000
        RBP: 0000000000000000 R08: ffffffffffffffff R09: 0000000000000000
        R10: 0000000000000000 R11: 0000000000000216 R12: 00007fff1b2260ae
        R13: 00007fff1b2260af R14: 00007f10ed1f2700 R15: 0000000000000000
        Code: dd ca ff 48 8b 45 88 48 83 e8 01 4c 39 f0 0f 86 a8 07 00 00 e8 53 dd ca ff 49 8d 46 01 48 89 85 58 ff ff ff 48 8b 85 60 ff ff ff <42> 0f b6 0c 30 89 c8 88 8d 75 ff ff ff 83 e0 1f 89 8d 28 ff ff
        RIP: asn1_ber_decoder+0x17f/0xe60 lib/asn1_decoder.c:233 RSP: ffff9b6b37973c78
        CR2: 0000000000000000
    
    Fixes: 42d5ec27f873 ("X.509: Add an ASN.1 decoder")
    Reported-by: syzbot <[email protected]>
    Cc: <[email protected]> # v3.7+
    Signed-off-by: Eric Biggers <[email protected]>
    Signed-off-by: David Howells <[email protected]>
    Signed-off-by: James Morris <[email protected]>

commit e6b03ab63b4d270e0249f96536fde632409dc1dc
Author: Jonas Gorski <[email protected]>
Date:   Sun Oct 29 16:27:19 2017 +0100

    MIPS: AR7: Defer registration of GPIO
    
    When called from prom init code, ar7_gpio_init() will fail as it will
    call gpiochip_add() which relies on a working kmalloc() to alloc
    the gpio_desc array and kmalloc is not useable yet at prom init time.
    
    Move ar7_gpio_init() to ar7_register_devices() (a device_initcall)
    where kmalloc works.
    
    Fixes: 14e85c0e69d5 ("gpio: remove gpio_descs global array")
    Signed-off-by: Jonas Gorski <[email protected]>
    Reviewed-by: Florian Fainelli <[email protected]>
    Cc: Ralf Baechle <[email protected]>
    Cc: Greg Kroah-Hartman <[email protected]>
    Cc: Yoshihiro YUNOMAE <[email protected]>
    Cc: Nicolas Schichan <[email protected]>
    Cc: [email protected]
    Cc: [email protected]
    Cc: <[email protected]> # 3.19+
    Patchwork: https://patchwork.linux-mips.org/patch/17542/
    Signed-off-by: James Hogan <[email protected]>

commit 0de0add10e587effa880c741c9413c874f16be91
Author: Kristian Evensen <[email protected]>
Date:   Tue Nov 7 13:47:56 2017 +0100

    qmi_wwan: Add missing skb_reset_mac_header-call
    
    When we receive a packet on a QMI device in raw IP mode, we should call
    skb_reset_mac_header() to ensure that skb->mac_header contains a valid
    offset in the packet. While it shouldn't really matter, the packets have
    no MAC header and the interface is configured as-such, it seems certain
    parts of the network stack expects a "good" value in skb->mac_header.
    
    Without the skb_reset_mac_header() call added in this patch, for example
    shaping traffic (using tc) triggers the following oops on the first
    received packet:
    
    [  303.642957] skbuff: skb_under_panic: text:8f137918 len:177 put:67 head:8e4b0f00 data:8e4b0eff tail:0x8e4b0fb0 end:0x8e4b1520 dev:wwan0
    [  303.655045] Kernel bug detected[#1]:
    [  303.658622] CPU: 1 PID: 1002 Comm: logd Not tainted 4.9.58 #0
    [  303.664339] task: 8fdf05e0 task.stack: 8f15c000
    [  303.668844] $ 0   : 00000000 00000001 0000007a 00000000
    [  303.674062] $ 4   : 8149a2fc 8149a2fc 8149ce20 00000000
    [  303.679284] $ 8   : 00000030 3878303a 31623465 20303235
    [  303.684510] $12   : ded731e3 2626a277 00000000 03bd0000
    [  303.689747] $16   : 8ef62b40 00000043 8f137918 804db5fc
    [  303.694978] $20   : 00000001 00000004 8fc13800 00000003
    [  303.700215] $24   : 00000001 8024ab10
    [  303.705442] $28   : 8f15c000 8fc19cf0 00000043 802cc920
    [  303.710664] Hi    : 00000000
    [  303.713533] Lo    : 74e58000
    [  303.716436] epc   : 802cc920 skb_panic+0x58/0x5c
    [  303.721046] ra    : 802cc920 skb_panic+0x58/0x5c
    [  303.725639] Status: 11007c03 KERNEL EXL IE
    [  303.729823] Cause : 50800024 (ExcCode 09)
    [  303.733817] PrId  : 0001992f (MIPS 1004Kc)
    [  303.737892] Modules linked in: rt2800pci rt2800mmio rt2800lib qcserial ppp_async option usb_wwan rt2x00pci rt2x00mmio rt2x00lib rndis_host qmi_wwan ppp_generic nf_nat_pptp nf_conntrack_pptp nf_conntrack_ipv6 mt76x2i
    Process logd (pid: 1002, threadinfo=8f15c000, task=8fdf05e0, tls=77b3eee4)
    [  303.962509] Stack : 00000000 80408990 8f137918 000000b1 00000043 8e4b0f00 8e4b0eff 8e4b0fb0
    [  303.970871]         8e4b1520 8fec1800 00000043 802cd2a4 6e000045 00000043 00000000 8ef62000
    [  303.979219]         8eef5d00 8ef62b40 8fea7300 8f137918 00000000 00000000 0002bb01 793e5664
    [  303.987568]         8ef08884 00000001 8fea7300 00000002 8fc19e80 8eef5d00 00000006 00000003
    [  303.995934]         00000000 8030ba90 00000003 77ab3fd0 8149dc80 8004d1bc 8f15c000 8f383700
    [  304.004324]         ...
    [  304.006767] Call Trace:
    [  304.009241] [<802cc920>] skb_panic+0x58/0x5c
    [  304.013504] [<802cd2a4>] skb_push+0x78/0x90
    [  304.017783] [<8f137918>] 0x8f137918
    [  304.021269] Code: 00602825  0c02a3b4  24842888 <000c000d> 8c870060  8c8200a0  0007382b  00070336  8c88005c
    [  304.031034]
    [  304.032805] ---[ end trace b778c482b3f0bda9 ]---
    [  304.041384] Kernel panic - not syncing: Fatal exception in interrupt
    [  304.051975] Rebooting in 3 seconds..
    
    While the oops is for a 4.9-kernel, I was able to trigger the same oops with
    net-next as of yesterday.
    
    Fixes: 32f7adf633b9 ("net: qmi_wwan: support "raw IP" mode")
    Signed-off-by: Kristian Evensen <[email protected]>
    Acked-by: Bjørn Mork <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit 055db6957e4735b16cd2fa94a5bbfb754c9b8023
Author: Jay Vosburgh <[email protected]>
Date:   Tue Nov 7 19:50:07 2017 +0900

    bonding: fix slave stuck in BOND_LINK_FAIL state
    
    The bonding miimon logic has a flaw, in that a failure of the
    rtnl_trylock can cause a slave to become permanently stuck in
    BOND_LINK_FAIL state.
    
            The sequence of events to cause this is as follows:
    
            1) bond_miimon_inspect finds that a slave's link is down, and so
    calls bond_propose_link_state, setting slave->new_link_state to
    BOND_LINK_FAIL, then sets slave->new_link to BOND_LINK_DOWN and returns
    non-zero.
    
            2) In bond_mii_monitor, the rtnl_trylock fails, and the timer is
    rescheduled.  No change is committed.
    
            3) bond_miimon_inspect is called again, but this time the slave
    from step 1 has recovered.  slave->new_link is reset to NOCHANGE, and, as
    slave->link was never changed, the switch enters the BOND_LINK_UP case,
    and does nothing.  The pending BOND_LINK_FAIL state from step 1 remains
    pending, as new_link_state is not reset.
    
            4) The state from step 3 persists until another slave changes link
    state and causes bond_miimon_inspect to return non-zero.  At this point,
    the BOND_LINK_FAIL state change on the slave from steps 1-3 is committed,
    and the slave will remain stuck in BOND_LINK_FAIL state even though it
    is actually link up.
    
            The remedy for this is to initialize new_link_state on each entry
    to bond_miimon_inspect, as is already done with new_link.
    
    Fixes: fb9eb899a6dc ("bonding: handle link transition from FAIL to UP correctly")
    Reported-by: Alex Sidorenko <[email protected]>
    Reviewed-by: Jarod Wilson <[email protected]>
    Signed-off-by: Jay Vosburgh <[email protected]>
    Acked-by: Mahesh Bandewar <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit b7e732fa3171318418524b776b841b4024933b2b
Author: Bjorn Andersson <[email protected]>
Date:   Mon Nov 6 20:50:35 2017 -0800

    qrtr: Move to postcore_initcall
    
    Registering qrtr with module_init makes the ability of typical platform
    code to create AF_QIPCRTR socket during probe a matter of link order
    luck. Moving qrtr to postcore_initcall() avoids this.
    
    Signed-off-by: Bjorn Andersson <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit 7fd078337201cf7468f53c3d9ef81ff78cb6df3b
Author: Bjørn Mork <[email protected]>
Date:   Mon Nov 6 15:32:18 2017 +0100

    net: qmi_wwan: fix divide by 0 on bad descriptors
    
    A CDC Ethernet functional descriptor with wMaxSegmentSize = 0 will
    cause a divide error in usbnet_probe:
    
    divide error: 0000 [#1] PREEMPT SMP KASAN
    Modules linked in:
    CPU: 0 PID: 24 Comm: kworker/0:1 Not tainted 4.14.0-rc8-44453-g1fdc1a82c34f #56
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
    Workqueue: usb_hub_wq hub_event
    task: ffff88006bef5c00 task.stack: ffff88006bf60000
    RIP: 0010:usbnet_update_max_qlen+0x24d/0x390 drivers/net/usb/usbnet.c:355
    RSP: 0018:ffff88006bf67508 EFLAGS: 00010246
    RAX: 00000000000163c8 RBX: ffff8800621fce40 RCX: ffff8800621fcf34
    RDX: 0000000000000000 RSI: ffffffff837ecb7a RDI: ffff8800621fcf34
    RBP: ffff88006bf67520 R08: ffff88006bef5c00 R09: ffffed000c43f881
    R10: ffffed000c43f880 R11: ffff8800621fc406 R12: 0000000000000003
    R13: ffffffff85c71de0 R14: 0000000000000000 R15: 0000000000000000
    FS:  0000000000000000(0000) GS:ffff88006ca00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007ffe9c0d6dac CR3: 00000000614f4000 CR4: 00000000000006f0
    Call Trace:
     usbnet_probe+0x18b5/0x2790 drivers/net/usb/usbnet.c:1783
     qmi_wwan_probe+0x133/0x220 drivers/net/usb/qmi_wwan.c:1338
     usb_probe_interface+0x324/0x940 drivers/usb/core/driver.c:361
     really_probe drivers/base/dd.c:413
     driver_probe_device+0x522/0x740 drivers/base/dd.c:557
    
    Fix by simply ignoring the bogus descriptor, as it is optional
    for QMI devices anyway.
    
    Fixes: 423ce8caab7e ("net: usb: qmi_wwan: New driver for Huawei QMI based WWAN devices")
    Reported-by: Andrey Konovalov <[email protected]>
    Signed-off-by: Bjørn Mork <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit 2cb80187ba065d7decad7c6614e35e07aec8a974
Author: Bjørn Mork <[email protected]>
Date:   Mon Nov 6 15:37:22 2017 +0100

    net: cdc_ether: fix divide by 0 on bad descriptors
    
    Setting dev->hard_mtu to 0 will cause a divide error in
    usbnet_probe. Protect against devices with bogus CDC Ethernet
    functional descriptors by ignoring a zero wMaxSegmentSize.
    
    Signed-off-by: Bjørn Mork <[email protected]>
    Acked-by: Oliver Neukum <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit 38c53af853069adf87181684370d7b8866d6387b
Author: Paul Mackerras <[email protected]>
Date:   Wed Nov 8 14:44:04 2017 +1100

    KVM: PPC: Book3S HV: Fix exclusion between HPT resizing and other HPT updates
    
    Commit 5e9859699aba ("KVM: PPC: Book3S HV: Outline of KVM-HV HPT resizing
    implementation", 2016-12-20) added code that tries to exclude any use
    or update of the hashed page table (HPT) while the HPT resizing code
    is iterating through all the entries in the HPT.  It does this by
    taking the kvm->lock mutex, clearing the kvm->arch.hpte_setup_done
    flag and then sending an IPI to all CPUs in the host.  The idea is
    that any VCPU task that tries to enter the guest will see that the
    hpte_setup_done flag is clear and therefore call kvmppc_hv_setup_htab_rma,
    which also takes the kvm->lock mutex and will therefore block until
    we release kvm->lock.
    
    However, any VCPU that is already in the guest, or is handling a
    hypervisor page fault or hypercall, can re-enter the guest without
    rechecking the hpte_setup_done flag.  The IPI will cause a guest exit
    of any VCPUs that are currently in the guest, but does not prevent
    those VCPU tasks from immediately re-entering the guest.
    
    The result is that after resize_hpt_rehash_hpte() has made a HPTE
    absent, a hypervisor page fault can occur and make that HPTE present
    again.  This includes updating the rmap array for the guest real page,
    meaning that we now have a pointer in the rmap array which connects
    with pointers in the old rev array but not the new rev array.  In
    fact, if the HPT is being reduced in size, the pointer in the rmap
    array could point outside the bounds of the new rev array.  If that
    happens, we can get a host crash later on such as this one:
    
    [91652.628516] Unable to handle kernel paging request for data at address 0xd0000000157fb10c
    [91652.628668] Faulting instruction address: 0xc0000000000e2640
    [91652.628736] Oops: Kernel access of bad area, sig: 11 [#1]
    [91652.628789] LE SMP NR_CPUS=1024 NUMA PowerNV
    [91652.628847] Modules linked in: binfmt_misc vhost_net vhost tap xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables ses enclosure scsi_transport_sas i2c_opal ipmi_powernv ipmi_devintf i2c_core ipmi_msghandler powernv_op_panel nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc kvm_hv kvm_pr kvm scsi_dh_alua dm_service_time dm_multipath tg3 ptp pps_core [last unloaded: stap_552b612747aec2da355051e464fa72a1_14259]
    [91652.629566] CPU: 136 PID: 41315 Comm: CPU 21/KVM Tainted: G           O    4.14.0-1.rc4.dev.gitb27fc5c.el7.centos.ppc64le #1
    [91652.629684] task: c0000007a419e400 task.stack: c0000000028d8000
    [91652.629750] NIP:  c0000000000e2640 LR: d00000000c36e498 CTR: c0000000000e25f0
    [91652.629829] REGS: c0000000028db5d0 TRAP: 0300   Tainted: G           O     (4.14.0-1.rc4.dev.gitb27fc5c.el7.centos.ppc64le)
    [91652.629932] MSR:  900000010280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]>  CR: 44022422  XER: 00000000
    [91652.630034] CFAR: d00000000c373f84 DAR: d0000000157fb10c DSISR: 40000000 SOFTE: 1
    [91652.630034] GPR00: d00000000c36e498 c0000000028db850 c000000001403900 c0000007b7960000
    [91652.630034] GPR04: d0000000117fb100 d000000007ab00d8 000000000033bb10 0000000000000000
    [91652.630034] GPR08: fffffffffffffe7f 801001810073bb10 d00000000e440000 d00000000c373f70
    [91652.630034] GPR12: c0000000000e25f0 c00000000fdb9400 f000000003b24680 0000000000000000
    [91652.630034] GPR16: 00000000000004fb 00007ff7081a0000 00000000000ec91a 000000000033bb10
    [91652.630034] GPR20: 0000000000010000 00000000001b1190 0000000000000001 0000000000010000
    [91652.630034] GPR24: c0000007b7ab8038 d0000000117fb100 0000000ec91a1190 c000001e6a000000
    [91652.630034] GPR28: 00000000033bb100 000000000073bb10 c0000007b7960000 d0000000157fb100
    [91652.630735] NIP [c0000000000e2640] kvmppc_add_revmap_chain+0x50/0x120
    [91652.630806] LR [d00000000c36e498] kvmppc_book3s_hv_page_fault+0xbb8/0xc40 [kvm_hv]
    [91652.630884] Call Trace:
    [91652.630913] [c0000000028db850] [c0000000028db8b0] 0xc0000000028db8b0 (unreliable)
    [91652.630996] [c0000000028db8b0] [d00000000c36e498] kvmppc_book3s_hv_page_fault+0xbb8/0xc40 [kvm_hv]
    [91652.631091] [c0000000028db9e0] [d00000000c36a078] kvmppc_vcpu_run_hv+0xdf8/0x1300 [kvm_hv]
    [91652.631179] [c0000000028dbb30] [d00000000c2248c4] kvmppc_vcpu_run+0x34/0x50 [kvm]
    [91652.631266] [c0000000028dbb50] [d00000000c220d54] kvm_arch_vcpu_ioctl_run+0x114/0x2a0 [kvm]
    [91652.631351] [c0000000028dbbd0] [d00000000c2139d8] kvm_vcpu_ioctl+0x598/0x7a0 [kvm]
    [91652.631433] [c0000000028dbd40] [c0000000003832e0] do_vfs_ioctl+0xd0/0x8c0
    [91652.631501] [c0000000028dbde0] [c000000000383ba4] SyS_ioctl+0xd4/0x130
    [91652.631569] [c0000000028dbe30] [c00000000000b8e0] system_call+0x58/0x6c
    [91652.631635] Instruction dump:
    [91652.631676] fba1ffe8 fbc1fff0 fbe1fff8 f8010010 f821ffa1 2fa70000 793d0020 e9432110
    [91652.631814] 7bbf26e4 7c7e1b78 7feafa14 409e0094 <807f000c> 786326e4 7c6a1a14 93a40008
    [91652.631959] ---[ end trace ac85ba6db72e5b2e ]---
    
    To fix this, we tighten up the way that the hpte_setup_done flag is
    checked to ensure that it does provide the guarantee that the resizing
    code needs.  In kvmppc_run_core(), we check the hpte_setup_done flag
    after disabling interrupts and refuse to enter the guest if it is
    clear (for a HPT guest).  The code that checks hpte_setup_done and
    calls kvmppc_hv_setup_htab_rma() is moved from kvmppc_vcpu_run_hv()
    to a point inside the main loop in kvmppc_run_vcpu(), ensuring that
    we don't just spin endlessly calling kvmppc_run_core() while
    hpte_setup_done is clear, but instead have a chance to block on the
    kvm->lock mutex.
    
    Finally we also check hpte_setup_done inside the region in
    kvmppc_book3s_hv_page_fault() where the HPTE is locked and we are about
    to update the HPTE, and bail out if it is clear.  If another CPU is
    inside kvm_vm_ioctl_resize_hpt_commit) and has cleared hpte_setup_done,
    then we know that either we are looking at a HPTE
    that resize_hpt_rehash_hpte() has not yet processed, which is OK,
    or else we will see hpte_setup_done clear and refuse to update it,
    because of the full barrier formed by the unlock of the HPTE in
    resize_hpt_rehash_hpte() combined with the locking of the HPTE
    in kvmppc_book3s_hv_page_fault().
    
    Fixes: 5e9859699aba ("KVM: PPC: Book3S HV: Outline of KVM-HV HPT resizing implementation")
    Cc: [email protected] # v4.10+
    Reported-by: Satheesh Rajendran <[email protected]>
    Signed-off-by: Paul Mackerras <[email protected]>

commit b5f862180d7011d9575d0499fa37f0f25b423b12
Author: Hangbin Liu <[email protected]>
Date:   Mon Nov 6 09:01:57 2017 +0800

    bonding: discard lowest hash bit for 802.3ad layer3+4
    
    After commit 07f4c90062f8 ("tcp/dccp: try to not exhaust ip_local_port_range
    in connect()"), we will try to use even ports for connect(). Then if an
    application (seen clearly with iperf) opens multiple streams to the same
    destination IP and port, each stream will be given an even source port.
    
    So the bonding driver's simple xmit_hash_policy based on layer3+4 addressing
    will always hash all these streams to the same interface. And the total
    throughput will limited to a single slave.
    
    Change the tcp code will impact the whole tcp behavior, only for bonding
    usage. Paolo Abeni suggested fix this by changing the bonding code only,
    which should be more reasonable, and less impact.
    
    Fix this by discarding the lowest hash bit because it contains little entropy.
    After the fix we can re-balance between slaves.
    
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Hangbin Liu <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit 39a4b86f0de4ce5024985a56fc39b16194b04313
Author: Gustavo A. R. Silva <[email protected]>
Date:   Sat Nov 4 22:54:53 2017 -0500

    net/mlx5e/core/en_fs: fix pointer dereference after free in mlx5e_execute_l2_action
    
    hn is being kfree'd in mlx5e_del_l2_from_hash and then dereferenced
    by accessing hn->ai.addr
    
    Fix this by copying the MAC address into a local variable for its safe use
    in all possible execution paths within function mlx5e_execute_l2_action.
    
    Addresses-Coverity-ID: 1417789
    Fixes: eeb66cdb6826 ("net/mlx5: Separate between E-Switch and MPFS")
    Signed-off-by: Gustavo A. R. Silva <[email protected]>
    Acked-by: Saeed Mahameed <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit 13c249a94f525fe4c757d28854049780b25605c4
Author: Marc Zyngier <[email protected]>
Date:   Sat Nov 4 12:33:47 2017 +0000

    net: mvpp2: Prevent userspace from changing TX affinities
    
    The mvpp2 driver can't cope at all with the TX affinities being
    changed from userspace, and spit an endless stream of
    
    [   91.779920] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing
    [   91.779930] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing
    [   91.780402] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing
    [   91.780406] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing
    [   91.780415] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing
    [   91.780418] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing
    
    rendering the box completely useless (I've measured around 600k
    interrupts/s on a 8040 box) once irqbalance kicks in and start
    doing its job.
    
    Obviously, the driver was never designed with this in mind. So let's
    work around the problem by preventing userspace from interacting
    with these interrupts altogether.
    
    Signed-off-by: Marc Zyngier <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit e806fa6bad186d7d353509b5a909abcf3a1e968b
Author: Gabriele Paoloni <[email protected]>
Date:   Tue Nov 7 18:37:54 2017 -0600

    MAINTAINERS: Remove Gabriele Paoloni as HiSilicon PCI maintainer
    
    Gabriele is now moving to a different role, so remove him as HiSilicon PCI
    maintainer.
    
    Signed-off-by: Gabriele Paoloni <[email protected]>
    [bhelgaas: Thanks for all your help, Gabriele, and best wishes!]
    Signed-off-by: Bjorn Helgaas <[email protected]>
    Acked-by: Zhou Wang <[email protected]>

commit c18475a41a13504928fc57e423f42e264844c5ad
Author: Sebastian Andrzej Siewior <[email protected]>
Date:   Tue Nov 7 18:37:38 2017 -0600

    MAINTAINERS: Remove Stephen Bates as Microsemi Switchtec maintainer
    
    Just sent an email there and received an autoreply because he no longer
    works there.
    
    Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
    Signed-off-by: Bjorn Helgaas <[email protected]>

commit ea4b3afe1eac8f88bb453798a084fba47a1f155a
Author: Jaedon Shin <[email protected]>
Date:   Fri Jun 16 20:03:01 2017 +0900

    MIPS: BMIPS: Fix missing cbr address
    
    Fix NULL pointer access in BMIPS3300 RAC flush.
    
    Fixes: 738a3f79027b ("MIPS: BMIPS: Add early CPU initialization code")
    Signed-off-by: Jaedon Shin <[email protected]>
    Reviewed-by: Florian Fainelli <[email protected]>
    Cc: Kevin Cernekee <[email protected]>
    Cc: [email protected]
    Cc: <[email protected]> # 4.7+
    Patchwork: https://patchwork.linux-mips.org/patch/16423/
    Signed-off-by: James Hogan <[email protected]>

commit fbc3edf7d7731d7a22c483c679700589bab936a3
Author: Borislav Petkov <[email protected]>
Date:   Tue Nov 7 17:37:24 2017 +0100

    drivers/ide-cd: Handle missing driver data during status check gracefully
    
    The 0day bot reports the below failure which happens occasionally, with
    their randconfig testing (once every ~100 boots).  The Code points at
    the private pointer ->driver_data being NULL, which hints at a race of
    sorts where the private driver_data descriptor has disappeared by the
    time we get to run the workqueue.
    
    So let's check that pointer before we continue with issuing the command
    to the drive.
    
    This fix is of the brown paper bag nature but considering that IDE is
    long deprecated, let's do that so that random testing which happens to
    enable CONFIG_IDE during randconfig builds, doesn't fail because of
    this.
    
    Besides, failing the TEST_UNIT_READY command because the drive private
    data is gone is something which we could simply do anyway, to denote
    that there was a problem communicating with the device.
    
      BUG: unable to handle kernel NULL pointer dereference at 000001c0
      IP: cdrom_check_status
      *pde = 00000000
      Oops: 0000 [#1] SMP
      CPU: 1 PID: 155 Comm: kworker/1:2 Not tainted 4.14.0-rc8 #127
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
      Workqueue: events_freezable_power_ disk_events_workfn
      task: 4fe90980 task.stack: 507ac000
      EIP: cdrom_check_status+0x2c/0x90
      EFLAGS: 00210246 CPU: 1
      EAX: 00000000 EBX: 4fefec00 ECX: 00000000 EDX: 00000000
      ESI: 00000003 EDI: ffffffff EBP: 467a9340 ESP: 507aded0
       DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
      CR0: 80050033 CR2: 000001c0 CR3: 06e0f000 CR4: 00000690
      Call Trace:
       ? ide_cdrom_check_events_real
       ? cdrom_check_events
       ? disk_check_events
       ? process_one_work
       ? process_one_work
       ? worker_thread
       ? kthread
       ? process_one_work
       ? __kthread_create_on_node
       ? ret_from_fork
      Code: 53 83 ec 14 89 c3 89 d1 be 03 00 00 00 65 a1 14 00 00 00 89 44 24 10 31 c0 8b 43 18 c7 44 24 04 00 00 00 00 c7 04 24 00 00 00 00 <8a> 80 c0 01 00 00 c7 44 24 08 00 00 00 00 83 e0 03 c7 44 24 0c
      EIP: cdrom_check_status+0x2c/0x90 SS:ESP: 0068:507aded0
      CR2: 00000000000001c0
      ---[ end trace 2410e586dd8f88b2 ]---
    
    Reported-and-tested-by: Fengguang Wu <[email protected]>
    Signed-off-by: Borislav Petkov <[email protected]>
    Cc: "David S. Miller" <[email protected]>
    Cc: Jens Axboe <[email protected]>
    Cc: Bart Van Assche <[email protected]>
    Signed-off-by: Linus Torvalds <[email protected]>

commit a817e73fe693f0718b6210f4b959478877fb2e2f
Author: Linus Torvalds <[email protected]>
Date:   Tue Nov 7 09:04:32 2017 -0800

    Revert "scsi: make 'state' device attribute pollable"
    
    This reverts commit 8a97712e5314aefe16b3ffb4583a34deaa49de04.
    
    This commit added a call to sysfs_notify() from within
    scsi_device_set_state(), which in turn turns out to make libata very
    unhappy, because ata_eh_detach_dev() does
    
            spin_lock_irqsave(ap->lock, flags);
            ..
            if (ata_scsi_offline_dev(dev)) {
                    dev->flags |= ATA_DFLAG_DETACHED;
                    ap->pflags |= ATA_PFLAG_SCSI_HOTPLUG;
            }
    
    and ata_scsi_offline_dev() then does that scsi_device_set_state() to set
    it offline.
    
    So now we called sysfs_notify() from within a spinlocked region, which
    really doesn't work.  The 0day robot reported this as:
    
       BUG: sleeping function called from invalid context at kernel/locking/mutex.c:238
    
    because sysfs_notify() ends up calling kernfs_find_and_get_ns() which
    then does mutex_lock(&kernfs_mutex)..
    
    The pollability of the device state isn't critical, so revert this all
    for now, and maybe we'll do it differently in the future.
    
    Reported-by: Fengguang Wu <[email protected]>
    Acked-by: Tejun Heo <[email protected]>
    Acked-by: Martin K. Petersen <[email protected]>
    Acked-by: Hannes Reinecke <[email protected]>
    Signed-off-by: Linus Torvalds <[email protected]>

commit 132d358b183ac6ad8b3fea32ad5e0663456d18d1
Author: Takashi Iwai <[email protected]>
Date:   Tue Nov 7 16:05:24 2017 +0100

    ALSA: seq: Fix OSS sysex delivery in OSS emulation
    
    The SYSEX event delivery in OSS sequencer emulation assumed that the
    event is encoded in the variable-length data with the straight
    buffering.  This was the normal behavior in the past, but during the
    development, the chained buffers were introduced for carrying more
    data, while the OSS code was left intact.  As a result, when a SYSEX
    event with the chained buffer data is passed to OSS sequencer port,
    it may end up with the wrong memory access, as if it were having a too
    large buffer.
    
    This patch addresses the bug, by applying the buffer data expansion by
    the generic snd_seq_dump_var_event() helper function.
    
    Reported-by: syzbot <[email protected]>
    Reported-by: Mark Salyzyn <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Takashi Iwai <[email protected]>

commit 71630b7a832f699d6a6764ae75797e4e743ae348
Author: Rafael J. Wysocki <[email protected]>
Date:   Mon Nov 6 23:56:57 2017 +0100

    ACPI / PM: Blacklist Low Power S0 Idle _DSM for Dell XPS13 9360
    
    At least one Dell XPS13 9360 is reported to have serious issues with
    the Low Power S0 Idle _DSM interface and since this machine model
    generally can do ACPI S3 just fine, add a blacklist entry to disable
    that interface for Dell XPS13 9360.
    
    Fixes: 8110dd281e15 (ACPI / sleep: EC-based wakeup from suspend-to-idle on recent systems)
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=196907
    Reported-by: Paul Menzel <[email protected]>
    Tested-by: Paul Menzel <[email protected]>
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Cc: 4.13+ <[email protected]> # 4.13+

commit 136fc5c41f349296db1910677bb7402b0eeff376
Author: Tobin C. Harding <[email protected]>
Date:   Mon Nov 6 16:19:27 2017 +1100

    scripts: add leaking_addresses.pl
    
    Currently we are leaking addresses from the kernel to user space. This
    script is an attempt to find some of those leakages. Script parses
    `dmesg` output and /proc and /sys files for hex strings that look like
    kernel addresses.
    
    Only works for 64 bit kernels, the reason being that kernel addresses on
    64 bit kernels have 'ffff' as the leading bit pattern making greping
    possible. On 32 kernels we don't have this luxury.
    
    Scripts is _slightly_ smarter than a straight grep, we check for false
    positives (all 0's or all 1's, and vsyscall start/finish addresses).
    
    [ I think there is a lot of room for improvement here, but it's already
      useful, so I'm merging it as-is. The whole "hash %p format" series is
      expected to go into 4.15, but will not fix %x users, and will not
      incentivize people to look at what they are leaking.     - Linus ]
    
    Signed-off-by: Tobin C. Harding <[email protected]>
    Signed-off-by: Linus Torvalds <[email protected]>

commit 3510c7aa069aa83a2de6dab2b41401a198317bdc
Author: Takashi Iwai <[email protected]>
Date:   Mon Nov 6 20:16:50 2017 +0100

    ALSA: seq: Avoid invalid lockdep class warning
    
    The recent fix for adding rwsem nesting annotation was using the given
    "hop" argument as the lock subclass key.  Although the idea itself
    works, it may trigger a kernel warning like:
      BUG: looking up invalid subclass: 8
      ....
    since the lockdep has a smaller number of subclasses (8) than we
    currently allow for the hops there (10).
    
    The current definition is merely a sanity check for avoiding the too
    deep delivery paths, and the 8 hops are already enough.  So, as a
    quick fix, just follow the max hops as same as the max lockdep
    subclasses.
    
    Fixes: 1f20f9ff57ca ("ALSA: seq: Fix nested rwsem annotation for lockdep splat")
    Reported-by: syzbot <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Takashi Iwai <[email protected]>

commit b9dd05c7002ee0ca8b676428b2268c26399b5e31
Author: Mark Rutland <[email protected]>
Date:   Thu Nov 2 18:44:28 2017 +0100

    ARM: 8720/1: ensure dump_instr() checks addr_limit
    
    When CONFIG_DEBUG_USER is enabled, it's possible for a user to
    deliberately trigger dump_instr() with a chosen kernel address.
    
    Let's avoid problems resulting from this by using get_user() rather than
    __get_user(), ensuring that we don't erroneously access kernel memory.
    
    So that we can use the same code to dump user instructions and kernel
    instructions, the common dumping code is factored out to __dump_instr(),
    with the fs manipulated appropriately in dump_instr() around calls to
    this.
    
    Signed-off-by: Mark Rutland <[email protected]>
    Cc: [email protected]
    Signed-off-by: Russell King <[email protected]>

commit 9b7d869ee5a77ed4a462372bb89af622e705bfb8
Author: Takashi Iwai <[email protected]>
Date:   Sun Nov 5 10:07:43 2017 +0100

    ALSA: timer: Limit max instances per timer
    
    Currently we allow unlimited number of timer instances, and it may
    bring the system hogging way too much CPU when too many timer
    instances are o…
Noltari pushed a commit to Noltari/linux that referenced this pull request Nov 24, 2017
[ Upstream commit 8f56246 ]

When asix_suspend() is called dev->driver_priv might not have been
assigned a value, so we need to check that it's not NULL.

Similar issue is present in asix_resume(), this patch fixes it as well.

Found by syzkaller.

kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] PREEMPT SMP KASAN
Modules linked in:
CPU: 0 PID: 24 Comm: kworker/0:1 Not tainted 4.14.0-rc4-43422-geccacdd69a8c torvalds#400
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Workqueue: usb_hub_wq hub_event
task: ffff88006bb36300 task.stack: ffff88006bba8000
RIP: 0010:asix_suspend+0x76/0xc0 drivers/net/usb/asix_devices.c:629
RSP: 0018:ffff88006bbae718 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: ffff880061ba3b80 RCX: 1ffff1000c34d644
RDX: 0000000000000001 RSI: 0000000000000402 RDI: 0000000000000008
RBP: ffff88006bbae738 R08: 1ffff1000d775cad R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800630a8b40
R13: 0000000000000000 R14: 0000000000000402 R15: ffff880061ba3b80
FS:  0000000000000000(0000) GS:ffff88006c600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ff33cf89000 CR3: 0000000061c0a000 CR4: 00000000000006f0
Call Trace:
 usb_suspend_interface drivers/usb/core/driver.c:1209
 usb_suspend_both+0x27f/0x7e0 drivers/usb/core/driver.c:1314
 usb_runtime_suspend+0x41/0x120 drivers/usb/core/driver.c:1852
 __rpm_callback+0x339/0xb60 drivers/base/power/runtime.c:334
 rpm_callback+0x106/0x220 drivers/base/power/runtime.c:461
 rpm_suspend+0x465/0x1980 drivers/base/power/runtime.c:596
 __pm_runtime_suspend+0x11e/0x230 drivers/base/power/runtime.c:1009
 pm_runtime_put_sync_autosuspend ./include/linux/pm_runtime.h:251
 usb_new_device+0xa37/0x1020 drivers/usb/core/hub.c:2487
 hub_port_connect drivers/usb/core/hub.c:4903
 hub_port_connect_change drivers/usb/core/hub.c:5009
 port_event drivers/usb/core/hub.c:5115
 hub_event+0x194d/0x3740 drivers/usb/core/hub.c:5195
 process_one_work+0xc7f/0x1db0 kernel/workqueue.c:2119
 worker_thread+0x221/0x1850 kernel/workqueue.c:2253
 kthread+0x3a1/0x470 kernel/kthread.c:231
 ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:431
Code: 8d 7c 24 20 48 89 fa 48 c1 ea 03 80 3c 02 00 75 5b 48 b8 00 00
00 00 00 fc ff df 4d 8b 6c 24 20 49 8d 7d 08 48 89 fa 48 c1 ea 03 <80>
3c 02 00 75 34 4d 8b 6d 08 4d 85 ed 74 0b e8 26 2b 51 fd 4c
RIP: asix_suspend+0x76/0xc0 RSP: ffff88006bbae718
---[ end trace dfc4f5649284342c ]---

Signed-off-by: Andrey Konovalov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
damentz pushed a commit to zen-kernel/zen-kernel that referenced this pull request Nov 24, 2017
[ Upstream commit 8f56246 ]

When asix_suspend() is called dev->driver_priv might not have been
assigned a value, so we need to check that it's not NULL.

Similar issue is present in asix_resume(), this patch fixes it as well.

Found by syzkaller.

kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] PREEMPT SMP KASAN
Modules linked in:
CPU: 0 PID: 24 Comm: kworker/0:1 Not tainted 4.14.0-rc4-43422-geccacdd69a8c torvalds#400
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Workqueue: usb_hub_wq hub_event
task: ffff88006bb36300 task.stack: ffff88006bba8000
RIP: 0010:asix_suspend+0x76/0xc0 drivers/net/usb/asix_devices.c:629
RSP: 0018:ffff88006bbae718 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: ffff880061ba3b80 RCX: 1ffff1000c34d644
RDX: 0000000000000001 RSI: 0000000000000402 RDI: 0000000000000008
RBP: ffff88006bbae738 R08: 1ffff1000d775cad R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800630a8b40
R13: 0000000000000000 R14: 0000000000000402 R15: ffff880061ba3b80
FS:  0000000000000000(0000) GS:ffff88006c600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ff33cf89000 CR3: 0000000061c0a000 CR4: 00000000000006f0
Call Trace:
 usb_suspend_interface drivers/usb/core/driver.c:1209
 usb_suspend_both+0x27f/0x7e0 drivers/usb/core/driver.c:1314
 usb_runtime_suspend+0x41/0x120 drivers/usb/core/driver.c:1852
 __rpm_callback+0x339/0xb60 drivers/base/power/runtime.c:334
 rpm_callback+0x106/0x220 drivers/base/power/runtime.c:461
 rpm_suspend+0x465/0x1980 drivers/base/power/runtime.c:596
 __pm_runtime_suspend+0x11e/0x230 drivers/base/power/runtime.c:1009
 pm_runtime_put_sync_autosuspend ./include/linux/pm_runtime.h:251
 usb_new_device+0xa37/0x1020 drivers/usb/core/hub.c:2487
 hub_port_connect drivers/usb/core/hub.c:4903
 hub_port_connect_change drivers/usb/core/hub.c:5009
 port_event drivers/usb/core/hub.c:5115
 hub_event+0x194d/0x3740 drivers/usb/core/hub.c:5195
 process_one_work+0xc7f/0x1db0 kernel/workqueue.c:2119
 worker_thread+0x221/0x1850 kernel/workqueue.c:2253
 kthread+0x3a1/0x470 kernel/kthread.c:231
 ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:431
Code: 8d 7c 24 20 48 89 fa 48 c1 ea 03 80 3c 02 00 75 5b 48 b8 00 00
00 00 00 fc ff df 4d 8b 6c 24 20 49 8d 7d 08 48 89 fa 48 c1 ea 03 <80>
3c 02 00 75 34 4d 8b 6d 08 4d 85 ed 74 0b e8 26 2b 51 fd 4c
RIP: asix_suspend+0x76/0xc0 RSP: ffff88006bbae718
---[ end trace dfc4f5649284342c ]---

Signed-off-by: Andrey Konovalov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
torvalds pushed a commit that referenced this pull request Aug 1, 2018
When building the kernel as Thumb-2 with binutils 2.29 or newer, if the
assembler has seen the .type directive (via ENDPROC()) for a symbol, it
automatically handles the setting of the lowest bit when the symbol is
used with ADR.  The badr macro on the other hand handles this lowest bit
manually.  This leads to a jump to a wrong address in the wrong state
in the syscall return path:

 Internal error: Oops - undefined instruction: 0 [#2] SMP THUMB2
 Modules linked in:
 CPU: 0 PID: 652 Comm: modprobe Tainted: G      D           4.18.0-rc3+ #8
 PC is at ret_fast_syscall+0x4/0x62
 LR is at sys_brk+0x109/0x128
 pc : [<80101004>]    lr : [<801c8a35>]    psr: 60000013
 Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
 Control: 50c5387d  Table: 9e82006a  DAC: 00000051
 Process modprobe (pid: 652, stack limit = 0x(ptrval))

 80101000 <ret_fast_syscall>:
 80101000:       b672            cpsid   i
 80101002:       f8d9 2008       ldr.w   r2, [r9, #8]
 80101006:       f1b2 4ffe       cmp.w   r2, #2130706432 ; 0x7f000000

 80101184 <local_restart>:
 80101184:       f8d9 a000       ldr.w   sl, [r9]
 80101188:       e92d 0030       stmdb   sp!, {r4, r5}
 8010118c:       f01a 0ff0       tst.w   sl, #240        ; 0xf0
 80101190:       d117            bne.n   801011c2 <__sys_trace>
 80101192:       46ba            mov     sl, r7
 80101194:       f5ba 7fc8       cmp.w   sl, #400        ; 0x190
 80101198:       bf28            it      cs
 8010119a:       f04f 0a00       movcs.w sl, #0
 8010119e:       f3af 8014       nop.w   {20}
 801011a2:       f2af 1ea2       subw    lr, pc, #418    ; 0x1a2

To fix this, add a new symbol name which doesn't have ENDPROC used on it
and use that with badr.  We can't remove the badr usage since that would
would cause breakage with older binutils.

Signed-off-by: Vincent Whitchurch <[email protected]>
Signed-off-by: Russell King <[email protected]>
leitao pushed a commit to leitao/linux that referenced this pull request Aug 3, 2018
When building the kernel as Thumb-2 with binutils 2.29 or newer, if the
assembler has seen the .type directive (via ENDPROC()) for a symbol, it
automatically handles the setting of the lowest bit when the symbol is
used with ADR.  The badr macro on the other hand handles this lowest bit
manually.  This leads to a jump to a wrong address in the wrong state
in the syscall return path:

 Internal error: Oops - undefined instruction: 0 [#2] SMP THUMB2
 Modules linked in:
 CPU: 0 PID: 652 Comm: modprobe Tainted: G      D           4.18.0-rc3+ torvalds#8
 PC is at ret_fast_syscall+0x4/0x62
 LR is at sys_brk+0x109/0x128
 pc : [<80101004>]    lr : [<801c8a35>]    psr: 60000013
 Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
 Control: 50c5387d  Table: 9e82006a  DAC: 00000051
 Process modprobe (pid: 652, stack limit = 0x(ptrval))

 80101000 <ret_fast_syscall>:
 80101000:       b672            cpsid   i
 80101002:       f8d9 2008       ldr.w   r2, [r9, torvalds#8]
 80101006:       f1b2 4ffe       cmp.w   r2, #2130706432 ; 0x7f000000

 80101184 <local_restart>:
 80101184:       f8d9 a000       ldr.w   sl, [r9]
 80101188:       e92d 0030       stmdb   sp!, {r4, r5}
 8010118c:       f01a 0ff0       tst.w   sl, torvalds#240        ; 0xf0
 80101190:       d117            bne.n   801011c2 <__sys_trace>
 80101192:       46ba            mov     sl, r7
 80101194:       f5ba 7fc8       cmp.w   sl, torvalds#400        ; 0x190
 80101198:       bf28            it      cs
 8010119a:       f04f 0a00       movcs.w sl, #0
 8010119e:       f3af 8014       nop.w   {20}
 801011a2:       f2af 1ea2       subw    lr, pc, torvalds#418    ; 0x1a2

To fix this, add a new symbol name which doesn't have ENDPROC used on it
and use that with badr.  We can't remove the badr usage since that would
would cause breakage with older binutils.

Signed-off-by: Vincent Whitchurch <[email protected]>
Signed-off-by: Russell King <[email protected]>
Mic92 pushed a commit to Mic92/linux that referenced this pull request Feb 4, 2019
Noltari pushed a commit to Noltari/linux that referenced this pull request Mar 13, 2019
[ Upstream commit afc9f65 ]

When building the kernel as Thumb-2 with binutils 2.29 or newer, if the
assembler has seen the .type directive (via ENDPROC()) for a symbol, it
automatically handles the setting of the lowest bit when the symbol is
used with ADR.  The badr macro on the other hand handles this lowest bit
manually.  This leads to a jump to a wrong address in the wrong state
in the syscall return path:

 Internal error: Oops - undefined instruction: 0 [#2] SMP THUMB2
 Modules linked in:
 CPU: 0 PID: 652 Comm: modprobe Tainted: G      D           4.18.0-rc3+ torvalds#8
 PC is at ret_fast_syscall+0x4/0x62
 LR is at sys_brk+0x109/0x128
 pc : [<80101004>]    lr : [<801c8a35>]    psr: 60000013
 Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
 Control: 50c5387d  Table: 9e82006a  DAC: 00000051
 Process modprobe (pid: 652, stack limit = 0x(ptrval))

 80101000 <ret_fast_syscall>:
 80101000:       b672            cpsid   i
 80101002:       f8d9 2008       ldr.w   r2, [r9, torvalds#8]
 80101006:       f1b2 4ffe       cmp.w   r2, #2130706432 ; 0x7f000000

 80101184 <local_restart>:
 80101184:       f8d9 a000       ldr.w   sl, [r9]
 80101188:       e92d 0030       stmdb   sp!, {r4, r5}
 8010118c:       f01a 0ff0       tst.w   sl, torvalds#240        ; 0xf0
 80101190:       d117            bne.n   801011c2 <__sys_trace>
 80101192:       46ba            mov     sl, r7
 80101194:       f5ba 7fc8       cmp.w   sl, torvalds#400        ; 0x190
 80101198:       bf28            it      cs
 8010119a:       f04f 0a00       movcs.w sl, #0
 8010119e:       f3af 8014       nop.w   {20}
 801011a2:       f2af 1ea2       subw    lr, pc, torvalds#418    ; 0x1a2

To fix this, add a new symbol name which doesn't have ENDPROC used on it
and use that with badr.  We can't remove the badr usage since that would
would cause breakage with older binutils.

Signed-off-by: Vincent Whitchurch <[email protected]>
Signed-off-by: Russell King <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
repojohnray pushed a commit to repojohnray/linux-sunxi-4.7.y that referenced this pull request Mar 14, 2019
[ Upstream commit afc9f65 ]

When building the kernel as Thumb-2 with binutils 2.29 or newer, if the
assembler has seen the .type directive (via ENDPROC()) for a symbol, it
automatically handles the setting of the lowest bit when the symbol is
used with ADR.  The badr macro on the other hand handles this lowest bit
manually.  This leads to a jump to a wrong address in the wrong state
in the syscall return path:

 Internal error: Oops - undefined instruction: 0 [jwrdegoede#2] SMP THUMB2
 Modules linked in:
 CPU: 0 PID: 652 Comm: modprobe Tainted: G      D           4.18.0-rc3+ linux-sunxi#8
 PC is at ret_fast_syscall+0x4/0x62
 LR is at sys_brk+0x109/0x128
 pc : [<80101004>]    lr : [<801c8a35>]    psr: 60000013
 Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
 Control: 50c5387d  Table: 9e82006a  DAC: 00000051
 Process modprobe (pid: 652, stack limit = 0x(ptrval))

 80101000 <ret_fast_syscall>:
 80101000:       b672            cpsid   i
 80101002:       f8d9 2008       ldr.w   r2, [r9, linux-sunxi#8]
 80101006:       f1b2 4ffe       cmp.w   r2, #2130706432 ; 0x7f000000

 80101184 <local_restart>:
 80101184:       f8d9 a000       ldr.w   sl, [r9]
 80101188:       e92d 0030       stmdb   sp!, {r4, r5}
 8010118c:       f01a 0ff0       tst.w   sl, linux-sunxi#240        ; 0xf0
 80101190:       d117            bne.n   801011c2 <__sys_trace>
 80101192:       46ba            mov     sl, r7
 80101194:       f5ba 7fc8       cmp.w   sl, torvalds#400        ; 0x190
 80101198:       bf28            it      cs
 8010119a:       f04f 0a00       movcs.w sl, #0
 8010119e:       f3af 8014       nop.w   {20}
 801011a2:       f2af 1ea2       subw    lr, pc, torvalds#418    ; 0x1a2

To fix this, add a new symbol name which doesn't have ENDPROC used on it
and use that with badr.  We can't remove the badr usage since that would
would cause breakage with older binutils.

Signed-off-by: Vincent Whitchurch <[email protected]>
Signed-off-by: Russell King <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
tobetter pushed a commit to tobetter/linux that referenced this pull request Apr 17, 2019
[ Upstream commit afc9f65 ]

When building the kernel as Thumb-2 with binutils 2.29 or newer, if the
assembler has seen the .type directive (via ENDPROC()) for a symbol, it
automatically handles the setting of the lowest bit when the symbol is
used with ADR.  The badr macro on the other hand handles this lowest bit
manually.  This leads to a jump to a wrong address in the wrong state
in the syscall return path:

 Internal error: Oops - undefined instruction: 0 [#2] SMP THUMB2
 Modules linked in:
 CPU: 0 PID: 652 Comm: modprobe Tainted: G      D           4.18.0-rc3+ #8
 PC is at ret_fast_syscall+0x4/0x62
 LR is at sys_brk+0x109/0x128
 pc : [<80101004>]    lr : [<801c8a35>]    psr: 60000013
 Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
 Control: 50c5387d  Table: 9e82006a  DAC: 00000051
 Process modprobe (pid: 652, stack limit = 0x(ptrval))

 80101000 <ret_fast_syscall>:
 80101000:       b672            cpsid   i
 80101002:       f8d9 2008       ldr.w   r2, [r9, #8]
 80101006:       f1b2 4ffe       cmp.w   r2, #2130706432 ; 0x7f000000

 80101184 <local_restart>:
 80101184:       f8d9 a000       ldr.w   sl, [r9]
 80101188:       e92d 0030       stmdb   sp!, {r4, r5}
 8010118c:       f01a 0ff0       tst.w   sl, torvalds#240        ; 0xf0
 80101190:       d117            bne.n   801011c2 <__sys_trace>
 80101192:       46ba            mov     sl, r7
 80101194:       f5ba 7fc8       cmp.w   sl, torvalds#400        ; 0x190
 80101198:       bf28            it      cs
 8010119a:       f04f 0a00       movcs.w sl, #0
 8010119e:       f3af 8014       nop.w   {20}
 801011a2:       f2af 1ea2       subw    lr, pc, torvalds#418    ; 0x1a2

To fix this, add a new symbol name which doesn't have ENDPROC used on it
and use that with badr.  We can't remove the badr usage since that would
would cause breakage with older binutils.

Signed-off-by: Vincent Whitchurch <[email protected]>
Signed-off-by: Russell King <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
fengguang pushed a commit to 0day-ci/linux that referenced this pull request May 22, 2019
Because of both sides doing L2CAP disconnection at the same time, it
was possible to receive L2CAP Disconnection Response with CID that was
already freed. That caused problems if CID was already reused and L2CAP
Connection Request with same CID was sent out. Before this patch kernel
deleted channel context regardless of the state of the channel.

Example where leftover Disconnection Response (frame torvalds#402) causes local
device to delete L2CAP channel which was not yet connected. This in
turn confuses remote device's stack because same CID is re-used without
properly disconnecting.

Btmon capture before patch:
** snip **
> ACL Data RX: Handle 43 flags 0x02 dlen 8                torvalds#394 [hci1] 10.748949
      Channel: 65 len 4 [PSM 3 mode 0] {chan 2}
      RFCOMM: Disconnect (DISC) (0x43)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x53 poll/final 1
         Length: 0
         FCS: 0xfd
< ACL Data TX: Handle 43 flags 0x00 dlen 8                torvalds#395 [hci1] 10.749062
      Channel: 65 len 4 [PSM 3 mode 0] {chan 2}
      RFCOMM: Unnumbered Ack (UA) (0x63)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x73 poll/final 1
         Length: 0
         FCS: 0xd7
< ACL Data TX: Handle 43 flags 0x00 dlen 12               torvalds#396 [hci1] 10.749073
      L2CAP: Disconnection Request (0x06) ident 17 len 4
        Destination CID: 65
        Source CID: 65
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#397 [hci1] 10.752391
        Num handles: 1
        Handle: 43
        Count: 1
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#398 [hci1] 10.753394
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12               torvalds#399 [hci1] 10.756499
      L2CAP: Disconnection Request (0x06) ident 26 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12               torvalds#400 [hci1] 10.756548
      L2CAP: Disconnection Response (0x07) ident 26 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12               torvalds#401 [hci1] 10.757459
      L2CAP: Connection Request (0x02) ident 18 len 4
        PSM: 1 (0x0001)
        Source CID: 65
> ACL Data RX: Handle 43 flags 0x02 dlen 12               torvalds#402 [hci1] 10.759148
      L2CAP: Disconnection Response (0x07) ident 17 len 4
        Destination CID: 65
        Source CID: 65
= bluetoothd: 00:1E:AB:4C:56:54: error updating services: Input/o..   10.759447
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#403 [hci1] 10.759386
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12               torvalds#404 [hci1] 10.760397
      L2CAP: Connection Request (0x02) ident 27 len 4
        PSM: 3 (0x0003)
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 16               torvalds#405 [hci1] 10.760441
      L2CAP: Connection Response (0x03) ident 27 len 8
        Destination CID: 65
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
< ACL Data TX: Handle 43 flags 0x00 dlen 27               torvalds#406 [hci1] 10.760449
      L2CAP: Configure Request (0x04) ident 19 len 19
        Destination CID: 65
        Flags: 0x0000
        Option: Maximum Transmission Unit (0x01) [mandatory]
          MTU: 1013
        Option: Retransmission and Flow Control (0x04) [mandatory]
          Mode: Basic (0x00)
          TX window size: 0
          Max transmit: 0
          Retransmission timeout: 0
          Monitor timeout: 0
          Maximum PDU size: 0
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#407 [hci1] 10.761399
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 16               torvalds#408 [hci1] 10.762942
      L2CAP: Connection Response (0x03) ident 18 len 8
        Destination CID: 66
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
*snip*

Similar case after the patch:
*snip*
> ACL Data RX: Handle 43 flags 0x02 dlen 8            #22702 [hci0] 1664.411056
      Channel: 65 len 4 [PSM 3 mode 0] {chan 3}
      RFCOMM: Disconnect (DISC) (0x43)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x53 poll/final 1
         Length: 0
         FCS: 0xfd
< ACL Data TX: Handle 43 flags 0x00 dlen 8            #22703 [hci0] 1664.411136
      Channel: 65 len 4 [PSM 3 mode 0] {chan 3}
      RFCOMM: Unnumbered Ack (UA) (0x63)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x73 poll/final 1
         Length: 0
         FCS: 0xd7
< ACL Data TX: Handle 43 flags 0x00 dlen 12           #22704 [hci0] 1664.411143
      L2CAP: Disconnection Request (0x06) ident 11 len 4
        Destination CID: 65
        Source CID: 65
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22705 [hci0] 1664.414009
        Num handles: 1
        Handle: 43
        Count: 1
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22706 [hci0] 1664.415007
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12           #22707 [hci0] 1664.418674
      L2CAP: Disconnection Request (0x06) ident 17 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12           #22708 [hci0] 1664.418762
      L2CAP: Disconnection Response (0x07) ident 17 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12           #22709 [hci0] 1664.421073
      L2CAP: Connection Request (0x02) ident 12 len 4
        PSM: 1 (0x0001)
        Source CID: 65
> ACL Data RX: Handle 43 flags 0x02 dlen 12           #22710 [hci0] 1664.421371
      L2CAP: Disconnection Response (0x07) ident 11 len 4
        Destination CID: 65
        Source CID: 65
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22711 [hci0] 1664.424082
        Num handles: 1
        Handle: 43
        Count: 1
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22712 [hci0] 1664.425040
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12           #22713 [hci0] 1664.426103
      L2CAP: Connection Request (0x02) ident 18 len 4
        PSM: 3 (0x0003)
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 16           #22714 [hci0] 1664.426186
      L2CAP: Connection Response (0x03) ident 18 len 8
        Destination CID: 66
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
< ACL Data TX: Handle 43 flags 0x00 dlen 27           #22715 [hci0] 1664.426196
      L2CAP: Configure Request (0x04) ident 13 len 19
        Destination CID: 65
        Flags: 0x0000
        Option: Maximum Transmission Unit (0x01) [mandatory]
          MTU: 1013
        Option: Retransmission and Flow Control (0x04) [mandatory]
          Mode: Basic (0x00)
          TX window size: 0
          Max transmit: 0
          Retransmission timeout: 0
          Monitor timeout: 0
          Maximum PDU size: 0
> ACL Data RX: Handle 43 flags 0x02 dlen 16           #22716 [hci0] 1664.428804
      L2CAP: Connection Response (0x03) ident 12 len 8
        Destination CID: 66
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
*snip*

Fix is to check that channel is in state BT_DISCONN before deleting the
channel.

This bug was found while fuzzing Bluez's OBEX implementation using
Synopsys Defensics.

Reported-by: Matti Kamunen <[email protected]>
Reported-by: Ari Timonen <[email protected]>
Signed-off-by: Matias Karhumaa <[email protected]>
alaahl pushed a commit to alaahl/linux that referenced this pull request Jul 8, 2019
Because of both sides doing L2CAP disconnection at the same time, it
was possible to receive L2CAP Disconnection Response with CID that was
already freed. That caused problems if CID was already reused and L2CAP
Connection Request with same CID was sent out. Before this patch kernel
deleted channel context regardless of the state of the channel.

Example where leftover Disconnection Response (frame torvalds#402) causes local
device to delete L2CAP channel which was not yet connected. This in
turn confuses remote device's stack because same CID is re-used without
properly disconnecting.

Btmon capture before patch:
** snip **
> ACL Data RX: Handle 43 flags 0x02 dlen 8                torvalds#394 [hci1] 10.748949
      Channel: 65 len 4 [PSM 3 mode 0] {chan 2}
      RFCOMM: Disconnect (DISC) (0x43)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x53 poll/final 1
         Length: 0
         FCS: 0xfd
< ACL Data TX: Handle 43 flags 0x00 dlen 8                torvalds#395 [hci1] 10.749062
      Channel: 65 len 4 [PSM 3 mode 0] {chan 2}
      RFCOMM: Unnumbered Ack (UA) (0x63)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x73 poll/final 1
         Length: 0
         FCS: 0xd7
< ACL Data TX: Handle 43 flags 0x00 dlen 12               torvalds#396 [hci1] 10.749073
      L2CAP: Disconnection Request (0x06) ident 17 len 4
        Destination CID: 65
        Source CID: 65
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#397 [hci1] 10.752391
        Num handles: 1
        Handle: 43
        Count: 1
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#398 [hci1] 10.753394
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12               torvalds#399 [hci1] 10.756499
      L2CAP: Disconnection Request (0x06) ident 26 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12               torvalds#400 [hci1] 10.756548
      L2CAP: Disconnection Response (0x07) ident 26 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12               torvalds#401 [hci1] 10.757459
      L2CAP: Connection Request (0x02) ident 18 len 4
        PSM: 1 (0x0001)
        Source CID: 65
> ACL Data RX: Handle 43 flags 0x02 dlen 12               torvalds#402 [hci1] 10.759148
      L2CAP: Disconnection Response (0x07) ident 17 len 4
        Destination CID: 65
        Source CID: 65
= bluetoothd: 00:1E:AB:4C:56:54: error updating services: Input/o..   10.759447
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#403 [hci1] 10.759386
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12               torvalds#404 [hci1] 10.760397
      L2CAP: Connection Request (0x02) ident 27 len 4
        PSM: 3 (0x0003)
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 16               torvalds#405 [hci1] 10.760441
      L2CAP: Connection Response (0x03) ident 27 len 8
        Destination CID: 65
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
< ACL Data TX: Handle 43 flags 0x00 dlen 27               torvalds#406 [hci1] 10.760449
      L2CAP: Configure Request (0x04) ident 19 len 19
        Destination CID: 65
        Flags: 0x0000
        Option: Maximum Transmission Unit (0x01) [mandatory]
          MTU: 1013
        Option: Retransmission and Flow Control (0x04) [mandatory]
          Mode: Basic (0x00)
          TX window size: 0
          Max transmit: 0
          Retransmission timeout: 0
          Monitor timeout: 0
          Maximum PDU size: 0
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#407 [hci1] 10.761399
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 16               torvalds#408 [hci1] 10.762942
      L2CAP: Connection Response (0x03) ident 18 len 8
        Destination CID: 66
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
*snip*

Similar case after the patch:
*snip*
> ACL Data RX: Handle 43 flags 0x02 dlen 8            #22702 [hci0] 1664.411056
      Channel: 65 len 4 [PSM 3 mode 0] {chan 3}
      RFCOMM: Disconnect (DISC) (0x43)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x53 poll/final 1
         Length: 0
         FCS: 0xfd
< ACL Data TX: Handle 43 flags 0x00 dlen 8            #22703 [hci0] 1664.411136
      Channel: 65 len 4 [PSM 3 mode 0] {chan 3}
      RFCOMM: Unnumbered Ack (UA) (0x63)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x73 poll/final 1
         Length: 0
         FCS: 0xd7
< ACL Data TX: Handle 43 flags 0x00 dlen 12           #22704 [hci0] 1664.411143
      L2CAP: Disconnection Request (0x06) ident 11 len 4
        Destination CID: 65
        Source CID: 65
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22705 [hci0] 1664.414009
        Num handles: 1
        Handle: 43
        Count: 1
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22706 [hci0] 1664.415007
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12           #22707 [hci0] 1664.418674
      L2CAP: Disconnection Request (0x06) ident 17 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12           #22708 [hci0] 1664.418762
      L2CAP: Disconnection Response (0x07) ident 17 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12           #22709 [hci0] 1664.421073
      L2CAP: Connection Request (0x02) ident 12 len 4
        PSM: 1 (0x0001)
        Source CID: 65
> ACL Data RX: Handle 43 flags 0x02 dlen 12           #22710 [hci0] 1664.421371
      L2CAP: Disconnection Response (0x07) ident 11 len 4
        Destination CID: 65
        Source CID: 65
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22711 [hci0] 1664.424082
        Num handles: 1
        Handle: 43
        Count: 1
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22712 [hci0] 1664.425040
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12           #22713 [hci0] 1664.426103
      L2CAP: Connection Request (0x02) ident 18 len 4
        PSM: 3 (0x0003)
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 16           #22714 [hci0] 1664.426186
      L2CAP: Connection Response (0x03) ident 18 len 8
        Destination CID: 66
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
< ACL Data TX: Handle 43 flags 0x00 dlen 27           #22715 [hci0] 1664.426196
      L2CAP: Configure Request (0x04) ident 13 len 19
        Destination CID: 65
        Flags: 0x0000
        Option: Maximum Transmission Unit (0x01) [mandatory]
          MTU: 1013
        Option: Retransmission and Flow Control (0x04) [mandatory]
          Mode: Basic (0x00)
          TX window size: 0
          Max transmit: 0
          Retransmission timeout: 0
          Monitor timeout: 0
          Maximum PDU size: 0
> ACL Data RX: Handle 43 flags 0x02 dlen 16           #22716 [hci0] 1664.428804
      L2CAP: Connection Response (0x03) ident 12 len 8
        Destination CID: 66
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
*snip*

Fix is to check that channel is in state BT_DISCONN before deleting the
channel.

This bug was found while fuzzing Bluez's OBEX implementation using
Synopsys Defensics.

Reported-by: Matti Kamunen <[email protected]>
Reported-by: Ari Timonen <[email protected]>
Signed-off-by: Matias Karhumaa <[email protected]>
Signed-off-by: Marcel Holtmann <[email protected]>
mrchapp pushed a commit to mrchapp/linux that referenced this pull request Jul 24, 2019
[ Upstream commit 28261da ]

Because of both sides doing L2CAP disconnection at the same time, it
was possible to receive L2CAP Disconnection Response with CID that was
already freed. That caused problems if CID was already reused and L2CAP
Connection Request with same CID was sent out. Before this patch kernel
deleted channel context regardless of the state of the channel.

Example where leftover Disconnection Response (frame torvalds#402) causes local
device to delete L2CAP channel which was not yet connected. This in
turn confuses remote device's stack because same CID is re-used without
properly disconnecting.

Btmon capture before patch:
** snip **
> ACL Data RX: Handle 43 flags 0x02 dlen 8                torvalds#394 [hci1] 10.748949
      Channel: 65 len 4 [PSM 3 mode 0] {chan 2}
      RFCOMM: Disconnect (DISC) (0x43)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x53 poll/final 1
         Length: 0
         FCS: 0xfd
< ACL Data TX: Handle 43 flags 0x00 dlen 8                torvalds#395 [hci1] 10.749062
      Channel: 65 len 4 [PSM 3 mode 0] {chan 2}
      RFCOMM: Unnumbered Ack (UA) (0x63)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x73 poll/final 1
         Length: 0
         FCS: 0xd7
< ACL Data TX: Handle 43 flags 0x00 dlen 12               torvalds#396 [hci1] 10.749073
      L2CAP: Disconnection Request (0x06) ident 17 len 4
        Destination CID: 65
        Source CID: 65
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#397 [hci1] 10.752391
        Num handles: 1
        Handle: 43
        Count: 1
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#398 [hci1] 10.753394
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12               torvalds#399 [hci1] 10.756499
      L2CAP: Disconnection Request (0x06) ident 26 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12               torvalds#400 [hci1] 10.756548
      L2CAP: Disconnection Response (0x07) ident 26 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12               torvalds#401 [hci1] 10.757459
      L2CAP: Connection Request (0x02) ident 18 len 4
        PSM: 1 (0x0001)
        Source CID: 65
> ACL Data RX: Handle 43 flags 0x02 dlen 12               torvalds#402 [hci1] 10.759148
      L2CAP: Disconnection Response (0x07) ident 17 len 4
        Destination CID: 65
        Source CID: 65
= bluetoothd: 00:1E:AB:4C:56:54: error updating services: Input/o..   10.759447
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#403 [hci1] 10.759386
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12               torvalds#404 [hci1] 10.760397
      L2CAP: Connection Request (0x02) ident 27 len 4
        PSM: 3 (0x0003)
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 16               torvalds#405 [hci1] 10.760441
      L2CAP: Connection Response (0x03) ident 27 len 8
        Destination CID: 65
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
< ACL Data TX: Handle 43 flags 0x00 dlen 27               torvalds#406 [hci1] 10.760449
      L2CAP: Configure Request (0x04) ident 19 len 19
        Destination CID: 65
        Flags: 0x0000
        Option: Maximum Transmission Unit (0x01) [mandatory]
          MTU: 1013
        Option: Retransmission and Flow Control (0x04) [mandatory]
          Mode: Basic (0x00)
          TX window size: 0
          Max transmit: 0
          Retransmission timeout: 0
          Monitor timeout: 0
          Maximum PDU size: 0
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#407 [hci1] 10.761399
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 16               torvalds#408 [hci1] 10.762942
      L2CAP: Connection Response (0x03) ident 18 len 8
        Destination CID: 66
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
*snip*

Similar case after the patch:
*snip*
> ACL Data RX: Handle 43 flags 0x02 dlen 8            #22702 [hci0] 1664.411056
      Channel: 65 len 4 [PSM 3 mode 0] {chan 3}
      RFCOMM: Disconnect (DISC) (0x43)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x53 poll/final 1
         Length: 0
         FCS: 0xfd
< ACL Data TX: Handle 43 flags 0x00 dlen 8            #22703 [hci0] 1664.411136
      Channel: 65 len 4 [PSM 3 mode 0] {chan 3}
      RFCOMM: Unnumbered Ack (UA) (0x63)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x73 poll/final 1
         Length: 0
         FCS: 0xd7
< ACL Data TX: Handle 43 flags 0x00 dlen 12           #22704 [hci0] 1664.411143
      L2CAP: Disconnection Request (0x06) ident 11 len 4
        Destination CID: 65
        Source CID: 65
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22705 [hci0] 1664.414009
        Num handles: 1
        Handle: 43
        Count: 1
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22706 [hci0] 1664.415007
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12           #22707 [hci0] 1664.418674
      L2CAP: Disconnection Request (0x06) ident 17 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12           #22708 [hci0] 1664.418762
      L2CAP: Disconnection Response (0x07) ident 17 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12           #22709 [hci0] 1664.421073
      L2CAP: Connection Request (0x02) ident 12 len 4
        PSM: 1 (0x0001)
        Source CID: 65
> ACL Data RX: Handle 43 flags 0x02 dlen 12           #22710 [hci0] 1664.421371
      L2CAP: Disconnection Response (0x07) ident 11 len 4
        Destination CID: 65
        Source CID: 65
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22711 [hci0] 1664.424082
        Num handles: 1
        Handle: 43
        Count: 1
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22712 [hci0] 1664.425040
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12           #22713 [hci0] 1664.426103
      L2CAP: Connection Request (0x02) ident 18 len 4
        PSM: 3 (0x0003)
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 16           #22714 [hci0] 1664.426186
      L2CAP: Connection Response (0x03) ident 18 len 8
        Destination CID: 66
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
< ACL Data TX: Handle 43 flags 0x00 dlen 27           #22715 [hci0] 1664.426196
      L2CAP: Configure Request (0x04) ident 13 len 19
        Destination CID: 65
        Flags: 0x0000
        Option: Maximum Transmission Unit (0x01) [mandatory]
          MTU: 1013
        Option: Retransmission and Flow Control (0x04) [mandatory]
          Mode: Basic (0x00)
          TX window size: 0
          Max transmit: 0
          Retransmission timeout: 0
          Monitor timeout: 0
          Maximum PDU size: 0
> ACL Data RX: Handle 43 flags 0x02 dlen 16           #22716 [hci0] 1664.428804
      L2CAP: Connection Response (0x03) ident 12 len 8
        Destination CID: 66
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
*snip*

Fix is to check that channel is in state BT_DISCONN before deleting the
channel.

This bug was found while fuzzing Bluez's OBEX implementation using
Synopsys Defensics.

Reported-by: Matti Kamunen <[email protected]>
Reported-by: Ari Timonen <[email protected]>
Signed-off-by: Matias Karhumaa <[email protected]>
Signed-off-by: Marcel Holtmann <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
mrchapp pushed a commit to mrchapp/linux that referenced this pull request Jul 24, 2019
[ Upstream commit 28261da ]

Because of both sides doing L2CAP disconnection at the same time, it
was possible to receive L2CAP Disconnection Response with CID that was
already freed. That caused problems if CID was already reused and L2CAP
Connection Request with same CID was sent out. Before this patch kernel
deleted channel context regardless of the state of the channel.

Example where leftover Disconnection Response (frame torvalds#402) causes local
device to delete L2CAP channel which was not yet connected. This in
turn confuses remote device's stack because same CID is re-used without
properly disconnecting.

Btmon capture before patch:
** snip **
> ACL Data RX: Handle 43 flags 0x02 dlen 8                torvalds#394 [hci1] 10.748949
      Channel: 65 len 4 [PSM 3 mode 0] {chan 2}
      RFCOMM: Disconnect (DISC) (0x43)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x53 poll/final 1
         Length: 0
         FCS: 0xfd
< ACL Data TX: Handle 43 flags 0x00 dlen 8                torvalds#395 [hci1] 10.749062
      Channel: 65 len 4 [PSM 3 mode 0] {chan 2}
      RFCOMM: Unnumbered Ack (UA) (0x63)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x73 poll/final 1
         Length: 0
         FCS: 0xd7
< ACL Data TX: Handle 43 flags 0x00 dlen 12               torvalds#396 [hci1] 10.749073
      L2CAP: Disconnection Request (0x06) ident 17 len 4
        Destination CID: 65
        Source CID: 65
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#397 [hci1] 10.752391
        Num handles: 1
        Handle: 43
        Count: 1
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#398 [hci1] 10.753394
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12               torvalds#399 [hci1] 10.756499
      L2CAP: Disconnection Request (0x06) ident 26 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12               torvalds#400 [hci1] 10.756548
      L2CAP: Disconnection Response (0x07) ident 26 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12               torvalds#401 [hci1] 10.757459
      L2CAP: Connection Request (0x02) ident 18 len 4
        PSM: 1 (0x0001)
        Source CID: 65
> ACL Data RX: Handle 43 flags 0x02 dlen 12               torvalds#402 [hci1] 10.759148
      L2CAP: Disconnection Response (0x07) ident 17 len 4
        Destination CID: 65
        Source CID: 65
= bluetoothd: 00:1E:AB:4C:56:54: error updating services: Input/o..   10.759447
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#403 [hci1] 10.759386
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12               torvalds#404 [hci1] 10.760397
      L2CAP: Connection Request (0x02) ident 27 len 4
        PSM: 3 (0x0003)
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 16               torvalds#405 [hci1] 10.760441
      L2CAP: Connection Response (0x03) ident 27 len 8
        Destination CID: 65
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
< ACL Data TX: Handle 43 flags 0x00 dlen 27               torvalds#406 [hci1] 10.760449
      L2CAP: Configure Request (0x04) ident 19 len 19
        Destination CID: 65
        Flags: 0x0000
        Option: Maximum Transmission Unit (0x01) [mandatory]
          MTU: 1013
        Option: Retransmission and Flow Control (0x04) [mandatory]
          Mode: Basic (0x00)
          TX window size: 0
          Max transmit: 0
          Retransmission timeout: 0
          Monitor timeout: 0
          Maximum PDU size: 0
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#407 [hci1] 10.761399
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 16               torvalds#408 [hci1] 10.762942
      L2CAP: Connection Response (0x03) ident 18 len 8
        Destination CID: 66
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
*snip*

Similar case after the patch:
*snip*
> ACL Data RX: Handle 43 flags 0x02 dlen 8            #22702 [hci0] 1664.411056
      Channel: 65 len 4 [PSM 3 mode 0] {chan 3}
      RFCOMM: Disconnect (DISC) (0x43)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x53 poll/final 1
         Length: 0
         FCS: 0xfd
< ACL Data TX: Handle 43 flags 0x00 dlen 8            #22703 [hci0] 1664.411136
      Channel: 65 len 4 [PSM 3 mode 0] {chan 3}
      RFCOMM: Unnumbered Ack (UA) (0x63)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x73 poll/final 1
         Length: 0
         FCS: 0xd7
< ACL Data TX: Handle 43 flags 0x00 dlen 12           #22704 [hci0] 1664.411143
      L2CAP: Disconnection Request (0x06) ident 11 len 4
        Destination CID: 65
        Source CID: 65
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22705 [hci0] 1664.414009
        Num handles: 1
        Handle: 43
        Count: 1
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22706 [hci0] 1664.415007
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12           #22707 [hci0] 1664.418674
      L2CAP: Disconnection Request (0x06) ident 17 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12           #22708 [hci0] 1664.418762
      L2CAP: Disconnection Response (0x07) ident 17 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12           #22709 [hci0] 1664.421073
      L2CAP: Connection Request (0x02) ident 12 len 4
        PSM: 1 (0x0001)
        Source CID: 65
> ACL Data RX: Handle 43 flags 0x02 dlen 12           #22710 [hci0] 1664.421371
      L2CAP: Disconnection Response (0x07) ident 11 len 4
        Destination CID: 65
        Source CID: 65
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22711 [hci0] 1664.424082
        Num handles: 1
        Handle: 43
        Count: 1
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22712 [hci0] 1664.425040
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12           #22713 [hci0] 1664.426103
      L2CAP: Connection Request (0x02) ident 18 len 4
        PSM: 3 (0x0003)
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 16           #22714 [hci0] 1664.426186
      L2CAP: Connection Response (0x03) ident 18 len 8
        Destination CID: 66
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
< ACL Data TX: Handle 43 flags 0x00 dlen 27           #22715 [hci0] 1664.426196
      L2CAP: Configure Request (0x04) ident 13 len 19
        Destination CID: 65
        Flags: 0x0000
        Option: Maximum Transmission Unit (0x01) [mandatory]
          MTU: 1013
        Option: Retransmission and Flow Control (0x04) [mandatory]
          Mode: Basic (0x00)
          TX window size: 0
          Max transmit: 0
          Retransmission timeout: 0
          Monitor timeout: 0
          Maximum PDU size: 0
> ACL Data RX: Handle 43 flags 0x02 dlen 16           #22716 [hci0] 1664.428804
      L2CAP: Connection Response (0x03) ident 12 len 8
        Destination CID: 66
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
*snip*

Fix is to check that channel is in state BT_DISCONN before deleting the
channel.

This bug was found while fuzzing Bluez's OBEX implementation using
Synopsys Defensics.

Reported-by: Matti Kamunen <[email protected]>
Reported-by: Ari Timonen <[email protected]>
Signed-off-by: Matias Karhumaa <[email protected]>
Signed-off-by: Marcel Holtmann <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
mrchapp pushed a commit to mrchapp/linux that referenced this pull request Jul 30, 2019
[ Upstream commit 28261da ]

Because of both sides doing L2CAP disconnection at the same time, it
was possible to receive L2CAP Disconnection Response with CID that was
already freed. That caused problems if CID was already reused and L2CAP
Connection Request with same CID was sent out. Before this patch kernel
deleted channel context regardless of the state of the channel.

Example where leftover Disconnection Response (frame torvalds#402) causes local
device to delete L2CAP channel which was not yet connected. This in
turn confuses remote device's stack because same CID is re-used without
properly disconnecting.

Btmon capture before patch:
** snip **
> ACL Data RX: Handle 43 flags 0x02 dlen 8                torvalds#394 [hci1] 10.748949
      Channel: 65 len 4 [PSM 3 mode 0] {chan 2}
      RFCOMM: Disconnect (DISC) (0x43)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x53 poll/final 1
         Length: 0
         FCS: 0xfd
< ACL Data TX: Handle 43 flags 0x00 dlen 8                torvalds#395 [hci1] 10.749062
      Channel: 65 len 4 [PSM 3 mode 0] {chan 2}
      RFCOMM: Unnumbered Ack (UA) (0x63)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x73 poll/final 1
         Length: 0
         FCS: 0xd7
< ACL Data TX: Handle 43 flags 0x00 dlen 12               torvalds#396 [hci1] 10.749073
      L2CAP: Disconnection Request (0x06) ident 17 len 4
        Destination CID: 65
        Source CID: 65
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#397 [hci1] 10.752391
        Num handles: 1
        Handle: 43
        Count: 1
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#398 [hci1] 10.753394
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12               torvalds#399 [hci1] 10.756499
      L2CAP: Disconnection Request (0x06) ident 26 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12               torvalds#400 [hci1] 10.756548
      L2CAP: Disconnection Response (0x07) ident 26 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12               torvalds#401 [hci1] 10.757459
      L2CAP: Connection Request (0x02) ident 18 len 4
        PSM: 1 (0x0001)
        Source CID: 65
> ACL Data RX: Handle 43 flags 0x02 dlen 12               torvalds#402 [hci1] 10.759148
      L2CAP: Disconnection Response (0x07) ident 17 len 4
        Destination CID: 65
        Source CID: 65
= bluetoothd: 00:1E:AB:4C:56:54: error updating services: Input/o..   10.759447
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#403 [hci1] 10.759386
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12               torvalds#404 [hci1] 10.760397
      L2CAP: Connection Request (0x02) ident 27 len 4
        PSM: 3 (0x0003)
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 16               torvalds#405 [hci1] 10.760441
      L2CAP: Connection Response (0x03) ident 27 len 8
        Destination CID: 65
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
< ACL Data TX: Handle 43 flags 0x00 dlen 27               torvalds#406 [hci1] 10.760449
      L2CAP: Configure Request (0x04) ident 19 len 19
        Destination CID: 65
        Flags: 0x0000
        Option: Maximum Transmission Unit (0x01) [mandatory]
          MTU: 1013
        Option: Retransmission and Flow Control (0x04) [mandatory]
          Mode: Basic (0x00)
          TX window size: 0
          Max transmit: 0
          Retransmission timeout: 0
          Monitor timeout: 0
          Maximum PDU size: 0
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#407 [hci1] 10.761399
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 16               torvalds#408 [hci1] 10.762942
      L2CAP: Connection Response (0x03) ident 18 len 8
        Destination CID: 66
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
*snip*

Similar case after the patch:
*snip*
> ACL Data RX: Handle 43 flags 0x02 dlen 8            #22702 [hci0] 1664.411056
      Channel: 65 len 4 [PSM 3 mode 0] {chan 3}
      RFCOMM: Disconnect (DISC) (0x43)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x53 poll/final 1
         Length: 0
         FCS: 0xfd
< ACL Data TX: Handle 43 flags 0x00 dlen 8            #22703 [hci0] 1664.411136
      Channel: 65 len 4 [PSM 3 mode 0] {chan 3}
      RFCOMM: Unnumbered Ack (UA) (0x63)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x73 poll/final 1
         Length: 0
         FCS: 0xd7
< ACL Data TX: Handle 43 flags 0x00 dlen 12           #22704 [hci0] 1664.411143
      L2CAP: Disconnection Request (0x06) ident 11 len 4
        Destination CID: 65
        Source CID: 65
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22705 [hci0] 1664.414009
        Num handles: 1
        Handle: 43
        Count: 1
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22706 [hci0] 1664.415007
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12           #22707 [hci0] 1664.418674
      L2CAP: Disconnection Request (0x06) ident 17 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12           #22708 [hci0] 1664.418762
      L2CAP: Disconnection Response (0x07) ident 17 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12           #22709 [hci0] 1664.421073
      L2CAP: Connection Request (0x02) ident 12 len 4
        PSM: 1 (0x0001)
        Source CID: 65
> ACL Data RX: Handle 43 flags 0x02 dlen 12           #22710 [hci0] 1664.421371
      L2CAP: Disconnection Response (0x07) ident 11 len 4
        Destination CID: 65
        Source CID: 65
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22711 [hci0] 1664.424082
        Num handles: 1
        Handle: 43
        Count: 1
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22712 [hci0] 1664.425040
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12           #22713 [hci0] 1664.426103
      L2CAP: Connection Request (0x02) ident 18 len 4
        PSM: 3 (0x0003)
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 16           #22714 [hci0] 1664.426186
      L2CAP: Connection Response (0x03) ident 18 len 8
        Destination CID: 66
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
< ACL Data TX: Handle 43 flags 0x00 dlen 27           #22715 [hci0] 1664.426196
      L2CAP: Configure Request (0x04) ident 13 len 19
        Destination CID: 65
        Flags: 0x0000
        Option: Maximum Transmission Unit (0x01) [mandatory]
          MTU: 1013
        Option: Retransmission and Flow Control (0x04) [mandatory]
          Mode: Basic (0x00)
          TX window size: 0
          Max transmit: 0
          Retransmission timeout: 0
          Monitor timeout: 0
          Maximum PDU size: 0
> ACL Data RX: Handle 43 flags 0x02 dlen 16           #22716 [hci0] 1664.428804
      L2CAP: Connection Response (0x03) ident 12 len 8
        Destination CID: 66
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
*snip*

Fix is to check that channel is in state BT_DISCONN before deleting the
channel.

This bug was found while fuzzing Bluez's OBEX implementation using
Synopsys Defensics.

Reported-by: Matti Kamunen <[email protected]>
Reported-by: Ari Timonen <[email protected]>
Signed-off-by: Matias Karhumaa <[email protected]>
Signed-off-by: Marcel Holtmann <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
mrchapp pushed a commit to mrchapp/linux that referenced this pull request Jul 31, 2019
[ Upstream commit 28261da ]

Because of both sides doing L2CAP disconnection at the same time, it
was possible to receive L2CAP Disconnection Response with CID that was
already freed. That caused problems if CID was already reused and L2CAP
Connection Request with same CID was sent out. Before this patch kernel
deleted channel context regardless of the state of the channel.

Example where leftover Disconnection Response (frame torvalds#402) causes local
device to delete L2CAP channel which was not yet connected. This in
turn confuses remote device's stack because same CID is re-used without
properly disconnecting.

Btmon capture before patch:
** snip **
> ACL Data RX: Handle 43 flags 0x02 dlen 8                torvalds#394 [hci1] 10.748949
      Channel: 65 len 4 [PSM 3 mode 0] {chan 2}
      RFCOMM: Disconnect (DISC) (0x43)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x53 poll/final 1
         Length: 0
         FCS: 0xfd
< ACL Data TX: Handle 43 flags 0x00 dlen 8                torvalds#395 [hci1] 10.749062
      Channel: 65 len 4 [PSM 3 mode 0] {chan 2}
      RFCOMM: Unnumbered Ack (UA) (0x63)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x73 poll/final 1
         Length: 0
         FCS: 0xd7
< ACL Data TX: Handle 43 flags 0x00 dlen 12               torvalds#396 [hci1] 10.749073
      L2CAP: Disconnection Request (0x06) ident 17 len 4
        Destination CID: 65
        Source CID: 65
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#397 [hci1] 10.752391
        Num handles: 1
        Handle: 43
        Count: 1
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#398 [hci1] 10.753394
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12               torvalds#399 [hci1] 10.756499
      L2CAP: Disconnection Request (0x06) ident 26 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12               torvalds#400 [hci1] 10.756548
      L2CAP: Disconnection Response (0x07) ident 26 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12               torvalds#401 [hci1] 10.757459
      L2CAP: Connection Request (0x02) ident 18 len 4
        PSM: 1 (0x0001)
        Source CID: 65
> ACL Data RX: Handle 43 flags 0x02 dlen 12               torvalds#402 [hci1] 10.759148
      L2CAP: Disconnection Response (0x07) ident 17 len 4
        Destination CID: 65
        Source CID: 65
= bluetoothd: 00:1E:AB:4C:56:54: error updating services: Input/o..   10.759447
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#403 [hci1] 10.759386
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12               torvalds#404 [hci1] 10.760397
      L2CAP: Connection Request (0x02) ident 27 len 4
        PSM: 3 (0x0003)
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 16               torvalds#405 [hci1] 10.760441
      L2CAP: Connection Response (0x03) ident 27 len 8
        Destination CID: 65
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
< ACL Data TX: Handle 43 flags 0x00 dlen 27               torvalds#406 [hci1] 10.760449
      L2CAP: Configure Request (0x04) ident 19 len 19
        Destination CID: 65
        Flags: 0x0000
        Option: Maximum Transmission Unit (0x01) [mandatory]
          MTU: 1013
        Option: Retransmission and Flow Control (0x04) [mandatory]
          Mode: Basic (0x00)
          TX window size: 0
          Max transmit: 0
          Retransmission timeout: 0
          Monitor timeout: 0
          Maximum PDU size: 0
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#407 [hci1] 10.761399
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 16               torvalds#408 [hci1] 10.762942
      L2CAP: Connection Response (0x03) ident 18 len 8
        Destination CID: 66
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
*snip*

Similar case after the patch:
*snip*
> ACL Data RX: Handle 43 flags 0x02 dlen 8            #22702 [hci0] 1664.411056
      Channel: 65 len 4 [PSM 3 mode 0] {chan 3}
      RFCOMM: Disconnect (DISC) (0x43)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x53 poll/final 1
         Length: 0
         FCS: 0xfd
< ACL Data TX: Handle 43 flags 0x00 dlen 8            #22703 [hci0] 1664.411136
      Channel: 65 len 4 [PSM 3 mode 0] {chan 3}
      RFCOMM: Unnumbered Ack (UA) (0x63)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x73 poll/final 1
         Length: 0
         FCS: 0xd7
< ACL Data TX: Handle 43 flags 0x00 dlen 12           #22704 [hci0] 1664.411143
      L2CAP: Disconnection Request (0x06) ident 11 len 4
        Destination CID: 65
        Source CID: 65
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22705 [hci0] 1664.414009
        Num handles: 1
        Handle: 43
        Count: 1
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22706 [hci0] 1664.415007
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12           #22707 [hci0] 1664.418674
      L2CAP: Disconnection Request (0x06) ident 17 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12           #22708 [hci0] 1664.418762
      L2CAP: Disconnection Response (0x07) ident 17 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12           #22709 [hci0] 1664.421073
      L2CAP: Connection Request (0x02) ident 12 len 4
        PSM: 1 (0x0001)
        Source CID: 65
> ACL Data RX: Handle 43 flags 0x02 dlen 12           #22710 [hci0] 1664.421371
      L2CAP: Disconnection Response (0x07) ident 11 len 4
        Destination CID: 65
        Source CID: 65
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22711 [hci0] 1664.424082
        Num handles: 1
        Handle: 43
        Count: 1
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22712 [hci0] 1664.425040
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12           #22713 [hci0] 1664.426103
      L2CAP: Connection Request (0x02) ident 18 len 4
        PSM: 3 (0x0003)
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 16           #22714 [hci0] 1664.426186
      L2CAP: Connection Response (0x03) ident 18 len 8
        Destination CID: 66
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
< ACL Data TX: Handle 43 flags 0x00 dlen 27           #22715 [hci0] 1664.426196
      L2CAP: Configure Request (0x04) ident 13 len 19
        Destination CID: 65
        Flags: 0x0000
        Option: Maximum Transmission Unit (0x01) [mandatory]
          MTU: 1013
        Option: Retransmission and Flow Control (0x04) [mandatory]
          Mode: Basic (0x00)
          TX window size: 0
          Max transmit: 0
          Retransmission timeout: 0
          Monitor timeout: 0
          Maximum PDU size: 0
> ACL Data RX: Handle 43 flags 0x02 dlen 16           #22716 [hci0] 1664.428804
      L2CAP: Connection Response (0x03) ident 12 len 8
        Destination CID: 66
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
*snip*

Fix is to check that channel is in state BT_DISCONN before deleting the
channel.

This bug was found while fuzzing Bluez's OBEX implementation using
Synopsys Defensics.

Reported-by: Matti Kamunen <[email protected]>
Reported-by: Ari Timonen <[email protected]>
Signed-off-by: Matias Karhumaa <[email protected]>
Signed-off-by: Marcel Holtmann <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
mrchapp pushed a commit to mrchapp/linux that referenced this pull request Aug 1, 2019
[ Upstream commit 28261da ]

Because of both sides doing L2CAP disconnection at the same time, it
was possible to receive L2CAP Disconnection Response with CID that was
already freed. That caused problems if CID was already reused and L2CAP
Connection Request with same CID was sent out. Before this patch kernel
deleted channel context regardless of the state of the channel.

Example where leftover Disconnection Response (frame torvalds#402) causes local
device to delete L2CAP channel which was not yet connected. This in
turn confuses remote device's stack because same CID is re-used without
properly disconnecting.

Btmon capture before patch:
** snip **
> ACL Data RX: Handle 43 flags 0x02 dlen 8                torvalds#394 [hci1] 10.748949
      Channel: 65 len 4 [PSM 3 mode 0] {chan 2}
      RFCOMM: Disconnect (DISC) (0x43)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x53 poll/final 1
         Length: 0
         FCS: 0xfd
< ACL Data TX: Handle 43 flags 0x00 dlen 8                torvalds#395 [hci1] 10.749062
      Channel: 65 len 4 [PSM 3 mode 0] {chan 2}
      RFCOMM: Unnumbered Ack (UA) (0x63)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x73 poll/final 1
         Length: 0
         FCS: 0xd7
< ACL Data TX: Handle 43 flags 0x00 dlen 12               torvalds#396 [hci1] 10.749073
      L2CAP: Disconnection Request (0x06) ident 17 len 4
        Destination CID: 65
        Source CID: 65
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#397 [hci1] 10.752391
        Num handles: 1
        Handle: 43
        Count: 1
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#398 [hci1] 10.753394
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12               torvalds#399 [hci1] 10.756499
      L2CAP: Disconnection Request (0x06) ident 26 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12               torvalds#400 [hci1] 10.756548
      L2CAP: Disconnection Response (0x07) ident 26 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12               torvalds#401 [hci1] 10.757459
      L2CAP: Connection Request (0x02) ident 18 len 4
        PSM: 1 (0x0001)
        Source CID: 65
> ACL Data RX: Handle 43 flags 0x02 dlen 12               torvalds#402 [hci1] 10.759148
      L2CAP: Disconnection Response (0x07) ident 17 len 4
        Destination CID: 65
        Source CID: 65
= bluetoothd: 00:1E:AB:4C:56:54: error updating services: Input/o..   10.759447
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#403 [hci1] 10.759386
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12               torvalds#404 [hci1] 10.760397
      L2CAP: Connection Request (0x02) ident 27 len 4
        PSM: 3 (0x0003)
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 16               torvalds#405 [hci1] 10.760441
      L2CAP: Connection Response (0x03) ident 27 len 8
        Destination CID: 65
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
< ACL Data TX: Handle 43 flags 0x00 dlen 27               torvalds#406 [hci1] 10.760449
      L2CAP: Configure Request (0x04) ident 19 len 19
        Destination CID: 65
        Flags: 0x0000
        Option: Maximum Transmission Unit (0x01) [mandatory]
          MTU: 1013
        Option: Retransmission and Flow Control (0x04) [mandatory]
          Mode: Basic (0x00)
          TX window size: 0
          Max transmit: 0
          Retransmission timeout: 0
          Monitor timeout: 0
          Maximum PDU size: 0
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#407 [hci1] 10.761399
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 16               torvalds#408 [hci1] 10.762942
      L2CAP: Connection Response (0x03) ident 18 len 8
        Destination CID: 66
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
*snip*

Similar case after the patch:
*snip*
> ACL Data RX: Handle 43 flags 0x02 dlen 8            #22702 [hci0] 1664.411056
      Channel: 65 len 4 [PSM 3 mode 0] {chan 3}
      RFCOMM: Disconnect (DISC) (0x43)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x53 poll/final 1
         Length: 0
         FCS: 0xfd
< ACL Data TX: Handle 43 flags 0x00 dlen 8            #22703 [hci0] 1664.411136
      Channel: 65 len 4 [PSM 3 mode 0] {chan 3}
      RFCOMM: Unnumbered Ack (UA) (0x63)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x73 poll/final 1
         Length: 0
         FCS: 0xd7
< ACL Data TX: Handle 43 flags 0x00 dlen 12           #22704 [hci0] 1664.411143
      L2CAP: Disconnection Request (0x06) ident 11 len 4
        Destination CID: 65
        Source CID: 65
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22705 [hci0] 1664.414009
        Num handles: 1
        Handle: 43
        Count: 1
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22706 [hci0] 1664.415007
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12           #22707 [hci0] 1664.418674
      L2CAP: Disconnection Request (0x06) ident 17 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12           #22708 [hci0] 1664.418762
      L2CAP: Disconnection Response (0x07) ident 17 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12           #22709 [hci0] 1664.421073
      L2CAP: Connection Request (0x02) ident 12 len 4
        PSM: 1 (0x0001)
        Source CID: 65
> ACL Data RX: Handle 43 flags 0x02 dlen 12           #22710 [hci0] 1664.421371
      L2CAP: Disconnection Response (0x07) ident 11 len 4
        Destination CID: 65
        Source CID: 65
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22711 [hci0] 1664.424082
        Num handles: 1
        Handle: 43
        Count: 1
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22712 [hci0] 1664.425040
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12           #22713 [hci0] 1664.426103
      L2CAP: Connection Request (0x02) ident 18 len 4
        PSM: 3 (0x0003)
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 16           #22714 [hci0] 1664.426186
      L2CAP: Connection Response (0x03) ident 18 len 8
        Destination CID: 66
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
< ACL Data TX: Handle 43 flags 0x00 dlen 27           #22715 [hci0] 1664.426196
      L2CAP: Configure Request (0x04) ident 13 len 19
        Destination CID: 65
        Flags: 0x0000
        Option: Maximum Transmission Unit (0x01) [mandatory]
          MTU: 1013
        Option: Retransmission and Flow Control (0x04) [mandatory]
          Mode: Basic (0x00)
          TX window size: 0
          Max transmit: 0
          Retransmission timeout: 0
          Monitor timeout: 0
          Maximum PDU size: 0
> ACL Data RX: Handle 43 flags 0x02 dlen 16           #22716 [hci0] 1664.428804
      L2CAP: Connection Response (0x03) ident 12 len 8
        Destination CID: 66
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
*snip*

Fix is to check that channel is in state BT_DISCONN before deleting the
channel.

This bug was found while fuzzing Bluez's OBEX implementation using
Synopsys Defensics.

Reported-by: Matti Kamunen <[email protected]>
Reported-by: Ari Timonen <[email protected]>
Signed-off-by: Matias Karhumaa <[email protected]>
Signed-off-by: Marcel Holtmann <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
mrchapp pushed a commit to mrchapp/linux that referenced this pull request Aug 1, 2019
[ Upstream commit 28261da ]

Because of both sides doing L2CAP disconnection at the same time, it
was possible to receive L2CAP Disconnection Response with CID that was
already freed. That caused problems if CID was already reused and L2CAP
Connection Request with same CID was sent out. Before this patch kernel
deleted channel context regardless of the state of the channel.

Example where leftover Disconnection Response (frame torvalds#402) causes local
device to delete L2CAP channel which was not yet connected. This in
turn confuses remote device's stack because same CID is re-used without
properly disconnecting.

Btmon capture before patch:
** snip **
> ACL Data RX: Handle 43 flags 0x02 dlen 8                torvalds#394 [hci1] 10.748949
      Channel: 65 len 4 [PSM 3 mode 0] {chan 2}
      RFCOMM: Disconnect (DISC) (0x43)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x53 poll/final 1
         Length: 0
         FCS: 0xfd
< ACL Data TX: Handle 43 flags 0x00 dlen 8                torvalds#395 [hci1] 10.749062
      Channel: 65 len 4 [PSM 3 mode 0] {chan 2}
      RFCOMM: Unnumbered Ack (UA) (0x63)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x73 poll/final 1
         Length: 0
         FCS: 0xd7
< ACL Data TX: Handle 43 flags 0x00 dlen 12               torvalds#396 [hci1] 10.749073
      L2CAP: Disconnection Request (0x06) ident 17 len 4
        Destination CID: 65
        Source CID: 65
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#397 [hci1] 10.752391
        Num handles: 1
        Handle: 43
        Count: 1
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#398 [hci1] 10.753394
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12               torvalds#399 [hci1] 10.756499
      L2CAP: Disconnection Request (0x06) ident 26 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12               torvalds#400 [hci1] 10.756548
      L2CAP: Disconnection Response (0x07) ident 26 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12               torvalds#401 [hci1] 10.757459
      L2CAP: Connection Request (0x02) ident 18 len 4
        PSM: 1 (0x0001)
        Source CID: 65
> ACL Data RX: Handle 43 flags 0x02 dlen 12               torvalds#402 [hci1] 10.759148
      L2CAP: Disconnection Response (0x07) ident 17 len 4
        Destination CID: 65
        Source CID: 65
= bluetoothd: 00:1E:AB:4C:56:54: error updating services: Input/o..   10.759447
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#403 [hci1] 10.759386
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12               torvalds#404 [hci1] 10.760397
      L2CAP: Connection Request (0x02) ident 27 len 4
        PSM: 3 (0x0003)
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 16               torvalds#405 [hci1] 10.760441
      L2CAP: Connection Response (0x03) ident 27 len 8
        Destination CID: 65
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
< ACL Data TX: Handle 43 flags 0x00 dlen 27               torvalds#406 [hci1] 10.760449
      L2CAP: Configure Request (0x04) ident 19 len 19
        Destination CID: 65
        Flags: 0x0000
        Option: Maximum Transmission Unit (0x01) [mandatory]
          MTU: 1013
        Option: Retransmission and Flow Control (0x04) [mandatory]
          Mode: Basic (0x00)
          TX window size: 0
          Max transmit: 0
          Retransmission timeout: 0
          Monitor timeout: 0
          Maximum PDU size: 0
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#407 [hci1] 10.761399
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 16               torvalds#408 [hci1] 10.762942
      L2CAP: Connection Response (0x03) ident 18 len 8
        Destination CID: 66
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
*snip*

Similar case after the patch:
*snip*
> ACL Data RX: Handle 43 flags 0x02 dlen 8            #22702 [hci0] 1664.411056
      Channel: 65 len 4 [PSM 3 mode 0] {chan 3}
      RFCOMM: Disconnect (DISC) (0x43)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x53 poll/final 1
         Length: 0
         FCS: 0xfd
< ACL Data TX: Handle 43 flags 0x00 dlen 8            #22703 [hci0] 1664.411136
      Channel: 65 len 4 [PSM 3 mode 0] {chan 3}
      RFCOMM: Unnumbered Ack (UA) (0x63)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x73 poll/final 1
         Length: 0
         FCS: 0xd7
< ACL Data TX: Handle 43 flags 0x00 dlen 12           #22704 [hci0] 1664.411143
      L2CAP: Disconnection Request (0x06) ident 11 len 4
        Destination CID: 65
        Source CID: 65
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22705 [hci0] 1664.414009
        Num handles: 1
        Handle: 43
        Count: 1
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22706 [hci0] 1664.415007
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12           #22707 [hci0] 1664.418674
      L2CAP: Disconnection Request (0x06) ident 17 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12           #22708 [hci0] 1664.418762
      L2CAP: Disconnection Response (0x07) ident 17 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12           #22709 [hci0] 1664.421073
      L2CAP: Connection Request (0x02) ident 12 len 4
        PSM: 1 (0x0001)
        Source CID: 65
> ACL Data RX: Handle 43 flags 0x02 dlen 12           #22710 [hci0] 1664.421371
      L2CAP: Disconnection Response (0x07) ident 11 len 4
        Destination CID: 65
        Source CID: 65
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22711 [hci0] 1664.424082
        Num handles: 1
        Handle: 43
        Count: 1
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22712 [hci0] 1664.425040
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12           #22713 [hci0] 1664.426103
      L2CAP: Connection Request (0x02) ident 18 len 4
        PSM: 3 (0x0003)
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 16           #22714 [hci0] 1664.426186
      L2CAP: Connection Response (0x03) ident 18 len 8
        Destination CID: 66
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
< ACL Data TX: Handle 43 flags 0x00 dlen 27           #22715 [hci0] 1664.426196
      L2CAP: Configure Request (0x04) ident 13 len 19
        Destination CID: 65
        Flags: 0x0000
        Option: Maximum Transmission Unit (0x01) [mandatory]
          MTU: 1013
        Option: Retransmission and Flow Control (0x04) [mandatory]
          Mode: Basic (0x00)
          TX window size: 0
          Max transmit: 0
          Retransmission timeout: 0
          Monitor timeout: 0
          Maximum PDU size: 0
> ACL Data RX: Handle 43 flags 0x02 dlen 16           #22716 [hci0] 1664.428804
      L2CAP: Connection Response (0x03) ident 12 len 8
        Destination CID: 66
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
*snip*

Fix is to check that channel is in state BT_DISCONN before deleting the
channel.

This bug was found while fuzzing Bluez's OBEX implementation using
Synopsys Defensics.

Reported-by: Matti Kamunen <[email protected]>
Reported-by: Ari Timonen <[email protected]>
Signed-off-by: Matias Karhumaa <[email protected]>
Signed-off-by: Marcel Holtmann <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
mrchapp pushed a commit to mrchapp/linux that referenced this pull request Aug 2, 2019
[ Upstream commit 28261da ]

Because of both sides doing L2CAP disconnection at the same time, it
was possible to receive L2CAP Disconnection Response with CID that was
already freed. That caused problems if CID was already reused and L2CAP
Connection Request with same CID was sent out. Before this patch kernel
deleted channel context regardless of the state of the channel.

Example where leftover Disconnection Response (frame torvalds#402) causes local
device to delete L2CAP channel which was not yet connected. This in
turn confuses remote device's stack because same CID is re-used without
properly disconnecting.

Btmon capture before patch:
** snip **
> ACL Data RX: Handle 43 flags 0x02 dlen 8                torvalds#394 [hci1] 10.748949
      Channel: 65 len 4 [PSM 3 mode 0] {chan 2}
      RFCOMM: Disconnect (DISC) (0x43)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x53 poll/final 1
         Length: 0
         FCS: 0xfd
< ACL Data TX: Handle 43 flags 0x00 dlen 8                torvalds#395 [hci1] 10.749062
      Channel: 65 len 4 [PSM 3 mode 0] {chan 2}
      RFCOMM: Unnumbered Ack (UA) (0x63)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x73 poll/final 1
         Length: 0
         FCS: 0xd7
< ACL Data TX: Handle 43 flags 0x00 dlen 12               torvalds#396 [hci1] 10.749073
      L2CAP: Disconnection Request (0x06) ident 17 len 4
        Destination CID: 65
        Source CID: 65
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#397 [hci1] 10.752391
        Num handles: 1
        Handle: 43
        Count: 1
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#398 [hci1] 10.753394
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12               torvalds#399 [hci1] 10.756499
      L2CAP: Disconnection Request (0x06) ident 26 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12               torvalds#400 [hci1] 10.756548
      L2CAP: Disconnection Response (0x07) ident 26 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12               torvalds#401 [hci1] 10.757459
      L2CAP: Connection Request (0x02) ident 18 len 4
        PSM: 1 (0x0001)
        Source CID: 65
> ACL Data RX: Handle 43 flags 0x02 dlen 12               torvalds#402 [hci1] 10.759148
      L2CAP: Disconnection Response (0x07) ident 17 len 4
        Destination CID: 65
        Source CID: 65
= bluetoothd: 00:1E:AB:4C:56:54: error updating services: Input/o..   10.759447
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#403 [hci1] 10.759386
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12               torvalds#404 [hci1] 10.760397
      L2CAP: Connection Request (0x02) ident 27 len 4
        PSM: 3 (0x0003)
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 16               torvalds#405 [hci1] 10.760441
      L2CAP: Connection Response (0x03) ident 27 len 8
        Destination CID: 65
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
< ACL Data TX: Handle 43 flags 0x00 dlen 27               torvalds#406 [hci1] 10.760449
      L2CAP: Configure Request (0x04) ident 19 len 19
        Destination CID: 65
        Flags: 0x0000
        Option: Maximum Transmission Unit (0x01) [mandatory]
          MTU: 1013
        Option: Retransmission and Flow Control (0x04) [mandatory]
          Mode: Basic (0x00)
          TX window size: 0
          Max transmit: 0
          Retransmission timeout: 0
          Monitor timeout: 0
          Maximum PDU size: 0
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#407 [hci1] 10.761399
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 16               torvalds#408 [hci1] 10.762942
      L2CAP: Connection Response (0x03) ident 18 len 8
        Destination CID: 66
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
*snip*

Similar case after the patch:
*snip*
> ACL Data RX: Handle 43 flags 0x02 dlen 8            #22702 [hci0] 1664.411056
      Channel: 65 len 4 [PSM 3 mode 0] {chan 3}
      RFCOMM: Disconnect (DISC) (0x43)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x53 poll/final 1
         Length: 0
         FCS: 0xfd
< ACL Data TX: Handle 43 flags 0x00 dlen 8            #22703 [hci0] 1664.411136
      Channel: 65 len 4 [PSM 3 mode 0] {chan 3}
      RFCOMM: Unnumbered Ack (UA) (0x63)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x73 poll/final 1
         Length: 0
         FCS: 0xd7
< ACL Data TX: Handle 43 flags 0x00 dlen 12           #22704 [hci0] 1664.411143
      L2CAP: Disconnection Request (0x06) ident 11 len 4
        Destination CID: 65
        Source CID: 65
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22705 [hci0] 1664.414009
        Num handles: 1
        Handle: 43
        Count: 1
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22706 [hci0] 1664.415007
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12           #22707 [hci0] 1664.418674
      L2CAP: Disconnection Request (0x06) ident 17 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12           #22708 [hci0] 1664.418762
      L2CAP: Disconnection Response (0x07) ident 17 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12           #22709 [hci0] 1664.421073
      L2CAP: Connection Request (0x02) ident 12 len 4
        PSM: 1 (0x0001)
        Source CID: 65
> ACL Data RX: Handle 43 flags 0x02 dlen 12           #22710 [hci0] 1664.421371
      L2CAP: Disconnection Response (0x07) ident 11 len 4
        Destination CID: 65
        Source CID: 65
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22711 [hci0] 1664.424082
        Num handles: 1
        Handle: 43
        Count: 1
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22712 [hci0] 1664.425040
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12           #22713 [hci0] 1664.426103
      L2CAP: Connection Request (0x02) ident 18 len 4
        PSM: 3 (0x0003)
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 16           #22714 [hci0] 1664.426186
      L2CAP: Connection Response (0x03) ident 18 len 8
        Destination CID: 66
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
< ACL Data TX: Handle 43 flags 0x00 dlen 27           #22715 [hci0] 1664.426196
      L2CAP: Configure Request (0x04) ident 13 len 19
        Destination CID: 65
        Flags: 0x0000
        Option: Maximum Transmission Unit (0x01) [mandatory]
          MTU: 1013
        Option: Retransmission and Flow Control (0x04) [mandatory]
          Mode: Basic (0x00)
          TX window size: 0
          Max transmit: 0
          Retransmission timeout: 0
          Monitor timeout: 0
          Maximum PDU size: 0
> ACL Data RX: Handle 43 flags 0x02 dlen 16           #22716 [hci0] 1664.428804
      L2CAP: Connection Response (0x03) ident 12 len 8
        Destination CID: 66
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
*snip*

Fix is to check that channel is in state BT_DISCONN before deleting the
channel.

This bug was found while fuzzing Bluez's OBEX implementation using
Synopsys Defensics.

Reported-by: Matti Kamunen <[email protected]>
Reported-by: Ari Timonen <[email protected]>
Signed-off-by: Matias Karhumaa <[email protected]>
Signed-off-by: Marcel Holtmann <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
mrchapp pushed a commit to mrchapp/linux that referenced this pull request Aug 2, 2019
[ Upstream commit 28261da ]

Because of both sides doing L2CAP disconnection at the same time, it
was possible to receive L2CAP Disconnection Response with CID that was
already freed. That caused problems if CID was already reused and L2CAP
Connection Request with same CID was sent out. Before this patch kernel
deleted channel context regardless of the state of the channel.

Example where leftover Disconnection Response (frame torvalds#402) causes local
device to delete L2CAP channel which was not yet connected. This in
turn confuses remote device's stack because same CID is re-used without
properly disconnecting.

Btmon capture before patch:
** snip **
> ACL Data RX: Handle 43 flags 0x02 dlen 8                torvalds#394 [hci1] 10.748949
      Channel: 65 len 4 [PSM 3 mode 0] {chan 2}
      RFCOMM: Disconnect (DISC) (0x43)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x53 poll/final 1
         Length: 0
         FCS: 0xfd
< ACL Data TX: Handle 43 flags 0x00 dlen 8                torvalds#395 [hci1] 10.749062
      Channel: 65 len 4 [PSM 3 mode 0] {chan 2}
      RFCOMM: Unnumbered Ack (UA) (0x63)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x73 poll/final 1
         Length: 0
         FCS: 0xd7
< ACL Data TX: Handle 43 flags 0x00 dlen 12               torvalds#396 [hci1] 10.749073
      L2CAP: Disconnection Request (0x06) ident 17 len 4
        Destination CID: 65
        Source CID: 65
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#397 [hci1] 10.752391
        Num handles: 1
        Handle: 43
        Count: 1
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#398 [hci1] 10.753394
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12               torvalds#399 [hci1] 10.756499
      L2CAP: Disconnection Request (0x06) ident 26 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12               torvalds#400 [hci1] 10.756548
      L2CAP: Disconnection Response (0x07) ident 26 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12               torvalds#401 [hci1] 10.757459
      L2CAP: Connection Request (0x02) ident 18 len 4
        PSM: 1 (0x0001)
        Source CID: 65
> ACL Data RX: Handle 43 flags 0x02 dlen 12               torvalds#402 [hci1] 10.759148
      L2CAP: Disconnection Response (0x07) ident 17 len 4
        Destination CID: 65
        Source CID: 65
= bluetoothd: 00:1E:AB:4C:56:54: error updating services: Input/o..   10.759447
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#403 [hci1] 10.759386
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12               torvalds#404 [hci1] 10.760397
      L2CAP: Connection Request (0x02) ident 27 len 4
        PSM: 3 (0x0003)
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 16               torvalds#405 [hci1] 10.760441
      L2CAP: Connection Response (0x03) ident 27 len 8
        Destination CID: 65
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
< ACL Data TX: Handle 43 flags 0x00 dlen 27               torvalds#406 [hci1] 10.760449
      L2CAP: Configure Request (0x04) ident 19 len 19
        Destination CID: 65
        Flags: 0x0000
        Option: Maximum Transmission Unit (0x01) [mandatory]
          MTU: 1013
        Option: Retransmission and Flow Control (0x04) [mandatory]
          Mode: Basic (0x00)
          TX window size: 0
          Max transmit: 0
          Retransmission timeout: 0
          Monitor timeout: 0
          Maximum PDU size: 0
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#407 [hci1] 10.761399
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 16               torvalds#408 [hci1] 10.762942
      L2CAP: Connection Response (0x03) ident 18 len 8
        Destination CID: 66
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
*snip*

Similar case after the patch:
*snip*
> ACL Data RX: Handle 43 flags 0x02 dlen 8            #22702 [hci0] 1664.411056
      Channel: 65 len 4 [PSM 3 mode 0] {chan 3}
      RFCOMM: Disconnect (DISC) (0x43)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x53 poll/final 1
         Length: 0
         FCS: 0xfd
< ACL Data TX: Handle 43 flags 0x00 dlen 8            #22703 [hci0] 1664.411136
      Channel: 65 len 4 [PSM 3 mode 0] {chan 3}
      RFCOMM: Unnumbered Ack (UA) (0x63)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x73 poll/final 1
         Length: 0
         FCS: 0xd7
< ACL Data TX: Handle 43 flags 0x00 dlen 12           #22704 [hci0] 1664.411143
      L2CAP: Disconnection Request (0x06) ident 11 len 4
        Destination CID: 65
        Source CID: 65
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22705 [hci0] 1664.414009
        Num handles: 1
        Handle: 43
        Count: 1
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22706 [hci0] 1664.415007
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12           #22707 [hci0] 1664.418674
      L2CAP: Disconnection Request (0x06) ident 17 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12           #22708 [hci0] 1664.418762
      L2CAP: Disconnection Response (0x07) ident 17 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12           #22709 [hci0] 1664.421073
      L2CAP: Connection Request (0x02) ident 12 len 4
        PSM: 1 (0x0001)
        Source CID: 65
> ACL Data RX: Handle 43 flags 0x02 dlen 12           #22710 [hci0] 1664.421371
      L2CAP: Disconnection Response (0x07) ident 11 len 4
        Destination CID: 65
        Source CID: 65
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22711 [hci0] 1664.424082
        Num handles: 1
        Handle: 43
        Count: 1
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22712 [hci0] 1664.425040
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12           #22713 [hci0] 1664.426103
      L2CAP: Connection Request (0x02) ident 18 len 4
        PSM: 3 (0x0003)
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 16           #22714 [hci0] 1664.426186
      L2CAP: Connection Response (0x03) ident 18 len 8
        Destination CID: 66
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
< ACL Data TX: Handle 43 flags 0x00 dlen 27           #22715 [hci0] 1664.426196
      L2CAP: Configure Request (0x04) ident 13 len 19
        Destination CID: 65
        Flags: 0x0000
        Option: Maximum Transmission Unit (0x01) [mandatory]
          MTU: 1013
        Option: Retransmission and Flow Control (0x04) [mandatory]
          Mode: Basic (0x00)
          TX window size: 0
          Max transmit: 0
          Retransmission timeout: 0
          Monitor timeout: 0
          Maximum PDU size: 0
> ACL Data RX: Handle 43 flags 0x02 dlen 16           #22716 [hci0] 1664.428804
      L2CAP: Connection Response (0x03) ident 12 len 8
        Destination CID: 66
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
*snip*

Fix is to check that channel is in state BT_DISCONN before deleting the
channel.

This bug was found while fuzzing Bluez's OBEX implementation using
Synopsys Defensics.

Reported-by: Matti Kamunen <[email protected]>
Reported-by: Ari Timonen <[email protected]>
Signed-off-by: Matias Karhumaa <[email protected]>
Signed-off-by: Marcel Holtmann <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Noltari pushed a commit to Noltari/linux that referenced this pull request Aug 4, 2019
[ Upstream commit 28261da ]

Because of both sides doing L2CAP disconnection at the same time, it
was possible to receive L2CAP Disconnection Response with CID that was
already freed. That caused problems if CID was already reused and L2CAP
Connection Request with same CID was sent out. Before this patch kernel
deleted channel context regardless of the state of the channel.

Example where leftover Disconnection Response (frame torvalds#402) causes local
device to delete L2CAP channel which was not yet connected. This in
turn confuses remote device's stack because same CID is re-used without
properly disconnecting.

Btmon capture before patch:
** snip **
> ACL Data RX: Handle 43 flags 0x02 dlen 8                torvalds#394 [hci1] 10.748949
      Channel: 65 len 4 [PSM 3 mode 0] {chan 2}
      RFCOMM: Disconnect (DISC) (0x43)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x53 poll/final 1
         Length: 0
         FCS: 0xfd
< ACL Data TX: Handle 43 flags 0x00 dlen 8                torvalds#395 [hci1] 10.749062
      Channel: 65 len 4 [PSM 3 mode 0] {chan 2}
      RFCOMM: Unnumbered Ack (UA) (0x63)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x73 poll/final 1
         Length: 0
         FCS: 0xd7
< ACL Data TX: Handle 43 flags 0x00 dlen 12               torvalds#396 [hci1] 10.749073
      L2CAP: Disconnection Request (0x06) ident 17 len 4
        Destination CID: 65
        Source CID: 65
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#397 [hci1] 10.752391
        Num handles: 1
        Handle: 43
        Count: 1
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#398 [hci1] 10.753394
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12               torvalds#399 [hci1] 10.756499
      L2CAP: Disconnection Request (0x06) ident 26 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12               torvalds#400 [hci1] 10.756548
      L2CAP: Disconnection Response (0x07) ident 26 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12               torvalds#401 [hci1] 10.757459
      L2CAP: Connection Request (0x02) ident 18 len 4
        PSM: 1 (0x0001)
        Source CID: 65
> ACL Data RX: Handle 43 flags 0x02 dlen 12               torvalds#402 [hci1] 10.759148
      L2CAP: Disconnection Response (0x07) ident 17 len 4
        Destination CID: 65
        Source CID: 65
= bluetoothd: 00:1E:AB:4C:56:54: error updating services: Input/o..   10.759447
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#403 [hci1] 10.759386
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12               torvalds#404 [hci1] 10.760397
      L2CAP: Connection Request (0x02) ident 27 len 4
        PSM: 3 (0x0003)
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 16               torvalds#405 [hci1] 10.760441
      L2CAP: Connection Response (0x03) ident 27 len 8
        Destination CID: 65
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
< ACL Data TX: Handle 43 flags 0x00 dlen 27               torvalds#406 [hci1] 10.760449
      L2CAP: Configure Request (0x04) ident 19 len 19
        Destination CID: 65
        Flags: 0x0000
        Option: Maximum Transmission Unit (0x01) [mandatory]
          MTU: 1013
        Option: Retransmission and Flow Control (0x04) [mandatory]
          Mode: Basic (0x00)
          TX window size: 0
          Max transmit: 0
          Retransmission timeout: 0
          Monitor timeout: 0
          Maximum PDU size: 0
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#407 [hci1] 10.761399
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 16               torvalds#408 [hci1] 10.762942
      L2CAP: Connection Response (0x03) ident 18 len 8
        Destination CID: 66
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
*snip*

Similar case after the patch:
*snip*
> ACL Data RX: Handle 43 flags 0x02 dlen 8            #22702 [hci0] 1664.411056
      Channel: 65 len 4 [PSM 3 mode 0] {chan 3}
      RFCOMM: Disconnect (DISC) (0x43)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x53 poll/final 1
         Length: 0
         FCS: 0xfd
< ACL Data TX: Handle 43 flags 0x00 dlen 8            #22703 [hci0] 1664.411136
      Channel: 65 len 4 [PSM 3 mode 0] {chan 3}
      RFCOMM: Unnumbered Ack (UA) (0x63)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x73 poll/final 1
         Length: 0
         FCS: 0xd7
< ACL Data TX: Handle 43 flags 0x00 dlen 12           #22704 [hci0] 1664.411143
      L2CAP: Disconnection Request (0x06) ident 11 len 4
        Destination CID: 65
        Source CID: 65
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22705 [hci0] 1664.414009
        Num handles: 1
        Handle: 43
        Count: 1
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22706 [hci0] 1664.415007
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12           #22707 [hci0] 1664.418674
      L2CAP: Disconnection Request (0x06) ident 17 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12           #22708 [hci0] 1664.418762
      L2CAP: Disconnection Response (0x07) ident 17 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12           #22709 [hci0] 1664.421073
      L2CAP: Connection Request (0x02) ident 12 len 4
        PSM: 1 (0x0001)
        Source CID: 65
> ACL Data RX: Handle 43 flags 0x02 dlen 12           #22710 [hci0] 1664.421371
      L2CAP: Disconnection Response (0x07) ident 11 len 4
        Destination CID: 65
        Source CID: 65
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22711 [hci0] 1664.424082
        Num handles: 1
        Handle: 43
        Count: 1
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22712 [hci0] 1664.425040
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12           #22713 [hci0] 1664.426103
      L2CAP: Connection Request (0x02) ident 18 len 4
        PSM: 3 (0x0003)
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 16           #22714 [hci0] 1664.426186
      L2CAP: Connection Response (0x03) ident 18 len 8
        Destination CID: 66
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
< ACL Data TX: Handle 43 flags 0x00 dlen 27           #22715 [hci0] 1664.426196
      L2CAP: Configure Request (0x04) ident 13 len 19
        Destination CID: 65
        Flags: 0x0000
        Option: Maximum Transmission Unit (0x01) [mandatory]
          MTU: 1013
        Option: Retransmission and Flow Control (0x04) [mandatory]
          Mode: Basic (0x00)
          TX window size: 0
          Max transmit: 0
          Retransmission timeout: 0
          Monitor timeout: 0
          Maximum PDU size: 0
> ACL Data RX: Handle 43 flags 0x02 dlen 16           #22716 [hci0] 1664.428804
      L2CAP: Connection Response (0x03) ident 12 len 8
        Destination CID: 66
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
*snip*

Fix is to check that channel is in state BT_DISCONN before deleting the
channel.

This bug was found while fuzzing Bluez's OBEX implementation using
Synopsys Defensics.

Reported-by: Matti Kamunen <[email protected]>
Reported-by: Ari Timonen <[email protected]>
Signed-off-by: Matias Karhumaa <[email protected]>
Signed-off-by: Marcel Holtmann <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Noltari pushed a commit to Noltari/linux that referenced this pull request Aug 4, 2019
[ Upstream commit 28261da ]

Because of both sides doing L2CAP disconnection at the same time, it
was possible to receive L2CAP Disconnection Response with CID that was
already freed. That caused problems if CID was already reused and L2CAP
Connection Request with same CID was sent out. Before this patch kernel
deleted channel context regardless of the state of the channel.

Example where leftover Disconnection Response (frame torvalds#402) causes local
device to delete L2CAP channel which was not yet connected. This in
turn confuses remote device's stack because same CID is re-used without
properly disconnecting.

Btmon capture before patch:
** snip **
> ACL Data RX: Handle 43 flags 0x02 dlen 8                torvalds#394 [hci1] 10.748949
      Channel: 65 len 4 [PSM 3 mode 0] {chan 2}
      RFCOMM: Disconnect (DISC) (0x43)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x53 poll/final 1
         Length: 0
         FCS: 0xfd
< ACL Data TX: Handle 43 flags 0x00 dlen 8                torvalds#395 [hci1] 10.749062
      Channel: 65 len 4 [PSM 3 mode 0] {chan 2}
      RFCOMM: Unnumbered Ack (UA) (0x63)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x73 poll/final 1
         Length: 0
         FCS: 0xd7
< ACL Data TX: Handle 43 flags 0x00 dlen 12               torvalds#396 [hci1] 10.749073
      L2CAP: Disconnection Request (0x06) ident 17 len 4
        Destination CID: 65
        Source CID: 65
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#397 [hci1] 10.752391
        Num handles: 1
        Handle: 43
        Count: 1
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#398 [hci1] 10.753394
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12               torvalds#399 [hci1] 10.756499
      L2CAP: Disconnection Request (0x06) ident 26 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12               torvalds#400 [hci1] 10.756548
      L2CAP: Disconnection Response (0x07) ident 26 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12               torvalds#401 [hci1] 10.757459
      L2CAP: Connection Request (0x02) ident 18 len 4
        PSM: 1 (0x0001)
        Source CID: 65
> ACL Data RX: Handle 43 flags 0x02 dlen 12               torvalds#402 [hci1] 10.759148
      L2CAP: Disconnection Response (0x07) ident 17 len 4
        Destination CID: 65
        Source CID: 65
= bluetoothd: 00:1E:AB:4C:56:54: error updating services: Input/o..   10.759447
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#403 [hci1] 10.759386
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12               torvalds#404 [hci1] 10.760397
      L2CAP: Connection Request (0x02) ident 27 len 4
        PSM: 3 (0x0003)
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 16               torvalds#405 [hci1] 10.760441
      L2CAP: Connection Response (0x03) ident 27 len 8
        Destination CID: 65
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
< ACL Data TX: Handle 43 flags 0x00 dlen 27               torvalds#406 [hci1] 10.760449
      L2CAP: Configure Request (0x04) ident 19 len 19
        Destination CID: 65
        Flags: 0x0000
        Option: Maximum Transmission Unit (0x01) [mandatory]
          MTU: 1013
        Option: Retransmission and Flow Control (0x04) [mandatory]
          Mode: Basic (0x00)
          TX window size: 0
          Max transmit: 0
          Retransmission timeout: 0
          Monitor timeout: 0
          Maximum PDU size: 0
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#407 [hci1] 10.761399
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 16               torvalds#408 [hci1] 10.762942
      L2CAP: Connection Response (0x03) ident 18 len 8
        Destination CID: 66
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
*snip*

Similar case after the patch:
*snip*
> ACL Data RX: Handle 43 flags 0x02 dlen 8            #22702 [hci0] 1664.411056
      Channel: 65 len 4 [PSM 3 mode 0] {chan 3}
      RFCOMM: Disconnect (DISC) (0x43)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x53 poll/final 1
         Length: 0
         FCS: 0xfd
< ACL Data TX: Handle 43 flags 0x00 dlen 8            #22703 [hci0] 1664.411136
      Channel: 65 len 4 [PSM 3 mode 0] {chan 3}
      RFCOMM: Unnumbered Ack (UA) (0x63)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x73 poll/final 1
         Length: 0
         FCS: 0xd7
< ACL Data TX: Handle 43 flags 0x00 dlen 12           #22704 [hci0] 1664.411143
      L2CAP: Disconnection Request (0x06) ident 11 len 4
        Destination CID: 65
        Source CID: 65
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22705 [hci0] 1664.414009
        Num handles: 1
        Handle: 43
        Count: 1
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22706 [hci0] 1664.415007
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12           #22707 [hci0] 1664.418674
      L2CAP: Disconnection Request (0x06) ident 17 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12           #22708 [hci0] 1664.418762
      L2CAP: Disconnection Response (0x07) ident 17 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12           #22709 [hci0] 1664.421073
      L2CAP: Connection Request (0x02) ident 12 len 4
        PSM: 1 (0x0001)
        Source CID: 65
> ACL Data RX: Handle 43 flags 0x02 dlen 12           #22710 [hci0] 1664.421371
      L2CAP: Disconnection Response (0x07) ident 11 len 4
        Destination CID: 65
        Source CID: 65
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22711 [hci0] 1664.424082
        Num handles: 1
        Handle: 43
        Count: 1
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22712 [hci0] 1664.425040
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12           #22713 [hci0] 1664.426103
      L2CAP: Connection Request (0x02) ident 18 len 4
        PSM: 3 (0x0003)
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 16           #22714 [hci0] 1664.426186
      L2CAP: Connection Response (0x03) ident 18 len 8
        Destination CID: 66
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
< ACL Data TX: Handle 43 flags 0x00 dlen 27           #22715 [hci0] 1664.426196
      L2CAP: Configure Request (0x04) ident 13 len 19
        Destination CID: 65
        Flags: 0x0000
        Option: Maximum Transmission Unit (0x01) [mandatory]
          MTU: 1013
        Option: Retransmission and Flow Control (0x04) [mandatory]
          Mode: Basic (0x00)
          TX window size: 0
          Max transmit: 0
          Retransmission timeout: 0
          Monitor timeout: 0
          Maximum PDU size: 0
> ACL Data RX: Handle 43 flags 0x02 dlen 16           #22716 [hci0] 1664.428804
      L2CAP: Connection Response (0x03) ident 12 len 8
        Destination CID: 66
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
*snip*

Fix is to check that channel is in state BT_DISCONN before deleting the
channel.

This bug was found while fuzzing Bluez's OBEX implementation using
Synopsys Defensics.

Reported-by: Matti Kamunen <[email protected]>
Reported-by: Ari Timonen <[email protected]>
Signed-off-by: Matias Karhumaa <[email protected]>
Signed-off-by: Marcel Holtmann <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
chewitt pushed a commit to chewitt/linux that referenced this pull request Dec 2, 2019
[ Upstream commit 28261da ]

Because of both sides doing L2CAP disconnection at the same time, it
was possible to receive L2CAP Disconnection Response with CID that was
already freed. That caused problems if CID was already reused and L2CAP
Connection Request with same CID was sent out. Before this patch kernel
deleted channel context regardless of the state of the channel.

Example where leftover Disconnection Response (frame torvalds#402) causes local
device to delete L2CAP channel which was not yet connected. This in
turn confuses remote device's stack because same CID is re-used without
properly disconnecting.

Btmon capture before patch:
** snip **
> ACL Data RX: Handle 43 flags 0x02 dlen 8                torvalds#394 [hci1] 10.748949
      Channel: 65 len 4 [PSM 3 mode 0] {chan 2}
      RFCOMM: Disconnect (DISC) (0x43)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x53 poll/final 1
         Length: 0
         FCS: 0xfd
< ACL Data TX: Handle 43 flags 0x00 dlen 8                torvalds#395 [hci1] 10.749062
      Channel: 65 len 4 [PSM 3 mode 0] {chan 2}
      RFCOMM: Unnumbered Ack (UA) (0x63)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x73 poll/final 1
         Length: 0
         FCS: 0xd7
< ACL Data TX: Handle 43 flags 0x00 dlen 12               torvalds#396 [hci1] 10.749073
      L2CAP: Disconnection Request (0x06) ident 17 len 4
        Destination CID: 65
        Source CID: 65
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#397 [hci1] 10.752391
        Num handles: 1
        Handle: 43
        Count: 1
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#398 [hci1] 10.753394
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12               torvalds#399 [hci1] 10.756499
      L2CAP: Disconnection Request (0x06) ident 26 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12               torvalds#400 [hci1] 10.756548
      L2CAP: Disconnection Response (0x07) ident 26 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12               torvalds#401 [hci1] 10.757459
      L2CAP: Connection Request (0x02) ident 18 len 4
        PSM: 1 (0x0001)
        Source CID: 65
> ACL Data RX: Handle 43 flags 0x02 dlen 12               torvalds#402 [hci1] 10.759148
      L2CAP: Disconnection Response (0x07) ident 17 len 4
        Destination CID: 65
        Source CID: 65
= bluetoothd: 00:1E:AB:4C:56:54: error updating services: Input/o..   10.759447
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#403 [hci1] 10.759386
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12               torvalds#404 [hci1] 10.760397
      L2CAP: Connection Request (0x02) ident 27 len 4
        PSM: 3 (0x0003)
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 16               torvalds#405 [hci1] 10.760441
      L2CAP: Connection Response (0x03) ident 27 len 8
        Destination CID: 65
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
< ACL Data TX: Handle 43 flags 0x00 dlen 27               torvalds#406 [hci1] 10.760449
      L2CAP: Configure Request (0x04) ident 19 len 19
        Destination CID: 65
        Flags: 0x0000
        Option: Maximum Transmission Unit (0x01) [mandatory]
          MTU: 1013
        Option: Retransmission and Flow Control (0x04) [mandatory]
          Mode: Basic (0x00)
          TX window size: 0
          Max transmit: 0
          Retransmission timeout: 0
          Monitor timeout: 0
          Maximum PDU size: 0
> HCI Event: Number of Completed Packets (0x13) plen 5    torvalds#407 [hci1] 10.761399
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 16               torvalds#408 [hci1] 10.762942
      L2CAP: Connection Response (0x03) ident 18 len 8
        Destination CID: 66
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
*snip*

Similar case after the patch:
*snip*
> ACL Data RX: Handle 43 flags 0x02 dlen 8            #22702 [hci0] 1664.411056
      Channel: 65 len 4 [PSM 3 mode 0] {chan 3}
      RFCOMM: Disconnect (DISC) (0x43)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x53 poll/final 1
         Length: 0
         FCS: 0xfd
< ACL Data TX: Handle 43 flags 0x00 dlen 8            #22703 [hci0] 1664.411136
      Channel: 65 len 4 [PSM 3 mode 0] {chan 3}
      RFCOMM: Unnumbered Ack (UA) (0x63)
         Address: 0x03 cr 1 dlci 0x00
         Control: 0x73 poll/final 1
         Length: 0
         FCS: 0xd7
< ACL Data TX: Handle 43 flags 0x00 dlen 12           #22704 [hci0] 1664.411143
      L2CAP: Disconnection Request (0x06) ident 11 len 4
        Destination CID: 65
        Source CID: 65
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22705 [hci0] 1664.414009
        Num handles: 1
        Handle: 43
        Count: 1
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22706 [hci0] 1664.415007
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12           #22707 [hci0] 1664.418674
      L2CAP: Disconnection Request (0x06) ident 17 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12           #22708 [hci0] 1664.418762
      L2CAP: Disconnection Response (0x07) ident 17 len 4
        Destination CID: 65
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 12           #22709 [hci0] 1664.421073
      L2CAP: Connection Request (0x02) ident 12 len 4
        PSM: 1 (0x0001)
        Source CID: 65
> ACL Data RX: Handle 43 flags 0x02 dlen 12           #22710 [hci0] 1664.421371
      L2CAP: Disconnection Response (0x07) ident 11 len 4
        Destination CID: 65
        Source CID: 65
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22711 [hci0] 1664.424082
        Num handles: 1
        Handle: 43
        Count: 1
> HCI Event: Number of Completed Pac.. (0x13) plen 5  #22712 [hci0] 1664.425040
        Num handles: 1
        Handle: 43
        Count: 1
> ACL Data RX: Handle 43 flags 0x02 dlen 12           #22713 [hci0] 1664.426103
      L2CAP: Connection Request (0x02) ident 18 len 4
        PSM: 3 (0x0003)
        Source CID: 65
< ACL Data TX: Handle 43 flags 0x00 dlen 16           #22714 [hci0] 1664.426186
      L2CAP: Connection Response (0x03) ident 18 len 8
        Destination CID: 66
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
< ACL Data TX: Handle 43 flags 0x00 dlen 27           #22715 [hci0] 1664.426196
      L2CAP: Configure Request (0x04) ident 13 len 19
        Destination CID: 65
        Flags: 0x0000
        Option: Maximum Transmission Unit (0x01) [mandatory]
          MTU: 1013
        Option: Retransmission and Flow Control (0x04) [mandatory]
          Mode: Basic (0x00)
          TX window size: 0
          Max transmit: 0
          Retransmission timeout: 0
          Monitor timeout: 0
          Maximum PDU size: 0
> ACL Data RX: Handle 43 flags 0x02 dlen 16           #22716 [hci0] 1664.428804
      L2CAP: Connection Response (0x03) ident 12 len 8
        Destination CID: 66
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
*snip*

Fix is to check that channel is in state BT_DISCONN before deleting the
channel.

This bug was found while fuzzing Bluez's OBEX implementation using
Synopsys Defensics.

Reported-by: Matti Kamunen <[email protected]>
Reported-by: Ari Timonen <[email protected]>
Signed-off-by: Matias Karhumaa <[email protected]>
Signed-off-by: Marcel Holtmann <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
fengguang pushed a commit to 0day-ci/linux that referenced this pull request Dec 8, 2019
Some meson pll registers can be initialized with 0 as N value, introducing
the following division by 0 when computing rate :

  UBSAN: Undefined behaviour in drivers/clk/meson/clk-pll.c:75:9
  division by zero
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.4.0-rc3-608075-g86c9af8630e1-dirty torvalds#400
  Call trace:
   dump_backtrace+0x0/0x1c0
   show_stack+0x14/0x20
   dump_stack+0xc4/0x100
   ubsan_epilogue+0x14/0x68
   __ubsan_handle_divrem_overflow+0x98/0xb8
   __pll_params_to_rate+0xdc/0x140
   meson_clk_pll_recalc_rate+0x278/0x3a0
   __clk_register+0x7c8/0xbb0
   devm_clk_hw_register+0x54/0xc0
   meson_eeclkc_probe+0xf4/0x1a0
   platform_drv_probe+0x54/0xd8
   really_probe+0x16c/0x438
   driver_probe_device+0xb0/0xf0
   device_driver_attach+0x94/0xa0
   __driver_attach+0x70/0x108
   bus_for_each_dev+0xd8/0x128
   driver_attach+0x30/0x40
   bus_add_driver+0x1b0/0x2d8
   driver_register+0xbc/0x1d0
   __platform_driver_register+0x78/0x88
   axg_driver_init+0x18/0x20
   do_one_initcall+0xc8/0x24c
   kernel_init_freeable+0x2b0/0x344
   kernel_init+0x10/0x128
   ret_from_fork+0x10/0x18

This checks if N is null before doing the division.

Fixes: 8289aaf ("clk: meson: improve pll driver results with frac")
Signed-off-by: Remi Pommarel <[email protected]>
jbrun3t pushed a commit to BayLibre/clk-meson that referenced this pull request Dec 16, 2019
Some meson pll registers can be initialized with 0 as N value, introducing
the following division by 0 when computing rate :

  UBSAN: Undefined behaviour in drivers/clk/meson/clk-pll.c:75:9
  division by zero
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.4.0-rc3-608075-g86c9af8630e1-dirty torvalds#400
  Call trace:
   dump_backtrace+0x0/0x1c0
   show_stack+0x14/0x20
   dump_stack+0xc4/0x100
   ubsan_epilogue+0x14/0x68
   __ubsan_handle_divrem_overflow+0x98/0xb8
   __pll_params_to_rate+0xdc/0x140
   meson_clk_pll_recalc_rate+0x278/0x3a0
   __clk_register+0x7c8/0xbb0
   devm_clk_hw_register+0x54/0xc0
   meson_eeclkc_probe+0xf4/0x1a0
   platform_drv_probe+0x54/0xd8
   really_probe+0x16c/0x438
   driver_probe_device+0xb0/0xf0
   device_driver_attach+0x94/0xa0
   __driver_attach+0x70/0x108
   bus_for_each_dev+0xd8/0x128
   driver_attach+0x30/0x40
   bus_add_driver+0x1b0/0x2d8
   driver_register+0xbc/0x1d0
   __platform_driver_register+0x78/0x88
   axg_driver_init+0x18/0x20
   do_one_initcall+0xc8/0x24c
   kernel_init_freeable+0x2b0/0x344
   kernel_init+0x10/0x128
   ret_from_fork+0x10/0x18

This checks if N is null before doing the division.

Fixes: 7a29a86 ("clk: meson: Add support for Meson clock controller")
Reviewed-by: Martin Blumenstingl <[email protected]>
Signed-off-by: Remi Pommarel <[email protected]>
[[email protected]: update the comment in above the fix]
Signed-off-by: Jerome Brunet <[email protected]>
chewitt pushed a commit to chewitt/linux that referenced this pull request Dec 18, 2019
Some meson pll registers can be initialized with 0 as N value, introducing
the following division by 0 when computing rate :

  UBSAN: Undefined behaviour in drivers/clk/meson/clk-pll.c:75:9
  division by zero
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.4.0-rc3-608075-g86c9af8630e1-dirty torvalds#400
  Call trace:
   dump_backtrace+0x0/0x1c0
   show_stack+0x14/0x20
   dump_stack+0xc4/0x100
   ubsan_epilogue+0x14/0x68
   __ubsan_handle_divrem_overflow+0x98/0xb8
   __pll_params_to_rate+0xdc/0x140
   meson_clk_pll_recalc_rate+0x278/0x3a0
   __clk_register+0x7c8/0xbb0
   devm_clk_hw_register+0x54/0xc0
   meson_eeclkc_probe+0xf4/0x1a0
   platform_drv_probe+0x54/0xd8
   really_probe+0x16c/0x438
   driver_probe_device+0xb0/0xf0
   device_driver_attach+0x94/0xa0
   __driver_attach+0x70/0x108
   bus_for_each_dev+0xd8/0x128
   driver_attach+0x30/0x40
   bus_add_driver+0x1b0/0x2d8
   driver_register+0xbc/0x1d0
   __platform_driver_register+0x78/0x88
   axg_driver_init+0x18/0x20
   do_one_initcall+0xc8/0x24c
   kernel_init_freeable+0x2b0/0x344
   kernel_init+0x10/0x128
   ret_from_fork+0x10/0x18

This checks if N is null before doing the division.

Fixes: 7a29a86 ("clk: meson: Add support for Meson clock controller")
Reviewed-by: Martin Blumenstingl <[email protected]>
Signed-off-by: Remi Pommarel <[email protected]>
[[email protected]: update the comment in above the fix]
Signed-off-by: Jerome Brunet <[email protected]>
fengguang pushed a commit to 0day-ci/linux that referenced this pull request Dec 18, 2019
Some meson pll registers can be initialized with 0 as N value, introducing
the following division by 0 when computing rate :

  UBSAN: Undefined behaviour in drivers/clk/meson/clk-pll.c:75:9
  division by zero
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.4.0-rc3-608075-g86c9af8630e1-dirty torvalds#400
  Call trace:
   dump_backtrace+0x0/0x1c0
   show_stack+0x14/0x20
   dump_stack+0xc4/0x100
   ubsan_epilogue+0x14/0x68
   __ubsan_handle_divrem_overflow+0x98/0xb8
   __pll_params_to_rate+0xdc/0x140
   meson_clk_pll_recalc_rate+0x278/0x3a0
   __clk_register+0x7c8/0xbb0
   devm_clk_hw_register+0x54/0xc0
   meson_eeclkc_probe+0xf4/0x1a0
   platform_drv_probe+0x54/0xd8
   really_probe+0x16c/0x438
   driver_probe_device+0xb0/0xf0
   device_driver_attach+0x94/0xa0
   __driver_attach+0x70/0x108
   bus_for_each_dev+0xd8/0x128
   driver_attach+0x30/0x40
   bus_add_driver+0x1b0/0x2d8
   driver_register+0xbc/0x1d0
   __platform_driver_register+0x78/0x88
   axg_driver_init+0x18/0x20
   do_one_initcall+0xc8/0x24c
   kernel_init_freeable+0x2b0/0x344
   kernel_init+0x10/0x128
   ret_from_fork+0x10/0x18

This checks if N is null before doing the division.

Fixes: 7a29a86 ("clk: meson: Add support for Meson clock controller")
Signed-off-by: Remi Pommarel <[email protected]>
chewitt pushed a commit to chewitt/linux that referenced this pull request Dec 19, 2019
Some meson pll registers can be initialized with 0 as N value, introducing
the following division by 0 when computing rate :

  UBSAN: Undefined behaviour in drivers/clk/meson/clk-pll.c:75:9
  division by zero
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.4.0-rc3-608075-g86c9af8630e1-dirty torvalds#400
  Call trace:
   dump_backtrace+0x0/0x1c0
   show_stack+0x14/0x20
   dump_stack+0xc4/0x100
   ubsan_epilogue+0x14/0x68
   __ubsan_handle_divrem_overflow+0x98/0xb8
   __pll_params_to_rate+0xdc/0x140
   meson_clk_pll_recalc_rate+0x278/0x3a0
   __clk_register+0x7c8/0xbb0
   devm_clk_hw_register+0x54/0xc0
   meson_eeclkc_probe+0xf4/0x1a0
   platform_drv_probe+0x54/0xd8
   really_probe+0x16c/0x438
   driver_probe_device+0xb0/0xf0
   device_driver_attach+0x94/0xa0
   __driver_attach+0x70/0x108
   bus_for_each_dev+0xd8/0x128
   driver_attach+0x30/0x40
   bus_add_driver+0x1b0/0x2d8
   driver_register+0xbc/0x1d0
   __platform_driver_register+0x78/0x88
   axg_driver_init+0x18/0x20
   do_one_initcall+0xc8/0x24c
   kernel_init_freeable+0x2b0/0x344
   kernel_init+0x10/0x128
   ret_from_fork+0x10/0x18

This checks if N is null before doing the division.

Fixes: 7a29a86 ("clk: meson: Add support for Meson clock controller")
Reviewed-by: Martin Blumenstingl <[email protected]>
Signed-off-by: Remi Pommarel <[email protected]>
[[email protected]: update the comment in above the fix]
Signed-off-by: Jerome Brunet <[email protected]>
chewitt pushed a commit to chewitt/linux that referenced this pull request Dec 24, 2019
Some meson pll registers can be initialized with 0 as N value, introducing
the following division by 0 when computing rate :

  UBSAN: Undefined behaviour in drivers/clk/meson/clk-pll.c:75:9
  division by zero
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.4.0-rc3-608075-g86c9af8630e1-dirty torvalds#400
  Call trace:
   dump_backtrace+0x0/0x1c0
   show_stack+0x14/0x20
   dump_stack+0xc4/0x100
   ubsan_epilogue+0x14/0x68
   __ubsan_handle_divrem_overflow+0x98/0xb8
   __pll_params_to_rate+0xdc/0x140
   meson_clk_pll_recalc_rate+0x278/0x3a0
   __clk_register+0x7c8/0xbb0
   devm_clk_hw_register+0x54/0xc0
   meson_eeclkc_probe+0xf4/0x1a0
   platform_drv_probe+0x54/0xd8
   really_probe+0x16c/0x438
   driver_probe_device+0xb0/0xf0
   device_driver_attach+0x94/0xa0
   __driver_attach+0x70/0x108
   bus_for_each_dev+0xd8/0x128
   driver_attach+0x30/0x40
   bus_add_driver+0x1b0/0x2d8
   driver_register+0xbc/0x1d0
   __platform_driver_register+0x78/0x88
   axg_driver_init+0x18/0x20
   do_one_initcall+0xc8/0x24c
   kernel_init_freeable+0x2b0/0x344
   kernel_init+0x10/0x128
   ret_from_fork+0x10/0x18

This checks if N is null before doing the division.

Fixes: 7a29a86 ("clk: meson: Add support for Meson clock controller")
Reviewed-by: Martin Blumenstingl <[email protected]>
Signed-off-by: Remi Pommarel <[email protected]>
[[email protected]: update the comment in above the fix]
Signed-off-by: Jerome Brunet <[email protected]>
chewitt pushed a commit to chewitt/linux that referenced this pull request Jan 7, 2020
Some meson pll registers can be initialized with 0 as N value, introducing
the following division by 0 when computing rate :

  UBSAN: Undefined behaviour in drivers/clk/meson/clk-pll.c:75:9
  division by zero
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.4.0-rc3-608075-g86c9af8630e1-dirty torvalds#400
  Call trace:
   dump_backtrace+0x0/0x1c0
   show_stack+0x14/0x20
   dump_stack+0xc4/0x100
   ubsan_epilogue+0x14/0x68
   __ubsan_handle_divrem_overflow+0x98/0xb8
   __pll_params_to_rate+0xdc/0x140
   meson_clk_pll_recalc_rate+0x278/0x3a0
   __clk_register+0x7c8/0xbb0
   devm_clk_hw_register+0x54/0xc0
   meson_eeclkc_probe+0xf4/0x1a0
   platform_drv_probe+0x54/0xd8
   really_probe+0x16c/0x438
   driver_probe_device+0xb0/0xf0
   device_driver_attach+0x94/0xa0
   __driver_attach+0x70/0x108
   bus_for_each_dev+0xd8/0x128
   driver_attach+0x30/0x40
   bus_add_driver+0x1b0/0x2d8
   driver_register+0xbc/0x1d0
   __platform_driver_register+0x78/0x88
   axg_driver_init+0x18/0x20
   do_one_initcall+0xc8/0x24c
   kernel_init_freeable+0x2b0/0x344
   kernel_init+0x10/0x128
   ret_from_fork+0x10/0x18

This checks if N is null before doing the division.

Fixes: 7a29a86 ("clk: meson: Add support for Meson clock controller")
Reviewed-by: Martin Blumenstingl <[email protected]>
Signed-off-by: Remi Pommarel <[email protected]>
[[email protected]: update the comment in above the fix]
Signed-off-by: Jerome Brunet <[email protected]>
chewitt pushed a commit to chewitt/linux that referenced this pull request Jan 13, 2020
Some meson pll registers can be initialized with 0 as N value, introducing
the following division by 0 when computing rate :

  UBSAN: Undefined behaviour in drivers/clk/meson/clk-pll.c:75:9
  division by zero
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.4.0-rc3-608075-g86c9af8630e1-dirty torvalds#400
  Call trace:
   dump_backtrace+0x0/0x1c0
   show_stack+0x14/0x20
   dump_stack+0xc4/0x100
   ubsan_epilogue+0x14/0x68
   __ubsan_handle_divrem_overflow+0x98/0xb8
   __pll_params_to_rate+0xdc/0x140
   meson_clk_pll_recalc_rate+0x278/0x3a0
   __clk_register+0x7c8/0xbb0
   devm_clk_hw_register+0x54/0xc0
   meson_eeclkc_probe+0xf4/0x1a0
   platform_drv_probe+0x54/0xd8
   really_probe+0x16c/0x438
   driver_probe_device+0xb0/0xf0
   device_driver_attach+0x94/0xa0
   __driver_attach+0x70/0x108
   bus_for_each_dev+0xd8/0x128
   driver_attach+0x30/0x40
   bus_add_driver+0x1b0/0x2d8
   driver_register+0xbc/0x1d0
   __platform_driver_register+0x78/0x88
   axg_driver_init+0x18/0x20
   do_one_initcall+0xc8/0x24c
   kernel_init_freeable+0x2b0/0x344
   kernel_init+0x10/0x128
   ret_from_fork+0x10/0x18

This checks if N is null before doing the division.

Fixes: 7a29a86 ("clk: meson: Add support for Meson clock controller")
Reviewed-by: Martin Blumenstingl <[email protected]>
Signed-off-by: Remi Pommarel <[email protected]>
[[email protected]: update the comment in above the fix]
Signed-off-by: Jerome Brunet <[email protected]>
chewitt pushed a commit to chewitt/linux that referenced this pull request Jan 28, 2020
Some meson pll registers can be initialized with 0 as N value, introducing
the following division by 0 when computing rate :

  UBSAN: Undefined behaviour in drivers/clk/meson/clk-pll.c:75:9
  division by zero
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.4.0-rc3-608075-g86c9af8630e1-dirty torvalds#400
  Call trace:
   dump_backtrace+0x0/0x1c0
   show_stack+0x14/0x20
   dump_stack+0xc4/0x100
   ubsan_epilogue+0x14/0x68
   __ubsan_handle_divrem_overflow+0x98/0xb8
   __pll_params_to_rate+0xdc/0x140
   meson_clk_pll_recalc_rate+0x278/0x3a0
   __clk_register+0x7c8/0xbb0
   devm_clk_hw_register+0x54/0xc0
   meson_eeclkc_probe+0xf4/0x1a0
   platform_drv_probe+0x54/0xd8
   really_probe+0x16c/0x438
   driver_probe_device+0xb0/0xf0
   device_driver_attach+0x94/0xa0
   __driver_attach+0x70/0x108
   bus_for_each_dev+0xd8/0x128
   driver_attach+0x30/0x40
   bus_add_driver+0x1b0/0x2d8
   driver_register+0xbc/0x1d0
   __platform_driver_register+0x78/0x88
   axg_driver_init+0x18/0x20
   do_one_initcall+0xc8/0x24c
   kernel_init_freeable+0x2b0/0x344
   kernel_init+0x10/0x128
   ret_from_fork+0x10/0x18

This checks if N is null before doing the division.

Fixes: 7a29a86 ("clk: meson: Add support for Meson clock controller")
Reviewed-by: Martin Blumenstingl <[email protected]>
Signed-off-by: Remi Pommarel <[email protected]>
[[email protected]: update the comment in above the fix]
Signed-off-by: Jerome Brunet <[email protected]>
tombriden pushed a commit to tombriden/linux that referenced this pull request Feb 24, 2020
[ Upstream commit d8488a4 ]

Some meson pll registers can be initialized with 0 as N value, introducing
the following division by 0 when computing rate :

  UBSAN: Undefined behaviour in drivers/clk/meson/clk-pll.c:75:9
  division by zero
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.4.0-rc3-608075-g86c9af8630e1-dirty torvalds#400
  Call trace:
   dump_backtrace+0x0/0x1c0
   show_stack+0x14/0x20
   dump_stack+0xc4/0x100
   ubsan_epilogue+0x14/0x68
   __ubsan_handle_divrem_overflow+0x98/0xb8
   __pll_params_to_rate+0xdc/0x140
   meson_clk_pll_recalc_rate+0x278/0x3a0
   __clk_register+0x7c8/0xbb0
   devm_clk_hw_register+0x54/0xc0
   meson_eeclkc_probe+0xf4/0x1a0
   platform_drv_probe+0x54/0xd8
   really_probe+0x16c/0x438
   driver_probe_device+0xb0/0xf0
   device_driver_attach+0x94/0xa0
   __driver_attach+0x70/0x108
   bus_for_each_dev+0xd8/0x128
   driver_attach+0x30/0x40
   bus_add_driver+0x1b0/0x2d8
   driver_register+0xbc/0x1d0
   __platform_driver_register+0x78/0x88
   axg_driver_init+0x18/0x20
   do_one_initcall+0xc8/0x24c
   kernel_init_freeable+0x2b0/0x344
   kernel_init+0x10/0x128
   ret_from_fork+0x10/0x18

This checks if N is null before doing the division.

Fixes: 7a29a86 ("clk: meson: Add support for Meson clock controller")
Reviewed-by: Martin Blumenstingl <[email protected]>
Signed-off-by: Remi Pommarel <[email protected]>
[[email protected]: update the comment in above the fix]
Signed-off-by: Jerome Brunet <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
kakra pushed a commit to kakra/linux that referenced this pull request Feb 25, 2020
[ Upstream commit d8488a4 ]

Some meson pll registers can be initialized with 0 as N value, introducing
the following division by 0 when computing rate :

  UBSAN: Undefined behaviour in drivers/clk/meson/clk-pll.c:75:9
  division by zero
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.4.0-rc3-608075-g86c9af8630e1-dirty torvalds#400
  Call trace:
   dump_backtrace+0x0/0x1c0
   show_stack+0x14/0x20
   dump_stack+0xc4/0x100
   ubsan_epilogue+0x14/0x68
   __ubsan_handle_divrem_overflow+0x98/0xb8
   __pll_params_to_rate+0xdc/0x140
   meson_clk_pll_recalc_rate+0x278/0x3a0
   __clk_register+0x7c8/0xbb0
   devm_clk_hw_register+0x54/0xc0
   meson_eeclkc_probe+0xf4/0x1a0
   platform_drv_probe+0x54/0xd8
   really_probe+0x16c/0x438
   driver_probe_device+0xb0/0xf0
   device_driver_attach+0x94/0xa0
   __driver_attach+0x70/0x108
   bus_for_each_dev+0xd8/0x128
   driver_attach+0x30/0x40
   bus_add_driver+0x1b0/0x2d8
   driver_register+0xbc/0x1d0
   __platform_driver_register+0x78/0x88
   axg_driver_init+0x18/0x20
   do_one_initcall+0xc8/0x24c
   kernel_init_freeable+0x2b0/0x344
   kernel_init+0x10/0x128
   ret_from_fork+0x10/0x18

This checks if N is null before doing the division.

Fixes: 7a29a86 ("clk: meson: Add support for Meson clock controller")
Reviewed-by: Martin Blumenstingl <[email protected]>
Signed-off-by: Remi Pommarel <[email protected]>
[[email protected]: update the comment in above the fix]
Signed-off-by: Jerome Brunet <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
jackpot51 pushed a commit to pop-os/linux that referenced this pull request Apr 13, 2020
BugLink: https://bugs.launchpad.net/bugs/1868324

[ Upstream commit d8488a4 ]

Some meson pll registers can be initialized with 0 as N value, introducing
the following division by 0 when computing rate :

  UBSAN: Undefined behaviour in drivers/clk/meson/clk-pll.c:75:9
  division by zero
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.4.0-rc3-608075-g86c9af8630e1-dirty torvalds#400
  Call trace:
   dump_backtrace+0x0/0x1c0
   show_stack+0x14/0x20
   dump_stack+0xc4/0x100
   ubsan_epilogue+0x14/0x68
   __ubsan_handle_divrem_overflow+0x98/0xb8
   __pll_params_to_rate+0xdc/0x140
   meson_clk_pll_recalc_rate+0x278/0x3a0
   __clk_register+0x7c8/0xbb0
   devm_clk_hw_register+0x54/0xc0
   meson_eeclkc_probe+0xf4/0x1a0
   platform_drv_probe+0x54/0xd8
   really_probe+0x16c/0x438
   driver_probe_device+0xb0/0xf0
   device_driver_attach+0x94/0xa0
   __driver_attach+0x70/0x108
   bus_for_each_dev+0xd8/0x128
   driver_attach+0x30/0x40
   bus_add_driver+0x1b0/0x2d8
   driver_register+0xbc/0x1d0
   __platform_driver_register+0x78/0x88
   axg_driver_init+0x18/0x20
   do_one_initcall+0xc8/0x24c
   kernel_init_freeable+0x2b0/0x344
   kernel_init+0x10/0x128
   ret_from_fork+0x10/0x18

This checks if N is null before doing the division.

Fixes: 7a29a86 ("clk: meson: Add support for Meson clock controller")
Reviewed-by: Martin Blumenstingl <[email protected]>
Signed-off-by: Remi Pommarel <[email protected]>
[[email protected]: update the comment in above the fix]
Signed-off-by: Jerome Brunet <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Kamal Mostafa <[email protected]>
Signed-off-by: Khalid Elmously <[email protected]>
samueldr pushed a commit to samueldr/linux that referenced this pull request Jun 28, 2020
[ Upstream commit af30922 ]

If a block device is closed while iterate_bdevs() is handling it, the
following NULL pointer dereference occurs because bdev->b_disk is NULL
in bdev_get_queue(), which is called from blk_get_backing_dev_info() (in
turn called by the mapping_cap_writeback_dirty() call in
__filemap_fdatawrite_range()):

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000508
 IP: [<ffffffff81314790>] blk_get_backing_dev_info+0x10/0x20
 PGD 9e62067 PUD 9ee8067 PMD 0
 Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
 Modules linked in:
 CPU: 1 PID: 2422 Comm: sync Not tainted 4.5.0-rc7+ torvalds#400
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
 task: ffff880009f4d700 ti: ffff880009f5c000 task.ti: ffff880009f5c000
 RIP: 0010:[<ffffffff81314790>]  [<ffffffff81314790>] blk_get_backing_dev_info+0x10/0x20
 RSP: 0018:ffff880009f5fe68  EFLAGS: 00010246
 RAX: 0000000000000000 RBX: ffff88000ec17a38 RCX: ffffffff81a4e940
 RDX: 7fffffffffffffff RSI: 0000000000000000 RDI: ffff88000ec176c0
 RBP: ffff880009f5fe68 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000001 R11: 0000000000000000 R12: ffff88000ec17860
 R13: ffffffff811b25c0 R14: ffff88000ec178e0 R15: ffff88000ec17a38
 FS:  00007faee505d700(0000) GS:ffff88000fb00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: 0000000000000508 CR3: 0000000009e8a000 CR4: 00000000000006e0
 Stack:
  ffff880009f5feb8 ffffffff8112e7f5 0000000000000000 7fffffffffffffff
  0000000000000000 0000000000000000 7fffffffffffffff 0000000000000001
  ffff88000ec178e0 ffff88000ec17860 ffff880009f5fec8 ffffffff8112e81f
 Call Trace:
  [<ffffffff8112e7f5>] __filemap_fdatawrite_range+0x85/0x90
  [<ffffffff8112e81f>] filemap_fdatawrite+0x1f/0x30
  [<ffffffff811b25d6>] fdatawrite_one_bdev+0x16/0x20
  [<ffffffff811bc402>] iterate_bdevs+0xf2/0x130
  [<ffffffff811b2763>] sys_sync+0x63/0x90
  [<ffffffff815d4272>] entry_SYSCALL_64_fastpath+0x12/0x76
 Code: 0f 1f 44 00 00 48 8b 87 f0 00 00 00 55 48 89 e5 <48> 8b 80 08 05 00 00 5d
 RIP  [<ffffffff81314790>] blk_get_backing_dev_info+0x10/0x20
  RSP <ffff880009f5fe68>
 CR2: 0000000000000508
 ---[ end trace 2487336ceb3de62d ]---

The crash is easily reproducible by running the following command, if an
msleep(100) is inserted before the call to func() in iterate_devs():

 while :; do head -c1 /dev/nullb0; done > /dev/null & while :; do sync; done

Fix it by holding the bd_mutex across the func() call and only calling
func() if the bdev is opened.

Cc: [email protected]
Fixes: 5c0d6b6 ("vfs: Create function for iterating over block devices")
Reported-and-tested-by: Wei Fang <[email protected]>
Signed-off-by: Rabin Vincent <[email protected]>
Signed-off-by: Jan Kara <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
fengguang pushed a commit to 0day-ci/linux that referenced this pull request Jun 29, 2020
GFP_KERNEL flag specifies a normal kernel allocation in which executing
in process context without any locks and can sleep.
mmio_diff takes sometime to finish all the diff compare and it has
locks, continue using GFP_KERNEL will output below trace if LOCKDEP
enabled.

Use GFP_ATOMIC instead.

V2: Rebase.

=====================================================
WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
5.7.0-rc2 torvalds#400 Not tainted
-----------------------------------------------------
is trying to acquire:
ffffffffb47bea20 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_acquire.part.0+0x0/0x30

               and this task is already holding:
ffff88845b85cc90 (&gvt->scheduler.mmio_context_lock){+.-.}-{2:2}, at: vgpu_mmio_diff_show+0xcf/0x2e0
which would create a new lock dependency:
 (&gvt->scheduler.mmio_context_lock){+.-.}-{2:2} -> (fs_reclaim){+.+.}-{0:0}

               but this new dependency connects a SOFTIRQ-irq-safe lock:
 (&gvt->scheduler.mmio_context_lock){+.-.}-{2:2}

               ... which became SOFTIRQ-irq-safe at:
  lock_acquire+0x175/0x4e0
  _raw_spin_lock_irqsave+0x2b/0x40
  shadow_context_status_change+0xfe/0x2f0
  notifier_call_chain+0x6a/0xa0
  __atomic_notifier_call_chain+0x5f/0xf0
  execlists_schedule_out+0x42a/0x820
  process_csb+0xe7/0x3e0
  execlists_submission_tasklet+0x5c/0x1d0
  tasklet_action_common.isra.0+0xeb/0x260
  __do_softirq+0x11d/0x56f
  irq_exit+0xf6/0x100
  do_IRQ+0x7f/0x160
  ret_from_intr+0x0/0x2a
  cpuidle_enter_state+0xcd/0x5b0
  cpuidle_enter+0x37/0x60
  do_idle+0x337/0x3f0
  cpu_startup_entry+0x14/0x20
  start_kernel+0x58b/0x5c5
  secondary_startup_64+0xa4/0xb0

               to a SOFTIRQ-irq-unsafe lock:
 (fs_reclaim){+.+.}-{0:0}

               ... which became SOFTIRQ-irq-unsafe at:
...
  lock_acquire+0x175/0x4e0
  fs_reclaim_acquire.part.0+0x20/0x30
  kmem_cache_alloc_node_trace+0x2e/0x290
  alloc_worker+0x2b/0xb0
  init_rescuer.part.0+0x17/0xe0
  workqueue_init+0x293/0x3bb
  kernel_init_freeable+0x149/0x325
  kernel_init+0x8/0x116
  ret_from_fork+0x3a/0x50

               other info that might help us debug this:

 Possible interrupt unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(fs_reclaim);
                               local_irq_disable();
                               lock(&gvt->scheduler.mmio_context_lock);
                               lock(fs_reclaim);
  <Interrupt>
    lock(&gvt->scheduler.mmio_context_lock);

                *** DEADLOCK ***

3 locks held by cat/1439:
 #0: ffff888444a23698 (&p->lock){+.+.}-{3:3}, at: seq_read+0x49/0x680
 #1: ffff88845b858068 (&gvt->lock){+.+.}-{3:3}, at: vgpu_mmio_diff_show+0xc7/0x2e0
 #2: ffff88845b85cc90 (&gvt->scheduler.mmio_context_lock){+.-.}-{2:2}, at: vgpu_mmio_diff_show+0xcf/0x2e0

               the dependencies between SOFTIRQ-irq-safe lock and the holding lock:
-> (&gvt->scheduler.mmio_context_lock){+.-.}-{2:2} ops: 31 {
   HARDIRQ-ON-W at:
                    lock_acquire+0x175/0x4e0
                    _raw_spin_lock_bh+0x2f/0x40
                    vgpu_mmio_diff_show+0xcf/0x2e0
                    seq_read+0x242/0x680
                    full_proxy_read+0x95/0xc0
                    vfs_read+0xc2/0x1b0
                    ksys_read+0xc4/0x160
                    do_syscall_64+0x63/0x290
                    entry_SYSCALL_64_after_hwframe+0x49/0xb3
   IN-SOFTIRQ-W at:
                    lock_acquire+0x175/0x4e0
                    _raw_spin_lock_irqsave+0x2b/0x40
                    shadow_context_status_change+0xfe/0x2f0
                    notifier_call_chain+0x6a/0xa0
                    __atomic_notifier_call_chain+0x5f/0xf0
                    execlists_schedule_out+0x42a/0x820
                    process_csb+0xe7/0x3e0
                    execlists_submission_tasklet+0x5c/0x1d0
                    tasklet_action_common.isra.0+0xeb/0x260
                    __do_softirq+0x11d/0x56f
                    irq_exit+0xf6/0x100
                    do_IRQ+0x7f/0x160
                    ret_from_intr+0x0/0x2a
                    cpuidle_enter_state+0xcd/0x5b0
                    cpuidle_enter+0x37/0x60
                    do_idle+0x337/0x3f0
                    cpu_startup_entry+0x14/0x20
                    start_kernel+0x58b/0x5c5
                    secondary_startup_64+0xa4/0xb0
   INITIAL USE at:
                   lock_acquire+0x175/0x4e0
                   _raw_spin_lock_irqsave+0x2b/0x40
                   shadow_context_status_change+0xfe/0x2f0
                   notifier_call_chain+0x6a/0xa0
                   __atomic_notifier_call_chain+0x5f/0xf0
                   execlists_schedule_in+0x2c8/0x690
                   __execlists_submission_tasklet+0x1303/0x1930
                   execlists_submit_request+0x1e7/0x230
                   submit_notify+0x105/0x2a4
                   __i915_sw_fence_complete+0xaa/0x380
                   __engine_park+0x313/0x5a0
                   ____intel_wakeref_put_last+0x3e/0x90
                   intel_gt_resume+0x41e/0x440
                   intel_gt_init+0x283/0xbc0
                   i915_gem_init+0x197/0x240
                   i915_driver_probe+0xc2d/0x12e0
                   i915_pci_probe+0xa2/0x1e0
                   local_pci_probe+0x6f/0xb0
                   pci_device_probe+0x171/0x230
                   really_probe+0x17a/0x380
                   driver_probe_device+0x70/0xf0
                   device_driver_attach+0x82/0x90
                   __driver_attach+0x60/0x100
                   bus_for_each_dev+0xe4/0x140
                   bus_add_driver+0x257/0x2a0
                   driver_register+0xd3/0x150
                   i915_init+0x6d/0x80
                   do_one_initcall+0xb8/0x3a0
                   kernel_init_freeable+0x2b4/0x325
                   kernel_init+0x8/0x116
                   ret_from_fork+0x3a/0x50
 }
__key.77812+0x0/0x40
 ... acquired at:
   lock_acquire+0x175/0x4e0
   fs_reclaim_acquire.part.0+0x20/0x30
   kmem_cache_alloc_trace+0x2e/0x260
   mmio_diff_handler+0xc0/0x150
   intel_gvt_for_each_tracked_mmio+0x7b/0x140
   vgpu_mmio_diff_show+0x111/0x2e0
   seq_read+0x242/0x680
   full_proxy_read+0x95/0xc0
   vfs_read+0xc2/0x1b0
   ksys_read+0xc4/0x160
   do_syscall_64+0x63/0x290
   entry_SYSCALL_64_after_hwframe+0x49/0xb3

               the dependencies between the lock to be acquired
 and SOFTIRQ-irq-unsafe lock:
-> (fs_reclaim){+.+.}-{0:0} ops: 1999031 {
   HARDIRQ-ON-W at:
                    lock_acquire+0x175/0x4e0
                    fs_reclaim_acquire.part.0+0x20/0x30
                    kmem_cache_alloc_node_trace+0x2e/0x290
                    alloc_worker+0x2b/0xb0
                    init_rescuer.part.0+0x17/0xe0
                    workqueue_init+0x293/0x3bb
                    kernel_init_freeable+0x149/0x325
                    kernel_init+0x8/0x116
                    ret_from_fork+0x3a/0x50
   SOFTIRQ-ON-W at:
                    lock_acquire+0x175/0x4e0
                    fs_reclaim_acquire.part.0+0x20/0x30
                    kmem_cache_alloc_node_trace+0x2e/0x290
                    alloc_worker+0x2b/0xb0
                    init_rescuer.part.0+0x17/0xe0
                    workqueue_init+0x293/0x3bb
                    kernel_init_freeable+0x149/0x325
                    kernel_init+0x8/0x116
                    ret_from_fork+0x3a/0x50
   INITIAL USE at:
                   lock_acquire+0x175/0x4e0
                   fs_reclaim_acquire.part.0+0x20/0x30
                   kmem_cache_alloc_node_trace+0x2e/0x290
                   alloc_worker+0x2b/0xb0
                   init_rescuer.part.0+0x17/0xe0
                   workqueue_init+0x293/0x3bb
                   kernel_init_freeable+0x149/0x325
                   kernel_init+0x8/0x116
                   ret_from_fork+0x3a/0x50
 }
__fs_reclaim_map+0x0/0x60
 ... acquired at:
   lock_acquire+0x175/0x4e0
   fs_reclaim_acquire.part.0+0x20/0x30
   kmem_cache_alloc_trace+0x2e/0x260
   mmio_diff_handler+0xc0/0x150
   intel_gvt_for_each_tracked_mmio+0x7b/0x140
   vgpu_mmio_diff_show+0x111/0x2e0
   seq_read+0x242/0x680
   full_proxy_read+0x95/0xc0
   vfs_read+0xc2/0x1b0
   ksys_read+0xc4/0x160
   do_syscall_64+0x63/0x290
   entry_SYSCALL_64_after_hwframe+0x49/0xb3

               stack backtrace:
CPU: 5 PID: 1439 Comm: cat Not tainted 5.7.0-rc2 torvalds#400
Hardware name: Intel(R) Client Systems NUC8i7BEH/NUC8BEB, BIOS BECFL357.86A.0056.2018.1128.1717 11/28/2018
Call Trace:
 dump_stack+0x97/0xe0
 check_irq_usage.cold+0x428/0x434
 ? check_usage_forwards+0x2c0/0x2c0
 ? class_equal+0x11/0x20
 ? __bfs+0xd2/0x2d0
 ? in_any_class_list+0xa0/0xa0
 ? check_path+0x22/0x40
 ? check_noncircular+0x150/0x2b0
 ? print_circular_bug.isra.0+0x1b0/0x1b0
 ? mark_lock+0x13d/0xc50
 ? __lock_acquire+0x1e32/0x39b0
 __lock_acquire+0x1e32/0x39b0
 ? timerqueue_add+0xc1/0x130
 ? register_lock_class+0xa60/0xa60
 ? mark_lock+0x13d/0xc50
 lock_acquire+0x175/0x4e0
 ? __zone_pcp_update+0x80/0x80
 ? check_flags.part.0+0x210/0x210
 ? mark_held_locks+0x65/0x90
 ? _raw_spin_unlock_irqrestore+0x32/0x40
 ? lockdep_hardirqs_on+0x190/0x290
 ? fwtable_read32+0x163/0x480
 ? mmio_diff_handler+0xc0/0x150
 fs_reclaim_acquire.part.0+0x20/0x30
 ? __zone_pcp_update+0x80/0x80
 kmem_cache_alloc_trace+0x2e/0x260
 mmio_diff_handler+0xc0/0x150
 ? vgpu_mmio_diff_open+0x30/0x30
 intel_gvt_for_each_tracked_mmio+0x7b/0x140
 vgpu_mmio_diff_show+0x111/0x2e0
 ? mmio_diff_handler+0x150/0x150
 ? rcu_read_lock_sched_held+0xa0/0xb0
 ? rcu_read_lock_bh_held+0xc0/0xc0
 ? kasan_unpoison_shadow+0x33/0x40
 ? __kasan_kmalloc.constprop.0+0xc2/0xd0
 seq_read+0x242/0x680
 ? debugfs_locked_down.isra.0+0x70/0x70
 full_proxy_read+0x95/0xc0
 vfs_read+0xc2/0x1b0
 ksys_read+0xc4/0x160
 ? kernel_write+0xb0/0xb0
 ? mark_held_locks+0x24/0x90
 do_syscall_64+0x63/0x290
 entry_SYSCALL_64_after_hwframe+0x49/0xb3
RIP: 0033:0x7ffbe3e6efb2
Code: c0 e9 c2 fe ff ff 50 48 8d 3d ca cb 0a 00 e8 f5 19 02 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24
RSP: 002b:00007ffd021c08a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007ffbe3e6efb2
RDX: 0000000000020000 RSI: 00007ffbe34cd000 RDI: 0000000000000003
RBP: 00007ffbe34cd000 R08: 00007ffbe34cc010 R09: 0000000000000000
R10: 0000000000000022 R11: 0000000000000246 R12: 0000562b6f0a11f0
R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000020000
------------[ cut here ]------------

Acked-by: Zhenyu Wang <[email protected]>
Signed-off-by: Colin Xu <[email protected]>
Signed-off-by: Zhenyu Wang <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
fifteenhex pushed a commit to fifteenhex/linux that referenced this pull request Jul 25, 2020
GFP_KERNEL flag specifies a normal kernel allocation in which executing
in process context without any locks and can sleep.
mmio_diff takes sometime to finish all the diff compare and it has
locks, continue using GFP_KERNEL will output below trace if LOCKDEP
enabled.

Use GFP_ATOMIC instead.

V2: Rebase.

=====================================================
WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
5.7.0-rc2 torvalds#400 Not tainted
-----------------------------------------------------
is trying to acquire:
ffffffffb47bea20 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_acquire.part.0+0x0/0x30

               and this task is already holding:
ffff88845b85cc90 (&gvt->scheduler.mmio_context_lock){+.-.}-{2:2}, at: vgpu_mmio_diff_show+0xcf/0x2e0
which would create a new lock dependency:
 (&gvt->scheduler.mmio_context_lock){+.-.}-{2:2} -> (fs_reclaim){+.+.}-{0:0}

               but this new dependency connects a SOFTIRQ-irq-safe lock:
 (&gvt->scheduler.mmio_context_lock){+.-.}-{2:2}

               ... which became SOFTIRQ-irq-safe at:
  lock_acquire+0x175/0x4e0
  _raw_spin_lock_irqsave+0x2b/0x40
  shadow_context_status_change+0xfe/0x2f0
  notifier_call_chain+0x6a/0xa0
  __atomic_notifier_call_chain+0x5f/0xf0
  execlists_schedule_out+0x42a/0x820
  process_csb+0xe7/0x3e0
  execlists_submission_tasklet+0x5c/0x1d0
  tasklet_action_common.isra.0+0xeb/0x260
  __do_softirq+0x11d/0x56f
  irq_exit+0xf6/0x100
  do_IRQ+0x7f/0x160
  ret_from_intr+0x0/0x2a
  cpuidle_enter_state+0xcd/0x5b0
  cpuidle_enter+0x37/0x60
  do_idle+0x337/0x3f0
  cpu_startup_entry+0x14/0x20
  start_kernel+0x58b/0x5c5
  secondary_startup_64+0xa4/0xb0

               to a SOFTIRQ-irq-unsafe lock:
 (fs_reclaim){+.+.}-{0:0}

               ... which became SOFTIRQ-irq-unsafe at:
...
  lock_acquire+0x175/0x4e0
  fs_reclaim_acquire.part.0+0x20/0x30
  kmem_cache_alloc_node_trace+0x2e/0x290
  alloc_worker+0x2b/0xb0
  init_rescuer.part.0+0x17/0xe0
  workqueue_init+0x293/0x3bb
  kernel_init_freeable+0x149/0x325
  kernel_init+0x8/0x116
  ret_from_fork+0x3a/0x50

               other info that might help us debug this:

 Possible interrupt unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(fs_reclaim);
                               local_irq_disable();
                               lock(&gvt->scheduler.mmio_context_lock);
                               lock(fs_reclaim);
  <Interrupt>
    lock(&gvt->scheduler.mmio_context_lock);

                *** DEADLOCK ***

3 locks held by cat/1439:
 #0: ffff888444a23698 (&p->lock){+.+.}-{3:3}, at: seq_read+0x49/0x680
 #1: ffff88845b858068 (&gvt->lock){+.+.}-{3:3}, at: vgpu_mmio_diff_show+0xc7/0x2e0
 #2: ffff88845b85cc90 (&gvt->scheduler.mmio_context_lock){+.-.}-{2:2}, at: vgpu_mmio_diff_show+0xcf/0x2e0

               the dependencies between SOFTIRQ-irq-safe lock and the holding lock:
-> (&gvt->scheduler.mmio_context_lock){+.-.}-{2:2} ops: 31 {
   HARDIRQ-ON-W at:
                    lock_acquire+0x175/0x4e0
                    _raw_spin_lock_bh+0x2f/0x40
                    vgpu_mmio_diff_show+0xcf/0x2e0
                    seq_read+0x242/0x680
                    full_proxy_read+0x95/0xc0
                    vfs_read+0xc2/0x1b0
                    ksys_read+0xc4/0x160
                    do_syscall_64+0x63/0x290
                    entry_SYSCALL_64_after_hwframe+0x49/0xb3
   IN-SOFTIRQ-W at:
                    lock_acquire+0x175/0x4e0
                    _raw_spin_lock_irqsave+0x2b/0x40
                    shadow_context_status_change+0xfe/0x2f0
                    notifier_call_chain+0x6a/0xa0
                    __atomic_notifier_call_chain+0x5f/0xf0
                    execlists_schedule_out+0x42a/0x820
                    process_csb+0xe7/0x3e0
                    execlists_submission_tasklet+0x5c/0x1d0
                    tasklet_action_common.isra.0+0xeb/0x260
                    __do_softirq+0x11d/0x56f
                    irq_exit+0xf6/0x100
                    do_IRQ+0x7f/0x160
                    ret_from_intr+0x0/0x2a
                    cpuidle_enter_state+0xcd/0x5b0
                    cpuidle_enter+0x37/0x60
                    do_idle+0x337/0x3f0
                    cpu_startup_entry+0x14/0x20
                    start_kernel+0x58b/0x5c5
                    secondary_startup_64+0xa4/0xb0
   INITIAL USE at:
                   lock_acquire+0x175/0x4e0
                   _raw_spin_lock_irqsave+0x2b/0x40
                   shadow_context_status_change+0xfe/0x2f0
                   notifier_call_chain+0x6a/0xa0
                   __atomic_notifier_call_chain+0x5f/0xf0
                   execlists_schedule_in+0x2c8/0x690
                   __execlists_submission_tasklet+0x1303/0x1930
                   execlists_submit_request+0x1e7/0x230
                   submit_notify+0x105/0x2a4
                   __i915_sw_fence_complete+0xaa/0x380
                   __engine_park+0x313/0x5a0
                   ____intel_wakeref_put_last+0x3e/0x90
                   intel_gt_resume+0x41e/0x440
                   intel_gt_init+0x283/0xbc0
                   i915_gem_init+0x197/0x240
                   i915_driver_probe+0xc2d/0x12e0
                   i915_pci_probe+0xa2/0x1e0
                   local_pci_probe+0x6f/0xb0
                   pci_device_probe+0x171/0x230
                   really_probe+0x17a/0x380
                   driver_probe_device+0x70/0xf0
                   device_driver_attach+0x82/0x90
                   __driver_attach+0x60/0x100
                   bus_for_each_dev+0xe4/0x140
                   bus_add_driver+0x257/0x2a0
                   driver_register+0xd3/0x150
                   i915_init+0x6d/0x80
                   do_one_initcall+0xb8/0x3a0
                   kernel_init_freeable+0x2b4/0x325
                   kernel_init+0x8/0x116
                   ret_from_fork+0x3a/0x50
 }
__key.77812+0x0/0x40
 ... acquired at:
   lock_acquire+0x175/0x4e0
   fs_reclaim_acquire.part.0+0x20/0x30
   kmem_cache_alloc_trace+0x2e/0x260
   mmio_diff_handler+0xc0/0x150
   intel_gvt_for_each_tracked_mmio+0x7b/0x140
   vgpu_mmio_diff_show+0x111/0x2e0
   seq_read+0x242/0x680
   full_proxy_read+0x95/0xc0
   vfs_read+0xc2/0x1b0
   ksys_read+0xc4/0x160
   do_syscall_64+0x63/0x290
   entry_SYSCALL_64_after_hwframe+0x49/0xb3

               the dependencies between the lock to be acquired
 and SOFTIRQ-irq-unsafe lock:
-> (fs_reclaim){+.+.}-{0:0} ops: 1999031 {
   HARDIRQ-ON-W at:
                    lock_acquire+0x175/0x4e0
                    fs_reclaim_acquire.part.0+0x20/0x30
                    kmem_cache_alloc_node_trace+0x2e/0x290
                    alloc_worker+0x2b/0xb0
                    init_rescuer.part.0+0x17/0xe0
                    workqueue_init+0x293/0x3bb
                    kernel_init_freeable+0x149/0x325
                    kernel_init+0x8/0x116
                    ret_from_fork+0x3a/0x50
   SOFTIRQ-ON-W at:
                    lock_acquire+0x175/0x4e0
                    fs_reclaim_acquire.part.0+0x20/0x30
                    kmem_cache_alloc_node_trace+0x2e/0x290
                    alloc_worker+0x2b/0xb0
                    init_rescuer.part.0+0x17/0xe0
                    workqueue_init+0x293/0x3bb
                    kernel_init_freeable+0x149/0x325
                    kernel_init+0x8/0x116
                    ret_from_fork+0x3a/0x50
   INITIAL USE at:
                   lock_acquire+0x175/0x4e0
                   fs_reclaim_acquire.part.0+0x20/0x30
                   kmem_cache_alloc_node_trace+0x2e/0x290
                   alloc_worker+0x2b/0xb0
                   init_rescuer.part.0+0x17/0xe0
                   workqueue_init+0x293/0x3bb
                   kernel_init_freeable+0x149/0x325
                   kernel_init+0x8/0x116
                   ret_from_fork+0x3a/0x50
 }
__fs_reclaim_map+0x0/0x60
 ... acquired at:
   lock_acquire+0x175/0x4e0
   fs_reclaim_acquire.part.0+0x20/0x30
   kmem_cache_alloc_trace+0x2e/0x260
   mmio_diff_handler+0xc0/0x150
   intel_gvt_for_each_tracked_mmio+0x7b/0x140
   vgpu_mmio_diff_show+0x111/0x2e0
   seq_read+0x242/0x680
   full_proxy_read+0x95/0xc0
   vfs_read+0xc2/0x1b0
   ksys_read+0xc4/0x160
   do_syscall_64+0x63/0x290
   entry_SYSCALL_64_after_hwframe+0x49/0xb3

               stack backtrace:
CPU: 5 PID: 1439 Comm: cat Not tainted 5.7.0-rc2 torvalds#400
Hardware name: Intel(R) Client Systems NUC8i7BEH/NUC8BEB, BIOS BECFL357.86A.0056.2018.1128.1717 11/28/2018
Call Trace:
 dump_stack+0x97/0xe0
 check_irq_usage.cold+0x428/0x434
 ? check_usage_forwards+0x2c0/0x2c0
 ? class_equal+0x11/0x20
 ? __bfs+0xd2/0x2d0
 ? in_any_class_list+0xa0/0xa0
 ? check_path+0x22/0x40
 ? check_noncircular+0x150/0x2b0
 ? print_circular_bug.isra.0+0x1b0/0x1b0
 ? mark_lock+0x13d/0xc50
 ? __lock_acquire+0x1e32/0x39b0
 __lock_acquire+0x1e32/0x39b0
 ? timerqueue_add+0xc1/0x130
 ? register_lock_class+0xa60/0xa60
 ? mark_lock+0x13d/0xc50
 lock_acquire+0x175/0x4e0
 ? __zone_pcp_update+0x80/0x80
 ? check_flags.part.0+0x210/0x210
 ? mark_held_locks+0x65/0x90
 ? _raw_spin_unlock_irqrestore+0x32/0x40
 ? lockdep_hardirqs_on+0x190/0x290
 ? fwtable_read32+0x163/0x480
 ? mmio_diff_handler+0xc0/0x150
 fs_reclaim_acquire.part.0+0x20/0x30
 ? __zone_pcp_update+0x80/0x80
 kmem_cache_alloc_trace+0x2e/0x260
 mmio_diff_handler+0xc0/0x150
 ? vgpu_mmio_diff_open+0x30/0x30
 intel_gvt_for_each_tracked_mmio+0x7b/0x140
 vgpu_mmio_diff_show+0x111/0x2e0
 ? mmio_diff_handler+0x150/0x150
 ? rcu_read_lock_sched_held+0xa0/0xb0
 ? rcu_read_lock_bh_held+0xc0/0xc0
 ? kasan_unpoison_shadow+0x33/0x40
 ? __kasan_kmalloc.constprop.0+0xc2/0xd0
 seq_read+0x242/0x680
 ? debugfs_locked_down.isra.0+0x70/0x70
 full_proxy_read+0x95/0xc0
 vfs_read+0xc2/0x1b0
 ksys_read+0xc4/0x160
 ? kernel_write+0xb0/0xb0
 ? mark_held_locks+0x24/0x90
 do_syscall_64+0x63/0x290
 entry_SYSCALL_64_after_hwframe+0x49/0xb3
RIP: 0033:0x7ffbe3e6efb2
Code: c0 e9 c2 fe ff ff 50 48 8d 3d ca cb 0a 00 e8 f5 19 02 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24
RSP: 002b:00007ffd021c08a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007ffbe3e6efb2
RDX: 0000000000020000 RSI: 00007ffbe34cd000 RDI: 0000000000000003
RBP: 00007ffbe34cd000 R08: 00007ffbe34cc010 R09: 0000000000000000
R10: 0000000000000022 R11: 0000000000000246 R12: 0000562b6f0a11f0
R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000020000
------------[ cut here ]------------

Acked-by: Zhenyu Wang <[email protected]>
Signed-off-by: Colin Xu <[email protected]>
Signed-off-by: Zhenyu Wang <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
fifteenhex pushed a commit to fifteenhex/linux that referenced this pull request Jul 28, 2020
GFP_KERNEL flag specifies a normal kernel allocation in which executing
in process context without any locks and can sleep.
mmio_diff takes sometime to finish all the diff compare and it has
locks, continue using GFP_KERNEL will output below trace if LOCKDEP
enabled.

Use GFP_ATOMIC instead.

V2: Rebase.

=====================================================
WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
5.7.0-rc2 torvalds#400 Not tainted
-----------------------------------------------------
is trying to acquire:
ffffffffb47bea20 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_acquire.part.0+0x0/0x30

               and this task is already holding:
ffff88845b85cc90 (&gvt->scheduler.mmio_context_lock){+.-.}-{2:2}, at: vgpu_mmio_diff_show+0xcf/0x2e0
which would create a new lock dependency:
 (&gvt->scheduler.mmio_context_lock){+.-.}-{2:2} -> (fs_reclaim){+.+.}-{0:0}

               but this new dependency connects a SOFTIRQ-irq-safe lock:
 (&gvt->scheduler.mmio_context_lock){+.-.}-{2:2}

               ... which became SOFTIRQ-irq-safe at:
  lock_acquire+0x175/0x4e0
  _raw_spin_lock_irqsave+0x2b/0x40
  shadow_context_status_change+0xfe/0x2f0
  notifier_call_chain+0x6a/0xa0
  __atomic_notifier_call_chain+0x5f/0xf0
  execlists_schedule_out+0x42a/0x820
  process_csb+0xe7/0x3e0
  execlists_submission_tasklet+0x5c/0x1d0
  tasklet_action_common.isra.0+0xeb/0x260
  __do_softirq+0x11d/0x56f
  irq_exit+0xf6/0x100
  do_IRQ+0x7f/0x160
  ret_from_intr+0x0/0x2a
  cpuidle_enter_state+0xcd/0x5b0
  cpuidle_enter+0x37/0x60
  do_idle+0x337/0x3f0
  cpu_startup_entry+0x14/0x20
  start_kernel+0x58b/0x5c5
  secondary_startup_64+0xa4/0xb0

               to a SOFTIRQ-irq-unsafe lock:
 (fs_reclaim){+.+.}-{0:0}

               ... which became SOFTIRQ-irq-unsafe at:
...
  lock_acquire+0x175/0x4e0
  fs_reclaim_acquire.part.0+0x20/0x30
  kmem_cache_alloc_node_trace+0x2e/0x290
  alloc_worker+0x2b/0xb0
  init_rescuer.part.0+0x17/0xe0
  workqueue_init+0x293/0x3bb
  kernel_init_freeable+0x149/0x325
  kernel_init+0x8/0x116
  ret_from_fork+0x3a/0x50

               other info that might help us debug this:

 Possible interrupt unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(fs_reclaim);
                               local_irq_disable();
                               lock(&gvt->scheduler.mmio_context_lock);
                               lock(fs_reclaim);
  <Interrupt>
    lock(&gvt->scheduler.mmio_context_lock);

                *** DEADLOCK ***

3 locks held by cat/1439:
 #0: ffff888444a23698 (&p->lock){+.+.}-{3:3}, at: seq_read+0x49/0x680
 #1: ffff88845b858068 (&gvt->lock){+.+.}-{3:3}, at: vgpu_mmio_diff_show+0xc7/0x2e0
 #2: ffff88845b85cc90 (&gvt->scheduler.mmio_context_lock){+.-.}-{2:2}, at: vgpu_mmio_diff_show+0xcf/0x2e0

               the dependencies between SOFTIRQ-irq-safe lock and the holding lock:
-> (&gvt->scheduler.mmio_context_lock){+.-.}-{2:2} ops: 31 {
   HARDIRQ-ON-W at:
                    lock_acquire+0x175/0x4e0
                    _raw_spin_lock_bh+0x2f/0x40
                    vgpu_mmio_diff_show+0xcf/0x2e0
                    seq_read+0x242/0x680
                    full_proxy_read+0x95/0xc0
                    vfs_read+0xc2/0x1b0
                    ksys_read+0xc4/0x160
                    do_syscall_64+0x63/0x290
                    entry_SYSCALL_64_after_hwframe+0x49/0xb3
   IN-SOFTIRQ-W at:
                    lock_acquire+0x175/0x4e0
                    _raw_spin_lock_irqsave+0x2b/0x40
                    shadow_context_status_change+0xfe/0x2f0
                    notifier_call_chain+0x6a/0xa0
                    __atomic_notifier_call_chain+0x5f/0xf0
                    execlists_schedule_out+0x42a/0x820
                    process_csb+0xe7/0x3e0
                    execlists_submission_tasklet+0x5c/0x1d0
                    tasklet_action_common.isra.0+0xeb/0x260
                    __do_softirq+0x11d/0x56f
                    irq_exit+0xf6/0x100
                    do_IRQ+0x7f/0x160
                    ret_from_intr+0x0/0x2a
                    cpuidle_enter_state+0xcd/0x5b0
                    cpuidle_enter+0x37/0x60
                    do_idle+0x337/0x3f0
                    cpu_startup_entry+0x14/0x20
                    start_kernel+0x58b/0x5c5
                    secondary_startup_64+0xa4/0xb0
   INITIAL USE at:
                   lock_acquire+0x175/0x4e0
                   _raw_spin_lock_irqsave+0x2b/0x40
                   shadow_context_status_change+0xfe/0x2f0
                   notifier_call_chain+0x6a/0xa0
                   __atomic_notifier_call_chain+0x5f/0xf0
                   execlists_schedule_in+0x2c8/0x690
                   __execlists_submission_tasklet+0x1303/0x1930
                   execlists_submit_request+0x1e7/0x230
                   submit_notify+0x105/0x2a4
                   __i915_sw_fence_complete+0xaa/0x380
                   __engine_park+0x313/0x5a0
                   ____intel_wakeref_put_last+0x3e/0x90
                   intel_gt_resume+0x41e/0x440
                   intel_gt_init+0x283/0xbc0
                   i915_gem_init+0x197/0x240
                   i915_driver_probe+0xc2d/0x12e0
                   i915_pci_probe+0xa2/0x1e0
                   local_pci_probe+0x6f/0xb0
                   pci_device_probe+0x171/0x230
                   really_probe+0x17a/0x380
                   driver_probe_device+0x70/0xf0
                   device_driver_attach+0x82/0x90
                   __driver_attach+0x60/0x100
                   bus_for_each_dev+0xe4/0x140
                   bus_add_driver+0x257/0x2a0
                   driver_register+0xd3/0x150
                   i915_init+0x6d/0x80
                   do_one_initcall+0xb8/0x3a0
                   kernel_init_freeable+0x2b4/0x325
                   kernel_init+0x8/0x116
                   ret_from_fork+0x3a/0x50
 }
__key.77812+0x0/0x40
 ... acquired at:
   lock_acquire+0x175/0x4e0
   fs_reclaim_acquire.part.0+0x20/0x30
   kmem_cache_alloc_trace+0x2e/0x260
   mmio_diff_handler+0xc0/0x150
   intel_gvt_for_each_tracked_mmio+0x7b/0x140
   vgpu_mmio_diff_show+0x111/0x2e0
   seq_read+0x242/0x680
   full_proxy_read+0x95/0xc0
   vfs_read+0xc2/0x1b0
   ksys_read+0xc4/0x160
   do_syscall_64+0x63/0x290
   entry_SYSCALL_64_after_hwframe+0x49/0xb3

               the dependencies between the lock to be acquired
 and SOFTIRQ-irq-unsafe lock:
-> (fs_reclaim){+.+.}-{0:0} ops: 1999031 {
   HARDIRQ-ON-W at:
                    lock_acquire+0x175/0x4e0
                    fs_reclaim_acquire.part.0+0x20/0x30
                    kmem_cache_alloc_node_trace+0x2e/0x290
                    alloc_worker+0x2b/0xb0
                    init_rescuer.part.0+0x17/0xe0
                    workqueue_init+0x293/0x3bb
                    kernel_init_freeable+0x149/0x325
                    kernel_init+0x8/0x116
                    ret_from_fork+0x3a/0x50
   SOFTIRQ-ON-W at:
                    lock_acquire+0x175/0x4e0
                    fs_reclaim_acquire.part.0+0x20/0x30
                    kmem_cache_alloc_node_trace+0x2e/0x290
                    alloc_worker+0x2b/0xb0
                    init_rescuer.part.0+0x17/0xe0
                    workqueue_init+0x293/0x3bb
                    kernel_init_freeable+0x149/0x325
                    kernel_init+0x8/0x116
                    ret_from_fork+0x3a/0x50
   INITIAL USE at:
                   lock_acquire+0x175/0x4e0
                   fs_reclaim_acquire.part.0+0x20/0x30
                   kmem_cache_alloc_node_trace+0x2e/0x290
                   alloc_worker+0x2b/0xb0
                   init_rescuer.part.0+0x17/0xe0
                   workqueue_init+0x293/0x3bb
                   kernel_init_freeable+0x149/0x325
                   kernel_init+0x8/0x116
                   ret_from_fork+0x3a/0x50
 }
__fs_reclaim_map+0x0/0x60
 ... acquired at:
   lock_acquire+0x175/0x4e0
   fs_reclaim_acquire.part.0+0x20/0x30
   kmem_cache_alloc_trace+0x2e/0x260
   mmio_diff_handler+0xc0/0x150
   intel_gvt_for_each_tracked_mmio+0x7b/0x140
   vgpu_mmio_diff_show+0x111/0x2e0
   seq_read+0x242/0x680
   full_proxy_read+0x95/0xc0
   vfs_read+0xc2/0x1b0
   ksys_read+0xc4/0x160
   do_syscall_64+0x63/0x290
   entry_SYSCALL_64_after_hwframe+0x49/0xb3

               stack backtrace:
CPU: 5 PID: 1439 Comm: cat Not tainted 5.7.0-rc2 torvalds#400
Hardware name: Intel(R) Client Systems NUC8i7BEH/NUC8BEB, BIOS BECFL357.86A.0056.2018.1128.1717 11/28/2018
Call Trace:
 dump_stack+0x97/0xe0
 check_irq_usage.cold+0x428/0x434
 ? check_usage_forwards+0x2c0/0x2c0
 ? class_equal+0x11/0x20
 ? __bfs+0xd2/0x2d0
 ? in_any_class_list+0xa0/0xa0
 ? check_path+0x22/0x40
 ? check_noncircular+0x150/0x2b0
 ? print_circular_bug.isra.0+0x1b0/0x1b0
 ? mark_lock+0x13d/0xc50
 ? __lock_acquire+0x1e32/0x39b0
 __lock_acquire+0x1e32/0x39b0
 ? timerqueue_add+0xc1/0x130
 ? register_lock_class+0xa60/0xa60
 ? mark_lock+0x13d/0xc50
 lock_acquire+0x175/0x4e0
 ? __zone_pcp_update+0x80/0x80
 ? check_flags.part.0+0x210/0x210
 ? mark_held_locks+0x65/0x90
 ? _raw_spin_unlock_irqrestore+0x32/0x40
 ? lockdep_hardirqs_on+0x190/0x290
 ? fwtable_read32+0x163/0x480
 ? mmio_diff_handler+0xc0/0x150
 fs_reclaim_acquire.part.0+0x20/0x30
 ? __zone_pcp_update+0x80/0x80
 kmem_cache_alloc_trace+0x2e/0x260
 mmio_diff_handler+0xc0/0x150
 ? vgpu_mmio_diff_open+0x30/0x30
 intel_gvt_for_each_tracked_mmio+0x7b/0x140
 vgpu_mmio_diff_show+0x111/0x2e0
 ? mmio_diff_handler+0x150/0x150
 ? rcu_read_lock_sched_held+0xa0/0xb0
 ? rcu_read_lock_bh_held+0xc0/0xc0
 ? kasan_unpoison_shadow+0x33/0x40
 ? __kasan_kmalloc.constprop.0+0xc2/0xd0
 seq_read+0x242/0x680
 ? debugfs_locked_down.isra.0+0x70/0x70
 full_proxy_read+0x95/0xc0
 vfs_read+0xc2/0x1b0
 ksys_read+0xc4/0x160
 ? kernel_write+0xb0/0xb0
 ? mark_held_locks+0x24/0x90
 do_syscall_64+0x63/0x290
 entry_SYSCALL_64_after_hwframe+0x49/0xb3
RIP: 0033:0x7ffbe3e6efb2
Code: c0 e9 c2 fe ff ff 50 48 8d 3d ca cb 0a 00 e8 f5 19 02 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24
RSP: 002b:00007ffd021c08a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007ffbe3e6efb2
RDX: 0000000000020000 RSI: 00007ffbe34cd000 RDI: 0000000000000003
RBP: 00007ffbe34cd000 R08: 00007ffbe34cc010 R09: 0000000000000000
R10: 0000000000000022 R11: 0000000000000246 R12: 0000562b6f0a11f0
R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000020000
------------[ cut here ]------------

Acked-by: Zhenyu Wang <[email protected]>
Signed-off-by: Colin Xu <[email protected]>
Signed-off-by: Zhenyu Wang <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
fifteenhex pushed a commit to fifteenhex/linux that referenced this pull request Jul 28, 2020
GFP_KERNEL flag specifies a normal kernel allocation in which executing
in process context without any locks and can sleep.
mmio_diff takes sometime to finish all the diff compare and it has
locks, continue using GFP_KERNEL will output below trace if LOCKDEP
enabled.

Use GFP_ATOMIC instead.

V2: Rebase.

=====================================================
WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
5.7.0-rc2 torvalds#400 Not tainted
-----------------------------------------------------
is trying to acquire:
ffffffffb47bea20 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_acquire.part.0+0x0/0x30

               and this task is already holding:
ffff88845b85cc90 (&gvt->scheduler.mmio_context_lock){+.-.}-{2:2}, at: vgpu_mmio_diff_show+0xcf/0x2e0
which would create a new lock dependency:
 (&gvt->scheduler.mmio_context_lock){+.-.}-{2:2} -> (fs_reclaim){+.+.}-{0:0}

               but this new dependency connects a SOFTIRQ-irq-safe lock:
 (&gvt->scheduler.mmio_context_lock){+.-.}-{2:2}

               ... which became SOFTIRQ-irq-safe at:
  lock_acquire+0x175/0x4e0
  _raw_spin_lock_irqsave+0x2b/0x40
  shadow_context_status_change+0xfe/0x2f0
  notifier_call_chain+0x6a/0xa0
  __atomic_notifier_call_chain+0x5f/0xf0
  execlists_schedule_out+0x42a/0x820
  process_csb+0xe7/0x3e0
  execlists_submission_tasklet+0x5c/0x1d0
  tasklet_action_common.isra.0+0xeb/0x260
  __do_softirq+0x11d/0x56f
  irq_exit+0xf6/0x100
  do_IRQ+0x7f/0x160
  ret_from_intr+0x0/0x2a
  cpuidle_enter_state+0xcd/0x5b0
  cpuidle_enter+0x37/0x60
  do_idle+0x337/0x3f0
  cpu_startup_entry+0x14/0x20
  start_kernel+0x58b/0x5c5
  secondary_startup_64+0xa4/0xb0

               to a SOFTIRQ-irq-unsafe lock:
 (fs_reclaim){+.+.}-{0:0}

               ... which became SOFTIRQ-irq-unsafe at:
...
  lock_acquire+0x175/0x4e0
  fs_reclaim_acquire.part.0+0x20/0x30
  kmem_cache_alloc_node_trace+0x2e/0x290
  alloc_worker+0x2b/0xb0
  init_rescuer.part.0+0x17/0xe0
  workqueue_init+0x293/0x3bb
  kernel_init_freeable+0x149/0x325
  kernel_init+0x8/0x116
  ret_from_fork+0x3a/0x50

               other info that might help us debug this:

 Possible interrupt unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(fs_reclaim);
                               local_irq_disable();
                               lock(&gvt->scheduler.mmio_context_lock);
                               lock(fs_reclaim);
  <Interrupt>
    lock(&gvt->scheduler.mmio_context_lock);

                *** DEADLOCK ***

3 locks held by cat/1439:
 #0: ffff888444a23698 (&p->lock){+.+.}-{3:3}, at: seq_read+0x49/0x680
 #1: ffff88845b858068 (&gvt->lock){+.+.}-{3:3}, at: vgpu_mmio_diff_show+0xc7/0x2e0
 #2: ffff88845b85cc90 (&gvt->scheduler.mmio_context_lock){+.-.}-{2:2}, at: vgpu_mmio_diff_show+0xcf/0x2e0

               the dependencies between SOFTIRQ-irq-safe lock and the holding lock:
-> (&gvt->scheduler.mmio_context_lock){+.-.}-{2:2} ops: 31 {
   HARDIRQ-ON-W at:
                    lock_acquire+0x175/0x4e0
                    _raw_spin_lock_bh+0x2f/0x40
                    vgpu_mmio_diff_show+0xcf/0x2e0
                    seq_read+0x242/0x680
                    full_proxy_read+0x95/0xc0
                    vfs_read+0xc2/0x1b0
                    ksys_read+0xc4/0x160
                    do_syscall_64+0x63/0x290
                    entry_SYSCALL_64_after_hwframe+0x49/0xb3
   IN-SOFTIRQ-W at:
                    lock_acquire+0x175/0x4e0
                    _raw_spin_lock_irqsave+0x2b/0x40
                    shadow_context_status_change+0xfe/0x2f0
                    notifier_call_chain+0x6a/0xa0
                    __atomic_notifier_call_chain+0x5f/0xf0
                    execlists_schedule_out+0x42a/0x820
                    process_csb+0xe7/0x3e0
                    execlists_submission_tasklet+0x5c/0x1d0
                    tasklet_action_common.isra.0+0xeb/0x260
                    __do_softirq+0x11d/0x56f
                    irq_exit+0xf6/0x100
                    do_IRQ+0x7f/0x160
                    ret_from_intr+0x0/0x2a
                    cpuidle_enter_state+0xcd/0x5b0
                    cpuidle_enter+0x37/0x60
                    do_idle+0x337/0x3f0
                    cpu_startup_entry+0x14/0x20
                    start_kernel+0x58b/0x5c5
                    secondary_startup_64+0xa4/0xb0
   INITIAL USE at:
                   lock_acquire+0x175/0x4e0
                   _raw_spin_lock_irqsave+0x2b/0x40
                   shadow_context_status_change+0xfe/0x2f0
                   notifier_call_chain+0x6a/0xa0
                   __atomic_notifier_call_chain+0x5f/0xf0
                   execlists_schedule_in+0x2c8/0x690
                   __execlists_submission_tasklet+0x1303/0x1930
                   execlists_submit_request+0x1e7/0x230
                   submit_notify+0x105/0x2a4
                   __i915_sw_fence_complete+0xaa/0x380
                   __engine_park+0x313/0x5a0
                   ____intel_wakeref_put_last+0x3e/0x90
                   intel_gt_resume+0x41e/0x440
                   intel_gt_init+0x283/0xbc0
                   i915_gem_init+0x197/0x240
                   i915_driver_probe+0xc2d/0x12e0
                   i915_pci_probe+0xa2/0x1e0
                   local_pci_probe+0x6f/0xb0
                   pci_device_probe+0x171/0x230
                   really_probe+0x17a/0x380
                   driver_probe_device+0x70/0xf0
                   device_driver_attach+0x82/0x90
                   __driver_attach+0x60/0x100
                   bus_for_each_dev+0xe4/0x140
                   bus_add_driver+0x257/0x2a0
                   driver_register+0xd3/0x150
                   i915_init+0x6d/0x80
                   do_one_initcall+0xb8/0x3a0
                   kernel_init_freeable+0x2b4/0x325
                   kernel_init+0x8/0x116
                   ret_from_fork+0x3a/0x50
 }
__key.77812+0x0/0x40
 ... acquired at:
   lock_acquire+0x175/0x4e0
   fs_reclaim_acquire.part.0+0x20/0x30
   kmem_cache_alloc_trace+0x2e/0x260
   mmio_diff_handler+0xc0/0x150
   intel_gvt_for_each_tracked_mmio+0x7b/0x140
   vgpu_mmio_diff_show+0x111/0x2e0
   seq_read+0x242/0x680
   full_proxy_read+0x95/0xc0
   vfs_read+0xc2/0x1b0
   ksys_read+0xc4/0x160
   do_syscall_64+0x63/0x290
   entry_SYSCALL_64_after_hwframe+0x49/0xb3

               the dependencies between the lock to be acquired
 and SOFTIRQ-irq-unsafe lock:
-> (fs_reclaim){+.+.}-{0:0} ops: 1999031 {
   HARDIRQ-ON-W at:
                    lock_acquire+0x175/0x4e0
                    fs_reclaim_acquire.part.0+0x20/0x30
                    kmem_cache_alloc_node_trace+0x2e/0x290
                    alloc_worker+0x2b/0xb0
                    init_rescuer.part.0+0x17/0xe0
                    workqueue_init+0x293/0x3bb
                    kernel_init_freeable+0x149/0x325
                    kernel_init+0x8/0x116
                    ret_from_fork+0x3a/0x50
   SOFTIRQ-ON-W at:
                    lock_acquire+0x175/0x4e0
                    fs_reclaim_acquire.part.0+0x20/0x30
                    kmem_cache_alloc_node_trace+0x2e/0x290
                    alloc_worker+0x2b/0xb0
                    init_rescuer.part.0+0x17/0xe0
                    workqueue_init+0x293/0x3bb
                    kernel_init_freeable+0x149/0x325
                    kernel_init+0x8/0x116
                    ret_from_fork+0x3a/0x50
   INITIAL USE at:
                   lock_acquire+0x175/0x4e0
                   fs_reclaim_acquire.part.0+0x20/0x30
                   kmem_cache_alloc_node_trace+0x2e/0x290
                   alloc_worker+0x2b/0xb0
                   init_rescuer.part.0+0x17/0xe0
                   workqueue_init+0x293/0x3bb
                   kernel_init_freeable+0x149/0x325
                   kernel_init+0x8/0x116
                   ret_from_fork+0x3a/0x50
 }
__fs_reclaim_map+0x0/0x60
 ... acquired at:
   lock_acquire+0x175/0x4e0
   fs_reclaim_acquire.part.0+0x20/0x30
   kmem_cache_alloc_trace+0x2e/0x260
   mmio_diff_handler+0xc0/0x150
   intel_gvt_for_each_tracked_mmio+0x7b/0x140
   vgpu_mmio_diff_show+0x111/0x2e0
   seq_read+0x242/0x680
   full_proxy_read+0x95/0xc0
   vfs_read+0xc2/0x1b0
   ksys_read+0xc4/0x160
   do_syscall_64+0x63/0x290
   entry_SYSCALL_64_after_hwframe+0x49/0xb3

               stack backtrace:
CPU: 5 PID: 1439 Comm: cat Not tainted 5.7.0-rc2 torvalds#400
Hardware name: Intel(R) Client Systems NUC8i7BEH/NUC8BEB, BIOS BECFL357.86A.0056.2018.1128.1717 11/28/2018
Call Trace:
 dump_stack+0x97/0xe0
 check_irq_usage.cold+0x428/0x434
 ? check_usage_forwards+0x2c0/0x2c0
 ? class_equal+0x11/0x20
 ? __bfs+0xd2/0x2d0
 ? in_any_class_list+0xa0/0xa0
 ? check_path+0x22/0x40
 ? check_noncircular+0x150/0x2b0
 ? print_circular_bug.isra.0+0x1b0/0x1b0
 ? mark_lock+0x13d/0xc50
 ? __lock_acquire+0x1e32/0x39b0
 __lock_acquire+0x1e32/0x39b0
 ? timerqueue_add+0xc1/0x130
 ? register_lock_class+0xa60/0xa60
 ? mark_lock+0x13d/0xc50
 lock_acquire+0x175/0x4e0
 ? __zone_pcp_update+0x80/0x80
 ? check_flags.part.0+0x210/0x210
 ? mark_held_locks+0x65/0x90
 ? _raw_spin_unlock_irqrestore+0x32/0x40
 ? lockdep_hardirqs_on+0x190/0x290
 ? fwtable_read32+0x163/0x480
 ? mmio_diff_handler+0xc0/0x150
 fs_reclaim_acquire.part.0+0x20/0x30
 ? __zone_pcp_update+0x80/0x80
 kmem_cache_alloc_trace+0x2e/0x260
 mmio_diff_handler+0xc0/0x150
 ? vgpu_mmio_diff_open+0x30/0x30
 intel_gvt_for_each_tracked_mmio+0x7b/0x140
 vgpu_mmio_diff_show+0x111/0x2e0
 ? mmio_diff_handler+0x150/0x150
 ? rcu_read_lock_sched_held+0xa0/0xb0
 ? rcu_read_lock_bh_held+0xc0/0xc0
 ? kasan_unpoison_shadow+0x33/0x40
 ? __kasan_kmalloc.constprop.0+0xc2/0xd0
 seq_read+0x242/0x680
 ? debugfs_locked_down.isra.0+0x70/0x70
 full_proxy_read+0x95/0xc0
 vfs_read+0xc2/0x1b0
 ksys_read+0xc4/0x160
 ? kernel_write+0xb0/0xb0
 ? mark_held_locks+0x24/0x90
 do_syscall_64+0x63/0x290
 entry_SYSCALL_64_after_hwframe+0x49/0xb3
RIP: 0033:0x7ffbe3e6efb2
Code: c0 e9 c2 fe ff ff 50 48 8d 3d ca cb 0a 00 e8 f5 19 02 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24
RSP: 002b:00007ffd021c08a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007ffbe3e6efb2
RDX: 0000000000020000 RSI: 00007ffbe34cd000 RDI: 0000000000000003
RBP: 00007ffbe34cd000 R08: 00007ffbe34cc010 R09: 0000000000000000
R10: 0000000000000022 R11: 0000000000000246 R12: 0000562b6f0a11f0
R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000020000
------------[ cut here ]------------

Acked-by: Zhenyu Wang <[email protected]>
Signed-off-by: Colin Xu <[email protected]>
Signed-off-by: Zhenyu Wang <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
fifteenhex pushed a commit to fifteenhex/linux that referenced this pull request Jul 29, 2020
GFP_KERNEL flag specifies a normal kernel allocation in which executing
in process context without any locks and can sleep.
mmio_diff takes sometime to finish all the diff compare and it has
locks, continue using GFP_KERNEL will output below trace if LOCKDEP
enabled.

Use GFP_ATOMIC instead.

V2: Rebase.

=====================================================
WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
5.7.0-rc2 torvalds#400 Not tainted
-----------------------------------------------------
is trying to acquire:
ffffffffb47bea20 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_acquire.part.0+0x0/0x30

               and this task is already holding:
ffff88845b85cc90 (&gvt->scheduler.mmio_context_lock){+.-.}-{2:2}, at: vgpu_mmio_diff_show+0xcf/0x2e0
which would create a new lock dependency:
 (&gvt->scheduler.mmio_context_lock){+.-.}-{2:2} -> (fs_reclaim){+.+.}-{0:0}

               but this new dependency connects a SOFTIRQ-irq-safe lock:
 (&gvt->scheduler.mmio_context_lock){+.-.}-{2:2}

               ... which became SOFTIRQ-irq-safe at:
  lock_acquire+0x175/0x4e0
  _raw_spin_lock_irqsave+0x2b/0x40
  shadow_context_status_change+0xfe/0x2f0
  notifier_call_chain+0x6a/0xa0
  __atomic_notifier_call_chain+0x5f/0xf0
  execlists_schedule_out+0x42a/0x820
  process_csb+0xe7/0x3e0
  execlists_submission_tasklet+0x5c/0x1d0
  tasklet_action_common.isra.0+0xeb/0x260
  __do_softirq+0x11d/0x56f
  irq_exit+0xf6/0x100
  do_IRQ+0x7f/0x160
  ret_from_intr+0x0/0x2a
  cpuidle_enter_state+0xcd/0x5b0
  cpuidle_enter+0x37/0x60
  do_idle+0x337/0x3f0
  cpu_startup_entry+0x14/0x20
  start_kernel+0x58b/0x5c5
  secondary_startup_64+0xa4/0xb0

               to a SOFTIRQ-irq-unsafe lock:
 (fs_reclaim){+.+.}-{0:0}

               ... which became SOFTIRQ-irq-unsafe at:
...
  lock_acquire+0x175/0x4e0
  fs_reclaim_acquire.part.0+0x20/0x30
  kmem_cache_alloc_node_trace+0x2e/0x290
  alloc_worker+0x2b/0xb0
  init_rescuer.part.0+0x17/0xe0
  workqueue_init+0x293/0x3bb
  kernel_init_freeable+0x149/0x325
  kernel_init+0x8/0x116
  ret_from_fork+0x3a/0x50

               other info that might help us debug this:

 Possible interrupt unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(fs_reclaim);
                               local_irq_disable();
                               lock(&gvt->scheduler.mmio_context_lock);
                               lock(fs_reclaim);
  <Interrupt>
    lock(&gvt->scheduler.mmio_context_lock);

                *** DEADLOCK ***

3 locks held by cat/1439:
 #0: ffff888444a23698 (&p->lock){+.+.}-{3:3}, at: seq_read+0x49/0x680
 #1: ffff88845b858068 (&gvt->lock){+.+.}-{3:3}, at: vgpu_mmio_diff_show+0xc7/0x2e0
 #2: ffff88845b85cc90 (&gvt->scheduler.mmio_context_lock){+.-.}-{2:2}, at: vgpu_mmio_diff_show+0xcf/0x2e0

               the dependencies between SOFTIRQ-irq-safe lock and the holding lock:
-> (&gvt->scheduler.mmio_context_lock){+.-.}-{2:2} ops: 31 {
   HARDIRQ-ON-W at:
                    lock_acquire+0x175/0x4e0
                    _raw_spin_lock_bh+0x2f/0x40
                    vgpu_mmio_diff_show+0xcf/0x2e0
                    seq_read+0x242/0x680
                    full_proxy_read+0x95/0xc0
                    vfs_read+0xc2/0x1b0
                    ksys_read+0xc4/0x160
                    do_syscall_64+0x63/0x290
                    entry_SYSCALL_64_after_hwframe+0x49/0xb3
   IN-SOFTIRQ-W at:
                    lock_acquire+0x175/0x4e0
                    _raw_spin_lock_irqsave+0x2b/0x40
                    shadow_context_status_change+0xfe/0x2f0
                    notifier_call_chain+0x6a/0xa0
                    __atomic_notifier_call_chain+0x5f/0xf0
                    execlists_schedule_out+0x42a/0x820
                    process_csb+0xe7/0x3e0
                    execlists_submission_tasklet+0x5c/0x1d0
                    tasklet_action_common.isra.0+0xeb/0x260
                    __do_softirq+0x11d/0x56f
                    irq_exit+0xf6/0x100
                    do_IRQ+0x7f/0x160
                    ret_from_intr+0x0/0x2a
                    cpuidle_enter_state+0xcd/0x5b0
                    cpuidle_enter+0x37/0x60
                    do_idle+0x337/0x3f0
                    cpu_startup_entry+0x14/0x20
                    start_kernel+0x58b/0x5c5
                    secondary_startup_64+0xa4/0xb0
   INITIAL USE at:
                   lock_acquire+0x175/0x4e0
                   _raw_spin_lock_irqsave+0x2b/0x40
                   shadow_context_status_change+0xfe/0x2f0
                   notifier_call_chain+0x6a/0xa0
                   __atomic_notifier_call_chain+0x5f/0xf0
                   execlists_schedule_in+0x2c8/0x690
                   __execlists_submission_tasklet+0x1303/0x1930
                   execlists_submit_request+0x1e7/0x230
                   submit_notify+0x105/0x2a4
                   __i915_sw_fence_complete+0xaa/0x380
                   __engine_park+0x313/0x5a0
                   ____intel_wakeref_put_last+0x3e/0x90
                   intel_gt_resume+0x41e/0x440
                   intel_gt_init+0x283/0xbc0
                   i915_gem_init+0x197/0x240
                   i915_driver_probe+0xc2d/0x12e0
                   i915_pci_probe+0xa2/0x1e0
                   local_pci_probe+0x6f/0xb0
                   pci_device_probe+0x171/0x230
                   really_probe+0x17a/0x380
                   driver_probe_device+0x70/0xf0
                   device_driver_attach+0x82/0x90
                   __driver_attach+0x60/0x100
                   bus_for_each_dev+0xe4/0x140
                   bus_add_driver+0x257/0x2a0
                   driver_register+0xd3/0x150
                   i915_init+0x6d/0x80
                   do_one_initcall+0xb8/0x3a0
                   kernel_init_freeable+0x2b4/0x325
                   kernel_init+0x8/0x116
                   ret_from_fork+0x3a/0x50
 }
__key.77812+0x0/0x40
 ... acquired at:
   lock_acquire+0x175/0x4e0
   fs_reclaim_acquire.part.0+0x20/0x30
   kmem_cache_alloc_trace+0x2e/0x260
   mmio_diff_handler+0xc0/0x150
   intel_gvt_for_each_tracked_mmio+0x7b/0x140
   vgpu_mmio_diff_show+0x111/0x2e0
   seq_read+0x242/0x680
   full_proxy_read+0x95/0xc0
   vfs_read+0xc2/0x1b0
   ksys_read+0xc4/0x160
   do_syscall_64+0x63/0x290
   entry_SYSCALL_64_after_hwframe+0x49/0xb3

               the dependencies between the lock to be acquired
 and SOFTIRQ-irq-unsafe lock:
-> (fs_reclaim){+.+.}-{0:0} ops: 1999031 {
   HARDIRQ-ON-W at:
                    lock_acquire+0x175/0x4e0
                    fs_reclaim_acquire.part.0+0x20/0x30
                    kmem_cache_alloc_node_trace+0x2e/0x290
                    alloc_worker+0x2b/0xb0
                    init_rescuer.part.0+0x17/0xe0
                    workqueue_init+0x293/0x3bb
                    kernel_init_freeable+0x149/0x325
                    kernel_init+0x8/0x116
                    ret_from_fork+0x3a/0x50
   SOFTIRQ-ON-W at:
                    lock_acquire+0x175/0x4e0
                    fs_reclaim_acquire.part.0+0x20/0x30
                    kmem_cache_alloc_node_trace+0x2e/0x290
                    alloc_worker+0x2b/0xb0
                    init_rescuer.part.0+0x17/0xe0
                    workqueue_init+0x293/0x3bb
                    kernel_init_freeable+0x149/0x325
                    kernel_init+0x8/0x116
                    ret_from_fork+0x3a/0x50
   INITIAL USE at:
                   lock_acquire+0x175/0x4e0
                   fs_reclaim_acquire.part.0+0x20/0x30
                   kmem_cache_alloc_node_trace+0x2e/0x290
                   alloc_worker+0x2b/0xb0
                   init_rescuer.part.0+0x17/0xe0
                   workqueue_init+0x293/0x3bb
                   kernel_init_freeable+0x149/0x325
                   kernel_init+0x8/0x116
                   ret_from_fork+0x3a/0x50
 }
__fs_reclaim_map+0x0/0x60
 ... acquired at:
   lock_acquire+0x175/0x4e0
   fs_reclaim_acquire.part.0+0x20/0x30
   kmem_cache_alloc_trace+0x2e/0x260
   mmio_diff_handler+0xc0/0x150
   intel_gvt_for_each_tracked_mmio+0x7b/0x140
   vgpu_mmio_diff_show+0x111/0x2e0
   seq_read+0x242/0x680
   full_proxy_read+0x95/0xc0
   vfs_read+0xc2/0x1b0
   ksys_read+0xc4/0x160
   do_syscall_64+0x63/0x290
   entry_SYSCALL_64_after_hwframe+0x49/0xb3

               stack backtrace:
CPU: 5 PID: 1439 Comm: cat Not tainted 5.7.0-rc2 torvalds#400
Hardware name: Intel(R) Client Systems NUC8i7BEH/NUC8BEB, BIOS BECFL357.86A.0056.2018.1128.1717 11/28/2018
Call Trace:
 dump_stack+0x97/0xe0
 check_irq_usage.cold+0x428/0x434
 ? check_usage_forwards+0x2c0/0x2c0
 ? class_equal+0x11/0x20
 ? __bfs+0xd2/0x2d0
 ? in_any_class_list+0xa0/0xa0
 ? check_path+0x22/0x40
 ? check_noncircular+0x150/0x2b0
 ? print_circular_bug.isra.0+0x1b0/0x1b0
 ? mark_lock+0x13d/0xc50
 ? __lock_acquire+0x1e32/0x39b0
 __lock_acquire+0x1e32/0x39b0
 ? timerqueue_add+0xc1/0x130
 ? register_lock_class+0xa60/0xa60
 ? mark_lock+0x13d/0xc50
 lock_acquire+0x175/0x4e0
 ? __zone_pcp_update+0x80/0x80
 ? check_flags.part.0+0x210/0x210
 ? mark_held_locks+0x65/0x90
 ? _raw_spin_unlock_irqrestore+0x32/0x40
 ? lockdep_hardirqs_on+0x190/0x290
 ? fwtable_read32+0x163/0x480
 ? mmio_diff_handler+0xc0/0x150
 fs_reclaim_acquire.part.0+0x20/0x30
 ? __zone_pcp_update+0x80/0x80
 kmem_cache_alloc_trace+0x2e/0x260
 mmio_diff_handler+0xc0/0x150
 ? vgpu_mmio_diff_open+0x30/0x30
 intel_gvt_for_each_tracked_mmio+0x7b/0x140
 vgpu_mmio_diff_show+0x111/0x2e0
 ? mmio_diff_handler+0x150/0x150
 ? rcu_read_lock_sched_held+0xa0/0xb0
 ? rcu_read_lock_bh_held+0xc0/0xc0
 ? kasan_unpoison_shadow+0x33/0x40
 ? __kasan_kmalloc.constprop.0+0xc2/0xd0
 seq_read+0x242/0x680
 ? debugfs_locked_down.isra.0+0x70/0x70
 full_proxy_read+0x95/0xc0
 vfs_read+0xc2/0x1b0
 ksys_read+0xc4/0x160
 ? kernel_write+0xb0/0xb0
 ? mark_held_locks+0x24/0x90
 do_syscall_64+0x63/0x290
 entry_SYSCALL_64_after_hwframe+0x49/0xb3
RIP: 0033:0x7ffbe3e6efb2
Code: c0 e9 c2 fe ff ff 50 48 8d 3d ca cb 0a 00 e8 f5 19 02 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24
RSP: 002b:00007ffd021c08a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007ffbe3e6efb2
RDX: 0000000000020000 RSI: 00007ffbe34cd000 RDI: 0000000000000003
RBP: 00007ffbe34cd000 R08: 00007ffbe34cc010 R09: 0000000000000000
R10: 0000000000000022 R11: 0000000000000246 R12: 0000562b6f0a11f0
R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000020000
------------[ cut here ]------------

Acked-by: Zhenyu Wang <[email protected]>
Signed-off-by: Colin Xu <[email protected]>
Signed-off-by: Zhenyu Wang <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
fifteenhex pushed a commit to fifteenhex/linux that referenced this pull request Aug 1, 2020
GFP_KERNEL flag specifies a normal kernel allocation in which executing
in process context without any locks and can sleep.
mmio_diff takes sometime to finish all the diff compare and it has
locks, continue using GFP_KERNEL will output below trace if LOCKDEP
enabled.

Use GFP_ATOMIC instead.

V2: Rebase.

=====================================================
WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
5.7.0-rc2 torvalds#400 Not tainted
-----------------------------------------------------
is trying to acquire:
ffffffffb47bea20 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_acquire.part.0+0x0/0x30

               and this task is already holding:
ffff88845b85cc90 (&gvt->scheduler.mmio_context_lock){+.-.}-{2:2}, at: vgpu_mmio_diff_show+0xcf/0x2e0
which would create a new lock dependency:
 (&gvt->scheduler.mmio_context_lock){+.-.}-{2:2} -> (fs_reclaim){+.+.}-{0:0}

               but this new dependency connects a SOFTIRQ-irq-safe lock:
 (&gvt->scheduler.mmio_context_lock){+.-.}-{2:2}

               ... which became SOFTIRQ-irq-safe at:
  lock_acquire+0x175/0x4e0
  _raw_spin_lock_irqsave+0x2b/0x40
  shadow_context_status_change+0xfe/0x2f0
  notifier_call_chain+0x6a/0xa0
  __atomic_notifier_call_chain+0x5f/0xf0
  execlists_schedule_out+0x42a/0x820
  process_csb+0xe7/0x3e0
  execlists_submission_tasklet+0x5c/0x1d0
  tasklet_action_common.isra.0+0xeb/0x260
  __do_softirq+0x11d/0x56f
  irq_exit+0xf6/0x100
  do_IRQ+0x7f/0x160
  ret_from_intr+0x0/0x2a
  cpuidle_enter_state+0xcd/0x5b0
  cpuidle_enter+0x37/0x60
  do_idle+0x337/0x3f0
  cpu_startup_entry+0x14/0x20
  start_kernel+0x58b/0x5c5
  secondary_startup_64+0xa4/0xb0

               to a SOFTIRQ-irq-unsafe lock:
 (fs_reclaim){+.+.}-{0:0}

               ... which became SOFTIRQ-irq-unsafe at:
...
  lock_acquire+0x175/0x4e0
  fs_reclaim_acquire.part.0+0x20/0x30
  kmem_cache_alloc_node_trace+0x2e/0x290
  alloc_worker+0x2b/0xb0
  init_rescuer.part.0+0x17/0xe0
  workqueue_init+0x293/0x3bb
  kernel_init_freeable+0x149/0x325
  kernel_init+0x8/0x116
  ret_from_fork+0x3a/0x50

               other info that might help us debug this:

 Possible interrupt unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(fs_reclaim);
                               local_irq_disable();
                               lock(&gvt->scheduler.mmio_context_lock);
                               lock(fs_reclaim);
  <Interrupt>
    lock(&gvt->scheduler.mmio_context_lock);

                *** DEADLOCK ***

3 locks held by cat/1439:
 #0: ffff888444a23698 (&p->lock){+.+.}-{3:3}, at: seq_read+0x49/0x680
 #1: ffff88845b858068 (&gvt->lock){+.+.}-{3:3}, at: vgpu_mmio_diff_show+0xc7/0x2e0
 #2: ffff88845b85cc90 (&gvt->scheduler.mmio_context_lock){+.-.}-{2:2}, at: vgpu_mmio_diff_show+0xcf/0x2e0

               the dependencies between SOFTIRQ-irq-safe lock and the holding lock:
-> (&gvt->scheduler.mmio_context_lock){+.-.}-{2:2} ops: 31 {
   HARDIRQ-ON-W at:
                    lock_acquire+0x175/0x4e0
                    _raw_spin_lock_bh+0x2f/0x40
                    vgpu_mmio_diff_show+0xcf/0x2e0
                    seq_read+0x242/0x680
                    full_proxy_read+0x95/0xc0
                    vfs_read+0xc2/0x1b0
                    ksys_read+0xc4/0x160
                    do_syscall_64+0x63/0x290
                    entry_SYSCALL_64_after_hwframe+0x49/0xb3
   IN-SOFTIRQ-W at:
                    lock_acquire+0x175/0x4e0
                    _raw_spin_lock_irqsave+0x2b/0x40
                    shadow_context_status_change+0xfe/0x2f0
                    notifier_call_chain+0x6a/0xa0
                    __atomic_notifier_call_chain+0x5f/0xf0
                    execlists_schedule_out+0x42a/0x820
                    process_csb+0xe7/0x3e0
                    execlists_submission_tasklet+0x5c/0x1d0
                    tasklet_action_common.isra.0+0xeb/0x260
                    __do_softirq+0x11d/0x56f
                    irq_exit+0xf6/0x100
                    do_IRQ+0x7f/0x160
                    ret_from_intr+0x0/0x2a
                    cpuidle_enter_state+0xcd/0x5b0
                    cpuidle_enter+0x37/0x60
                    do_idle+0x337/0x3f0
                    cpu_startup_entry+0x14/0x20
                    start_kernel+0x58b/0x5c5
                    secondary_startup_64+0xa4/0xb0
   INITIAL USE at:
                   lock_acquire+0x175/0x4e0
                   _raw_spin_lock_irqsave+0x2b/0x40
                   shadow_context_status_change+0xfe/0x2f0
                   notifier_call_chain+0x6a/0xa0
                   __atomic_notifier_call_chain+0x5f/0xf0
                   execlists_schedule_in+0x2c8/0x690
                   __execlists_submission_tasklet+0x1303/0x1930
                   execlists_submit_request+0x1e7/0x230
                   submit_notify+0x105/0x2a4
                   __i915_sw_fence_complete+0xaa/0x380
                   __engine_park+0x313/0x5a0
                   ____intel_wakeref_put_last+0x3e/0x90
                   intel_gt_resume+0x41e/0x440
                   intel_gt_init+0x283/0xbc0
                   i915_gem_init+0x197/0x240
                   i915_driver_probe+0xc2d/0x12e0
                   i915_pci_probe+0xa2/0x1e0
                   local_pci_probe+0x6f/0xb0
                   pci_device_probe+0x171/0x230
                   really_probe+0x17a/0x380
                   driver_probe_device+0x70/0xf0
                   device_driver_attach+0x82/0x90
                   __driver_attach+0x60/0x100
                   bus_for_each_dev+0xe4/0x140
                   bus_add_driver+0x257/0x2a0
                   driver_register+0xd3/0x150
                   i915_init+0x6d/0x80
                   do_one_initcall+0xb8/0x3a0
                   kernel_init_freeable+0x2b4/0x325
                   kernel_init+0x8/0x116
                   ret_from_fork+0x3a/0x50
 }
__key.77812+0x0/0x40
 ... acquired at:
   lock_acquire+0x175/0x4e0
   fs_reclaim_acquire.part.0+0x20/0x30
   kmem_cache_alloc_trace+0x2e/0x260
   mmio_diff_handler+0xc0/0x150
   intel_gvt_for_each_tracked_mmio+0x7b/0x140
   vgpu_mmio_diff_show+0x111/0x2e0
   seq_read+0x242/0x680
   full_proxy_read+0x95/0xc0
   vfs_read+0xc2/0x1b0
   ksys_read+0xc4/0x160
   do_syscall_64+0x63/0x290
   entry_SYSCALL_64_after_hwframe+0x49/0xb3

               the dependencies between the lock to be acquired
 and SOFTIRQ-irq-unsafe lock:
-> (fs_reclaim){+.+.}-{0:0} ops: 1999031 {
   HARDIRQ-ON-W at:
                    lock_acquire+0x175/0x4e0
                    fs_reclaim_acquire.part.0+0x20/0x30
                    kmem_cache_alloc_node_trace+0x2e/0x290
                    alloc_worker+0x2b/0xb0
                    init_rescuer.part.0+0x17/0xe0
                    workqueue_init+0x293/0x3bb
                    kernel_init_freeable+0x149/0x325
                    kernel_init+0x8/0x116
                    ret_from_fork+0x3a/0x50
   SOFTIRQ-ON-W at:
                    lock_acquire+0x175/0x4e0
                    fs_reclaim_acquire.part.0+0x20/0x30
                    kmem_cache_alloc_node_trace+0x2e/0x290
                    alloc_worker+0x2b/0xb0
                    init_rescuer.part.0+0x17/0xe0
                    workqueue_init+0x293/0x3bb
                    kernel_init_freeable+0x149/0x325
                    kernel_init+0x8/0x116
                    ret_from_fork+0x3a/0x50
   INITIAL USE at:
                   lock_acquire+0x175/0x4e0
                   fs_reclaim_acquire.part.0+0x20/0x30
                   kmem_cache_alloc_node_trace+0x2e/0x290
                   alloc_worker+0x2b/0xb0
                   init_rescuer.part.0+0x17/0xe0
                   workqueue_init+0x293/0x3bb
                   kernel_init_freeable+0x149/0x325
                   kernel_init+0x8/0x116
                   ret_from_fork+0x3a/0x50
 }
__fs_reclaim_map+0x0/0x60
 ... acquired at:
   lock_acquire+0x175/0x4e0
   fs_reclaim_acquire.part.0+0x20/0x30
   kmem_cache_alloc_trace+0x2e/0x260
   mmio_diff_handler+0xc0/0x150
   intel_gvt_for_each_tracked_mmio+0x7b/0x140
   vgpu_mmio_diff_show+0x111/0x2e0
   seq_read+0x242/0x680
   full_proxy_read+0x95/0xc0
   vfs_read+0xc2/0x1b0
   ksys_read+0xc4/0x160
   do_syscall_64+0x63/0x290
   entry_SYSCALL_64_after_hwframe+0x49/0xb3

               stack backtrace:
CPU: 5 PID: 1439 Comm: cat Not tainted 5.7.0-rc2 torvalds#400
Hardware name: Intel(R) Client Systems NUC8i7BEH/NUC8BEB, BIOS BECFL357.86A.0056.2018.1128.1717 11/28/2018
Call Trace:
 dump_stack+0x97/0xe0
 check_irq_usage.cold+0x428/0x434
 ? check_usage_forwards+0x2c0/0x2c0
 ? class_equal+0x11/0x20
 ? __bfs+0xd2/0x2d0
 ? in_any_class_list+0xa0/0xa0
 ? check_path+0x22/0x40
 ? check_noncircular+0x150/0x2b0
 ? print_circular_bug.isra.0+0x1b0/0x1b0
 ? mark_lock+0x13d/0xc50
 ? __lock_acquire+0x1e32/0x39b0
 __lock_acquire+0x1e32/0x39b0
 ? timerqueue_add+0xc1/0x130
 ? register_lock_class+0xa60/0xa60
 ? mark_lock+0x13d/0xc50
 lock_acquire+0x175/0x4e0
 ? __zone_pcp_update+0x80/0x80
 ? check_flags.part.0+0x210/0x210
 ? mark_held_locks+0x65/0x90
 ? _raw_spin_unlock_irqrestore+0x32/0x40
 ? lockdep_hardirqs_on+0x190/0x290
 ? fwtable_read32+0x163/0x480
 ? mmio_diff_handler+0xc0/0x150
 fs_reclaim_acquire.part.0+0x20/0x30
 ? __zone_pcp_update+0x80/0x80
 kmem_cache_alloc_trace+0x2e/0x260
 mmio_diff_handler+0xc0/0x150
 ? vgpu_mmio_diff_open+0x30/0x30
 intel_gvt_for_each_tracked_mmio+0x7b/0x140
 vgpu_mmio_diff_show+0x111/0x2e0
 ? mmio_diff_handler+0x150/0x150
 ? rcu_read_lock_sched_held+0xa0/0xb0
 ? rcu_read_lock_bh_held+0xc0/0xc0
 ? kasan_unpoison_shadow+0x33/0x40
 ? __kasan_kmalloc.constprop.0+0xc2/0xd0
 seq_read+0x242/0x680
 ? debugfs_locked_down.isra.0+0x70/0x70
 full_proxy_read+0x95/0xc0
 vfs_read+0xc2/0x1b0
 ksys_read+0xc4/0x160
 ? kernel_write+0xb0/0xb0
 ? mark_held_locks+0x24/0x90
 do_syscall_64+0x63/0x290
 entry_SYSCALL_64_after_hwframe+0x49/0xb3
RIP: 0033:0x7ffbe3e6efb2
Code: c0 e9 c2 fe ff ff 50 48 8d 3d ca cb 0a 00 e8 f5 19 02 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24
RSP: 002b:00007ffd021c08a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007ffbe3e6efb2
RDX: 0000000000020000 RSI: 00007ffbe34cd000 RDI: 0000000000000003
RBP: 00007ffbe34cd000 R08: 00007ffbe34cc010 R09: 0000000000000000
R10: 0000000000000022 R11: 0000000000000246 R12: 0000562b6f0a11f0
R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000020000
------------[ cut here ]------------

Acked-by: Zhenyu Wang <[email protected]>
Signed-off-by: Colin Xu <[email protected]>
Signed-off-by: Zhenyu Wang <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
chewitt pushed a commit to chewitt/linux that referenced this pull request Oct 28, 2020
Some meson pll registers can be initialized with 0 as N value, introducing
the following division by 0 when computing rate :

  UBSAN: Undefined behaviour in drivers/clk/meson/clk-pll.c:75:9
  division by zero
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.4.0-rc3-608075-g86c9af8630e1-dirty torvalds#400
  Call trace:
   dump_backtrace+0x0/0x1c0
   show_stack+0x14/0x20
   dump_stack+0xc4/0x100
   ubsan_epilogue+0x14/0x68
   __ubsan_handle_divrem_overflow+0x98/0xb8
   __pll_params_to_rate+0xdc/0x140
   meson_clk_pll_recalc_rate+0x278/0x3a0
   __clk_register+0x7c8/0xbb0
   devm_clk_hw_register+0x54/0xc0
   meson_eeclkc_probe+0xf4/0x1a0
   platform_drv_probe+0x54/0xd8
   really_probe+0x16c/0x438
   driver_probe_device+0xb0/0xf0
   device_driver_attach+0x94/0xa0
   __driver_attach+0x70/0x108
   bus_for_each_dev+0xd8/0x128
   driver_attach+0x30/0x40
   bus_add_driver+0x1b0/0x2d8
   driver_register+0xbc/0x1d0
   __platform_driver_register+0x78/0x88
   axg_driver_init+0x18/0x20
   do_one_initcall+0xc8/0x24c
   kernel_init_freeable+0x2b0/0x344
   kernel_init+0x10/0x128
   ret_from_fork+0x10/0x18

This checks if N is null before doing the division.

Fixes: 7a29a86 ("clk: meson: Add support for Meson clock controller")
Reviewed-by: Martin Blumenstingl <[email protected]>
Signed-off-by: Remi Pommarel <[email protected]>
[[email protected]: update the comment in above the fix]
Signed-off-by: Jerome Brunet <[email protected]>
ojeda added a commit to ojeda/linux that referenced this pull request Jul 1, 2021
Add red-black tree implementation backed by the C version.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant