commit 93556fb211fa7f1e18f869bdce0c225c25594942 Author: Greg Kroah-Hartman Date: Wed Mar 18 07:14:26 2020 +0100 Linux 4.19.111 commit 8562759c8fca50c9bb682156eb7f617c86ec3cd0 Author: Sven Eckelmann Date: Thu Oct 3 17:02:01 2019 +0200 batman-adv: Avoid free/alloc race when handling OGM2 buffer commit a8d23cbbf6c9f515ed678204ad2962be7c336344 upstream. A B.A.T.M.A.N. V virtual interface has an OGM2 packet buffer which is initialized using data from the netdevice notifier and other rtnetlink related hooks. It is sent regularly via various slave interfaces of the batadv virtual interface and in this process also modified (realloced) to integrate additional state information via TVLV containers. It must be avoided that the worker item is executed without a common lock with the netdevice notifier/rtnetlink helpers. Otherwise it can either happen that half modified data is sent out or the functions modifying the OGM2 buffer try to access already freed memory regions. Fixes: 0da0035942d4 ("batman-adv: OGMv2 - add basic infrastructure") Signed-off-by: Sven Eckelmann Signed-off-by: Simon Wunderlich Signed-off-by: Greg Kroah-Hartman commit d47aae069e8fd17b1f21042699ae63518e38cac9 Author: Vladis Dronov Date: Sun Mar 8 09:08:55 2020 +0100 efi: Add a sanity check to efivar_store_raw() commit d6c066fda90d578aacdf19771a027ed484a79825 upstream. Add a sanity check to efivar_store_raw() the same way efivar_{attr,size,data}_read() and efivar_show_raw() have it. Signed-off-by: Vladis Dronov Signed-off-by: Ard Biesheuvel Signed-off-by: Ingo Molnar Cc: Link: https://lore.kernel.org/r/20200305084041.24053-3-vdronov@redhat.com Link: https://lore.kernel.org/r/20200308080859.21568-25-ardb@kernel.org Signed-off-by: Greg Kroah-Hartman commit 2f3121f6e2a560860bce0531721b22258ce265f3 Author: Karsten Graul Date: Tue Mar 10 09:33:30 2020 +0100 net/smc: cancel event worker during device removal commit ece0d7bd74615773268475b6b64d6f1ebbd4b4c6 upstream. During IB device removal, cancel the event worker before the device structure is freed. Fixes: a4cf0443c414 ("smc: introduce SMC as an IB-client") Reported-by: syzbot+b297c6825752e7a07272@syzkaller.appspotmail.com Signed-off-by: Karsten Graul Reviewed-by: Ursula Braun Reviewed-by: Leon Romanovsky Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit a22af440349cfa38243ef50c915e0a5a4de59702 Author: Karsten Graul Date: Wed Feb 26 17:52:46 2020 +0100 net/smc: check for valid ib_client_data commit a2f2ef4a54c0d97aa6a8386f4ff23f36ebb488cf upstream. In smc_ib_remove_dev() check if the provided ib device was actually initialized for SMC before. Reported-by: syzbot+84484ccebdd4e5451d91@syzkaller.appspotmail.com Fixes: a4cf0443c414 ("smc: introduce SMC as an IB-client") Signed-off-by: Karsten Graul Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit b5b9c644cfc2547af1bf2f382b9b547a7b33a614 Author: Eric Dumazet Date: Tue Feb 25 11:52:29 2020 -0800 ipv6: restrict IPV6_ADDRFORM operation commit b6f6118901d1e867ac9177bbff3b00b185bd4fdc upstream. IPV6_ADDRFORM is able to transform IPv6 socket to IPv4 one. While this operation sounds illogical, we have to support it. One of the things it does for TCP socket is to switch sk->sk_prot to tcp_prot. We now have other layers playing with sk->sk_prot, so we should make sure to not interfere with them. This patch makes sure sk_prot is the default pointer for TCP IPv6 socket. syzbot reported : BUG: kernel NULL pointer dereference, address: 0000000000000000 PGD a0113067 P4D a0113067 PUD a8771067 PMD 0 Oops: 0010 [#1] PREEMPT SMP KASAN CPU: 0 PID: 10686 Comm: syz-executor.0 Not tainted 5.6.0-rc2-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:0x0 Code: Bad RIP value. RSP: 0018:ffffc9000281fce0 EFLAGS: 00010246 RAX: 1ffffffff15f48ac RBX: ffffffff8afa4560 RCX: dffffc0000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8880a69a8f40 RBP: ffffc9000281fd10 R08: ffffffff86ed9b0c R09: ffffed1014d351f5 R10: ffffed1014d351f5 R11: 0000000000000000 R12: ffff8880920d3098 R13: 1ffff1101241a613 R14: ffff8880a69a8f40 R15: 0000000000000000 FS: 00007f2ae75db700(0000) GS:ffff8880aea00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffffd6 CR3: 00000000a3b85000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: inet_release+0x165/0x1c0 net/ipv4/af_inet.c:427 __sock_release net/socket.c:605 [inline] sock_close+0xe1/0x260 net/socket.c:1283 __fput+0x2e4/0x740 fs/file_table.c:280 ____fput+0x15/0x20 fs/file_table.c:313 task_work_run+0x176/0x1b0 kernel/task_work.c:113 tracehook_notify_resume include/linux/tracehook.h:188 [inline] exit_to_usermode_loop arch/x86/entry/common.c:164 [inline] prepare_exit_to_usermode+0x480/0x5b0 arch/x86/entry/common.c:195 syscall_return_slowpath+0x113/0x4a0 arch/x86/entry/common.c:278 do_syscall_64+0x11f/0x1c0 arch/x86/entry/common.c:304 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x45c429 Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f2ae75dac78 EFLAGS: 00000246 ORIG_RAX: 0000000000000036 RAX: 0000000000000000 RBX: 00007f2ae75db6d4 RCX: 000000000045c429 RDX: 0000000000000001 RSI: 000000000000011a RDI: 0000000000000004 RBP: 000000000076bf20 R08: 0000000000000038 R09: 0000000000000000 R10: 0000000020000180 R11: 0000000000000246 R12: 00000000ffffffff R13: 0000000000000a9d R14: 00000000004ccfb4 R15: 000000000076bf2c Modules linked in: CR2: 0000000000000000 ---[ end trace 82567b5207e87bae ]--- RIP: 0010:0x0 Code: Bad RIP value. RSP: 0018:ffffc9000281fce0 EFLAGS: 00010246 RAX: 1ffffffff15f48ac RBX: ffffffff8afa4560 RCX: dffffc0000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8880a69a8f40 RBP: ffffc9000281fd10 R08: ffffffff86ed9b0c R09: ffffed1014d351f5 R10: ffffed1014d351f5 R11: 0000000000000000 R12: ffff8880920d3098 R13: 1ffff1101241a613 R14: ffff8880a69a8f40 R15: 0000000000000000 FS: 00007f2ae75db700(0000) GS:ffff8880aea00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffffd6 CR3: 00000000a3b85000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Eric Dumazet Reported-by: syzbot+1938db17e275e85dc328@syzkaller.appspotmail.com Cc: Daniel Borkmann Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 9dda737464fc0a05d69d75a9af21f2df645a7bc0 Author: Wolfram Sang Date: Thu Mar 12 14:32:44 2020 +0100 i2c: acpi: put device when verifying client fails commit 8daee952b4389729358665fb91949460641659d4 upstream. i2c_verify_client() can fail, so we need to put the device when that happens. Fixes: 525e6fabeae2 ("i2c / ACPI: add support for ACPI reconfigure notifications") Reported-by: Geert Uytterhoeven Signed-off-by: Wolfram Sang Reviewed-by: Geert Uytterhoeven Reviewed-by: Andy Shevchenko Acked-by: Mika Westerberg Signed-off-by: Wolfram Sang Signed-off-by: Greg Kroah-Hartman commit 7967fef2c6aac905233eba29f047772f1392aefb Author: Daniel Drake Date: Thu Mar 12 14:09:55 2020 +0800 iommu/vt-d: Ignore devices with out-of-spec domain number commit da72a379b2ec0bad3eb265787f7008bead0b040c upstream. VMD subdevices are created with a PCI domain ID of 0x10000 or higher. These subdevices are also handled like all other PCI devices by dmar_pci_bus_notifier(). However, when dmar_alloc_pci_notify_info() take records of such devices, it will truncate the domain ID to a u16 value (in info->seg). The device at (e.g.) 10000:00:02.0 is then treated by the DMAR code as if it is 0000:00:02.0. In the unlucky event that a real device also exists at 0000:00:02.0 and also has a device-specific entry in the DMAR table, dmar_insert_dev_scope() will crash on:   BUG_ON(i >= devices_cnt); That's basically a sanity check that only one PCI device matches a single DMAR entry; in this case we seem to have two matching devices. Fix this by ignoring devices that have a domain number higher than what can be looked up in the DMAR table. This problem was carefully diagnosed by Jian-Hong Pan. Signed-off-by: Lu Baolu Signed-off-by: Daniel Drake Fixes: 59ce0515cdaf3 ("iommu/vt-d: Update DRHD/RMRR/ATSR device scope caches when PCI hotplug happens") Signed-off-by: Joerg Roedel Signed-off-by: Greg Kroah-Hartman commit d51c65f835a9136dd9c1aa060ea5d58d72d09fe5 Author: Zhenzhong Duan Date: Thu Mar 12 14:09:54 2020 +0800 iommu/vt-d: Fix the wrong printing in RHSA parsing commit b0bb0c22c4db623f2e7b1a471596fbf1c22c6dc5 upstream. When base address in RHSA structure doesn't match base address in each DRHD structure, the base address in last DRHD is printed out. This doesn't make sense when there are multiple DRHD units, fix it by printing the buggy RHSA's base address. Signed-off-by: Lu Baolu Signed-off-by: Zhenzhong Duan Fixes: fd0c8894893cb ("intel-iommu: Set a more specific taint flag for invalid BIOS DMAR tables") Signed-off-by: Joerg Roedel Signed-off-by: Greg Kroah-Hartman commit 5ae2daf9977a1fa4f153c20e1996ba28a54a66d1 Author: Jakub Kicinski Date: Mon Mar 2 21:08:33 2020 -0800 netfilter: nft_tunnel: add missing attribute validation for tunnels commit 88a637719a1570705c02cacb3297af164b1714e7 upstream. Add missing attribute validation for tunnel source and destination ports to the netlink policy. Fixes: af308b94a2a4 ("netfilter: nf_tables: add tunnel support") Signed-off-by: Jakub Kicinski Signed-off-by: Pablo Neira Ayuso Signed-off-by: Greg Kroah-Hartman commit 64d43185eba6d61467db53ca026fdeb66fe78646 Author: Jakub Kicinski Date: Mon Mar 2 21:08:32 2020 -0800 netfilter: nft_payload: add missing attribute validation for payload csum flags commit 9d6effb2f1523eb84516e44213c00f2fd9e6afff upstream. Add missing attribute validation for NFTA_PAYLOAD_CSUM_FLAGS to the netlink policy. Fixes: 1814096980bb ("netfilter: nft_payload: layer 4 checksum adjustment for pseudoheader fields") Signed-off-by: Jakub Kicinski Signed-off-by: Pablo Neira Ayuso Signed-off-by: Greg Kroah-Hartman commit 5b425d389ed2627aa04739a076b9da9a9adaad9e Author: Jakub Kicinski Date: Mon Mar 2 21:08:31 2020 -0800 netfilter: cthelper: add missing attribute validation for cthelper commit c049b3450072b8e3998053490e025839fecfef31 upstream. Add missing attribute validation for cthelper to the netlink policy. Fixes: 12f7a505331e ("netfilter: add user-space connection tracking helper infrastructure") Signed-off-by: Jakub Kicinski Signed-off-by: Pablo Neira Ayuso Signed-off-by: Greg Kroah-Hartman commit 5e8dff99645e574122aef5dec91b8803339a6362 Author: Tommi Rantala Date: Thu Mar 5 10:37:13 2020 +0200 perf bench futex-wake: Restore thread count default to online CPU count commit f649bd9dd5d5004543bbc3c50b829577b49f5d75 upstream. Since commit 3b2323c2c1c4 ("perf bench futex: Use cpumaps") the default number of threads the benchmark uses got changed from number of online CPUs to zero: $ perf bench futex wake # Running 'futex/wake' benchmark: Run summary [PID 15930]: blocking on 0 threads (at [private] futex 0x558b8ee4bfac), waking up 1 at a time. [Run 1]: Wokeup 0 of 0 threads in 0.0000 ms [...] [Run 10]: Wokeup 0 of 0 threads in 0.0000 ms Wokeup 0 of 0 threads in 0.0004 ms (+-40.82%) Restore the old behavior by grabbing the number of online CPUs via cpu->nr: $ perf bench futex wake # Running 'futex/wake' benchmark: Run summary [PID 18356]: blocking on 8 threads (at [private] futex 0xb3e62c), waking up 1 at a time. [Run 1]: Wokeup 8 of 8 threads in 0.0260 ms [...] [Run 10]: Wokeup 8 of 8 threads in 0.0270 ms Wokeup 8 of 8 threads in 0.0419 ms (+-24.35%) Fixes: 3b2323c2c1c4 ("perf bench futex: Use cpumaps") Signed-off-by: Tommi Rantala Tested-by: Arnaldo Carvalho de Melo Cc: Alexander Shishkin Cc: Darren Hart Cc: Davidlohr Bueso Cc: Jiri Olsa Cc: Mark Rutland Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lore.kernel.org/lkml/20200305083714.9381-3-tommi.t.rantala@nokia.com Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Greg Kroah-Hartman commit 99c731e17d5e4f5143a845b9c7ad42d708e6085a Author: Jakub Kicinski Date: Mon Mar 2 21:10:58 2020 -0800 nl80211: add missing attribute validation for channel switch commit 5cde05c61cbe13cbb3fa66d52b9ae84f7975e5e6 upstream. Add missing attribute validation for NL80211_ATTR_OPER_CLASS to the netlink policy. Fixes: 1057d35ede5d ("cfg80211: introduce TDLS channel switch commands") Signed-off-by: Jakub Kicinski Link: https://lore.kernel.org/r/20200303051058.4089398-4-kuba@kernel.org Signed-off-by: Johannes Berg Signed-off-by: Greg Kroah-Hartman commit 4fe88c83221d5350b997e719e61afd682eddff89 Author: Jakub Kicinski Date: Mon Mar 2 21:10:57 2020 -0800 nl80211: add missing attribute validation for beacon report scanning commit 056e9375e1f3c4bf2fd49b70258c7daf788ecd9d upstream. Add missing attribute validation for beacon report scanning to the netlink policy. Fixes: 1d76250bd34a ("nl80211: support beacon report scanning") Signed-off-by: Jakub Kicinski Link: https://lore.kernel.org/r/20200303051058.4089398-3-kuba@kernel.org Signed-off-by: Johannes Berg Signed-off-by: Greg Kroah-Hartman commit 7ff441815924ea9df3b2947bf520899e8671c4ce Author: Jakub Kicinski Date: Mon Mar 2 21:10:56 2020 -0800 nl80211: add missing attribute validation for critical protocol indication commit 0e1a1d853ecedc99da9d27f9f5c376935547a0e2 upstream. Add missing attribute validation for critical protocol fields to the netlink policy. Fixes: 5de17984898c ("cfg80211: introduce critical protocol indication from user-space") Signed-off-by: Jakub Kicinski Link: https://lore.kernel.org/r/20200303051058.4089398-2-kuba@kernel.org Signed-off-by: Johannes Berg Signed-off-by: Greg Kroah-Hartman commit 70f5b36852347f8f653cace2f233afa145746ad1 Author: Hamish Martin Date: Tue Mar 10 10:16:18 2020 +1300 i2c: gpio: suppress error on probe defer commit 3747cd2efe7ecb9604972285ab3f60c96cb753a8 upstream. If a GPIO we are trying to use is not available and we are deferring the probe, don't output an error message. This seems to have been the intent of commit 05c74778858d ("i2c: gpio: Add support for named gpios in DT") but the error was still output due to not checking the updated 'retdesc'. Fixes: 05c74778858d ("i2c: gpio: Add support for named gpios in DT") Signed-off-by: Hamish Martin Acked-by: Linus Walleij Signed-off-by: Wolfram Sang Signed-off-by: Greg Kroah-Hartman commit cce0478d6a9e4c8170e5169b6bd5204b61f7e571 Author: Zhenyu Wang Date: Tue Mar 3 13:54:12 2020 +0800 drm/i915/gvt: Fix unnecessary schedule timer when no vGPU exits commit 04d6067f1f19e70a418f92fa3170cf7fe53b7fdf upstream. From commit f25a49ab8ab9 ("drm/i915/gvt: Use vgpu_lock to protect per vgpu access") the vgpu idr destroy is moved later than vgpu resource destroy, then it would fail to stop timer for schedule policy clean which to check vgpu idr for any left vGPU. So this trys to destroy vgpu idr earlier. Cc: Colin Xu Fixes: f25a49ab8ab9 ("drm/i915/gvt: Use vgpu_lock to protect per vgpu access") Acked-by: Colin Xu Signed-off-by: Zhenyu Wang Link: http://patchwork.freedesktop.org/patch/msgid/20200229055445.31481-1-zhenyuw@linux.intel.com Signed-off-by: Greg Kroah-Hartman commit c62016bdc2da958b22b61979f742edf094d5955a Author: Charles Keepax Date: Fri Feb 28 15:41:42 2020 +0000 pinctrl: core: Remove extra kref_get which blocks hogs being freed commit aafd56fc79041bf36f97712d4b35208cbe07db90 upstream. kref_init starts with the reference count at 1, which will be balanced by the pinctrl_put in pinctrl_unregister. The additional kref_get in pinctrl_claim_hogs will increase this count to 2 and cause the hogs to not get freed when pinctrl_unregister is called. Fixes: 6118714275f0 ("pinctrl: core: Fix pinctrl_register_and_init() with pinctrl_enable()") Signed-off-by: Charles Keepax Link: https://lore.kernel.org/r/20200228154142.13860-1-ckeepax@opensource.cirrus.com Signed-off-by: Linus Walleij Signed-off-by: Greg Kroah-Hartman commit e0e86d976a14940440b88737916887576c9cee10 Author: Nicolas Belin Date: Thu Feb 20 14:15:12 2020 +0100 pinctrl: meson-gxl: fix GPIOX sdio pins commit dc7a06b0dbbafac8623c2b7657e61362f2f479a7 upstream. In the gxl driver, the sdio cmd and clk pins are inverted. It has not caused any issue so far because devices using these pins always take both pins so the resulting configuration is OK. Fixes: 0f15f500ff2c ("pinctrl: meson: Add GXL pinctrl definitions") Reviewed-by: Jerome Brunet Signed-off-by: Nicolas Belin Link: https://lore.kernel.org/r/1582204512-7582-1-git-send-email-nbelin@baylibre.com Signed-off-by: Linus Walleij Signed-off-by: Greg Kroah-Hartman commit bf0ef794e1973ee6f7c482718a8433e3e1b9ed8c Author: Sven Eckelmann Date: Sun Feb 16 13:02:06 2020 +0100 batman-adv: Don't schedule OGM for disabled interface commit 8e8ce08198de193e3d21d42e96945216e3d9ac7f upstream. A transmission scheduling for an interface which is currently dropped by batadv_iv_ogm_iface_disable could still be in progress. The B.A.T.M.A.N. V is simply cancelling the workqueue item in an synchronous way but this is not possible with B.A.T.M.A.N. IV because the OGM submissions are intertwined. Instead it has to stop submitting the OGM when it detect that the buffer pointer is set to NULL. Reported-by: syzbot+a98f2016f40b9cd3818a@syzkaller.appspotmail.com Reported-by: syzbot+ac36b6a33c28a491e929@syzkaller.appspotmail.com Fixes: c6c8fea29769 ("net: Add batman-adv meshing protocol") Signed-off-by: Sven Eckelmann Cc: Hillf Danton Signed-off-by: Simon Wunderlich Signed-off-by: Greg Kroah-Hartman commit 1315f6e50e4e42c937627f2b1a3080f40d67f190 Author: Yonghyun Hwang Date: Wed Feb 26 12:30:06 2020 -0800 iommu/vt-d: Fix a bug in intel_iommu_iova_to_phys() for huge page commit 77a1bce84bba01f3f143d77127b72e872b573795 upstream. intel_iommu_iova_to_phys() has a bug when it translates an IOVA for a huge page onto its corresponding physical address. This commit fixes the bug by accomodating the level of page entry for the IOVA and adds IOVA's lower address to the physical address. Cc: Acked-by: Lu Baolu Reviewed-by: Moritz Fischer Signed-off-by: Yonghyun Hwang Fixes: 3871794642579 ("VT-d: Changes to support KVM") Signed-off-by: Joerg Roedel Signed-off-by: Greg Kroah-Hartman commit 9d9a8afd4c708aaacd643a833f0aaf72e11cdd7e Author: Hans de Goede Date: Mon Mar 9 15:01:37 2020 +0100 iommu/vt-d: dmar: replace WARN_TAINT with pr_warn + add_taint commit 59833696442c674acbbd297772ba89e7ad8c753d upstream. Quoting from the comment describing the WARN functions in include/asm-generic/bug.h: * WARN(), WARN_ON(), WARN_ON_ONCE, and so on can be used to report * significant kernel issues that need prompt attention if they should ever * appear at runtime. * * Do not use these macros when checking for invalid external inputs The (buggy) firmware tables which the dmar code was calling WARN_TAINT for really are invalid external inputs. They are not under the kernel's control and the issues in them cannot be fixed by a kernel update. So logging a backtrace, which invites bug reports to be filed about this, is not helpful. Some distros, e.g. Fedora, have tools watching for the kernel backtraces logged by the WARN macros and offer the user an option to file a bug for this when these are encountered. The WARN_TAINT in warn_invalid_dmar() + another iommu WARN_TAINT, addressed in another patch, have lead to over a 100 bugs being filed this way. This commit replaces the WARN_TAINT("...") calls, with pr_warn(FW_BUG "...") + add_taint(TAINT_FIRMWARE_WORKAROUND, ...) calls avoiding the backtrace and thus also avoiding bug-reports being filed about this against the kernel. Fixes: fd0c8894893c ("intel-iommu: Set a more specific taint flag for invalid BIOS DMAR tables") Fixes: e625b4a95d50 ("iommu/vt-d: Parse ANDD records") Signed-off-by: Hans de Goede Signed-off-by: Joerg Roedel Acked-by: Lu Baolu Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200309140138.3753-2-hdegoede@redhat.com BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1564895 Signed-off-by: Greg Kroah-Hartman commit 9407d33ff45364f68671c25c5e48083065667f67 Author: Marc Zyngier Date: Wed Mar 4 11:11:17 2020 +0000 iommu/dma: Fix MSI reservation allocation commit 65ac74f1de3334852fb7d9b1b430fa5a06524276 upstream. The way cookie_init_hw_msi_region() allocates the iommu_dma_msi_page structures doesn't match the way iommu_put_dma_cookie() frees them. The former performs a single allocation of all the required structures, while the latter tries to free them one at a time. It doesn't quite work for the main use case (the GICv3 ITS where the range is 64kB) when the base granule size is 4kB. This leads to a nice slab corruption on teardown, which is easily observable by simply creating a VF on a SRIOV-capable device, and tearing it down immediately (no need to even make use of it). Fortunately, this only affects systems where the ITS isn't translated by the SMMU, which are both rare and non-standard. Fix it by allocating iommu_dma_msi_page structures one at a time. Fixes: 7c1b058c8b5a3 ("iommu/dma: Handle IOMMU API reserved regions") Signed-off-by: Marc Zyngier Reviewed-by: Eric Auger Cc: Robin Murphy Cc: Joerg Roedel Cc: Will Deacon Cc: stable@vger.kernel.org Reviewed-by: Robin Murphy Signed-off-by: Joerg Roedel Signed-off-by: Greg Kroah-Hartman commit 119c200b41fc80cb30344befb96db1f68c85c0ab Author: Tony Luck Date: Tue Feb 25 17:17:37 2020 -0800 x86/mce: Fix logic and comments around MSR_PPIN_CTL commit 59b5809655bdafb0767d3fd00a3e41711aab07e6 upstream. There are two implemented bits in the PPIN_CTL MSR: Bit 0: LockOut (R/WO) Set 1 to prevent further writes to MSR_PPIN_CTL. Bit 1: Enable_PPIN (R/W) If 1, enables MSR_PPIN to be accessible using RDMSR. If 0, an attempt to read MSR_PPIN will cause #GP. So there are four defined values: 0: PPIN is disabled, PPIN_CTL may be updated 1: PPIN is disabled. PPIN_CTL is locked against updates 2: PPIN is enabled. PPIN_CTL may be updated 3: PPIN is enabled. PPIN_CTL is locked against updates Code would only enable the X86_FEATURE_INTEL_PPIN feature for case "2". When it should have done so for both case "2" and case "3". Fix the final test to just check for the enable bit. Also fix some of the other comments in this function. Fixes: 3f5a7896a509 ("x86/mce: Include the PPIN in MCE records when available") Signed-off-by: Tony Luck Signed-off-by: Borislav Petkov Cc: Link: https://lkml.kernel.org/r/20200226011737.9958-1-tony.luck@intel.com Signed-off-by: Greg Kroah-Hartman commit 319478cbd2be90995b011ca6adbd834121eb7acf Author: Felix Fietkau Date: Thu Feb 20 12:41:39 2020 +0100 mt76: fix array overflow on receiving too many fragments for a packet commit b102f0c522cf668c8382c56a4f771b37d011cda2 upstream. If the hardware receives an oversized packet with too many rx fragments, skb_shinfo(skb)->frags can overflow and corrupt memory of adjacent pages. This becomes especially visible if it corrupts the freelist pointer of a slab page. Cc: stable@vger.kernel.org Signed-off-by: Felix Fietkau Signed-off-by: Kalle Valo Signed-off-by: Greg Kroah-Hartman commit 53afdba2c35e6c4a9e5e2e05cf94e72d7ab51660 Author: Sai Praneeth Date: Tue Sep 11 12:15:21 2018 -0700 efi: Make efi_rts_work accessible to efi page fault handler commit 9dbbedaa6171247c4c7c40b83f05b200a117c2e0 upstream. After the kernel has booted, if any accesses by firmware causes a page fault, the efi page fault handler would freeze efi_rts_wq and schedules a new process. To do this, the efi page fault handler needs efi_rts_work. Hence, make it accessible. There will be no race conditions in accessing this structure, because all the calls to efi runtime services are already serialized. Tested-by: Bhupesh Sharma Suggested-by: Matt Fleming Based-on-code-from: Ricardo Neri Signed-off-by: Sai Praneeth Prakhya Signed-off-by: Ard Biesheuvel Fixes: 3eb420e70d87 (“efi: Use a work queue to invoke EFI Runtime Services”) Signed-off-by: Wen Yang Cc: Caspar Zhang Signed-off-by: Greg Kroah-Hartman commit 86896c1c578521b3a96c1178fafe7cdad858248c Author: Vladis Dronov Date: Sun Mar 8 09:08:54 2020 +0100 efi: Fix a race and a buffer overflow while reading efivars via sysfs commit 286d3250c9d6437340203fb64938bea344729a0e upstream. There is a race and a buffer overflow corrupting a kernel memory while reading an EFI variable with a size more than 1024 bytes via the older sysfs method. This happens because accessing struct efi_variable in efivar_{attr,size,data}_read() and friends is not protected from a concurrent access leading to a kernel memory corruption and, at best, to a crash. The race scenario is the following: CPU0: CPU1: efivar_attr_read() var->DataSize = 1024; efivar_entry_get(... &var->DataSize) down_interruptible(&efivars_lock) efivar_attr_read() // same EFI var var->DataSize = 1024; efivar_entry_get(... &var->DataSize) down_interruptible(&efivars_lock) virt_efi_get_variable() // returns EFI_BUFFER_TOO_SMALL but // var->DataSize is set to a real // var size more than 1024 bytes up(&efivars_lock) virt_efi_get_variable() // called with var->DataSize set // to a real var size, returns // successfully and overwrites // a 1024-bytes kernel buffer up(&efivars_lock) This can be reproduced by concurrent reading of an EFI variable which size is more than 1024 bytes: ts# for cpu in $(seq 0 $(nproc --ignore=1)); do ( taskset -c $cpu \ cat /sys/firmware/efi/vars/KEKDefault*/size & ) ; done Fix this by using a local variable for a var's data buffer size so it does not get overwritten. Fixes: e14ab23dde12b80d ("efivars: efivar_entry API") Reported-by: Bob Sanders and the LTP testsuite Signed-off-by: Vladis Dronov Signed-off-by: Ard Biesheuvel Signed-off-by: Ingo Molnar Cc: Link: https://lore.kernel.org/r/20200305084041.24053-2-vdronov@redhat.com Link: https://lore.kernel.org/r/20200308080859.21568-24-ardb@kernel.org Signed-off-by: Greg Kroah-Hartman commit 609314331540b3b40d75d2fd049222df5797737e Author: Wolfram Sang Date: Tue Mar 3 13:50:46 2020 +0100 macintosh: windfarm: fix MODINFO regression commit bcf3588d8ed3517e6ffaf083f034812aee9dc8e2 upstream. Commit af503716ac14 made sure OF devices get an OF style modalias with I2C events. It assumed all in-tree users were converted, yet it missed some Macintosh drivers. Add an OF module device table for all windfarm drivers to make them automatically load again. Fixes: af503716ac14 ("i2c: core: report OF style module alias for devices registered via OF") Link: https://bugzilla.kernel.org/show_bug.cgi?id=199471 Reported-by: Erhard Furtner Tested-by: Erhard Furtner Acked-by: Michael Ellerman (powerpc) Signed-off-by: Wolfram Sang Cc: stable@kernel.org # v4.17+ Signed-off-by: Greg Kroah-Hartman commit 2345c4d769edea6c7045647c1f3816b81ed4ec31 Author: Eugeniy Paltsev Date: Wed Mar 11 19:26:43 2020 +0300 ARC: define __ALIGN_STR and __ALIGN symbols for ARC commit 8d92e992a785f35d23f845206cf8c6cafbc264e0 upstream. The default defintions use fill pattern 0x90 for padding which for ARC generates unintended "ldh_s r12,[r0,0x20]" corresponding to opcode 0x9090 So use ".align 4" which insert a "nop_s" instruction instead. Cc: stable@vger.kernel.org Acked-by: Vineet Gupta Signed-off-by: Eugeniy Paltsev Signed-off-by: Vineet Gupta Signed-off-by: Greg Kroah-Hartman commit 331c88f5d0f1d9645d10da2e93e03e48de590f41 Author: Vitaly Kuznetsov Date: Tue Mar 3 15:33:15 2020 +0100 KVM: x86: clear stale x86_emulate_ctxt->intercept value commit 342993f96ab24d5864ab1216f46c0b199c2baf8e upstream. After commit 07721feee46b ("KVM: nVMX: Don't emulate instructions in guest mode") Hyper-V guests on KVM stopped booting with: kvm_nested_vmexit: rip fffff802987d6169 reason EPT_VIOLATION info1 181 info2 0 int_info 0 int_info_err 0 kvm_page_fault: address febd0000 error_code 181 kvm_emulate_insn: 0:fffff802987d6169: f3 a5 kvm_emulate_insn: 0:fffff802987d6169: f3 a5 FAIL kvm_inj_exception: #UD (0x0) "f3 a5" is a "rep movsw" instruction, which should not be intercepted at all. Commit c44b4c6ab80e ("KVM: emulate: clean up initializations in init_decode_cache") reduced the number of fields cleared by init_decode_cache() claiming that they are being cleared elsewhere, 'intercept', however, is left uncleared if the instruction does not have any of the "slow path" flags (NotImpl, Stack, Op3264, Sse, Mmx, CheckPerm, NearBranch, No16 and of course Intercept itself). Fixes: c44b4c6ab80e ("KVM: emulate: clean up initializations in init_decode_cache") Fixes: 07721feee46b ("KVM: nVMX: Don't emulate instructions in guest mode") Cc: stable@vger.kernel.org Suggested-by: Paolo Bonzini Signed-off-by: Vitaly Kuznetsov Reviewed-by: Sean Christopherson Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit 777179200cb89c0dce7448dbe34883ec132fe5db Author: Al Viro Date: Tue Mar 10 09:31:41 2020 -0400 gfs2_atomic_open(): fix O_EXCL|O_CREAT handling on cold dcache commit 21039132650281de06a169cbe8a0f7e5c578fd8b upstream. with the way fs/namei.c:do_last() had been done, ->atomic_open() instances needed to recognize the case when existing file got found with O_EXCL|O_CREAT, either by falling back to finish_no_open() or failing themselves. gfs2 one didn't. Fixes: 6d4ade986f9c (GFS2: Add atomic_open support) Cc: stable@kernel.org # v3.11 Signed-off-by: Al Viro Signed-off-by: Greg Kroah-Hartman commit a8ab0b70979059773f34171c8f05f340f60d47d9 Author: Al Viro Date: Thu Mar 12 18:25:20 2020 -0400 cifs_atomic_open(): fix double-put on late allocation failure commit d9a9f4849fe0c9d560851ab22a85a666cddfdd24 upstream. several iterations of ->atomic_open() calling conventions ago, we used to need fput() if ->atomic_open() failed at some point after successful finish_open(). Now (since 2016) it's not needed - struct file carries enough state to make fput() work regardless of the point in struct file lifecycle and discarding it on failure exits in open() got unified. Unfortunately, I'd missed the fact that we had an instance of ->atomic_open() (cifs one) that used to need that fput(), as well as the stale comment in finish_open() demanding such late failure handling. Trivially fixed... Fixes: fe9ec8291fca "do_last(): take fput() on error after opening to out:" Cc: stable@kernel.org # v4.7+ Signed-off-by: Al Viro Signed-off-by: Greg Kroah-Hartman commit a89327c1f7708173beb037386cea0b77357ed52c Author: Steven Rostedt (VMware) Date: Mon Mar 9 16:00:11 2020 -0400 ktest: Add timeout for ssh sync testing commit 4d00fc477a2ce8b6d2b09fb34ef9fe9918e7d434 upstream. Before rebooting the box, a "ssh sync" is called to the test machine to see if it is alive or not. But if the test machine is in a partial state, that ssh may never actually finish, and the ktest test hangs. Add a 10 second timeout to the sync test, which will fail after 10 seconds and then cause the test to reboot the test machine. Cc: stable@vger.kernel.org Fixes: 6474ace999edd ("ktest.pl: Powercycle the box on reboot if no connection can be made") Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Greg Kroah-Hartman commit bef7177cefad180b9776ed49290d42bfb873da1e Author: Colin Ian King Date: Fri Nov 8 14:45:27 2019 +0000 drm/amd/display: remove duplicated assignment to grph_obj_type commit d785476c608c621b345dd9396e8b21e90375cb0e upstream. Variable grph_obj_type is being assigned twice, one of these is redundant so remove it. Addresses-Coverity: ("Evaluation order violation") Signed-off-by: Colin Ian King Signed-off-by: Alex Deucher Cc: Signed-off-by: Greg Kroah-Hartman commit 3cd2a91a88bea3a42aed6d9fefbc3f91abe73a2a Author: Hillf Danton Date: Fri Jan 24 20:14:45 2020 -0500 workqueue: don't use wq_select_unbound_cpu() for bound works commit aa202f1f56960c60e7befaa0f49c72b8fa11b0a8 upstream. wq_select_unbound_cpu() is designed for unbound workqueues only, but it's wrongly called when using a bound workqueue too. Fixing this ensures work queued to a bound workqueue with cpu=WORK_CPU_UNBOUND always runs on the local CPU. Before, that would happen only if wq_unbound_cpumask happened to include it (likely almost always the case), or was empty, or we got lucky with forced round-robin placement. So restricting /sys/devices/virtual/workqueue/cpumask to a small subset of a machine's CPUs would cause some bound work items to run unexpectedly there. Fixes: ef557180447f ("workqueue: schedule WORK_CPU_UNBOUND work on wq_unbound_cpumask CPUs") Cc: stable@vger.kernel.org # v4.5+ Signed-off-by: Hillf Danton [dj: massage changelog] Signed-off-by: Daniel Jordan Cc: Tejun Heo Cc: Lai Jiangshan Cc: linux-kernel@vger.kernel.org Signed-off-by: Tejun Heo Signed-off-by: Greg Kroah-Hartman commit 80a12a6f03577de7077a04705b2eeb8078653816 Author: Vasily Averin Date: Tue Feb 25 10:07:12 2020 +0300 netfilter: x_tables: xt_mttg_seq_next should increase position index commit ee84f19cbbe9cf7cba2958acb03163fed3ecbb0f upstream. If .next function does not change position index, following .show function will repeat output related to current position index. Without patch: # dd if=/proc/net/ip_tables_matches # original file output conntrack conntrack conntrack recent recent icmp udplite udp tcp 0+1 records in 0+1 records out 65 bytes copied, 5.4074e-05 s, 1.2 MB/s # dd if=/proc/net/ip_tables_matches bs=62 skip=1 dd: /proc/net/ip_tables_matches: cannot skip to specified offset cp <<< end of last line tcp <<< and then unexpected whole last line once again 0+1 records in 0+1 records out 7 bytes copied, 0.000102447 s, 68.3 kB/s Cc: stable@vger.kernel.org Fixes: 1f4aace60b0e ("fs/seq_file.c: simplify seq_file iteration code ...") Link: https://bugzilla.kernel.org/show_bug.cgi?id=206283 Signed-off-by: Vasily Averin Signed-off-by: Pablo Neira Ayuso Signed-off-by: Greg Kroah-Hartman commit 6fb92c687fba5a534859c07773e3d168c5ae3075 Author: Vasily Averin Date: Tue Feb 25 10:06:29 2020 +0300 netfilter: xt_recent: recent_seq_next should increase position index commit db25517a550926f609c63054b12ea9ad515e1a10 upstream. If .next function does not change position index, following .show function will repeat output related to current position index. Without the patch: # dd if=/proc/net/xt_recent/SSH # original file outpt src=127.0.0.4 ttl: 0 last_seen: 6275444819 oldest_pkt: 1 6275444819 src=127.0.0.2 ttl: 0 last_seen: 6275438906 oldest_pkt: 1 6275438906 src=127.0.0.3 ttl: 0 last_seen: 6275441953 oldest_pkt: 1 6275441953 0+1 records in 0+1 records out 204 bytes copied, 6.1332e-05 s, 3.3 MB/s Read after lseek into middle of last line (offset 140 in example below) generates expected end of last line and then unexpected whole last line once again # dd if=/proc/net/xt_recent/SSH bs=140 skip=1 dd: /proc/net/xt_recent/SSH: cannot skip to specified offset 127.0.0.3 ttl: 0 last_seen: 6275441953 oldest_pkt: 1 6275441953 src=127.0.0.3 ttl: 0 last_seen: 6275441953 oldest_pkt: 1 6275441953 0+1 records in 0+1 records out 132 bytes copied, 6.2487e-05 s, 2.1 MB/s Cc: stable@vger.kernel.org Fixes: 1f4aace60b0e ("fs/seq_file.c: simplify seq_file iteration code ...") Link: https://bugzilla.kernel.org/show_bug.cgi?id=206283 Signed-off-by: Vasily Averin Signed-off-by: Pablo Neira Ayuso Signed-off-by: Greg Kroah-Hartman commit 4fbcbed7c45189b5f7f3309aaed8fc81c37e9bba Author: Vasily Averin Date: Tue Feb 25 10:05:59 2020 +0300 netfilter: synproxy: synproxy_cpu_seq_next should increase position index commit bb71f846a0002239f7058c84f1496648ff4a5c20 upstream. If .next function does not change position index, following .show function will repeat output related to current position index. Cc: stable@vger.kernel.org Fixes: 1f4aace60b0e ("fs/seq_file.c: simplify seq_file iteration code ...") Link: https://bugzilla.kernel.org/show_bug.cgi?id=206283 Signed-off-by: Vasily Averin Signed-off-by: Pablo Neira Ayuso Signed-off-by: Greg Kroah-Hartman commit 64cd059b794466f16610e0ed6b259f18e750b66c Author: Vasily Averin Date: Tue Feb 25 10:05:47 2020 +0300 netfilter: nf_conntrack: ct_cpu_seq_next should increase position index commit dc15af8e9dbd039ebb06336597d2c491ef46ab74 upstream. If .next function does not change position index, following .show function will repeat output related to current position index. Cc: stable@vger.kernel.org Fixes: 1f4aace60b0e ("fs/seq_file.c: simplify seq_file iteration code ...") Link: https://bugzilla.kernel.org/show_bug.cgi?id=206283 Signed-off-by: Vasily Averin Signed-off-by: Pablo Neira Ayuso Signed-off-by: Greg Kroah-Hartman commit ba21563b257af750562474cbb929d8dfb2be8270 Author: Hans de Goede Date: Mon Mar 9 19:25:10 2020 +0100 iommu/vt-d: quirk_ioat_snb_local_iommu: replace WARN_TAINT with pr_warn + add_taint commit 81ee85d0462410de8eeeec1b9761941fd6ed8c7b upstream. Quoting from the comment describing the WARN functions in include/asm-generic/bug.h: * WARN(), WARN_ON(), WARN_ON_ONCE, and so on can be used to report * significant kernel issues that need prompt attention if they should ever * appear at runtime. * * Do not use these macros when checking for invalid external inputs The (buggy) firmware tables which the dmar code was calling WARN_TAINT for really are invalid external inputs. They are not under the kernel's control and the issues in them cannot be fixed by a kernel update. So logging a backtrace, which invites bug reports to be filed about this, is not helpful. Fixes: 556ab45f9a77 ("ioat2: catch and recover from broken vtd configurations v6") Signed-off-by: Hans de Goede Acked-by: Lu Baolu Link: https://lore.kernel.org/r/20200309182510.373875-1-hdegoede@redhat.com BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=701847 Signed-off-by: Joerg Roedel Signed-off-by: Greg Kroah-Hartman commit 2d716cbdfd9fc3234140061c1707603b206fd5d3 Author: Halil Pasic Date: Thu Feb 13 13:37:27 2020 +0100 virtio-blk: fix hw_queue stopped on arbitrary error commit f5f6b95c72f7f8bb46eace8c5306c752d0133daa upstream. Since nobody else is going to restart our hw_queue for us, the blk_mq_start_stopped_hw_queues() is in virtblk_done() is not sufficient necessarily sufficient to ensure that the queue will get started again. In case of global resource outage (-ENOMEM because mapping failure, because of swiotlb full) our virtqueue may be empty and we can get stuck with a stopped hw_queue. Let us not stop the queue on arbitrary errors, but only on -EONSPC which indicates a full virtqueue, where the hw_queue is guaranteed to get started by virtblk_done() before when it makes sense to carry on submitting requests. Let us also remove a stale comment. Signed-off-by: Halil Pasic Cc: Jens Axboe Fixes: f7728002c1c7 ("virtio_ring: fix return code on DMA mapping fails") Link: https://lore.kernel.org/r/20200213123728.61216-2-pasic@linux.ibm.com Signed-off-by: Michael S. Tsirkin Reviewed-by: Stefan Hajnoczi Signed-off-by: Greg Kroah-Hartman commit 30fa84ae81ca3e182622a118003ad29899483535 Author: Dan Moulding Date: Tue Jan 28 02:31:07 2020 -0700 iwlwifi: mvm: Do not require PHY_SKU NVM section for 3168 devices commit a9149d243f259ad8f02b1e23dfe8ba06128f15e1 upstream. The logic for checking required NVM sections was recently fixed in commit b3f20e098293 ("iwlwifi: mvm: fix NVM check for 3168 devices"). However, with that fixed the else is now taken for 3168 devices and within the else clause there is a mandatory check for the PHY_SKU section. This causes the parsing to fail for 3168 devices. The PHY_SKU section is really only mandatory for the IWL_NVM_EXT layout (the phy_sku parameter of iwl_parse_nvm_data is only used when the NVM type is IWL_NVM_EXT). So this changes the PHY_SKU section check so that it's only mandatory for IWL_NVM_EXT. Fixes: b3f20e098293 ("iwlwifi: mvm: fix NVM check for 3168 devices") Signed-off-by: Dan Moulding Signed-off-by: Kalle Valo Signed-off-by: Greg Kroah-Hartman commit ab3e3b23d8d53c542cdfef19f5dbb2e13cc2b957 Author: Michal Koutný Date: Fri Jan 24 12:40:15 2020 +0100 cgroup: Iterate tasks that did not finish do_exit() commit 9c974c77246460fa6a92c18554c3311c8c83c160 upstream. PF_EXITING is set earlier than actual removal from css_set when a task is exitting. This can confuse cgroup.procs readers who see no PF_EXITING tasks, however, rmdir is checking against css_set membership so it can transitionally fail with EBUSY. Fix this by listing tasks that weren't unlinked from css_set active lists. It may happen that other users of the task iterator (without CSS_TASK_ITER_PROCS) spot a PF_EXITING task before cgroup_exit(). This is equal to the state before commit c03cd7738a83 ("cgroup: Include dying leaders with live threads in PROCS iterations") but it may be reviewed later. Reported-by: Suren Baghdasaryan Fixes: c03cd7738a83 ("cgroup: Include dying leaders with live threads in PROCS iterations") Signed-off-by: Michal Koutný Signed-off-by: Tejun Heo Signed-off-by: Greg Kroah-Hartman commit ff79a4a75ca34cb532f11a733cd647cb6832ace8 Author: Vasily Averin Date: Thu Jan 30 13:34:59 2020 +0300 cgroup: cgroup_procs_next should increase position index commit 2d4ecb030dcc90fb725ecbfc82ce5d6c37906e0e upstream. If seq_file .next fuction does not change position index, read after some lseek can generate unexpected output: 1) dd bs=1 skip output of each 2nd elements $ dd if=/sys/fs/cgroup/cgroup.procs bs=8 count=1 2 3 4 5 1+0 records in 1+0 records out 8 bytes copied, 0,000267297 s, 29,9 kB/s [test@localhost ~]$ dd if=/sys/fs/cgroup/cgroup.procs bs=1 count=8 2 4 <<< NB! 3 was skipped 6 <<< ... and 5 too 8 <<< ... and 7 8+0 records in 8+0 records out 8 bytes copied, 5,2123e-05 s, 153 kB/s This happen because __cgroup_procs_start() makes an extra extra cgroup_procs_next() call 2) read after lseek beyond end of file generates whole last line. 3) read after lseek into middle of last line generates expected rest of last line and unexpected whole line once again. Additionally patch removes an extra position index changes in __cgroup_procs_start() Cc: stable@vger.kernel.org https://bugzilla.kernel.org/show_bug.cgi?id=206283 Signed-off-by: Vasily Averin Signed-off-by: Tejun Heo Signed-off-by: Greg Kroah-Hartman commit 084dce9dc9d3aa8debd2e6aea6020be721be8908 Author: Mahesh Bandewar Date: Mon Mar 9 15:57:07 2020 -0700 macvlan: add cond_resched() during multicast processing [ Upstream commit ce9a4186f9ac475c415ffd20348176a4ea366670 ] The Rx bound multicast packets are deferred to a workqueue and macvlan can also suffer from the same attack that was discovered by Syzbot for IPvlan. This solution is not as effective as in IPvlan. IPvlan defers all (Tx and Rx) multicast packet processing to a workqueue while macvlan does this way only for the Rx. This fix should address the Rx codition to certain extent. Tx is still suseptible. Tx multicast processing happens when .ndo_start_xmit is called, hence we cannot add cond_resched(). However, it's not that severe since the user which is generating / flooding will be affected the most. Fixes: 412ca1550cbe ("macvlan: Move broadcasts into a work queue") Signed-off-by: Mahesh Bandewar Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 10ea9902ed4f8b4d08658820f47e61fecb630b33 Author: Jakub Kicinski Date: Tue Mar 10 20:36:16 2020 -0700 net: fec: validate the new settings in fec_enet_set_coalesce() [ Upstream commit ab14961d10d02d20767612c78ce148f6eb85bd58 ] fec_enet_set_coalesce() validates the previously set params and if they are within range proceeds to apply the new ones. The new ones, however, are not validated. This seems backwards, probably a copy-paste error? Compile tested only. Fixes: d851b47b22fc ("net: fec: add interrupt coalescence feature support") Signed-off-by: Jakub Kicinski Acked-by: Fugang Duan Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 6a62f2dd47a7ec7ea4f638fa8c0acea086c2d3de Author: Eric Dumazet Date: Wed Mar 4 15:51:43 2020 -0800 slip: make slhc_compress() more robust against malicious packets [ Upstream commit 110a40dfb708fe940a3f3704d470e431c368d256 ] Before accessing various fields in IPV4 network header and TCP header, make sure the packet : - Has IP version 4 (ip->version == 4) - Has not a silly network length (ip->ihl >= 5) - Is big enough to hold network and transport headers - Has not a silly TCP header size (th->doff >= sizeof(struct tcphdr) / 4) syzbot reported : BUG: KMSAN: uninit-value in slhc_compress+0x5b9/0x2e60 drivers/net/slip/slhc.c:270 CPU: 0 PID: 11728 Comm: syz-executor231 Not tainted 5.6.0-rc2-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x220 lib/dump_stack.c:118 kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:118 __msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:215 slhc_compress+0x5b9/0x2e60 drivers/net/slip/slhc.c:270 ppp_send_frame drivers/net/ppp/ppp_generic.c:1637 [inline] __ppp_xmit_process+0x1902/0x2970 drivers/net/ppp/ppp_generic.c:1495 ppp_xmit_process+0x147/0x2f0 drivers/net/ppp/ppp_generic.c:1516 ppp_write+0x6bb/0x790 drivers/net/ppp/ppp_generic.c:512 do_loop_readv_writev fs/read_write.c:717 [inline] do_iter_write+0x812/0xdc0 fs/read_write.c:1000 compat_writev+0x2df/0x5a0 fs/read_write.c:1351 do_compat_pwritev64 fs/read_write.c:1400 [inline] __do_compat_sys_pwritev fs/read_write.c:1420 [inline] __se_compat_sys_pwritev fs/read_write.c:1414 [inline] __ia32_compat_sys_pwritev+0x349/0x3f0 fs/read_write.c:1414 do_syscall_32_irqs_on arch/x86/entry/common.c:339 [inline] do_fast_syscall_32+0x3c7/0x6e0 arch/x86/entry/common.c:410 entry_SYSENTER_compat+0x68/0x77 arch/x86/entry/entry_64_compat.S:139 RIP: 0023:0xf7f7cd99 Code: 90 e8 0b 00 00 00 f3 90 0f ae e8 eb f9 8d 74 26 00 89 3c 24 c3 90 90 90 90 90 90 90 90 90 90 90 90 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 eb 0d 90 90 90 90 90 90 90 90 90 90 90 90 RSP: 002b:00000000ffdb84ac EFLAGS: 00000217 ORIG_RAX: 000000000000014e RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000200001c0 RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000003 RBP: 0000000040047459 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:144 [inline] kmsan_internal_poison_shadow+0x66/0xd0 mm/kmsan/kmsan.c:127 kmsan_slab_alloc+0x8a/0xe0 mm/kmsan/kmsan_hooks.c:82 slab_alloc_node mm/slub.c:2793 [inline] __kmalloc_node_track_caller+0xb40/0x1200 mm/slub.c:4401 __kmalloc_reserve net/core/skbuff.c:142 [inline] __alloc_skb+0x2fd/0xac0 net/core/skbuff.c:210 alloc_skb include/linux/skbuff.h:1051 [inline] ppp_write+0x115/0x790 drivers/net/ppp/ppp_generic.c:500 do_loop_readv_writev fs/read_write.c:717 [inline] do_iter_write+0x812/0xdc0 fs/read_write.c:1000 compat_writev+0x2df/0x5a0 fs/read_write.c:1351 do_compat_pwritev64 fs/read_write.c:1400 [inline] __do_compat_sys_pwritev fs/read_write.c:1420 [inline] __se_compat_sys_pwritev fs/read_write.c:1414 [inline] __ia32_compat_sys_pwritev+0x349/0x3f0 fs/read_write.c:1414 do_syscall_32_irqs_on arch/x86/entry/common.c:339 [inline] do_fast_syscall_32+0x3c7/0x6e0 arch/x86/entry/common.c:410 entry_SYSENTER_compat+0x68/0x77 arch/x86/entry/entry_64_compat.S:139 Fixes: b5451d783ade ("slip: Move the SLIP drivers") Signed-off-by: Eric Dumazet Reported-by: syzbot Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 4dff63b3ff055e6402ffa435ec09a28ab6a5d2f8 Author: Eric Dumazet Date: Wed Mar 4 09:32:16 2020 -0800 bonding/alb: make sure arp header is pulled before accessing it commit b7469e83d2add567e4e0b063963db185f3167cea upstream. Similar to commit 38f88c454042 ("bonding/alb: properly access headers in bond_alb_xmit()"), we need to make sure arp header was pulled in skb->head before blindly accessing it in rlb_arp_xmit(). Remove arp_pkt() private helper, since it is more readable/obvious to have the following construct back to back : if (!pskb_network_may_pull(skb, sizeof(*arp))) return NULL; arp = (struct arp_pkt *)skb_network_header(skb); syzbot reported : BUG: KMSAN: uninit-value in bond_slave_has_mac_rx include/net/bonding.h:704 [inline] BUG: KMSAN: uninit-value in rlb_arp_xmit drivers/net/bonding/bond_alb.c:662 [inline] BUG: KMSAN: uninit-value in bond_alb_xmit+0x575/0x25e0 drivers/net/bonding/bond_alb.c:1477 CPU: 0 PID: 12743 Comm: syz-executor.4 Not tainted 5.6.0-rc2-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x220 lib/dump_stack.c:118 kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:118 __msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:215 bond_slave_has_mac_rx include/net/bonding.h:704 [inline] rlb_arp_xmit drivers/net/bonding/bond_alb.c:662 [inline] bond_alb_xmit+0x575/0x25e0 drivers/net/bonding/bond_alb.c:1477 __bond_start_xmit drivers/net/bonding/bond_main.c:4257 [inline] bond_start_xmit+0x85d/0x2f70 drivers/net/bonding/bond_main.c:4282 __netdev_start_xmit include/linux/netdevice.h:4524 [inline] netdev_start_xmit include/linux/netdevice.h:4538 [inline] xmit_one net/core/dev.c:3470 [inline] dev_hard_start_xmit+0x531/0xab0 net/core/dev.c:3486 __dev_queue_xmit+0x37de/0x4220 net/core/dev.c:4063 dev_queue_xmit+0x4b/0x60 net/core/dev.c:4096 packet_snd net/packet/af_packet.c:2967 [inline] packet_sendmsg+0x8347/0x93b0 net/packet/af_packet.c:2992 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg net/socket.c:672 [inline] __sys_sendto+0xc1b/0xc50 net/socket.c:1998 __do_sys_sendto net/socket.c:2010 [inline] __se_sys_sendto+0x107/0x130 net/socket.c:2006 __x64_sys_sendto+0x6e/0x90 net/socket.c:2006 do_syscall_64+0xb8/0x160 arch/x86/entry/common.c:296 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x45c479 Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fc77ffbbc78 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 00007fc77ffbc6d4 RCX: 000000000045c479 RDX: 000000000000000e RSI: 00000000200004c0 RDI: 0000000000000003 RBP: 000000000076bf20 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff R13: 0000000000000a04 R14: 00000000004cc7b0 R15: 000000000076bf2c Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:144 [inline] kmsan_internal_poison_shadow+0x66/0xd0 mm/kmsan/kmsan.c:127 kmsan_slab_alloc+0x8a/0xe0 mm/kmsan/kmsan_hooks.c:82 slab_alloc_node mm/slub.c:2793 [inline] __kmalloc_node_track_caller+0xb40/0x1200 mm/slub.c:4401 __kmalloc_reserve net/core/skbuff.c:142 [inline] __alloc_skb+0x2fd/0xac0 net/core/skbuff.c:210 alloc_skb include/linux/skbuff.h:1051 [inline] alloc_skb_with_frags+0x18c/0xa70 net/core/skbuff.c:5766 sock_alloc_send_pskb+0xada/0xc60 net/core/sock.c:2242 packet_alloc_skb net/packet/af_packet.c:2815 [inline] packet_snd net/packet/af_packet.c:2910 [inline] packet_sendmsg+0x66a0/0x93b0 net/packet/af_packet.c:2992 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg net/socket.c:672 [inline] __sys_sendto+0xc1b/0xc50 net/socket.c:1998 __do_sys_sendto net/socket.c:2010 [inline] __se_sys_sendto+0x107/0x130 net/socket.c:2006 __x64_sys_sendto+0x6e/0x90 net/socket.c:2006 do_syscall_64+0xb8/0x160 arch/x86/entry/common.c:296 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet Reported-by: syzbot Cc: Jay Vosburgh Cc: Veaceslav Falico Cc: Andy Gospodarek Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 48aea14b8840a60662bac14f670ea4e265c546c8 Author: Jakub Kicinski Date: Mon Mar 2 21:05:12 2020 -0800 devlink: validate length of region addr/len [ Upstream commit ff3b63b8c299b73ac599b120653b47e275407656 ] DEVLINK_ATTR_REGION_CHUNK_ADDR and DEVLINK_ATTR_REGION_CHUNK_LEN lack entries in the netlink policy. Corresponding nla_get_u64()s may read beyond the end of the message. Fixes: 4e54795a27f5 ("devlink: Add support for region snapshot read command") Signed-off-by: Jakub Kicinski Reviewed-by: Jiri Pirko Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 3c2893a77cef0b05127c2d5c21cf08e19904b092 Author: Jakub Kicinski Date: Mon Mar 2 21:05:23 2020 -0800 tipc: add missing attribute validation for MTU property [ Upstream commit 213320a67962ff6e7b83b704d55cbebc341426db ] Add missing attribute validation for TIPC_NLA_PROP_MTU to the netlink policy. Fixes: 901271e0403a ("tipc: implement configuration of UDP media MTU") Signed-off-by: Jakub Kicinski Signed-off-by: Jakub Kicinski Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit dfe7df51a58b893a1293c631d8f59609b95a6851 Author: Hangbin Liu Date: Tue Mar 3 14:37:35 2020 +0800 net/ipv6: remove the old peer route if change it to a new one [ Upstream commit d0098e4c6b83e502cc1cd96d67ca86bc79a6c559 ] When we modify the peer route and changed it to a new one, we should remove the old route first. Before the fix: + ip addr add dev dummy1 2001:db8::1 peer 2001:db8::2 + ip -6 route show dev dummy1 2001:db8::1 proto kernel metric 256 pref medium 2001:db8::2 proto kernel metric 256 pref medium + ip addr change dev dummy1 2001:db8::1 peer 2001:db8::3 + ip -6 route show dev dummy1 2001:db8::1 proto kernel metric 256 pref medium 2001:db8::2 proto kernel metric 256 pref medium After the fix: + ip addr change dev dummy1 2001:db8::1 peer 2001:db8::3 + ip -6 route show dev dummy1 2001:db8::1 proto kernel metric 256 pref medium 2001:db8::3 proto kernel metric 256 pref medium This patch depend on the previous patch "net/ipv6: need update peer route when modify metric" to update new peer route after delete old one. Signed-off-by: Hangbin Liu Reviewed-by: David Ahern Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit d8cfddaf47d26757634f6889d5be04333f60a2eb Author: Hangbin Liu Date: Tue Mar 3 14:37:34 2020 +0800 net/ipv6: need update peer route when modify metric [ Upstream commit 617940123e0140521f3080d2befc2bf55bcda094 ] When we modify the route metric, the peer address's route need also be updated. Before the fix: + ip addr add dev dummy1 2001:db8::1 peer 2001:db8::2 metric 60 + ip -6 route show dev dummy1 2001:db8::1 proto kernel metric 60 pref medium 2001:db8::2 proto kernel metric 60 pref medium + ip addr change dev dummy1 2001:db8::1 peer 2001:db8::2 metric 61 + ip -6 route show dev dummy1 2001:db8::1 proto kernel metric 61 pref medium 2001:db8::2 proto kernel metric 60 pref medium After the fix: + ip addr change dev dummy1 2001:db8::1 peer 2001:db8::2 metric 61 + ip -6 route show dev dummy1 2001:db8::1 proto kernel metric 61 pref medium 2001:db8::2 proto kernel metric 61 pref medium Fixes: 8308f3ff1753 ("net/ipv6: Add support for specifying metric of connected routes") Signed-off-by: Hangbin Liu Reviewed-by: David Ahern Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit b5a8261a458823261c3a4f51f54a64c6bcf332e0 Author: Hangbin Liu Date: Tue Mar 3 14:37:36 2020 +0800 selftests/net/fib_tests: update addr_metric_test for peer route testing [ Upstream commit 0d29169a708bf730ede287248e429d579f432d1d ] This patch update {ipv4, ipv6}_addr_metric_test with 1. Set metric of address with peer route and see if the route added correctly. 2. Modify metric and peer address for peer route and see if the route changed correctly. Signed-off-by: Hangbin Liu Reviewed-by: David Ahern Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit e37fc53a1ea7ed12e7acf065de2607b00b1e1841 Author: Heiner Kallweit Date: Thu Mar 12 22:25:20 2020 +0100 net: phy: fix MDIO bus PM PHY resuming [ Upstream commit 611d779af7cad2b87487ff58e4931a90c20b113c ] So far we have the unfortunate situation that mdio_bus_phy_may_suspend() is called in suspend AND resume path, assuming that function result is the same. After the original change this is no longer the case, resulting in broken resume as reported by Geert. To fix this call mdio_bus_phy_may_suspend() in the suspend path only, and let the phy_device store the info whether it was suspended by MDIO bus PM. Fixes: 503ba7c69610 ("net: phy: Avoid multiple suspends") Reported-by: Geert Uytterhoeven Tested-by: Geert Uytterhoeven Signed-off-by: Heiner Kallweit Reviewed-by: Florian Fainelli Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 18164e7902ba77816d5b70d74876826d46469729 Author: Jakub Kicinski Date: Mon Mar 2 21:05:26 2020 -0800 nfc: add missing attribute validation for vendor subcommand [ Upstream commit 6ba3da446551f2150fadbf8c7788edcb977683d3 ] Add missing attribute validation for vendor subcommand attributes to the netlink policy. Fixes: 9e58095f9660 ("NFC: netlink: Implement vendor command support") Signed-off-by: Jakub Kicinski Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit b198061962a94a15e9d306b0f625e722b32ce769 Author: Jakub Kicinski Date: Mon Mar 2 21:05:25 2020 -0800 nfc: add missing attribute validation for deactivate target [ Upstream commit 88e706d5168b07df4792dbc3d1bc37b83e4bd74d ] Add missing attribute validation for NFC_ATTR_TARGET_INDEX to the netlink policy. Fixes: 4d63adfe12dd ("NFC: Add NFC_CMD_DEACTIVATE_TARGET support") Signed-off-by: Jakub Kicinski Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit ae855e782d4d640cc10f68a6cf371484b723f14c Author: Jakub Kicinski Date: Mon Mar 2 21:05:24 2020 -0800 nfc: add missing attribute validation for SE API [ Upstream commit 361d23e41ca6e504033f7e66a03b95788377caae ] Add missing attribute validation for NFC_ATTR_SE_INDEX to the netlink policy. Fixes: 5ce3f32b5264 ("NFC: netlink: SE API implementation") Signed-off-by: Jakub Kicinski Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 98be45047c3304a0b6d687ce2829d25b0d32e4a3 Author: Jakub Kicinski Date: Mon Mar 2 21:05:22 2020 -0800 team: add missing attribute validation for array index [ Upstream commit 669fcd7795900cd1880237cbbb57a7db66cb9ac8 ] Add missing attribute validation for TEAM_ATTR_OPTION_ARRAY_INDEX to the netlink policy. Fixes: b13033262d24 ("team: introduce array options") Signed-off-by: Jakub Kicinski Reviewed-by: Jiri Pirko Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 1c2dcaf80a0f3e239e042c15dcdea96e8f03c970 Author: Jakub Kicinski Date: Mon Mar 2 21:05:21 2020 -0800 team: add missing attribute validation for port ifindex [ Upstream commit dd25cb272ccce4db67dc8509278229099e4f5e99 ] Add missing attribute validation for TEAM_ATTR_OPTION_PORT_IFINDEX to the netlink policy. Fixes: 80f7c6683fe0 ("team: add support for per-port options") Signed-off-by: Jakub Kicinski Reviewed-by: Jiri Pirko Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 09ec15bbde5fe1f4c85f56e020469fdc076f0be9 Author: Jakub Kicinski Date: Mon Mar 2 21:05:19 2020 -0800 net: fq: add missing attribute validation for orphan mask [ Upstream commit 7e6dc03eeb023e18427a373522f1d247b916a641 ] Add missing attribute validation for TCA_FQ_ORPHAN_MASK to the netlink policy. Fixes: 06eb395fa985 ("pkt_sched: fq: better control of DDOS traffic") Signed-off-by: Jakub Kicinski Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 388b9d45aecd19c018ce8f70d9483a7d16e2cda3 Author: Jakub Kicinski Date: Mon Mar 2 21:05:17 2020 -0800 macsec: add missing attribute validation for port [ Upstream commit 31d9a1c524964bac77b7f9d0a1ac140dc6b57461 ] Add missing attribute validation for IFLA_MACSEC_PORT to the netlink policy. Fixes: c09440f7dcb3 ("macsec: introduce IEEE 802.1AE driver") Signed-off-by: Jakub Kicinski Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 89b4332a9f241cf4171f25586072440aaa415f02 Author: Jakub Kicinski Date: Mon Mar 2 21:05:16 2020 -0800 can: add missing attribute validation for termination [ Upstream commit ab02ad660586b94f5d08912a3952b939cf4c4430 ] Add missing attribute validation for IFLA_CAN_TERMINATION to the netlink policy. Fixes: 12a6075cabc0 ("can: dev: add CAN interface termination API") Signed-off-by: Jakub Kicinski Acked-by: Oliver Hartkopp Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 1768ebf32e5ecb19611dcd731dc47f1b5eab913a Author: Jakub Kicinski Date: Mon Mar 2 21:05:15 2020 -0800 nl802154: add missing attribute validation for dev_type [ Upstream commit b60673c4c418bef7550d02faf53c34fbfeb366bf ] Add missing attribute type validation for IEEE802154_ATTR_DEV_TYPE to the netlink policy. Fixes: 90c049b2c6ae ("ieee802154: interface type to be added") Signed-off-by: Jakub Kicinski Acked-by: Stefan Schmidt Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 4785c665973848c3380756705d6bff10ccd2819e Author: Jakub Kicinski Date: Mon Mar 2 21:05:14 2020 -0800 nl802154: add missing attribute validation [ Upstream commit 9322cd7c4af2ccc7fe7c5f01adb53f4f77949e92 ] Add missing attribute validation for several u8 types. Fixes: 2c21d11518b6 ("net: add NL802154 interface for configuration of 802.15.4 devices") Signed-off-by: Jakub Kicinski Acked-by: Stefan Schmidt Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 4e4a3292c69a69ce380609a9fad6ede3a8710a33 Author: Jakub Kicinski Date: Mon Mar 2 21:05:13 2020 -0800 fib: add missing attribute validation for tun_id [ Upstream commit 4c16d64ea04056f1b1b324ab6916019f6a064114 ] Add missing netlink policy entry for FRA_TUN_ID. Fixes: e7030878fc84 ("fib: Add fib rule match on tunnel id") Signed-off-by: Jakub Kicinski Reviewed-by: David Ahern Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit f9f04772a2a8fd05ef74badf4e5195b05803020b Author: Jakub Kicinski Date: Mon Mar 2 21:05:11 2020 -0800 devlink: validate length of param values [ Upstream commit 8750939b6ad86abc3f53ec8a9683a1cded4a5654 ] DEVLINK_ATTR_PARAM_VALUE_DATA may have different types so it's not checked by the normal netlink policy. Make sure the attribute length is what we expect. Fixes: e3b7ca18ad7b ("devlink: Add param set command") Signed-off-by: Jakub Kicinski Reviewed-by: Jiri Pirko Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 34636d249831ed4a21068e310ba57a5f1d9baf07 Author: Eric Dumazet Date: Wed Mar 11 11:44:26 2020 -0700 net: memcg: fix lockdep splat in inet_csk_accept() commit 06669ea346e476a5339033d77ef175566a40efbb upstream. Locking newsk while still holding the listener lock triggered a lockdep splat [1] We can simply move the memcg code after we release the listener lock, as this can also help if multiple threads are sharing a common listener. Also fix a typo while reading socket sk_rmem_alloc. [1] WARNING: possible recursive locking detected 5.6.0-rc3-syzkaller #0 Not tainted -------------------------------------------- syz-executor598/9524 is trying to acquire lock: ffff88808b5b8b90 (sk_lock-AF_INET6){+.+.}, at: lock_sock include/net/sock.h:1541 [inline] ffff88808b5b8b90 (sk_lock-AF_INET6){+.+.}, at: inet_csk_accept+0x69f/0xd30 net/ipv4/inet_connection_sock.c:492 but task is already holding lock: ffff88808b5b9590 (sk_lock-AF_INET6){+.+.}, at: lock_sock include/net/sock.h:1541 [inline] ffff88808b5b9590 (sk_lock-AF_INET6){+.+.}, at: inet_csk_accept+0x8d/0xd30 net/ipv4/inet_connection_sock.c:445 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(sk_lock-AF_INET6); lock(sk_lock-AF_INET6); *** DEADLOCK *** May be due to missing lock nesting notation 1 lock held by syz-executor598/9524: #0: ffff88808b5b9590 (sk_lock-AF_INET6){+.+.}, at: lock_sock include/net/sock.h:1541 [inline] #0: ffff88808b5b9590 (sk_lock-AF_INET6){+.+.}, at: inet_csk_accept+0x8d/0xd30 net/ipv4/inet_connection_sock.c:445 stack backtrace: CPU: 0 PID: 9524 Comm: syz-executor598 Not tainted 5.6.0-rc3-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x188/0x20d lib/dump_stack.c:118 print_deadlock_bug kernel/locking/lockdep.c:2370 [inline] check_deadlock kernel/locking/lockdep.c:2411 [inline] validate_chain kernel/locking/lockdep.c:2954 [inline] __lock_acquire.cold+0x114/0x288 kernel/locking/lockdep.c:3954 lock_acquire+0x197/0x420 kernel/locking/lockdep.c:4484 lock_sock_nested+0xc5/0x110 net/core/sock.c:2947 lock_sock include/net/sock.h:1541 [inline] inet_csk_accept+0x69f/0xd30 net/ipv4/inet_connection_sock.c:492 inet_accept+0xe9/0x7c0 net/ipv4/af_inet.c:734 __sys_accept4_file+0x3ac/0x5b0 net/socket.c:1758 __sys_accept4+0x53/0x90 net/socket.c:1809 __do_sys_accept4 net/socket.c:1821 [inline] __se_sys_accept4 net/socket.c:1818 [inline] __x64_sys_accept4+0x93/0xf0 net/socket.c:1818 do_syscall_64+0xf6/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x4445c9 Code: e8 0c 0d 03 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007ffc35b37608 EFLAGS: 00000246 ORIG_RAX: 0000000000000120 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004445c9 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003 RBP: 0000000000000000 R08: 0000000000306777 R09: 0000000000306777 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00000000004053d0 R14: 0000000000000000 R15: 0000000000000000 Fixes: d752a4986532 ("net: memcg: late association of sock to memcg") Signed-off-by: Eric Dumazet Cc: Shakeel Butt Reported-by: syzbot Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 9d9141948a02ba57b6746e37c1f62ad983f1b311 Author: Shakeel Butt Date: Mon Mar 9 22:16:06 2020 -0700 net: memcg: late association of sock to memcg [ Upstream commit d752a4986532cb6305dfd5290a614cde8072769d ] If a TCP socket is allocated in IRQ context or cloned from unassociated (i.e. not associated to a memcg) in IRQ context then it will remain unassociated for its whole life. Almost half of the TCPs created on the system are created in IRQ context, so, memory used by such sockets will not be accounted by the memcg. This issue is more widespread in cgroup v1 where network memory accounting is opt-in but it can happen in cgroup v2 if the source socket for the cloning was created in root memcg. To fix the issue, just do the association of the sockets at the accept() time in the process context and then force charge the memory buffer already used and reserved by the socket. Signed-off-by: Shakeel Butt Reviewed-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 941464dcbcd349256149d47ff66f683ec0a3bc31 Author: Shakeel Butt Date: Mon Mar 9 22:16:05 2020 -0700 cgroup: memcg: net: do not associate sock with unrelated cgroup [ Upstream commit e876ecc67db80dfdb8e237f71e5b43bb88ae549c ] We are testing network memory accounting in our setup and noticed inconsistent network memory usage and often unrelated cgroups network usage correlates with testing workload. On further inspection, it seems like mem_cgroup_sk_alloc() and cgroup_sk_alloc() are broken in irq context specially for cgroup v1. mem_cgroup_sk_alloc() and cgroup_sk_alloc() can be called in irq context and kind of assumes that this can only happen from sk_clone_lock() and the source sock object has already associated cgroup. However in cgroup v1, where network memory accounting is opt-in, the source sock can be unassociated with any cgroup and the new cloned sock can get associated with unrelated interrupted cgroup. Cgroup v2 can also suffer if the source sock object was created by process in the root cgroup or if sk_alloc() is called in irq context. The fix is to just do nothing in interrupt. WARNING: Please note that about half of the TCP sockets are allocated from the IRQ context, so, memory used by such sockets will not be accouted by the memcg. The stack trace of mem_cgroup_sk_alloc() from IRQ-context: CPU: 70 PID: 12720 Comm: ssh Tainted: 5.6.0-smp-DEV #1 Hardware name: ... Call Trace: dump_stack+0x57/0x75 mem_cgroup_sk_alloc+0xe9/0xf0 sk_clone_lock+0x2a7/0x420 inet_csk_clone_lock+0x1b/0x110 tcp_create_openreq_child+0x23/0x3b0 tcp_v6_syn_recv_sock+0x88/0x730 tcp_check_req+0x429/0x560 tcp_v6_rcv+0x72d/0xa40 ip6_protocol_deliver_rcu+0xc9/0x400 ip6_input+0x44/0xd0 ? ip6_protocol_deliver_rcu+0x400/0x400 ip6_rcv_finish+0x71/0x80 ipv6_rcv+0x5b/0xe0 ? ip6_sublist_rcv+0x2e0/0x2e0 process_backlog+0x108/0x1e0 net_rx_action+0x26b/0x460 __do_softirq+0x104/0x2a6 do_softirq_own_stack+0x2a/0x40 do_softirq.part.19+0x40/0x50 __local_bh_enable_ip+0x51/0x60 ip6_finish_output2+0x23d/0x520 ? ip6table_mangle_hook+0x55/0x160 __ip6_finish_output+0xa1/0x100 ip6_finish_output+0x30/0xd0 ip6_output+0x73/0x120 ? __ip6_finish_output+0x100/0x100 ip6_xmit+0x2e3/0x600 ? ipv6_anycast_cleanup+0x50/0x50 ? inet6_csk_route_socket+0x136/0x1e0 ? skb_free_head+0x1e/0x30 inet6_csk_xmit+0x95/0xf0 __tcp_transmit_skb+0x5b4/0xb20 __tcp_send_ack.part.60+0xa3/0x110 tcp_send_ack+0x1d/0x20 tcp_rcv_state_process+0xe64/0xe80 ? tcp_v6_connect+0x5d1/0x5f0 tcp_v6_do_rcv+0x1b1/0x3f0 ? tcp_v6_do_rcv+0x1b1/0x3f0 __release_sock+0x7f/0xd0 release_sock+0x30/0xa0 __inet_stream_connect+0x1c3/0x3b0 ? prepare_to_wait+0xb0/0xb0 inet_stream_connect+0x3b/0x60 __sys_connect+0x101/0x120 ? __sys_getsockopt+0x11b/0x140 __x64_sys_connect+0x1a/0x20 do_syscall_64+0x51/0x200 entry_SYSCALL_64_after_hwframe+0x44/0xa9 The stack trace of mem_cgroup_sk_alloc() from IRQ-context: Fixes: 2d7580738345 ("mm: memcontrol: consolidate cgroup socket tracking") Fixes: d979a39d7242 ("cgroup: duplicate cgroup reference when cloning sockets") Signed-off-by: Shakeel Butt Reviewed-by: Roman Gushchin Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit aadf2a728fd714d683c3b2910784d9083e501405 Author: Vasundhara Volam Date: Sun Mar 1 22:07:17 2020 -0500 bnxt_en: reinitialize IRQs when MTU is modified [ Upstream commit a9b952d267e59a3b405e644930f46d252cea7122 ] MTU changes may affect the number of IRQs so we must call bnxt_close_nic()/bnxt_open_nic() with the irq_re_init parameter set to true. The reason is that a larger MTU may require aggregation rings not needed with smaller MTU. We may not be able to allocate the required number of aggregation rings and so we reduce the number of channels which will change the number of IRQs. Without this patch, it may crash eventually in pci_disable_msix() when the IRQs are not properly unwound. Fixes: c0c050c58d84 ("bnxt_en: New Broadcom ethernet driver.") Signed-off-by: Vasundhara Volam Signed-off-by: Michael Chan Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 989e5462fae3f51aa1fc8468676c3297ec79a9ce Author: Edward Cree Date: Mon Mar 9 18:16:24 2020 +0000 sfc: detach from cb_page in efx_copy_channel() [ Upstream commit 4b1bd9db078f7d5332c8601a2f5bd43cf0458fd4 ] It's a resource, not a parameter, so we can't copy it into the new channel's TX queues, otherwise aliasing will lead to resource- management bugs if the channel is subsequently torn down without being initialised. Before the Fixes:-tagged commit there was a similar bug with tsoh_page, but I'm not sure it's worth doing another fix for such old kernels. Fixes: e9117e5099ea ("sfc: Firmware-Assisted TSO version 2") Suggested-by: Derek Shute Signed-off-by: Edward Cree Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 8a207f28f55d7657364ca66920c7bb44a296ecf0 Author: You-Sheng Yang Date: Wed Feb 26 23:37:10 2020 +0800 r8152: check disconnect status after long sleep [ Upstream commit d64c7a08034b32c285e576208ae44fc3ba3fa7df ] Dell USB Type C docking WD19/WD19DC attaches additional peripherals as: /: Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/6p, 5000M |__ Port 1: Dev 11, If 0, Class=Hub, Driver=hub/4p, 5000M |__ Port 3: Dev 12, If 0, Class=Hub, Driver=hub/4p, 5000M |__ Port 4: Dev 13, If 0, Class=Vendor Specific Class, Driver=r8152, 5000M where usb 2-1-3 is a hub connecting all USB Type-A/C ports on the dock. When hotplugging such dock with additional usb devices already attached on it, the probing process may reset usb 2.1 port, therefore r8152 ethernet device is also reset. However, during r8152 device init there are several for-loops that, when it's unable to retrieve hardware registers due to being disconnected from USB, may take up to 14 seconds each in practice, and that has to be completed before USB may re-enumerate devices on the bus. As a result, devices attached to the dock will only be available after nearly 1 minute after the dock was plugged in: [ 216.388290] [250] r8152 2-1.4:1.0: usb_probe_interface [ 216.388292] [250] r8152 2-1.4:1.0: usb_probe_interface - got id [ 258.830410] r8152 2-1.4:1.0 (unnamed net_device) (uninitialized): PHY not ready [ 258.830460] r8152 2-1.4:1.0 (unnamed net_device) (uninitialized): Invalid header when reading pass-thru MAC addr [ 258.830464] r8152 2-1.4:1.0 (unnamed net_device) (uninitialized): Get ether addr fail This happens in, for example, r8153_init: static int generic_ocp_read(struct r8152 *tp, u16 index, u16 size, void *data, u16 type) { if (test_bit(RTL8152_UNPLUG, &tp->flags)) return -ENODEV; ... } static u16 ocp_read_word(struct r8152 *tp, u16 type, u16 index) { u32 data; ... generic_ocp_read(tp, index, sizeof(tmp), &tmp, type | byen); data = __le32_to_cpu(tmp); ... return (u16)data; } static void r8153_init(struct r8152 *tp) { ... if (test_bit(RTL8152_UNPLUG, &tp->flags)) return; for (i = 0; i < 500; i++) { if (ocp_read_word(tp, MCU_TYPE_PLA, PLA_BOOT_CTRL) & AUTOLOAD_DONE) break; msleep(20); } ... } Since ocp_read_word() doesn't check the return status of generic_ocp_read(), and the only exit condition for the loop is to have a match in the returned value, such loops will only ends after exceeding its maximum runs when the device has been marked as disconnected, which takes 500 * 20ms = 10 seconds in theory, 14 in practice. To solve this long latency another test to RTL8152_UNPLUG flag should be added after those 20ms sleep to skip unnecessary loops, so that the device probe can complete early and proceed to parent port reset/reprobe process. This can be reproduced on all kernel versions up to latest v5.6-rc2, but after v5.5-rc7 the reproduce rate is dramatically lowered to 1/30 or less while it was around 1/2. Signed-off-by: You-Sheng Yang Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit c8149165ac27482b575620c55fc91afe8832d8ad Author: Colin Ian King Date: Thu Mar 12 15:04:30 2020 +0000 net: systemport: fix index check to avoid an array out of bounds access [ Upstream commit c0368595c1639947839c0db8294ee96aca0b3b86 ] Currently the bounds check on index is off by one and can lead to an out of bounds access on array priv->filters_loc when index is RXCHK_BRCM_TAG_MAX. Fixes: bb9051a2b230 ("net: systemport: Add support for WAKE_FILTER") Signed-off-by: Colin Ian King Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit ba245652e198d706887412982c17ec666dd46fa7 Author: Remi Pommarel Date: Sun Mar 8 10:25:56 2020 +0100 net: stmmac: dwmac1000: Disable ACS if enhanced descs are not used [ Upstream commit b723bd933980f4956dabc8a8d84b3e83be8d094c ] ACS (auto PAD/FCS stripping) removes FCS off 802.3 packets (LLC) so that there is no need to manually strip it for such packets. The enhanced DMA descriptors allow to flag LLC packets so that the receiving callback can use that to strip FCS manually or not. On the other hand, normal descriptors do not support that. Thus in order to not truncate LLC packet ACS should be disabled when using normal DMA descriptors. Fixes: 47dd7a540b8a0 ("net: add support for STMicroelectronics Ethernet controllers.") Signed-off-by: Remi Pommarel Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 64fabf9bcadfb93fb21fb023a72a2ad8b7a40047 Author: Willem de Bruijn Date: Mon Mar 9 11:34:35 2020 -0400 net/packet: tpacket_rcv: do not increment ring index on drop [ Upstream commit 46e4c421a053c36bf7a33dda2272481bcaf3eed3 ] In one error case, tpacket_rcv drops packets after incrementing the ring producer index. If this happens, it does not update tp_status to TP_STATUS_USER and thus the reader is stalled for an iteration of the ring, causing out of order arrival. The only such error path is when virtio_net_hdr_from_skb fails due to encountering an unknown GSO type. Signed-off-by: Willem de Bruijn Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 7e78a7fdcc8d8c854e5dc8441a452baca24dec67 Author: Dan Carpenter Date: Wed Mar 4 17:24:31 2020 +0300 net: nfc: fix bounds checking bugs on "pipe" [ Upstream commit a3aefbfe45751bf7b338c181b97608e276b5bb73 ] This is similar to commit 674d9de02aa7 ("NFC: Fix possible memory corruption when handling SHDLC I-Frame commands") and commit d7ee81ad09f0 ("NFC: nci: Add some bounds checking in nci_hci_cmd_received()") which added range checks on "pipe". The "pipe" variable comes skb->data[0] in nfc_hci_msg_rx_work(). It's in the 0-255 range. We're using it as the array index into the hdev->pipes[] array which has NFC_HCI_MAX_PIPES (128) members. Fixes: 118278f20aa8 ("NFC: hci: Add pipes table to reference them with a tuple {gate, host}") Signed-off-by: Dan Carpenter Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 28cedae50965a504c3dffaa7ecdbc0a6f056543f Author: Dmitry Bogdanov Date: Tue Mar 10 18:22:24 2020 +0300 net: macsec: update SCI upon MAC address change. [ Upstream commit 6fc498bc82929ee23aa2f35a828c6178dfd3f823 ] SCI should be updated, because it contains MAC in its first 6 octets. Fixes: c09440f7dcb3 ("macsec: introduce IEEE 802.1AE driver") Signed-off-by: Dmitry Bogdanov Signed-off-by: Mark Starovoytov Signed-off-by: Igor Russkikh Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 7aa760f0f545dc859d1f273e6d52dfac4c07f45e Author: Pablo Neira Ayuso Date: Wed Feb 26 19:47:34 2020 +0100 netlink: Use netlink header as base to calculate bad attribute offset [ Upstream commit 84b3268027641401bb8ad4427a90a3cce2eb86f5 ] Userspace might send a batch that is composed of several netlink messages. The netlink_ack() function must use the pointer to the netlink header as base to calculate the bad attribute offset. Fixes: 2d4bc93368f5 ("netlink: extended ACK reporting") Signed-off-by: Pablo Neira Ayuso Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 53e404eddda4044c5a005a27a55e06037de7408c Author: Hangbin Liu Date: Sat Feb 29 17:27:13 2020 +0800 net/ipv6: use configured metric when add peer route [ Upstream commit 07758eb9ff52794fba15d03aa88d92dbd1b7d125 ] When we add peer address with metric configured, IPv4 could set the dest metric correctly, but IPv6 do not. e.g. ]# ip addr add 192.0.2.1 peer 192.0.2.2/32 dev eth1 metric 20 ]# ip route show dev eth1 192.0.2.2 proto kernel scope link src 192.0.2.1 metric 20 ]# ip addr add 2001:db8::1 peer 2001:db8::2/128 dev eth1 metric 20 ]# ip -6 route show dev eth1 2001:db8::1 proto kernel metric 20 pref medium 2001:db8::2 proto kernel metric 256 pref medium Fix this by using configured metric instead of default one. Reported-by: Jianlin Shi Fixes: 8308f3ff1753 ("net/ipv6: Add support for specifying metric of connected routes") Reviewed-by: David Ahern Signed-off-by: Hangbin Liu Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit eb273bb8205c7eeed8da0bca7842fa68fd62d0bb Author: Mahesh Bandewar Date: Mon Mar 9 15:56:56 2020 -0700 ipvlan: don't deref eth hdr before checking it's set [ Upstream commit ad8192767c9f9cf97da57b9ffcea70fb100febef ] IPvlan in L3 mode discards outbound multicast packets but performs the check before ensuring the ether-header is set or not. This is an error that Eric found through code browsing. Fixes: 2ad7bf363841 (“ipvlan: Initial check-in of the IPVLAN driver.”) Signed-off-by: Mahesh Bandewar Reported-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit cb9e7197bbebbdfd762c34f64b1e55d8b526c345 Author: Eric Dumazet Date: Mon Mar 9 18:22:58 2020 -0700 ipvlan: do not use cond_resched_rcu() in ipvlan_process_multicast() [ Upstream commit afe207d80a61e4d6e7cfa0611a4af46d0ba95628 ] Commit e18b353f102e ("ipvlan: add cond_resched_rcu() while processing muticast backlog") added a cond_resched_rcu() in a loop using rcu protection to iterate over slaves. This is breaking rcu rules, so lets instead use cond_resched() at a point we can reschedule Fixes: e18b353f102e ("ipvlan: add cond_resched_rcu() while processing muticast backlog") Signed-off-by: Eric Dumazet Cc: Mahesh Bandewar Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit ee98e615c8eee9586d0c2f2104c663a15c38ec54 Author: Jiri Wiesner Date: Sat Mar 7 13:31:57 2020 +0100 ipvlan: do not add hardware address of master to its unicast filter list [ Upstream commit 63aae7b17344d4b08a7d05cb07044de4c0f9dcc6 ] There is a problem when ipvlan slaves are created on a master device that is a vmxnet3 device (ipvlan in VMware guests). The vmxnet3 driver does not support unicast address filtering. When an ipvlan device is brought up in ipvlan_open(), the ipvlan driver calls dev_uc_add() to add the hardware address of the vmxnet3 master device to the unicast address list of the master device, phy_dev->uc. This inevitably leads to the vmxnet3 master device being forced into promiscuous mode by __dev_set_rx_mode(). Promiscuous mode is switched on the master despite the fact that there is still only one hardware address that the master device should use for filtering in order for the ipvlan device to be able to receive packets. The comment above struct net_device describes the uc_promisc member as a "counter, that indicates, that promiscuous mode has been enabled due to the need to listen to additional unicast addresses in a device that does not implement ndo_set_rx_mode()". Moreover, the design of ipvlan guarantees that only the hardware address of a master device, phy_dev->dev_addr, will be used to transmit and receive all packets from its ipvlan slaves. Thus, the unicast address list of the master device should not be modified by ipvlan_open() and ipvlan_stop() in order to make ipvlan a workable option on masters that do not support unicast address filtering. Fixes: 2ad7bf3638411 ("ipvlan: Initial check-in of the IPVLAN driver") Reported-by: Per Sundstrom Signed-off-by: Jiri Wiesner Reviewed-by: Eric Dumazet Acked-by: Mahesh Bandewar Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 79a958d8a1e55bb66fe74a49d31fd8aa8474dfc0 Author: Mahesh Bandewar Date: Mon Mar 9 15:57:02 2020 -0700 ipvlan: add cond_resched_rcu() while processing muticast backlog [ Upstream commit e18b353f102e371580f3f01dd47567a25acc3c1d ] If there are substantial number of slaves created as simulated by Syzbot, the backlog processing could take much longer and result into the issue found in the Syzbot report. INFO: rcu_sched detected stalls on CPUs/tasks: (detected by 1, t=10502 jiffies, g=5049, c=5048, q=752) All QSes seen, last rcu_sched kthread activity 10502 (4294965563-4294955061), jiffies_till_next_fqs=1, root ->qsmask 0x0 syz-executor.1 R running task on cpu 1 10984 11210 3866 0x30020008 179034491270 Call Trace: [] _sched_show_task kernel/sched/core.c:8063 [inline] [] _sched_show_task.cold+0x2fd/0x392 kernel/sched/core.c:8030 [] sched_show_task+0xb/0x10 kernel/sched/core.c:8073 [] print_other_cpu_stall kernel/rcu/tree.c:1577 [inline] [] check_cpu_stall kernel/rcu/tree.c:1695 [inline] [] __rcu_pending kernel/rcu/tree.c:3478 [inline] [] rcu_pending kernel/rcu/tree.c:3540 [inline] [] rcu_check_callbacks.cold+0xbb4/0xc29 kernel/rcu/tree.c:2876 [] update_process_times+0x32/0x80 kernel/time/timer.c:1635 [] tick_sched_handle+0xa0/0x180 kernel/time/tick-sched.c:161 [] tick_sched_timer+0x44/0x130 kernel/time/tick-sched.c:1193 [] __run_hrtimer kernel/time/hrtimer.c:1393 [inline] [] __hrtimer_run_queues+0x307/0xd90 kernel/time/hrtimer.c:1455 [] hrtimer_interrupt+0x2ea/0x730 kernel/time/hrtimer.c:1513 [] local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1031 [inline] [] smp_apic_timer_interrupt+0x144/0x5e0 arch/x86/kernel/apic/apic.c:1056 [] apic_timer_interrupt+0x8e/0xa0 arch/x86/entry/entry_64.S:778 RIP: 0010:do_raw_read_lock+0x22/0x80 kernel/locking/spinlock_debug.c:153 RSP: 0018:ffff8801dad07ab8 EFLAGS: 00000a02 ORIG_RAX: ffffffffffffff12 RAX: 0000000000000000 RBX: ffff8801c4135680 RCX: 0000000000000000 RDX: 1ffff10038826afe RSI: ffff88019d816bb8 RDI: ffff8801c41357f0 RBP: ffff8801dad07ac0 R08: 0000000000004b15 R09: 0000000000310273 R10: ffff88019d816bb8 R11: 0000000000000001 R12: ffff8801c41357e8 R13: 0000000000000000 R14: ffff8801cfb19850 R15: ffff8801cfb198b0 [] __raw_read_lock_bh include/linux/rwlock_api_smp.h:177 [inline] [] _raw_read_lock_bh+0x3e/0x50 kernel/locking/spinlock.c:240 [] ipv6_chk_mcast_addr+0x11a/0x6f0 net/ipv6/mcast.c:1006 [] ip6_mc_input+0x319/0x8e0 net/ipv6/ip6_input.c:482 [] dst_input include/net/dst.h:449 [inline] [] ip6_rcv_finish+0x408/0x610 net/ipv6/ip6_input.c:78 [] NF_HOOK include/linux/netfilter.h:292 [inline] [] NF_HOOK include/linux/netfilter.h:286 [inline] [] ipv6_rcv+0x10e/0x420 net/ipv6/ip6_input.c:278 [] __netif_receive_skb_one_core+0x12a/0x1f0 net/core/dev.c:5303 [] __netif_receive_skb+0x2c/0x1b0 net/core/dev.c:5417 [] process_backlog+0x216/0x6c0 net/core/dev.c:6243 [] napi_poll net/core/dev.c:6680 [inline] [] net_rx_action+0x47b/0xfb0 net/core/dev.c:6748 [] __do_softirq+0x2c8/0x99a kernel/softirq.c:317 [] invoke_softirq kernel/softirq.c:399 [inline] [] irq_exit+0x16a/0x1a0 kernel/softirq.c:439 [] exiting_irq arch/x86/include/asm/apic.h:561 [inline] [] smp_apic_timer_interrupt+0x165/0x5e0 arch/x86/kernel/apic/apic.c:1058 [] apic_timer_interrupt+0x8e/0xa0 arch/x86/entry/entry_64.S:778 RIP: 0010:__sanitizer_cov_trace_pc+0x26/0x50 kernel/kcov.c:102 RSP: 0018:ffff880196033bd8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff12 RAX: ffff88019d8161c0 RBX: 00000000ffffffff RCX: ffffc90003501000 RDX: 0000000000000002 RSI: ffffffff816236d1 RDI: 0000000000000005 RBP: ffff880196033bd8 R08: ffff88019d8161c0 R09: 0000000000000000 R10: 1ffff10032c067f0 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000080 R14: 0000000000000000 R15: 0000000000000000 [] do_futex+0x151/0x1d50 kernel/futex.c:3548 [] C_SYSC_futex kernel/futex_compat.c:201 [inline] [] compat_SyS_futex+0x270/0x3b0 kernel/futex_compat.c:175 [] do_syscall_32_irqs_on arch/x86/entry/common.c:353 [inline] [] do_fast_syscall_32+0x357/0xe1c arch/x86/entry/common.c:415 [] entry_SYSENTER_compat+0x8b/0x9d arch/x86/entry/entry_64_compat.S:139 RIP: 0023:0xf7f23c69 RSP: 002b:00000000f5d1f12c EFLAGS: 00000282 ORIG_RAX: 00000000000000f0 RAX: ffffffffffffffda RBX: 000000000816af88 RCX: 0000000000000080 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000816af8c RBP: 00000000f5d1f228 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 rcu_sched kthread starved for 10502 jiffies! g5049 c5048 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x0 ->cpu=1 rcu_sched R running task on cpu 1 13048 8 2 0x90000000 179099587640 Call Trace: [] context_switch+0x60f/0xa60 kernel/sched/core.c:3209 [] __schedule+0x5aa/0x1da0 kernel/sched/core.c:3934 [] schedule+0x8f/0x1b0 kernel/sched/core.c:4011 [] schedule_timeout+0x50d/0xee0 kernel/time/timer.c:1803 [] rcu_gp_kthread+0xda1/0x3b50 kernel/rcu/tree.c:2327 [] kthread+0x348/0x420 kernel/kthread.c:246 [] ret_from_fork+0x56/0x70 arch/x86/entry/entry_64.S:393 Fixes: ba35f8588f47 (“ipvlan: Defer multicast / broadcast processing to a work-queue”) Signed-off-by: Mahesh Bandewar Reported-by: syzbot Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 3d95a5e34b40f7e1b25fa70cef129f7f3da4cdd3 Author: Hangbin Liu Date: Tue Mar 10 15:27:37 2020 +0800 ipv6/addrconf: call ipv6_mc_up() for non-Ethernet interface [ Upstream commit 60380488e4e0b95e9e82aa68aa9705baa86de84c ] Rafał found an issue that for non-Ethernet interface, if we down and up frequently, the memory will be consumed slowly. The reason is we add allnodes/allrouters addressed in multicast list in ipv6_add_dev(). When link down, we call ipv6_mc_down(), store all multicast addresses via mld_add_delrec(). But when link up, we don't call ipv6_mc_up() for non-Ethernet interface to remove the addresses. This makes idev->mc_tomb getting bigger and bigger. The call stack looks like: addrconf_notify(NETDEV_REGISTER) ipv6_add_dev ipv6_dev_mc_inc(ff01::1) ipv6_dev_mc_inc(ff02::1) ipv6_dev_mc_inc(ff02::2) addrconf_notify(NETDEV_UP) addrconf_dev_config /* Alas, we support only Ethernet autoconfiguration. */ return; addrconf_notify(NETDEV_DOWN) addrconf_ifdown ipv6_mc_down igmp6_group_dropped(ff02::2) mld_add_delrec(ff02::2) igmp6_group_dropped(ff02::1) igmp6_group_dropped(ff01::1) After investigating, I can't found a rule to disable multicast on non-Ethernet interface. In RFC2460, the link could be Ethernet, PPP, ATM, tunnels, etc. In IPv4, it doesn't check the dev type when calls ip_mc_up() in inetdev_event(). Even for IPv6, we don't check the dev type and call ipv6_add_dev(), ipv6_dev_mc_inc() after register device. So I think it's OK to fix this memory consumer by calling ipv6_mc_up() for non-Ethernet interface. v2: Also check IFF_MULTICAST flag to make sure the interface supports multicast Reported-by: Rafał Miłecki Tested-by: Rafał Miłecki Fixes: 74235a25c673 ("[IPV6] addrconf: Fix IPv6 on tuntap tunnels") Fixes: 1666d49e1d41 ("mld: do not remove mld souce list info when set link down") Signed-off-by: Hangbin Liu Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 24dd755fcec63d73d005ede430af288502a839cd Author: Dmitry Yakunin Date: Thu Mar 5 15:33:12 2020 +0300 inet_diag: return classid for all socket types [ Upstream commit 83f73c5bb7b9a9135173f0ba2b1aa00c06664ff9 ] In commit 1ec17dbd90f8 ("inet_diag: fix reporting cgroup classid and fallback to priority") croup classid reporting was fixed. But this works only for TCP sockets because for other socket types icsk parameter can be NULL and classid code path is skipped. This change moves classid handling to inet_diag_msg_attrs_fill() function. Also inet_diag_msg_attrs_size() helper was added and addends in nlmsg_new() were reordered to save order from inet_sk_diag_fill(). Fixes: 1ec17dbd90f8 ("inet_diag: fix reporting cgroup classid and fallback to priority") Signed-off-by: Dmitry Yakunin Reviewed-by: Konstantin Khlebnikov Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 33f0d95c107963d325af61de1f1811d17141c42c Author: Eric Dumazet Date: Sat Mar 7 22:05:14 2020 -0800 gre: fix uninit-value in __iptunnel_pull_header [ Upstream commit 17c25cafd4d3e74c83dce56b158843b19c40b414 ] syzbot found an interesting case of the kernel reading an uninit-value [1] Problem is in the handling of ETH_P_WCCP in gre_parse_header() We look at the byte following GRE options to eventually decide if the options are four bytes longer. Use skb_header_pointer() to not pull bytes if we found that no more bytes were needed. All callers of gre_parse_header() are properly using pskb_may_pull() anyway before proceeding to next header. [1] BUG: KMSAN: uninit-value in pskb_may_pull include/linux/skbuff.h:2303 [inline] BUG: KMSAN: uninit-value in __iptunnel_pull_header+0x30c/0xbd0 net/ipv4/ip_tunnel_core.c:94 CPU: 1 PID: 11784 Comm: syz-executor940 Not tainted 5.6.0-rc2-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x220 lib/dump_stack.c:118 kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:118 __msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:215 pskb_may_pull include/linux/skbuff.h:2303 [inline] __iptunnel_pull_header+0x30c/0xbd0 net/ipv4/ip_tunnel_core.c:94 iptunnel_pull_header include/net/ip_tunnels.h:411 [inline] gre_rcv+0x15e/0x19c0 net/ipv6/ip6_gre.c:606 ip6_protocol_deliver_rcu+0x181b/0x22c0 net/ipv6/ip6_input.c:432 ip6_input_finish net/ipv6/ip6_input.c:473 [inline] NF_HOOK include/linux/netfilter.h:307 [inline] ip6_input net/ipv6/ip6_input.c:482 [inline] ip6_mc_input+0xdf2/0x1460 net/ipv6/ip6_input.c:576 dst_input include/net/dst.h:442 [inline] ip6_rcv_finish net/ipv6/ip6_input.c:76 [inline] NF_HOOK include/linux/netfilter.h:307 [inline] ipv6_rcv+0x683/0x710 net/ipv6/ip6_input.c:306 __netif_receive_skb_one_core net/core/dev.c:5198 [inline] __netif_receive_skb net/core/dev.c:5312 [inline] netif_receive_skb_internal net/core/dev.c:5402 [inline] netif_receive_skb+0x66b/0xf20 net/core/dev.c:5461 tun_rx_batched include/linux/skbuff.h:4321 [inline] tun_get_user+0x6aef/0x6f60 drivers/net/tun.c:1997 tun_chr_write_iter+0x1f2/0x360 drivers/net/tun.c:2026 call_write_iter include/linux/fs.h:1901 [inline] new_sync_write fs/read_write.c:483 [inline] __vfs_write+0xa5a/0xca0 fs/read_write.c:496 vfs_write+0x44a/0x8f0 fs/read_write.c:558 ksys_write+0x267/0x450 fs/read_write.c:611 __do_sys_write fs/read_write.c:623 [inline] __se_sys_write fs/read_write.c:620 [inline] __ia32_sys_write+0xdb/0x120 fs/read_write.c:620 do_syscall_32_irqs_on arch/x86/entry/common.c:339 [inline] do_fast_syscall_32+0x3c7/0x6e0 arch/x86/entry/common.c:410 entry_SYSENTER_compat+0x68/0x77 arch/x86/entry/entry_64_compat.S:139 RIP: 0023:0xf7f62d99 Code: 90 e8 0b 00 00 00 f3 90 0f ae e8 eb f9 8d 74 26 00 89 3c 24 c3 90 90 90 90 90 90 90 90 90 90 90 90 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 eb 0d 90 90 90 90 90 90 90 90 90 90 90 90 RSP: 002b:00000000fffedb2c EFLAGS: 00000217 ORIG_RAX: 0000000000000004 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000020002580 RDX: 0000000000000fca RSI: 0000000000000036 RDI: 0000000000000004 RBP: 0000000000008914 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:144 [inline] kmsan_internal_poison_shadow+0x66/0xd0 mm/kmsan/kmsan.c:127 kmsan_slab_alloc+0x8a/0xe0 mm/kmsan/kmsan_hooks.c:82 slab_alloc_node mm/slub.c:2793 [inline] __kmalloc_node_track_caller+0xb40/0x1200 mm/slub.c:4401 __kmalloc_reserve net/core/skbuff.c:142 [inline] __alloc_skb+0x2fd/0xac0 net/core/skbuff.c:210 alloc_skb include/linux/skbuff.h:1051 [inline] alloc_skb_with_frags+0x18c/0xa70 net/core/skbuff.c:5766 sock_alloc_send_pskb+0xada/0xc60 net/core/sock.c:2242 tun_alloc_skb drivers/net/tun.c:1529 [inline] tun_get_user+0x10ae/0x6f60 drivers/net/tun.c:1843 tun_chr_write_iter+0x1f2/0x360 drivers/net/tun.c:2026 call_write_iter include/linux/fs.h:1901 [inline] new_sync_write fs/read_write.c:483 [inline] __vfs_write+0xa5a/0xca0 fs/read_write.c:496 vfs_write+0x44a/0x8f0 fs/read_write.c:558 ksys_write+0x267/0x450 fs/read_write.c:611 __do_sys_write fs/read_write.c:623 [inline] __se_sys_write fs/read_write.c:620 [inline] __ia32_sys_write+0xdb/0x120 fs/read_write.c:620 do_syscall_32_irqs_on arch/x86/entry/common.c:339 [inline] do_fast_syscall_32+0x3c7/0x6e0 arch/x86/entry/common.c:410 entry_SYSENTER_compat+0x68/0x77 arch/x86/entry/entry_64_compat.S:139 Fixes: 95f5c64c3c13 ("gre: Move utility functions to common headers") Fixes: c54419321455 ("GRE: Refactor GRE tunneling code.") Signed-off-by: Eric Dumazet Reported-by: syzbot Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit d6d11db20e8fdd565868d6d7cf24cecd00f02ebb Author: Dmitry Yakunin Date: Thu Mar 5 17:45:57 2020 +0300 cgroup, netclassid: periodically release file_lock on classid updating [ Upstream commit 018d26fcd12a75fb9b5fe233762aa3f2f0854b88 ] In our production environment we have faced with problem that updating classid in cgroup with heavy tasks cause long freeze of the file tables in this tasks. By heavy tasks we understand tasks with many threads and opened sockets (e.g. balancers). This freeze leads to an increase number of client timeouts. This patch implements following logic to fix this issue: аfter iterating 1000 file descriptors file table lock will be released thus providing a time gap for socket creation/deletion. Now update is non atomic and socket may be skipped using calls: dup2(oldfd, newfd); close(oldfd); But this case is not typical. Moreover before this patch skip is possible too by hiding socket fd in unix socket buffer. New sockets will be allocated with updated classid because cgroup state is updated before start of the file descriptors iteration. So in common cases this patch has no side effects. Signed-off-by: Dmitry Yakunin Reviewed-by: Konstantin Khlebnikov Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit ba389f3620301d29016831ebcdb7f164fc94d29f Author: Florian Fainelli Date: Thu Feb 20 15:34:53 2020 -0800 net: phy: Avoid multiple suspends commit 503ba7c6961034ff0047707685644cad9287c226 upstream. It is currently possible for a PHY device to be suspended as part of a network device driver's suspend call while it is still being attached to that net_device, either via phy_suspend() or implicitly via phy_stop(). Later on, when the MDIO bus controller get suspended, we would attempt to suspend again the PHY because it is still attached to a network device. This is both a waste of time and creates an opportunity for improper clock/power management bugs to creep in. Fixes: 803dd9c77ac3 ("net: phy: avoid suspending twice a PHY") Signed-off-by: Florian Fainelli Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 458c058c9cd4ace37ebf9f93250bf8c5b5cc1582 Author: David S. Miller Date: Tue Dec 4 08:47:44 2018 -0800 phy: Revert toggling reset changes. commit 7b566f70e1bf65b189b66eb3de6f431c30f7dff2 upstream. This reverts: ef1b5bf506b1 ("net: phy: Fix not to call phy_resume() if PHY is not attached") 8c85f4b81296 ("net: phy: micrel: add toggling phy reset if PHY is not attached") Andrew Lunn informs me that there are alternative efforts underway to fix this more properly. Signed-off-by: David S. Miller [just take the ef1b5bf506b1 revert - gregkh] Signed-off-by: Greg Kroah-Hartman