Compare commits

...

84 Commits

Author SHA1 Message Date
Linus Torvalds
fd95357fd8 Merge tag 'sched_ext-for-6.18-rc6-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext
Pull sched_ext fix from Tejun Heo:
 "One low risk and obvious fix: scx_enable() was dereferencing an error
  pointer on helper kthread creation failure. Fixed"

* tag 'sched_ext-for-6.18-rc6-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext:
  sched_ext: Fix scx_enable() crash on helper kthread creation failure
2025-11-20 11:04:37 -08:00
Linus Torvalds
c966813ea1 Merge tag 'slab-for-6.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab
Pull slab fix from Vlastimil Babka:

 - Fix mempool poisoning order>0 pages with CONFIG_HIGHMEM (Vlastimil Babka)

* tag 'slab-for-6.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
  mm/mempool: fix poisoning order>0 pages with HIGHMEM
2025-11-20 10:49:12 -08:00
Saket Kumar Bhaskar
7b6216baae sched_ext: Fix scx_enable() crash on helper kthread creation failure
A crash was observed when the sched_ext selftests runner was
terminated with Ctrl+\ while test 15 was running:

NIP [c00000000028fa58] scx_enable.constprop.0+0x358/0x12b0
LR [c00000000028fa2c] scx_enable.constprop.0+0x32c/0x12b0
Call Trace:
scx_enable.constprop.0+0x32c/0x12b0 (unreliable)
bpf_struct_ops_link_create+0x18c/0x22c
__sys_bpf+0x23f8/0x3044
sys_bpf+0x2c/0x6c
system_call_exception+0x124/0x320
system_call_vectored_common+0x15c/0x2ec

kthread_run_worker() returns an ERR_PTR() on failure rather than NULL,
but the current code in scx_alloc_and_add_sched() only checks for a NULL
helper. Incase of failure on SIGQUIT, the error is not handled in
scx_alloc_and_add_sched() and scx_enable() ends up dereferencing an
error pointer.

Error handling is fixed in scx_alloc_and_add_sched() to propagate
PTR_ERR() into ret, so that scx_enable() jumps to the existing error
path, avoiding random dereference on failure.

Fixes: bff3b5aec1 ("sched_ext: Move disable machinery into scx_sched")
Cc: stable@vger.kernel.org # v6.16+
Reported-and-tested-by: Samir Mulani <samir@linux.ibm.com>
Signed-off-by: Saket Kumar Bhaskar <skb99@linux.ibm.com>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Reviewed-by: Andrea Righi <arighi@nvidia.com>
Reviewed-by: Vishal Chourasia <vishalc@linux.ibm.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-11-20 08:45:43 -10:00
Linus Torvalds
07e09c3233 Merge tag 'pm-6.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fix from Rafael Wysocki:
 "Fix a regression introduced during the 6.16 development cycle that may
  cause runtime PM to be enabled by mistake for devices that do not
  support it (which may lead to some serious trouble) if there is a
  system wakeup event during the "late suspend" phase of system suspend"

* tag 'pm-6.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM: sleep: core: Fix runtime PM enabling in device_resume_early()
2025-11-20 09:46:52 -08:00
Linus Torvalds
1753d40dce Merge tag 'acpi-6.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
 "This fixes EINJV2 support introduced during the 6.17 cycle by
  unbreaking the initialization broken by a previous attempted fix,
  adding sanity checks for data coming from the platform firmware, and
  updating the code to handle injecting legacy error types on an EINJV2
  capable systems properly (Tony Luck)"

* tag 'acpi-6.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: APEI: EINJ: Fix EINJV2 initialization and injection
2025-11-20 09:44:27 -08:00
Linus Torvalds
6ba3bb3348 Merge tag 'platform-drivers-x86-v6.18-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver fixes from Ilpo Järvinen:
 "This one has lots of new HW entries which adds to the size in diffstat
  but the individual changes are simple.

  Fixes

   - acer-wmi: Ignore backlight event

   - alienware-wmi-wmax: Fix quirk match table order & drop redundant
     entries

   - amd/pmc:
      - Add Xbox Ally to spurious 8042 quirk list
      - Quirk list Lenovo Legion Go 2 NVMe resume

   - msi-wmi-platform:
      - Correct GUID to uppercase
      - GUID is uncleverly copy-pasted from an example so add a DMI
        whitelist

   - intel/speed_select_if: PCIBIOS_* return code conversion

   - intel-uncore-freq & ISST: Fix kernel doc warnings

  New HW support

   - alienware-wmi-wmax:
      - Alienware 16 Aurora support
      - Alienware M support
      - Alienware X support
      - Dell G support

   - amd/pmc:
      - ROG Xbox Ally (non-X) support

   - huaway-wmi: HONOR MagicBoox X16/X14 PrintScreen & YOYO keys

   - hp-wmi:
      - Omen 16-wf1xxx fan support
      - Omen MAX 16-ah0xx fan + thermal profile support
      - Victus 16-r0 and 16-s0 fan + thermal profile support

   - intel/hid: Intel Nova Lake support

   - intel-uncore-freq:
      - Intel Panther Lake support
      - Intel Wildcat Lake support
      - Intel Nova Lake support"

* tag 'platform-drivers-x86-v6.18-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (21 commits)
  platform/x86: intel-uncore-freq: fix all header kernel-doc warnings
  platform/x86: acer-wmi: Ignore backlight event
  platform/x86/intel/speed_select_if: Convert PCIBIOS_* return codes to errnos
  platform/x86/intel/hid: Add Nova Lake support
  platform/x86: alienware-wmi-wmax: Add AWCC support to Alienware 16 Aurora
  platform/x86: hp-wmi: Add Omen MAX 16-ah0xx fan support and thermal profile
  platform/x86: msi-wmi-platform: Fix typo in WMI GUID
  platform/x86: msi-wmi-platform: Only load on MSI devices
  platform/x86/amd: pmc: Add Lenovo Legion Go 2 to pmc quirk list
  platform/x86/amd/pmc: Add spurious_8042 to Xbox Ally
  platform/x86/amd/pmc: Add support for Van Gogh SoC
  platform/x86: alienware-wmi-wmax: Add support for the whole "G" family
  platform/x86: alienware-wmi-wmax: Add support for the whole "X" family
  platform/x86: alienware-wmi-wmax: Add support for the whole "M" family
  platform/x86: alienware-wmi-wmax: Drop redundant DMI entries
  platform/x86: alienware-wmi-wmax: Fix "Alienware m16 R1 AMD" quirk order
  platform/x86: ISST: isst_if.h: fix all kernel-doc warnings
  platform/x86: intel-uncore-freq: Add additional client processors
  platform/x86: hp-wmi: Add Omen 16-wf1xxx fan support
  platform/x86: huawei-wmi: add keys for HONOR models
  ...
2025-11-20 09:39:34 -08:00
Linus Torvalds
8e621c9a33 Merge tag 'net-6.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "Including fixes from IPsec and wireless.

  Previous releases - regressions:

   - prevent NULL deref in generic_hwtstamp_ioctl_lower(),
     newer APIs don't populate all the pointers in the request

   - phylink: add missing supported link modes for the fixed-link

   - mptcp: fix false positive warning in mptcp_pm_nl_rm_addr

  Previous releases - always broken:

   - openvswitch: remove never-working support for setting NSH fields

   - xfrm: number of fixes for error paths of xfrm_state creation/
     modification/deletion

   - xfrm: fixes for offload
      - fix the determination of the protocol of the inner packet
      - don't push locally generated packets directly to L2 tunnel
        mode offloading, they still need processing from the standard
        xfrm path

   - mptcp: fix a couple of corner cases in fallback and fastclose
     handling

   - wifi: rtw89: hw_scan: prevent connections from getting stuck,
     work around apparent bug in FW by tweaking messages we send

   - af_unix: fix duplicate data if PEEK w/ peek_offset needs to wait

   - veth: more robust handing of race to avoid txq getting stuck

   - eth: ps3_gelic_net: handle skb allocation failures"

* tag 'net-6.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (47 commits)
  vsock: Ignore signal/timeout on connect() if already established
  be2net: pass wrb_params in case of OS2BMC
  l2tp: reset skb control buffer on xmit
  net: dsa: microchip: lan937x: Fix RGMII delay tuning
  selftests: mptcp: add a check for 'add_addr_accepted'
  mptcp: fix address removal logic in mptcp_pm_nl_rm_addr
  selftests: mptcp: join: userspace: longer timeout
  selftests: mptcp: join: endpoints: longer timeout
  selftests: mptcp: join: fastclose: remove flaky marks
  mptcp: fix duplicate reset on fastclose
  mptcp: decouple mptcp fastclose from tcp close
  mptcp: do not fallback when OoO is present
  mptcp: fix premature close in case of fallback
  mptcp: avoid unneeded subflow-level drops
  mptcp: fix ack generation for fallback msk
  wifi: rtw89: hw_scan: Don't let the operating channel be last
  net: phylink: add missing supported link modes for the fixed-link
  selftest: af_unix: Add test for SO_PEEK_OFF.
  af_unix: Read sk_peek_offset() again after sleeping in unix_stream_read_generic().
  net/mlx5: Clean up only new IRQ glue on request_irq() failure
  ...
2025-11-20 08:52:07 -08:00
Michal Luczaj
002541ef65 vsock: Ignore signal/timeout on connect() if already established
During connect(), acting on a signal/timeout by disconnecting an already
established socket leads to several issues:

1. connect() invoking vsock_transport_cancel_pkt() ->
   virtio_transport_purge_skbs() may race with sendmsg() invoking
   virtio_transport_get_credit(). This results in a permanently elevated
   `vvs->bytes_unsent`. Which, in turn, confuses the SOCK_LINGER handling.

2. connect() resetting a connected socket's state may race with socket
   being placed in a sockmap. A disconnected socket remaining in a sockmap
   breaks sockmap's assumptions. And gives rise to WARNs.

3. connect() transitioning SS_CONNECTED -> SS_UNCONNECTED allows for a
   transport change/drop after TCP_ESTABLISHED. Which poses a problem for
   any simultaneous sendmsg() or connect() and may result in a
   use-after-free/null-ptr-deref.

Do not disconnect socket on signal/timeout. Keep the logic for unconnected
sockets: they don't linger, can't be placed in a sockmap, are rejected by
sendmsg().

[1]: https://lore.kernel.org/netdev/e07fd95c-9a38-4eea-9638-133e38c2ec9b@rbox.co/
[2]: https://lore.kernel.org/netdev/20250317-vsock-trans-signal-race-v4-0-fc8837f3f1d4@rbox.co/
[3]: https://lore.kernel.org/netdev/60f1b7db-3099-4f6a-875e-af9f6ef194f6@rbox.co/

Fixes: d021c34405 ("VSOCK: Introduce VM Sockets")
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://patch.msgid.link/20251119-vsock-interrupted-connect-v2-1-70734cf1233f@rbox.co
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-20 07:40:06 -08:00
Andrey Vatoropin
7d277a7a58 be2net: pass wrb_params in case of OS2BMC
be_insert_vlan_in_pkt() is called with the wrb_params argument being NULL
at be_send_pkt_to_bmc() call site.  This may lead to dereferencing a NULL
pointer when processing a workaround for specific packet, as commit
bc0c3405ab ("be2net: fix a Tx stall bug caused by a specific ipv6
packet") states.

The correct way would be to pass the wrb_params from be_xmit().

Fixes: 760c295e0e ("be2net: Support for OS2BMC.")
Cc: stable@vger.kernel.org
Signed-off-by: Andrey Vatoropin <a.vatoropin@crpt.ru>
Link: https://patch.msgid.link/20251119105015.194501-1-a.vatoropin@crpt.ru
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-20 07:39:54 -08:00
Paolo Abeni
dc9e7e652f Merge tag 'wireless-2025-11-20' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless
Johannes Berg says:

====================
wireless-2025-11-20

A single fix for scanning on some rtw89 devices.

* tag 'wireless-2025-11-20' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
  wifi: rtw89: hw_scan: Don't let the operating channel be last
====================

Link: https://patch.msgid.link/20251120085433.8601-3-johannes@sipsolutions.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-20 13:03:43 +01:00
David Bauer
d70b592551 l2tp: reset skb control buffer on xmit
The L2TP stack did not reset the skb control buffer before sending the
encapsulated package.

In a setup with an ath10k radio and batman-adv over an L2TP tunnel
massive fragmentations happen sporadically if the L2TP tunnel is
established over IPv4.

L2TP might reset some of the fields in the IP control buffer, but L2TP
assumes the type of the control buffer to be of an IPv4 packet.

In case the L2TP interface is used as a batadv hardif or the packet is
an IPv6 packet, this assumption breaks.

Clear the entire control buffer to avoid such mishaps altogether.

Fixes: f77ae93904 ("[PPPOL2TP]: Reset meta-data in xmit function")
Signed-off-by: David Bauer <mail@david-bauer.net>
Link: https://patch.msgid.link/20251118001619.242107-1-mail@david-bauer.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-20 11:52:24 +01:00
Oleksij Rempel
3ceb6ac211 net: dsa: microchip: lan937x: Fix RGMII delay tuning
Correct RGMII delay application logic in lan937x_set_tune_adj().

The function was missing `data16 &= ~PORT_TUNE_ADJ` before setting the
new delay value. This caused the new value to be bitwise-OR'd with the
existing PORT_TUNE_ADJ field instead of replacing it.

For example, when setting the RGMII 2 TX delay on port 4, the
intended TUNE_ADJUST value of 0 (RGMII_2_TX_DELAY_2NS) was
incorrectly OR'd with the default 0x1B (from register value 0xDA3),
leaving the delay at the wrong setting.

This patch adds the missing mask to clear the field, ensuring the
correct delay value is written. Physical measurements on the RGMII TX
lines confirm the fix, showing the delay changing from ~1ns (before
change) to ~2ns.

While testing on i.MX 8MP showed this was within the platform's timing
tolerance, it did not match the intended hardware-characterized value.

Fixes: b19ac41faa ("net: dsa: microchip: apply rgmii tx and rx delay in phylink mac config")
Cc: stable@vger.kernel.org
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://patch.msgid.link/20251114090951.4057261-1-o.rempel@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-20 11:26:14 +01:00
Johannes Berg
0ff8eeafba Merge tag 'rtw-2025-11-20' of https://github.com/pkshih/rtw
Ping-Ke Shih says:
==================
rtw patches for v6.18-rc7

Fix firmware goes wrong and causes device unusable after scanning. This
issue presents under certain regulatory domain reported from end users.
==================

Link: https://patch.msgid.link/8217bee0-96c4-44c1-9593-2e9ca12eccc5@RTKEXHMBS03.realtek.com.tw
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-11-20 09:44:08 +01:00
Jakub Kicinski
f170b1dc26 Merge branch '200GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2025-11-18 (idpf, ice)

This series contains updates to idpf and ice drivers.

Emil adds a check for NULL vport_config during removal to avoid NULL
pointer dereference in idpf.

Grzegorz fixes PTP teardown paths to account for some missed cleanups
for ice driver.

* '200GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  ice: fix PTP cleanup on driver removal in error path
  idpf: fix possible vport_config NULL pointer deref in remove
====================

Link: https://patch.msgid.link/20251118235207.2165495-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-19 20:10:53 -08:00
Jakub Kicinski
4026310a04 Merge branch 'mptcp-misc-fixes-for-v6-18-rc7'
Matthieu Baerts says:

====================
mptcp: misc fixes for v6.18-rc7

Here are various unrelated fixes:

- Patch 1: Fix window space computation for fallback connections which
  can affect ACK generation. A fix for v5.11.

- Patch 2: Avoid unneeded subflow-level drops due to unsynced received
  window. A fix for v5.11.

- Patch 3: Avoid premature close for fallback connections with PREEMPT
  kernels. A fix for v5.12.

- Patch 4: Reset instead of fallback in case of data in the MPTCP
  out-of-order queue. A fix for v5.7.

- Patches 5-7: Avoid also sending "plain" TCP reset when closing with an
  MP_FASTCLOSE. A fix for v6.1.

- Patches 8-9: Longer timeout for background connections in MPTCP Join
  selftests. An additional fix for recent patches for v5.13/v6.1.

- Patches 10-11: Fix typo in a check introduce in a recent refactoring.
  A fix for v6.15.
====================

Link: https://patch.msgid.link/20251118-net-mptcp-misc-fixes-6-18-rc6-v1-0-806d3781c95f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-19 20:07:19 -08:00
Gang Yan
0eee0fdf9b selftests: mptcp: add a check for 'add_addr_accepted'
The previous patch fixed an issue with the 'add_addr_accepted' counter.
This was not spot by the test suite.

Check this counter and 'add_addr_signal' in MPTCP Join 'delete re-add
signal' test. This should help spotting similar regressions later on.
These counters are crucial for ensuring the MPTCP path manager correctly
handles the subflow creation via 'ADD_ADDR'.

Signed-off-by: Gang Yan <yangang@kylinos.cn>
Reviewed-by: Geliang Tang <geliang@kernel.org>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251118-net-mptcp-misc-fixes-6-18-rc6-v1-11-806d3781c95f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-19 20:07:16 -08:00
Gang Yan
92e239e36d mptcp: fix address removal logic in mptcp_pm_nl_rm_addr
Fix inverted WARN_ON_ONCE condition that prevented normal address
removal counter updates. The current code only executes decrement
logic when the counter is already 0 (abnormal state), while
normal removals (counter > 0) are ignored.

Signed-off-by: Gang Yan <yangang@kylinos.cn>
Fixes: 6361139185 ("mptcp: pm: remove '_nl' from mptcp_pm_nl_rm_addr_received")
Cc: stable@vger.kernel.org
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251118-net-mptcp-misc-fixes-6-18-rc6-v1-10-806d3781c95f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-19 20:07:15 -08:00
Matthieu Baerts (NGI0)
0e4ec14dc1 selftests: mptcp: join: userspace: longer timeout
In rare cases, when the test environment is very slow, some userspace
tests can fail because some expected events have not been seen.

Because the tests are expecting a long on-going connection, and they are
not waiting for the end of the transfer, it is fine to have a longer
timeout, and even go over the default one. This connection will be
killed at the end, after the verifications: increasing the timeout
doesn't change anything, apart from avoiding it to end before the end of
the verifications.

To play it safe, all userspace tests not waiting for the end of the
transfer are now having a longer timeout: 2 minutes.

The Fixes commit was making the connection longer, but still, the
default timeout would have stopped it after 1 minute, which might not be
enough in very slow environments.

Fixes: 290493078b ("selftests: mptcp: join: userspace: longer transfer")
Cc: stable@vger.kernel.org
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Geliang Tang <geliang@kernel.org>
Link: https://patch.msgid.link/20251118-net-mptcp-misc-fixes-6-18-rc6-v1-9-806d3781c95f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-19 20:07:15 -08:00
Matthieu Baerts (NGI0)
fb13c6bb81 selftests: mptcp: join: endpoints: longer timeout
In rare cases, when the test environment is very slow, some endpoints
tests can fail because some expected events have not been seen.

Because the tests are expecting a long on-going connection, and they are
not waiting for the end of the transfer, it is fine to have a longer
timeout, and even go over the default one. This connection will be
killed at the end, after the verifications: increasing the timeout
doesn't change anything, apart from avoiding it to end before the end of
the verifications.

To play it safe, all endpoints tests not waiting for the end of the
transfer are now having a longer timeout: 2 minutes.

The Fixes commit was making the connection longer, but still, the
default timeout would have stopped it after 1 minute, which might not be
enough in very slow environments.

Fixes: 6457595db9 ("selftests: mptcp: join: endpoints: longer transfer")
Cc: stable@vger.kernel.org
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Geliang Tang <geliang@kernel.org>
Link: https://patch.msgid.link/20251118-net-mptcp-misc-fixes-6-18-rc6-v1-8-806d3781c95f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-19 20:07:15 -08:00
Matthieu Baerts (NGI0)
efff6cd53a selftests: mptcp: join: fastclose: remove flaky marks
After recent fixes like the parent commit, and "selftests: mptcp:
connect: trunc: read all recv data", the two fastclose subtests no
longer look flaky any more.

It then feels fine to remove these flaky marks, to no longer ignore
these subtests in case of errors.

Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251118-net-mptcp-misc-fixes-6-18-rc6-v1-7-806d3781c95f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-19 20:07:15 -08:00
Paolo Abeni
ae15506024 mptcp: fix duplicate reset on fastclose
The CI reports sporadic failures of the fastclose self-tests. The root
cause is a duplicate reset, not carrying the relevant MPTCP option.
In the failing scenario the bad reset is received by the peer before
the fastclose one, preventing the reception of the latter.

Indeed there is window of opportunity at fastclose time for the
following race:

  mptcp_do_fastclose
    __mptcp_close_ssk
      __tcp_close()
        tcp_set_state() [1]
        tcp_send_active_reset() [2]

After [1] the stack will send reset to in-flight data reaching the now
closed port. Such reset may race with [2].

Address the issue explicitly sending a single reset on fastclose before
explicitly moving the subflow to close status.

Fixes: d21f834855 ("mptcp: use fastclose on more edge scenarios")
Cc: stable@vger.kernel.org
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/596
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Geliang Tang <geliang@kernel.org>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251118-net-mptcp-misc-fixes-6-18-rc6-v1-6-806d3781c95f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-19 20:07:15 -08:00
Paolo Abeni
fff0c87996 mptcp: decouple mptcp fastclose from tcp close
With the current fastclose implementation, the mptcp_do_fastclose()
helper is in charge of two distinct actions: send the fastclose reset
and cleanup the subflows.

Formally decouple the two steps, ensuring that mptcp explicitly closes
all the subflows after the mentioned helper.

This will make the upcoming fix simpler, and allows dropping the 2nd
argument from mptcp_destroy_common(). The Fixes tag is then the same as
in the next commit to help with the backports.

Fixes: d21f834855 ("mptcp: use fastclose on more edge scenarios")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Geliang Tang <geliang@kernel.org>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251118-net-mptcp-misc-fixes-6-18-rc6-v1-5-806d3781c95f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-19 20:07:15 -08:00
Paolo Abeni
1bba3f219c mptcp: do not fallback when OoO is present
In case of DSS corruption, the MPTCP protocol tries to avoid the subflow
reset if fallback is possible. Such corruptions happen in the receive
path; to ensure fallback is possible the stack additionally needs to
check for OoO data, otherwise the fallback will break the data stream.

Fixes: e32d262c89 ("mptcp: handle consistently DSS corruption")
Cc: stable@vger.kernel.org
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/598
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251118-net-mptcp-misc-fixes-6-18-rc6-v1-4-806d3781c95f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-19 20:07:14 -08:00
Paolo Abeni
17393fa7b7 mptcp: fix premature close in case of fallback
I'm observing very frequent self-tests failures in case of fallback when
running on a CONFIG_PREEMPT kernel.

The root cause is that subflow_sched_work_if_closed() closes any subflow
as soon as it is half-closed and has no incoming data pending.

That works well for regular subflows - MPTCP needs bi-directional
connectivity to operate on a given subflow - but for fallback socket is
race prone.

When TCP peer closes the connection before the MPTCP one,
subflow_sched_work_if_closed() will schedule the MPTCP worker to
gracefully close the subflow, and shortly after will do another schedule
to inject and process a dummy incoming DATA_FIN.

On CONFIG_PREEMPT kernel, the MPTCP worker can kick-in and close the
fallback subflow before subflow_sched_work_if_closed() is able to create
the dummy DATA_FIN, unexpectedly interrupting the transfer.

Address the issue explicitly avoiding closing fallback subflows on when
the peer is only half-closed.

Note that, when the subflow is able to create the DATA_FIN before the
worker invocation, the worker will change the msk state before trying to
close the subflow and will skip the latter operation as the msk will not
match anymore the precondition in __mptcp_close_subflow().

Fixes: f09b0ad55a ("mptcp: close subflow when receiving TCP+FIN")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251118-net-mptcp-misc-fixes-6-18-rc6-v1-3-806d3781c95f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-19 20:07:14 -08:00
Paolo Abeni
4f102d747c mptcp: avoid unneeded subflow-level drops
The rcv window is shared among all the subflows. Currently, MPTCP sync
the TCP-level rcv window with the MPTCP one at tcp_transmit_skb() time.

The above means that incoming data may sporadically observe outdated
TCP-level rcv window and being wrongly dropped by TCP.

Address the issue checking for the edge condition before queuing the
data at TCP level, and eventually syncing the rcv window as needed.

Note that the issue is actually present from the very first MPTCP
implementation, but backports older than the blamed commit below will
range from impossible to useless.

Before:

  $ nstat -n; sleep 1; nstat -z TcpExtBeyondWindow
  TcpExtBeyondWindow              14                 0.0

After:

  $ nstat -n; sleep 1; nstat -z TcpExtBeyondWindow
  TcpExtBeyondWindow              0                  0.0

Fixes: fa3fe2b150 ("mptcp: track window announced to peer")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251118-net-mptcp-misc-fixes-6-18-rc6-v1-2-806d3781c95f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-19 20:07:14 -08:00
Paolo Abeni
5e15395f6d mptcp: fix ack generation for fallback msk
mptcp_cleanup_rbuf() needs to know the last most recent, mptcp-level
rcv_wnd sent, and such information is tracked into the msk->old_wspace
field, updated at ack transmission time by mptcp_write_options().

Fallback socket do not add any mptcp options, such helper is never
invoked, and msk->old_wspace value remain stale. That in turn makes
ack generation at recvmsg() time quite random.

Address the issue ensuring mptcp_write_options() is invoked even for
fallback sockets, and just update the needed info in such a case.

The issue went unnoticed for a long time, as mptcp currently overshots
the fallback socket receive buffer autotune significantly. It is going
to change in the near future.

Fixes: e3859603ba ("mptcp: better msk receive window updates")
Cc: stable@vger.kernel.org
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/594
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Geliang Tang <geliang@kernel.org>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251118-net-mptcp-misc-fixes-6-18-rc6-v1-1-806d3781c95f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-19 20:07:14 -08:00
Bitterblue Smith
e837b9091b wifi: rtw89: hw_scan: Don't let the operating channel be last
Scanning can be offloaded to the firmware. To that end, the driver
prepares a list of channels to scan, including periodic visits back to
the operating channel, and sends the list to the firmware.

When the channel list is too long to fit in a single H2C message, the
driver splits the list, sends the first part, and tells the firmware to
scan. When the scan is complete, the driver sends the next part of the
list and tells the firmware to scan.

When the last channel that fit in the H2C message is the operating
channel something seems to go wrong in the firmware. It will
acknowledge receiving the list of channels but apparently it will not
do anything more. The AP can't be pinged anymore. The driver still
receives beacons, though.

One way to avoid this is to split the list of channels before the
operating channel.

Affected devices:

* RTL8851BU with firmware 0.29.41.3
* RTL8832BU with firmware 0.29.29.8
* RTL8852BE with firmware 0.29.29.8

The commit 57a5fbe39a ("wifi: rtw89: refactor flow that hw scan handles channel list")
is found by git blame, but it is actually to refine the scan flow, but not
a culprit, so skip Fixes tag.

Reported-by: Bitterblue Smith <rtl8821cerfe2@gmail.com>
Closes: https://lore.kernel.org/linux-wireless/0abbda91-c5c2-4007-84c8-215679e652e1@gmail.com/
Cc: stable@vger.kernel.org # 6.16+
Signed-off-by: Bitterblue Smith <rtl8821cerfe2@gmail.com>
Acked-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/c1e61744-8db4-4646-867f-241b47d30386@gmail.com
2025-11-20 11:36:01 +08:00
Wei Fang
e31a11be41 net: phylink: add missing supported link modes for the fixed-link
Pause, Asym_Pause and Autoneg bits are not set when pl->supported is
initialized, so these link modes will not work for the fixed-link. This
leads to a TCP performance degradation issue observed on the i.MX943
platform.

The switch CPU port of i.MX943 is connected to an ENETC MAC, this link
is a fixed link and the link speed is 2.5Gbps. And one of the switch
user ports is the RGMII interface, and its link speed is 1Gbps. If the
flow-control of the fixed link is not enabled, we can easily observe
the iperf performance of TCP packets is very low. Because the inbound
rate on the CPU port is greater than the outbound rate on the user port,
the switch is prone to congestion, leading to the loss of some TCP
packets and requiring multiple retransmissions.

Solving this problem should be as simple as setting the Asym_Pause and
Pause bits. The reason why the Autoneg bit needs to be set, Russell
has gave a very good explanation in the thread [1], see below.

"As the advertising and lp_advertising bitmasks have to be non-empty,
and the swphy reports aneg capable, aneg complete, and AN enabled, then
for consistency with that state, Autoneg should be set. This is how it
was prior to the blamed commit."

Fixes: de7d3f87be ("net: phylink: Use phy_caps_lookup for fixed-link configuration")
Link: https://lore.kernel.org/aRjqLN8eQDIQfBjS@shell.armlinux.org.uk # [1]
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20251117102943.1862680-1-wei.fang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-19 08:31:11 -08:00
Tony Luck
d2932a59c2 ACPI: APEI: EINJ: Fix EINJV2 initialization and injection
ACPI 6.6 specification for EINJV2 appends an extra structure to
the end of the existing struct set_error_type_with_address.

Several issues showed up in testing.

 1) Initialization was broken by an earlier fix [1] since is_v2 is only
    set while performing an injection, not during initialization.

 2) A buggy BIOS provided invalid "revision" and "length" for the
    extension structure. Add several sanity checks.

 3) When injecting legacy error types on an EINJV2 capable system,
    don't copy the component arrays.

Fixes: 6c70585149 ("ACPI: APEI: EINJ: Check if user asked for EINJV2 injection") # [1]
Fixes: b47610296d ("ACPI: APEI: EINJ: Enable EINJv2 error injections")
Signed-off-by: Tony Luck <tony.luck@intel.com>
[ rjw: Changelog edits ]
Cc: 6.17+ <stable@vger.kernel.org> # 6.17+
Link: https://patch.msgid.link/20251119012712.178715-1-tony.luck@intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-19 13:36:29 +01:00
Jakub Kicinski
106a67494c Merge branch 'af_unix-fix-so_peek_off-bug-in-unix_stream_read_generic'
Kuniyuki Iwashima says:

====================
af_unix: Fix SO_PEEK_OFF bug in unix_stream_read_generic().

Miao Wang reported a bug of SO_PEEK_OFF on AF_UNIX SOCK_STREAM socket.

Patch 1 fixes the bug and Patch 2 adds a new selftest to cover the case.
====================

Link: https://patch.msgid.link/20251117174740.3684604-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-18 19:19:29 -08:00
Kuniyuki Iwashima
e1bb28bf13 selftest: af_unix: Add test for SO_PEEK_OFF.
The test covers various cases to verify SO_PEEK_OFF behaviour
for all AF_UNIX socket types.

two_chunks_blocking and two_chunks_overlap_blocking reproduce
the issue mentioned in the previous patch.

Without the patch, the two tests fail:

  #  RUN           so_peek_off.stream.two_chunks_blocking ...
  # so_peek_off.c:121:two_chunks_blocking:Expected 'bbbb' == 'aaaabbbb'.
  # two_chunks_blocking: Test terminated by assertion
  #          FAIL  so_peek_off.stream.two_chunks_blocking
  not ok 3 so_peek_off.stream.two_chunks_blocking

  #  RUN           so_peek_off.stream.two_chunks_overlap_blocking ...
  # so_peek_off.c:159:two_chunks_overlap_blocking:Expected 'bbbb' == 'aaaabbbb'.
  # two_chunks_overlap_blocking: Test terminated by assertion
  #          FAIL  so_peek_off.stream.two_chunks_overlap_blocking
  not ok 5 so_peek_off.stream.two_chunks_overlap_blocking

With the patch, all tests pass:

  # PASSED: 15 / 15 tests passed.
  # Totals: pass:15 fail:0 xfail:0 xpass:0 skip:0 error:0

Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20251117174740.3684604-3-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-18 19:19:09 -08:00
Kuniyuki Iwashima
7bf3a476ce af_unix: Read sk_peek_offset() again after sleeping in unix_stream_read_generic().
Miao Wang reported a bug of SO_PEEK_OFF on AF_UNIX SOCK_STREAM
socket.

The unexpected behaviour is triggered when the peek offset is
larger than the recv queue and the thread is unblocked by new
data.

Let's assume a socket which has "aaaa" in the recv queue and
the peek offset is 4.

First, unix_stream_read_generic() reads the offset 4 and skips
the skb(s) of "aaaa" with the code below:

	skip = max(sk_peek_offset(sk, flags), 0);	/* @skip is 4. */

	do {
	...
		while (skip >= unix_skb_len(skb)) {
			skip -= unix_skb_len(skb);
		...
			skb = skb_peek_next(skb, &sk->sk_receive_queue);
			if (!skb)
				goto again;		/* @skip is 0. */
		}

The thread jumps to the 'again' label and goes to sleep since
new data has not arrived yet.

Later, new data "bbbb" unblocks the thread, and the thread jumps
to the 'redo:' label to restart the entire process from the first
skb in the recv queue.

	do {
		...
redo:
		...
		last = skb = skb_peek(&sk->sk_receive_queue);
		...
again:
		if (skb == NULL) {
			...
			timeo = unix_stream_data_wait(sk, timeo, last,
						      last_len, freezable);
			...
			goto redo;			/* @skip is 0 !! */

However, the peek offset is not reset in the path.

If the buffer size is 8, recv() will return "aaaabbbb" without
skipping any data, and the final offset will be 12 (the original
offset 4 + peeked skbs' length 8).

After sleeping in unix_stream_read_generic(), we have to fetch the
peek offset again.

Let's move the redo label before mutex_lock(&u->iolock).

Fixes: 9f389e3567 ("af_unix: return data from multiple SKBs on recv() with MSG_PEEK flag")
Reported-by: Miao Wang <shankerwangmiao@gmail.com>
Closes: https://lore.kernel.org/netdev/3B969F90-F51F-4B9D-AB1A-994D9A54D460@gmail.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20251117174740.3684604-2-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-18 19:19:09 -08:00
Pradyumn Rahar
d47515af6c net/mlx5: Clean up only new IRQ glue on request_irq() failure
The mlx5_irq_alloc() function can inadvertently free the entire rmap
and end up in a crash[1] when the other threads tries to access this,
when request_irq() fails due to exhausted IRQ vectors. This commit
modifies the cleanup to remove only the specific IRQ mapping that was
just added.

This prevents removal of other valid mappings and ensures precise
cleanup of the failed IRQ allocation's associated glue object.

Note: This error is observed when both fwctl and rds configs are enabled.

[1]
mlx5_core 0000:05:00.0: Successfully registered panic handler for port 1
mlx5_core 0000:05:00.0: mlx5_irq_alloc:293:(pid 66740): Failed to
request irq. err = -28
infiniband mlx5_0: mlx5_ib_test_wc:290:(pid 66740): Error -28 while
trying to test write-combining support
mlx5_core 0000:05:00.0: Successfully unregistered panic handler for port 1
mlx5_core 0000:06:00.0: Successfully registered panic handler for port 1
mlx5_core 0000:06:00.0: mlx5_irq_alloc:293:(pid 66740): Failed to
request irq. err = -28
infiniband mlx5_0: mlx5_ib_test_wc:290:(pid 66740): Error -28 while
trying to test write-combining support
mlx5_core 0000:06:00.0: Successfully unregistered panic handler for port 1
mlx5_core 0000:03:00.0: mlx5_irq_alloc:293:(pid 28895): Failed to
request irq. err = -28
mlx5_core 0000:05:00.0: mlx5_irq_alloc:293:(pid 28895): Failed to
request irq. err = -28
general protection fault, probably for non-canonical address
0xe277a58fde16f291: 0000 [#1] SMP NOPTI

RIP: 0010:free_irq_cpu_rmap+0x23/0x7d
Call Trace:
   <TASK>
   ? show_trace_log_lvl+0x1d6/0x2f9
   ? show_trace_log_lvl+0x1d6/0x2f9
   ? mlx5_irq_alloc.cold+0x5d/0xf3 [mlx5_core]
   ? __die_body.cold+0x8/0xa
   ? die_addr+0x39/0x53
   ? exc_general_protection+0x1c4/0x3e9
   ? dev_vprintk_emit+0x5f/0x90
   ? asm_exc_general_protection+0x22/0x27
   ? free_irq_cpu_rmap+0x23/0x7d
   mlx5_irq_alloc.cold+0x5d/0xf3 [mlx5_core]
   irq_pool_request_vector+0x7d/0x90 [mlx5_core]
   mlx5_irq_request+0x2e/0xe0 [mlx5_core]
   mlx5_irq_request_vector+0xad/0xf7 [mlx5_core]
   comp_irq_request_pci+0x64/0xf0 [mlx5_core]
   create_comp_eq+0x71/0x385 [mlx5_core]
   ? mlx5e_open_xdpsq+0x11c/0x230 [mlx5_core]
   mlx5_comp_eqn_get+0x72/0x90 [mlx5_core]
   ? xas_load+0x8/0x91
   mlx5_comp_irqn_get+0x40/0x90 [mlx5_core]
   mlx5e_open_channel+0x7d/0x3c7 [mlx5_core]
   mlx5e_open_channels+0xad/0x250 [mlx5_core]
   mlx5e_open_locked+0x3e/0x110 [mlx5_core]
   mlx5e_open+0x23/0x70 [mlx5_core]
   __dev_open+0xf1/0x1a5
   __dev_change_flags+0x1e1/0x249
   dev_change_flags+0x21/0x5c
   do_setlink+0x28b/0xcc4
   ? __nla_parse+0x22/0x3d
   ? inet6_validate_link_af+0x6b/0x108
   ? cpumask_next+0x1f/0x35
   ? __snmp6_fill_stats64.constprop.0+0x66/0x107
   ? __nla_validate_parse+0x48/0x1e6
   __rtnl_newlink+0x5ff/0xa57
   ? kmem_cache_alloc_trace+0x164/0x2ce
   rtnl_newlink+0x44/0x6e
   rtnetlink_rcv_msg+0x2bb/0x362
   ? __netlink_sendskb+0x4c/0x6c
   ? netlink_unicast+0x28f/0x2ce
   ? rtnl_calcit.isra.0+0x150/0x146
   netlink_rcv_skb+0x5f/0x112
   netlink_unicast+0x213/0x2ce
   netlink_sendmsg+0x24f/0x4d9
   __sock_sendmsg+0x65/0x6a
   ____sys_sendmsg+0x28f/0x2c9
   ? import_iovec+0x17/0x2b
   ___sys_sendmsg+0x97/0xe0
   __sys_sendmsg+0x81/0xd8
   do_syscall_64+0x35/0x87
   entry_SYSCALL_64_after_hwframe+0x6e/0x0
RIP: 0033:0x7fc328603727
Code: c3 66 90 41 54 41 89 d4 55 48 89 f5 53 89 fb 48 83 ec 10 e8 0b ed
ff ff 44 89 e2 48 89 ee 89 df 41 89 c0 b8 2e 00 00 00 0f 05 <48> 3d 00
f0 ff ff 77 35 44 89 c7 48 89 44 24 08 e8 44 ed ff ff 48
RSP: 002b:00007ffe8eb3f1a0 EFLAGS: 00000293 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007fc328603727
RDX: 0000000000000000 RSI: 00007ffe8eb3f1f0 RDI: 000000000000000d
RBP: 00007ffe8eb3f1f0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000
R13: 0000000000000000 R14: 00007ffe8eb3f3c8 R15: 00007ffe8eb3f3bc
   </TASK>
---[ end trace f43ce73c3c2b13a2 ]---
RIP: 0010:free_irq_cpu_rmap+0x23/0x7d
Code: 0f 1f 80 00 00 00 00 48 85 ff 74 6b 55 48 89 fd 53 66 83 7f 06 00
74 24 31 db 48 8b 55 08 0f b7 c3 48 8b 04 c2 48 85 c0 74 09 <8b> 38 31
f6 e8 c4 0a b8 ff 83 c3 01 66 3b 5d 06 72 de b8 ff ff ff
RSP: 0018:ff384881640eaca0 EFLAGS: 00010282
RAX: e277a58fde16f291 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ff2335e2e20b3600 RSI: 0000000000000000 RDI: ff2335e2e20b3400
RBP: ff2335e2e20b3400 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 00000000ffffffe4 R12: ff384881640ead88
R13: ff2335c3760751e0 R14: ff2335e2e1672200 R15: ff2335c3760751f8
FS:  00007fc32ac22480(0000) GS:ff2335e2d6e00000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f651ab54000 CR3: 00000029f1206003 CR4: 0000000000771ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Kernel panic - not syncing: Fatal exception
Kernel Offset: 0x1dc00000 from 0xffffffff81000000 (relocation range:
0xffffffff80000000-0xffffffffbfffffff)
kvm-guest: disable async PF for cpu 0

Fixes: 3354822cde ("net/mlx5: Use dynamic msix vectors allocation")
Signed-off-by: Mohith Kumar Thummaluru<mohith.k.kumar.thummaluru@oracle.com>
Tested-by: Mohith Kumar Thummaluru<mohith.k.kumar.thummaluru@oracle.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Shay Drori <shayd@nvidia.com>
Signed-off-by: Pradyumn Rahar <pradyumn.rahar@oracle.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1763381768-1234998-1-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-18 18:47:09 -08:00
Eric Dumazet
426358d9be mptcp: fix a race in mptcp_pm_del_add_timer()
mptcp_pm_del_add_timer() can call sk_stop_timer_sync(sk, &entry->add_timer)
while another might have free entry already, as reported by syzbot.

Add RCU protection to fix this issue.

Also change confusing add_timer variable with stop_timer boolean.

syzbot report:

BUG: KASAN: slab-use-after-free in __timer_delete_sync+0x372/0x3f0 kernel/time/timer.c:1616
Read of size 4 at addr ffff8880311e4150 by task kworker/1:1/44

CPU: 1 UID: 0 PID: 44 Comm: kworker/1:1 Not tainted syzkaller #0 PREEMPT_{RT,(full)}
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/02/2025
Workqueue: events mptcp_worker
Call Trace:
 <TASK>
  dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120
  print_address_description mm/kasan/report.c:378 [inline]
  print_report+0xca/0x240 mm/kasan/report.c:482
  kasan_report+0x118/0x150 mm/kasan/report.c:595
  __timer_delete_sync+0x372/0x3f0 kernel/time/timer.c:1616
  sk_stop_timer_sync+0x1b/0x90 net/core/sock.c:3631
  mptcp_pm_del_add_timer+0x283/0x310 net/mptcp/pm.c:362
  mptcp_incoming_options+0x1357/0x1f60 net/mptcp/options.c:1174
  tcp_data_queue+0xca/0x6450 net/ipv4/tcp_input.c:5361
  tcp_rcv_established+0x1335/0x2670 net/ipv4/tcp_input.c:6441
  tcp_v4_do_rcv+0x98b/0xbf0 net/ipv4/tcp_ipv4.c:1931
  tcp_v4_rcv+0x252a/0x2dc0 net/ipv4/tcp_ipv4.c:2374
  ip_protocol_deliver_rcu+0x221/0x440 net/ipv4/ip_input.c:205
  ip_local_deliver_finish+0x3bb/0x6f0 net/ipv4/ip_input.c:239
  NF_HOOK+0x30c/0x3a0 include/linux/netfilter.h:318
  NF_HOOK+0x30c/0x3a0 include/linux/netfilter.h:318
  __netif_receive_skb_one_core net/core/dev.c:6079 [inline]
  __netif_receive_skb+0x143/0x380 net/core/dev.c:6192
  process_backlog+0x31e/0x900 net/core/dev.c:6544
  __napi_poll+0xb6/0x540 net/core/dev.c:7594
  napi_poll net/core/dev.c:7657 [inline]
  net_rx_action+0x5f7/0xda0 net/core/dev.c:7784
  handle_softirqs+0x22f/0x710 kernel/softirq.c:622
  __do_softirq kernel/softirq.c:656 [inline]
  __local_bh_enable_ip+0x1a0/0x2e0 kernel/softirq.c:302
  mptcp_pm_send_ack net/mptcp/pm.c:210 [inline]
 mptcp_pm_addr_send_ack+0x41f/0x500 net/mptcp/pm.c:-1
  mptcp_pm_worker+0x174/0x320 net/mptcp/pm.c:1002
  mptcp_worker+0xd5/0x1170 net/mptcp/protocol.c:2762
  process_one_work kernel/workqueue.c:3263 [inline]
  process_scheduled_works+0xae1/0x17b0 kernel/workqueue.c:3346
  worker_thread+0x8a0/0xda0 kernel/workqueue.c:3427
  kthread+0x711/0x8a0 kernel/kthread.c:463
  ret_from_fork+0x4bc/0x870 arch/x86/kernel/process.c:158
  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
 </TASK>

Allocated by task 44:
  kasan_save_stack mm/kasan/common.c:56 [inline]
  kasan_save_track+0x3e/0x80 mm/kasan/common.c:77
  poison_kmalloc_redzone mm/kasan/common.c:400 [inline]
  __kasan_kmalloc+0x93/0xb0 mm/kasan/common.c:417
  kasan_kmalloc include/linux/kasan.h:262 [inline]
  __kmalloc_cache_noprof+0x1ef/0x6c0 mm/slub.c:5748
  kmalloc_noprof include/linux/slab.h:957 [inline]
  mptcp_pm_alloc_anno_list+0x104/0x460 net/mptcp/pm.c:385
  mptcp_pm_create_subflow_or_signal_addr+0xf9d/0x1360 net/mptcp/pm_kernel.c:355
  mptcp_pm_nl_fully_established net/mptcp/pm_kernel.c:409 [inline]
  __mptcp_pm_kernel_worker+0x417/0x1ef0 net/mptcp/pm_kernel.c:1529
  mptcp_pm_worker+0x1ee/0x320 net/mptcp/pm.c:1008
  mptcp_worker+0xd5/0x1170 net/mptcp/protocol.c:2762
  process_one_work kernel/workqueue.c:3263 [inline]
  process_scheduled_works+0xae1/0x17b0 kernel/workqueue.c:3346
  worker_thread+0x8a0/0xda0 kernel/workqueue.c:3427
  kthread+0x711/0x8a0 kernel/kthread.c:463
  ret_from_fork+0x4bc/0x870 arch/x86/kernel/process.c:158
  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245

Freed by task 6630:
  kasan_save_stack mm/kasan/common.c:56 [inline]
  kasan_save_track+0x3e/0x80 mm/kasan/common.c:77
  __kasan_save_free_info+0x46/0x50 mm/kasan/generic.c:587
  kasan_save_free_info mm/kasan/kasan.h:406 [inline]
  poison_slab_object mm/kasan/common.c:252 [inline]
  __kasan_slab_free+0x5c/0x80 mm/kasan/common.c:284
  kasan_slab_free include/linux/kasan.h:234 [inline]
  slab_free_hook mm/slub.c:2523 [inline]
  slab_free mm/slub.c:6611 [inline]
  kfree+0x197/0x950 mm/slub.c:6818
  mptcp_remove_anno_list_by_saddr+0x2d/0x40 net/mptcp/pm.c:158
  mptcp_pm_flush_addrs_and_subflows net/mptcp/pm_kernel.c:1209 [inline]
  mptcp_nl_flush_addrs_list net/mptcp/pm_kernel.c:1240 [inline]
  mptcp_pm_nl_flush_addrs_doit+0x593/0xbb0 net/mptcp/pm_kernel.c:1281
  genl_family_rcv_msg_doit+0x215/0x300 net/netlink/genetlink.c:1115
  genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline]
  genl_rcv_msg+0x60e/0x790 net/netlink/genetlink.c:1210
  netlink_rcv_skb+0x208/0x470 net/netlink/af_netlink.c:2552
  genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219
  netlink_unicast_kernel net/netlink/af_netlink.c:1320 [inline]
  netlink_unicast+0x846/0xa10 net/netlink/af_netlink.c:1346
  netlink_sendmsg+0x805/0xb30 net/netlink/af_netlink.c:1896
  sock_sendmsg_nosec net/socket.c:727 [inline]
  __sock_sendmsg+0x21c/0x270 net/socket.c:742
  ____sys_sendmsg+0x508/0x820 net/socket.c:2630
  ___sys_sendmsg+0x21f/0x2a0 net/socket.c:2684
  __sys_sendmsg net/socket.c:2716 [inline]
  __do_sys_sendmsg net/socket.c:2721 [inline]
  __se_sys_sendmsg net/socket.c:2719 [inline]
  __x64_sys_sendmsg+0x1a1/0x260 net/socket.c:2719
  do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
  do_syscall_64+0xfa/0xfa0 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

Cc: stable@vger.kernel.org
Fixes: 00cfd77b90 ("mptcp: retransmit ADD_ADDR when timeout")
Reported-by: syzbot+2a6fbf0f0530375968df@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/691ad3c3.a70a0220.f6df1.0004.GAE@google.com
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Geliang Tang <geliang@kernel.org>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251117100745.1913963-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-18 18:33:01 -08:00
Jakub Kicinski
c3995fc1a8 Merge tag 'ipsec-2025-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:

====================
pull request (net): ipsec 2025-11-18

1) Misc fixes for xfrm_state creation/modification/deletion.
   Patchset from Sabrina Dubroca.

2) Fix inner packet family determination for xfrm offloads.
   From Jianbo Liu.

3) Don't push locally generated packets directly to L2 tunnel
   mode offloading, they still need processing from the standard
   xfrm path. From Jianbo Liu.

4) Fix memory leaks in xfrm_add_acquire for policy offloads and policy
   security contexts. From Zilin Guan.

* tag 'ipsec-2025-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
  xfrm: fix memory leak in xfrm_add_acquire()
  xfrm: Prevent locally generated packets from direct output in tunnel mode
  xfrm: Determine inner GSO type from packet inner protocol
  xfrm: Check inner packet family directly from skb_dst
  xfrm: check all hash buckets for leftover states during netns deletion
  xfrm: set err and extack on failure to create pcpu SA
  xfrm: call xfrm_dev_state_delete when xfrm_state_migrate fails to add the state
  xfrm: make state as DEAD before final put when migrate fails
  xfrm: also call xfrm_state_delete_tunnel at destroy time for states that were never added
  xfrm: drop SA reference in xfrm_state_update if dir doesn't match
====================

Link: https://patch.msgid.link/20251118085344.2199815-1-steffen.klassert@secunet.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-18 17:58:44 -08:00
Shay Drory
f94c1a114a devlink: rate: Unset parent pointer in devl_rate_nodes_destroy
The function devl_rate_nodes_destroy is documented to "Unset parent for
all rate objects". However, it was only calling the driver-specific
`rate_leaf_parent_set` or `rate_node_parent_set` ops and decrementing
the parent's refcount, without actually setting the
`devlink_rate->parent` pointer to NULL.

This leaves a dangling pointer in the `devlink_rate` struct, which cause
refcount error in netdevsim[1] and mlx5[2]. In addition, this is
inconsistent with the behavior of `devlink_nl_rate_parent_node_set`,
where the parent pointer is correctly cleared.

This patch fixes the issue by explicitly setting `devlink_rate->parent`
to NULL after notifying the driver, thus fulfilling the function's
documented behavior for all rate objects.

[1]
repro steps:
echo 1 > /sys/bus/netdevsim/new_device
devlink dev eswitch set netdevsim/netdevsim1 mode switchdev
echo 1 > /sys/bus/netdevsim/devices/netdevsim1/sriov_numvfs
devlink port function rate add netdevsim/netdevsim1/test_node
devlink port function rate set netdevsim/netdevsim1/128 parent test_node
echo 1 > /sys/bus/netdevsim/del_device

dmesg:
refcount_t: decrement hit 0; leaking memory.
WARNING: CPU: 8 PID: 1530 at lib/refcount.c:31 refcount_warn_saturate+0x42/0xe0
CPU: 8 UID: 0 PID: 1530 Comm: bash Not tainted 6.18.0-rc4+ #1 NONE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:refcount_warn_saturate+0x42/0xe0
Call Trace:
 <TASK>
 devl_rate_leaf_destroy+0x8d/0x90
 __nsim_dev_port_del+0x6c/0x70 [netdevsim]
 nsim_dev_reload_destroy+0x11c/0x140 [netdevsim]
 nsim_drv_remove+0x2b/0xb0 [netdevsim]
 device_release_driver_internal+0x194/0x1f0
 bus_remove_device+0xc6/0x130
 device_del+0x159/0x3c0
 device_unregister+0x1a/0x60
 del_device_store+0x111/0x170 [netdevsim]
 kernfs_fop_write_iter+0x12e/0x1e0
 vfs_write+0x215/0x3d0
 ksys_write+0x5f/0xd0
 do_syscall_64+0x55/0x10f0
 entry_SYSCALL_64_after_hwframe+0x4b/0x53

[2]
devlink dev eswitch set pci/0000:08:00.0 mode switchdev
devlink port add pci/0000:08:00.0 flavour pcisf pfnum 0 sfnum 1000
devlink port function rate add pci/0000:08:00.0/group1
devlink port function rate set pci/0000:08:00.0/32768 parent group1
modprobe -r mlx5_ib mlx5_fwctl mlx5_core

dmesg:
refcount_t: decrement hit 0; leaking memory.
WARNING: CPU: 7 PID: 16151 at lib/refcount.c:31 refcount_warn_saturate+0x42/0xe0
CPU: 7 UID: 0 PID: 16151 Comm: bash Not tainted 6.17.0-rc7_for_upstream_min_debug_2025_10_02_12_44 #1 NONE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
RIP: 0010:refcount_warn_saturate+0x42/0xe0
Call Trace:
 <TASK>
 devl_rate_leaf_destroy+0x8d/0x90
 mlx5_esw_offloads_devlink_port_unregister+0x33/0x60 [mlx5_core]
 mlx5_esw_offloads_unload_rep+0x3f/0x50 [mlx5_core]
 mlx5_eswitch_unload_sf_vport+0x40/0x90 [mlx5_core]
 mlx5_sf_esw_event+0xc4/0x120 [mlx5_core]
 notifier_call_chain+0x33/0xa0
 blocking_notifier_call_chain+0x3b/0x50
 mlx5_eswitch_disable_locked+0x50/0x110 [mlx5_core]
 mlx5_eswitch_disable+0x63/0x90 [mlx5_core]
 mlx5_unload+0x1d/0x170 [mlx5_core]
 mlx5_uninit_one+0xa2/0x130 [mlx5_core]
 remove_one+0x78/0xd0 [mlx5_core]
 pci_device_remove+0x39/0xa0
 device_release_driver_internal+0x194/0x1f0
 unbind_store+0x99/0xa0
 kernfs_fop_write_iter+0x12e/0x1e0
 vfs_write+0x215/0x3d0
 ksys_write+0x5f/0xd0
 do_syscall_64+0x53/0x1f0
 entry_SYSCALL_64_after_hwframe+0x4b/0x53

Fixes: d755598450 ("devlink: Allow setting parent node of rate objects")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1763381149-1234377-1-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-18 17:12:21 -08:00
Grzegorz Nitka
23a5b9b12d ice: fix PTP cleanup on driver removal in error path
Improve the cleanup on releasing PTP resources in error path.
The error case might happen either at the driver probe and PTP
feature initialization or on PTP restart (errors in reset handling, NVM
update etc). In both cases, calls to PF PTP cleanup (ice_ptp_cleanup_pf
function) and 'ps_lock' mutex deinitialization were missed.
Additionally, ptp clock was not unregistered in the latter case.

Keep PTP state as 'uninitialized' on init to distinguish between error
scenarios and to avoid resource release duplication at driver removal.

The consequence of missing ice_ptp_cleanup_pf call is the following call
trace dumped when ice_adapter object is freed (port list is not empty,
as it is required at this stage):

[  T93022] ------------[ cut here ]------------
[  T93022] WARNING: CPU: 10 PID: 93022 at
ice/ice_adapter.c:67 ice_adapter_put+0xef/0x100 [ice]
...
[  T93022] RIP: 0010:ice_adapter_put+0xef/0x100 [ice]
...
[  T93022] Call Trace:
[  T93022]  <TASK>
[  T93022]  ? ice_adapter_put+0xef/0x100 [ice
33d2647ad4f6d866d41eefff1806df37c68aef0c]
[  T93022]  ? __warn.cold+0xb0/0x10e
[  T93022]  ? ice_adapter_put+0xef/0x100 [ice
33d2647ad4f6d866d41eefff1806df37c68aef0c]
[  T93022]  ? report_bug+0xd8/0x150
[  T93022]  ? handle_bug+0xe9/0x110
[  T93022]  ? exc_invalid_op+0x17/0x70
[  T93022]  ? asm_exc_invalid_op+0x1a/0x20
[  T93022]  ? ice_adapter_put+0xef/0x100 [ice
33d2647ad4f6d866d41eefff1806df37c68aef0c]
[  T93022]  pci_device_remove+0x42/0xb0
[  T93022]  device_release_driver_internal+0x19f/0x200
[  T93022]  driver_detach+0x48/0x90
[  T93022]  bus_remove_driver+0x70/0xf0
[  T93022]  pci_unregister_driver+0x42/0xb0
[  T93022]  ice_module_exit+0x10/0xdb0 [ice
33d2647ad4f6d866d41eefff1806df37c68aef0c]
...
[  T93022] ---[ end trace 0000000000000000 ]---
[  T93022] ice: module unloaded

Fixes: e800654e85 ("ice: Use ice_adapter for PTP shared data instead of auxdev")
Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-11-18 14:30:11 -08:00
Emil Tantilov
118082368c idpf: fix possible vport_config NULL pointer deref in remove
Attempting to remove the driver will cause a crash in cases where
the vport failed to initialize. Following trace is from an instance where
the driver failed during an attempt to create a VF:
[ 1661.543624] idpf 0000:84:00.7: Device HW Reset initiated
[ 1722.923726] idpf 0000:84:00.7: Transaction timed-out (op:1 cookie:2900 vc_op:1 salt:29 timeout:60000ms)
[ 1723.353263] BUG: kernel NULL pointer dereference, address: 0000000000000028
...
[ 1723.358472] RIP: 0010:idpf_remove+0x11c/0x200 [idpf]
...
[ 1723.364973] Call Trace:
[ 1723.365475]  <TASK>
[ 1723.365972]  pci_device_remove+0x42/0xb0
[ 1723.366481]  device_release_driver_internal+0x1a9/0x210
[ 1723.366987]  pci_stop_bus_device+0x6d/0x90
[ 1723.367488]  pci_stop_and_remove_bus_device+0x12/0x20
[ 1723.367971]  pci_iov_remove_virtfn+0xbd/0x120
[ 1723.368309]  sriov_disable+0x34/0xe0
[ 1723.368643]  idpf_sriov_configure+0x58/0x140 [idpf]
[ 1723.368982]  sriov_numvfs_store+0xda/0x1c0

Avoid the NULL pointer dereference by adding NULL pointer check for
vport_config[i], before freeing user_config.q_coalesce.

Fixes: e1e3fec3e3 ("idpf: preserve coalescing settings across resets")
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Reviewed-by: Chittim Madhu <madhu.chittim@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Samuel Salin <Samuel.salin@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-11-18 13:22:11 -08:00
Rafael J. Wysocki
f384497a76 PM: sleep: core: Fix runtime PM enabling in device_resume_early()
Runtime PM should only be enabled in device_resume_early() if it has
been disabled for the given device by device_suspend_late().  Otherwise,
it may cause runtime PM callbacks to run prematurely in some cases
which leads to further functional issues.

Make two changes to address this problem.

First, reorder device_suspend_late() to only disable runtime PM for a
device when it is going to look for the device's callback or if the
device is a "syscore" one.  In all of the other cases, disabling runtime
PM for the device is not in fact necessary.  However, if the device's
callback returns an error and the power.is_late_suspended flag is not
going to be set, enable runtime PM so it only remains disabled when
power.is_late_suspended is set.

Second, make device_resume_early() only enable runtime PM for the
devices with the power.is_late_suspended flag set.

Fixes: 443046d1ad ("PM: sleep: Make suspend of devices more asynchronous")
Reported-by: Rose Wu <ya-jou.wu@mediatek.com>
Closes: https://lore.kernel.org/linux-pm/70b25dca6f8c2756d78f076f4a7dee7edaaffc33.camel@mediatek.com/
Cc: 6.16+ <stable@vger.kernel.org> # 6.16+
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/12784270.O9o76ZdvQC@rafael.j.wysocki
2025-11-18 15:47:55 +01:00
Florian Fuchs
0f08f0b0fb net: ps3_gelic_net: handle skb allocation failures
Handle skb allocation failures in RX path, to avoid NULL pointer
dereference and RX stalls under memory pressure. If the refill fails
with -ENOMEM, complete napi polling and wake up later to retry via timer.
Also explicitly re-enable RX DMA after oom, so the dmac doesn't remain
stopped in this situation.

Previously, memory pressure could lead to skb allocation failures and
subsequent Oops like:

	Oops: Kernel access of bad area, sig: 11 [#2]
	Hardware name: SonyPS3 Cell Broadband Engine 0x701000 PS3
	NIP [c0003d0000065900] gelic_net_poll+0x6c/0x2d0 [ps3_gelic] (unreliable)
	LR [c0003d00000659c4] gelic_net_poll+0x130/0x2d0 [ps3_gelic]
	Call Trace:
	  gelic_net_poll+0x130/0x2d0 [ps3_gelic] (unreliable)
	  __napi_poll+0x44/0x168
	  net_rx_action+0x178/0x290

Steps to reproduce the issue:
	1. Start a continuous network traffic, like scp of a 20GB file
	2. Inject failslab errors using the kernel fault injection:
	    echo -1 > /sys/kernel/debug/failslab/times
	    echo 30 > /sys/kernel/debug/failslab/interval
	    echo 100 > /sys/kernel/debug/failslab/probability
	3. After some time, traces start to appear, kernel Oopses
	   and the system stops

Step 2 is not always necessary, as it is usually already triggered by
the transfer of a big enough file.

Fixes: 02c1889166 ("ps3: gigabit ethernet driver for PS3, take3")
Signed-off-by: Florian Fuchs <fuchsfl@gmail.com>
Link: https://patch.msgid.link/20251113181000.3914980-1-fuchsfl@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-18 12:31:09 +01:00
Pavel Zhigulin
896f1a2493 net: qlogic/qede: fix potential out-of-bounds read in qede_tpa_cont() and qede_tpa_end()
The loops in 'qede_tpa_cont()' and 'qede_tpa_end()', iterate
over 'cqe->len_list[]' using only a zero-length terminator as
the stopping condition. If the terminator was missing or
malformed, the loop could run past the end of the fixed-size array.

Add an explicit bound check using ARRAY_SIZE() in both loops to prevent
a potential out-of-bounds access.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: 55482edc25 ("qede: Add slowpath/fastpath support and enable hardware GRO")
Signed-off-by: Pavel Zhigulin <Pavel.Zhigulin@kaspersky.com>
Link: https://patch.msgid.link/20251113112757.4166625-1-Pavel.Zhigulin@kaspersky.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-18 11:09:58 +01:00
Randy Dunlap
db30233361 platform/x86: intel-uncore-freq: fix all header kernel-doc warnings
In file uncore-frequency/uncore-frequency-common.h,
correct all kernel-doc warnings by adding missing leading " *" to some
lines, adding a missing kernel-doc entry, and fixing a name typo.

Warning: uncore-frequency-common.h:50 bad line:
   Storage for kobject attribute elc_low_threshold_percent
Warning: uncore-frequency-common.h:52 bad line:
   Storage for kobject attribute elc_high_threshold_percent
Warning: uncore-frequency-common.h:54 bad line:
   Storage for kobject attribute elc_high_threshold_enable
Warning: uncore-frequency-common.h:92 struct member
 'min_freq_khz_kobj_attr' not described in 'uncore_data'
Warning: uncore-frequency-common.h:92 struct member
 'die_id_kobj_attr' not described in 'uncore_data'

Fixes: 24b6616355 ("platform/x86/intel-uncore-freq: Add efficiency latency control to sysfs interface")
Fixes: 416de0246f ("platform/x86: intel-uncore-freq: Fix types in sysfs callbacks")
Fixes: 247b43fcd8 ("platform/x86/intel-uncore-freq: Add attributes to show die_id")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://patch.msgid.link/20251111060938.1998542-1-rdunlap@infradead.org
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-11-18 09:37:21 +02:00
Armin Wolf
444a9256f8 platform/x86: acer-wmi: Ignore backlight event
On the Acer Nitro AN515-58, the event 4 - 0 is send by the ACPI
firmware when the backlight up/down keys are pressed. Ignore this
event to avoid spamming the kernel log with error messages, as the
acpi-video driver already handles brightness up/down events.

Reported-by: Bugaddr <Bugaddr@protonmail.com>
Closes: https://bugaddr.tech/posts/2025-11-16-debugging-the-acer-nitro-5-an515-58-fn-f10-keyboard-backlight-bug-on-linux/#wmi-interface-issues
Tested-by: Bugaddr <Bugaddr@protonmail.com>
Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Link: https://patch.msgid.link/20251117155938.3030-1-W_Armin@gmx.de
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-11-18 09:35:53 +02:00
Haotian Zhang
d8bb447efc platform/x86/intel/speed_select_if: Convert PCIBIOS_* return codes to errnos
isst_if_probe() uses pci_read_config_dword() that returns PCIBIOS_*
codes. The return code is returned from the probe function as is but
probe functions should return normal errnos. A proper implementation
can be found in drivers/leds/leds-ss4200.c.

Convert PCIBIOS_* return codes using pcibios_err_to_errno() into
normal errno before returning.

Fixes: d3a2358429 ("platform/x86: ISST: Add Intel Speed Select mmio interface")
Signed-off-by: Haotian Zhang <vulab@iscas.ac.cn>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20251117033354.132-1-vulab@iscas.ac.cn
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-11-18 09:35:12 +02:00
Srinivas Pandruvada
ddf5ffff3a platform/x86/intel/hid: Add Nova Lake support
Add ACPI ID for Nova Lake.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20251110235041.123685-1-srinivas.pandruvada@linux.intel.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-11-18 09:34:24 +02:00
Anthony Wong
6f91ad24c6 platform/x86: alienware-wmi-wmax: Add AWCC support to Alienware 16 Aurora
Add AWCC support to Alienware 16 Aurora

Cc: stable@vger.kernel.org
Signed-off-by: Anthony Wong <anthony.wong@ubuntu.com>
Reviewed-by: Kurt Borja <kuurtb@gmail.com>
Link: https://patch.msgid.link/20251116185311.18074-1-anthony.wong@canonical.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-11-18 09:32:23 +02:00
Lorenzo Bianconi
8e0a754b08 net: airoha: Do not loopback traffic to GDM2 if it is available on the device
Airoha_eth driver forwards offloaded uplink traffic (packets received
on GDM1 and forwarded to GDM{3,4}) to GDM2 in order to apply hw QoS.
This is correct if the device does not support a dedicated GDM2 port.
In this case, in order to enable hw offloading for uplink traffic,
the packets should be sent to GDM{3,4} directly.

Fixes: 9cd451d414 ("net: airoha: Add loopback support for GDM2")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20251113-airoha-hw-offload-gdm2-fix-v1-1-7e4ca300872f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-17 20:00:07 -08:00
Ido Schimmel
bed22c7b90 selftests: net: lib: Do not overwrite error messages
ret_set_ksft_status() calls ksft_status_merge() with the current return
status and the last one. It treats a non-zero return code from
ksft_status_merge() as an indication that the return status was
overwritten by the last one and therefore overwrites the return message
with the last one.

Currently, ksft_status_merge() returns a non-zero return code even if
the current return status and the last one are equal. This results in
return messages being overwritten which is counter-productive since we
are more interested in the first failure message and not the last one.

Fix by changing ksft_status_merge() to only return a non-zero return
code if the current return status was actually changed.

Add a test case which checks that the first error message is not
overwritten.

Before:

 # ./lib_sh_test.sh
 [...]
 TEST: RET tfail2 tfail -> fail                                      [FAIL]
        retmsg=tfail expected tfail2
 [...]
 # echo $?
 1

After:

 # ./lib_sh_test.sh
 [...]
 TEST: RET tfail2 tfail -> fail                                      [ OK ]
 [...]
 # echo $?
 0

Fixes: 596c8819cb ("selftests: forwarding: Have RET track kselftest framework constants")
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20251116081029.69112-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-17 19:32:12 -08:00
Aleksei Nikiforov
da02a18248 s390/ctcm: Fix double-kfree
The function 'mpc_rcvd_sweep_req(mpcginfo)' is called conditionally
from function 'ctcmpc_unpack_skb'. It frees passed mpcginfo.
After that a call to function 'kfree' in function 'ctcmpc_unpack_skb'
frees it again.

Remove 'kfree' call in function 'mpc_rcvd_sweep_req(mpcginfo)'.

Bug detected by the clang static analyzer.

Fixes: 0c0b20587b ("s390/ctcm: fix potential memory leak")
Reviewed-by: Aswin Karuvally <aswin@linux.ibm.com>
Signed-off-by: Aleksei Nikiforov <aleksei.nikiforov@linux.ibm.com>
Signed-off-by: Aswin Karuvally <aswin@linux.ibm.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20251112182724.1109474-1-aswin@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-17 16:58:25 -08:00
Jesper Dangaard Brouer
5442a9da69 veth: more robust handing of race to avoid txq getting stuck
Commit dc82a33297 ("veth: apply qdisc backpressure on full ptr_ring to
reduce TX drops") introduced a race condition that can lead to a permanently
stalled TXQ. This was observed in production on ARM64 systems (Ampere Altra
Max).

The race occurs in veth_xmit(). The producer observes a full ptr_ring and
stops the queue (netif_tx_stop_queue()). The subsequent conditional logic,
intended to re-wake the queue if the consumer had just emptied it (if
(__ptr_ring_empty(...)) netif_tx_wake_queue()), can fail. This leads to a
"lost wakeup" where the TXQ remains stopped (QUEUE_STATE_DRV_XOFF) and
traffic halts.

This failure is caused by an incorrect use of the __ptr_ring_empty() API
from the producer side. As noted in kernel comments, this check is not
guaranteed to be correct if a consumer is operating on another CPU. The
empty test is based on ptr_ring->consumer_head, making it reliable only for
the consumer. Using this check from the producer side is fundamentally racy.

This patch fixes the race by adopting the more robust logic from an earlier
version V4 of the patchset, which always flushed the peer:

(1) In veth_xmit(), the racy conditional wake-up logic and its memory barrier
are removed. Instead, after stopping the queue, we unconditionally call
__veth_xdp_flush(rq). This guarantees that the NAPI consumer is scheduled,
making it solely responsible for re-waking the TXQ.
  This handles the race where veth_poll() consumes all packets and completes
NAPI *before* veth_xmit() on the producer side has called netif_tx_stop_queue.
The __veth_xdp_flush(rq) will observe rx_notify_masked is false and schedule
NAPI.

(2) On the consumer side, the logic for waking the peer TXQ is moved out of
veth_xdp_rcv() and placed at the end of the veth_poll() function. This
placement is part of fixing the race, as the netif_tx_queue_stopped() check
must occur after rx_notify_masked is potentially set to false during NAPI
completion.
  This handles the race where veth_poll() consumes all packets, but haven't
finished (rx_notify_masked is still true). The producer veth_xmit() stops the
TXQ and __veth_xdp_flush(rq) will observe rx_notify_masked is true, meaning
not starting NAPI.  Then veth_poll() change rx_notify_masked to false and
stops NAPI.  Before exiting veth_poll() will observe TXQ is stopped and wake
it up.

Fixes: dc82a33297 ("veth: apply qdisc backpressure on full ptr_ring to reduce TX drops")
Reviewed-by: Toshiaki Makita <toshiaki.makita1@gmail.com>
Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org>
Link: https://patch.msgid.link/176295323282.307447.14790015927673763094.stgit@firesoul
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-14 18:16:53 -08:00
Ilya Maximets
dfe28c4167 net: openvswitch: remove never-working support for setting nsh fields
The validation of the set(nsh(...)) action is completely wrong.
It runs through the nsh_key_put_from_nlattr() function that is the
same function that validates NSH keys for the flow match and the
push_nsh() action.  However, the set(nsh(...)) has a very different
memory layout.  Nested attributes in there are doubled in size in
case of the masked set().  That makes proper validation impossible.

There is also confusion in the code between the 'masked' flag, that
says that the nested attributes are doubled in size containing both
the value and the mask, and the 'is_mask' that says that the value
we're parsing is the mask.  This is causing kernel crash on trying to
write into mask part of the match with SW_FLOW_KEY_PUT() during
validation, while validate_nsh() doesn't allocate any memory for it:

  BUG: kernel NULL pointer dereference, address: 0000000000000018
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 1c2383067 P4D 1c2383067 PUD 20b703067 PMD 0
  Oops: Oops: 0000 [#1] SMP NOPTI
  CPU: 8 UID: 0 Kdump: loaded Not tainted 6.17.0-rc4+ #107 PREEMPT(voluntary)
  RIP: 0010:nsh_key_put_from_nlattr+0x19d/0x610 [openvswitch]
  Call Trace:
   <TASK>
   validate_nsh+0x60/0x90 [openvswitch]
   validate_set.constprop.0+0x270/0x3c0 [openvswitch]
   __ovs_nla_copy_actions+0x477/0x860 [openvswitch]
   ovs_nla_copy_actions+0x8d/0x100 [openvswitch]
   ovs_packet_cmd_execute+0x1cc/0x310 [openvswitch]
   genl_family_rcv_msg_doit+0xdb/0x130
   genl_family_rcv_msg+0x14b/0x220
   genl_rcv_msg+0x47/0xa0
   netlink_rcv_skb+0x53/0x100
   genl_rcv+0x24/0x40
   netlink_unicast+0x280/0x3b0
   netlink_sendmsg+0x1f7/0x430
   ____sys_sendmsg+0x36b/0x3a0
   ___sys_sendmsg+0x87/0xd0
   __sys_sendmsg+0x6d/0xd0
   do_syscall_64+0x7b/0x2c0
   entry_SYSCALL_64_after_hwframe+0x76/0x7e

The third issue with this process is that while trying to convert
the non-masked set into masked one, validate_set() copies and doubles
the size of the OVS_KEY_ATTR_NSH as if it didn't have any nested
attributes.  It should be copying each nested attribute and doubling
them in size independently.  And the process must be properly reversed
during the conversion back from masked to a non-masked variant during
the flow dump.

In the end, the only two outcomes of trying to use this action are
either validation failure or a kernel crash.  And if somehow someone
manages to install a flow with such an action, it will most definitely
not do what it is supposed to, since all the keys and the masks are
mixed up.

Fixing all the issues is a complex task as it requires re-writing
most of the validation code.

Given that and the fact that this functionality never worked since
introduction, let's just remove it altogether.  It's better to
re-introduce it later with a proper implementation instead of trying
to fix it in stable releases.

Fixes: b2d0f5d5dc ("openvswitch: enable NSH support")
Reported-by: Junvy Yang <zhuque@tencent.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Reviewed-by: Aaron Conole <aconole@redhat.com>
Link: https://patch.msgid.link/20251112112246.95064-1-i.maximets@ovn.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-14 18:13:24 -08:00
Eric Dumazet
035bca3f01 mptcp: fix race condition in mptcp_schedule_work()
syzbot reported use-after-free in mptcp_schedule_work() [1]

Issue here is that mptcp_schedule_work() schedules a work,
then gets a refcount on sk->sk_refcnt if the work was scheduled.
This refcount will be released by mptcp_worker().

[A] if (schedule_work(...)) {
[B]     sock_hold(sk);
        return true;
    }

Problem is that mptcp_worker() can run immediately and complete before [B]

We need instead :

    sock_hold(sk);
    if (schedule_work(...))
        return true;
    sock_put(sk);

[1]
refcount_t: addition on 0; use-after-free.
 WARNING: CPU: 1 PID: 29 at lib/refcount.c:25 refcount_warn_saturate+0xfa/0x1d0 lib/refcount.c:25
Call Trace:
 <TASK>
 __refcount_add include/linux/refcount.h:-1 [inline]
  __refcount_inc include/linux/refcount.h:366 [inline]
  refcount_inc include/linux/refcount.h:383 [inline]
  sock_hold include/net/sock.h:816 [inline]
  mptcp_schedule_work+0x164/0x1a0 net/mptcp/protocol.c:943
  mptcp_tout_timer+0x21/0xa0 net/mptcp/protocol.c:2316
  call_timer_fn+0x17e/0x5f0 kernel/time/timer.c:1747
  expire_timers kernel/time/timer.c:1798 [inline]
  __run_timers kernel/time/timer.c:2372 [inline]
  __run_timer_base+0x648/0x970 kernel/time/timer.c:2384
  run_timer_base kernel/time/timer.c:2393 [inline]
  run_timer_softirq+0xb7/0x180 kernel/time/timer.c:2403
  handle_softirqs+0x22f/0x710 kernel/softirq.c:622
  __do_softirq kernel/softirq.c:656 [inline]
  run_ktimerd+0xcf/0x190 kernel/softirq.c:1138
  smpboot_thread_fn+0x542/0xa60 kernel/smpboot.c:160
  kthread+0x711/0x8a0 kernel/kthread.c:463
  ret_from_fork+0x4bc/0x870 arch/x86/kernel/process.c:158
  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245

Cc: stable@vger.kernel.org
Fixes: 3b1d6210a9 ("mptcp: implement and use MPTCP-level retransmission")
Reported-by: syzbot+355158e7e301548a1424@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/6915b46f.050a0220.3565dc.0028.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251113103924.3737425-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-14 18:12:35 -08:00
Pavel Zhigulin
b0c959fec1 net: mlxsw: linecards: fix missing error check in mlxsw_linecard_devlink_info_get()
The call to devlink_info_version_fixed_put() in
mlxsw_linecard_devlink_info_get() did not check for errors,
although it is checked everywhere in the code.

Add missed 'err' check to the mlxsw_linecard_devlink_info_get()

Fixes: 3fc0c51905 ("mlxsw: core_linecards: Expose device PSID over device info")
Signed-off-by: Pavel Zhigulin <Pavel.Zhigulin@kaspersky.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20251113161922.813828-1-Pavel.Zhigulin@kaspersky.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-14 18:01:32 -08:00
Pavel Zhigulin
e6751b0b19 net: dsa: hellcreek: fix missing error handling in LED registration
The LED setup routine registered both led_sync_good
and led_is_gm devices without checking the return
values of led_classdev_register(). If either registration
failed, the function continued silently, leaving the
driver in a partially-initialized state and leaking
a registered LED classdev.

Add proper error handling

Fixes: 7d9ee2e8ff ("net: dsa: hellcreek: Add PTP status LEDs")
Signed-off-by: Pavel Zhigulin <Pavel.Zhigulin@kaspersky.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Kurt Kanzenbach <kurt@linutronix.de>
Link: https://patch.msgid.link/20251113135745.92375-1-Pavel.Zhigulin@kaspersky.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-14 17:46:32 -08:00
Vlastimil Babka
ec33b59542 mm/mempool: fix poisoning order>0 pages with HIGHMEM
The kernel test has reported:

  BUG: unable to handle page fault for address: fffba000
  #PF: supervisor write access in kernel mode
  #PF: error_code(0x0002) - not-present page
  *pde = 03171067 *pte = 00000000
  Oops: Oops: 0002 [#1]
  CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Tainted: G                T   6.18.0-rc2-00031-gec7f31b2a2d3 #1 NONE  a1d066dfe789f54bc7645c7989957d2bdee593ca
  Tainted: [T]=RANDSTRUCT
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
  EIP: memset (arch/x86/include/asm/string_32.h:168 arch/x86/lib/memcpy_32.c:17)
  Code: a5 8b 4d f4 83 e1 03 74 02 f3 a4 83 c4 04 5e 5f 5d 2e e9 73 41 01 00 90 90 90 3e 8d 74 26 00 55 89 e5 57 56 89 c6 89 d0 89 f7 <f3> aa 89 f0 5e 5f 5d 2e e9 53 41 01 00 cc cc cc 55 89 e5 53 57 56
  EAX: 0000006b EBX: 00000015 ECX: 001fefff EDX: 0000006b
  ESI: fffb9000 EDI: fffba000 EBP: c611fbf0 ESP: c611fbe8
  DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 EFLAGS: 00010287
  CR0: 80050033 CR2: fffba000 CR3: 0316e000 CR4: 00040690
  Call Trace:
   poison_element (mm/mempool.c:83 mm/mempool.c:102)
   mempool_init_node (mm/mempool.c:142 mm/mempool.c:226)
   mempool_init_noprof (mm/mempool.c:250 (discriminator 1))
   ? mempool_alloc_pages (mm/mempool.c:640)
   bio_integrity_initfn (block/bio-integrity.c:483 (discriminator 8))
   ? mempool_alloc_pages (mm/mempool.c:640)
   do_one_initcall (init/main.c:1283)

Christoph found out this is due to the poisoning code not dealing
properly with CONFIG_HIGHMEM because only the first page is mapped but
then the whole potentially high-order page is accessed.

We could give up on HIGHMEM here, but it's straightforward to fix this
with a loop that's mapping, poisoning or checking and unmapping
individual pages.

Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202511111411.9ebfa1ba-lkp@intel.com
Analyzed-by: Christoph Hellwig <hch@lst.de>
Fixes: bdfedb76f4 ("mm, mempool: poison elements backed by slab allocator")
Cc: stable@vger.kernel.org
Tested-by: kernel test robot <oliver.sang@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://patch.msgid.link/20251113-mempool-poison-v1-1-233b3ef984c3@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2025-11-14 17:55:23 +01:00
Zilin Guan
a55ef3bff8 xfrm: fix memory leak in xfrm_add_acquire()
The xfrm_add_acquire() function constructs an xfrm policy by calling
xfrm_policy_construct(). This allocates the policy structure and
potentially associates a security context and a device policy with it.

However, at the end of the function, the policy object is freed using
only kfree() . This skips the necessary cleanup for the security context
and device policy, leading to a memory leak.

To fix this, invoke the proper cleanup functions xfrm_dev_policy_delete(),
xfrm_dev_policy_free(), and security_xfrm_policy_free() before freeing the
policy object. This approach mirrors the error handling path in
xfrm_add_policy(), ensuring that all associated resources are correctly
released.

Fixes: 980ebd2579 ("[IPSEC]: Sync series - acquire insert")
Signed-off-by: Zilin Guan <zilin@seu.edu.cn>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2025-11-14 10:12:36 +01:00
Zilin Guan
407a06507c mlxsw: spectrum: Fix memory leak in mlxsw_sp_flower_stats()
The function mlxsw_sp_flower_stats() calls mlxsw_sp_acl_ruleset_get() to
obtain a ruleset reference. If the subsequent call to
mlxsw_sp_acl_rule_lookup() fails to find a rule, the function returns
an error without releasing the ruleset reference, causing a memory leak.

Fix this by using a goto to the existing error handling label, which
calls mlxsw_sp_acl_ruleset_put() to properly release the reference.

Fixes: 7c1b8eb175 ("mlxsw: spectrum: Add support for TC flower offload statistics")
Signed-off-by: Zilin Guan <zilin@seu.edu.cn>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20251112052114.1591695-1-zilin@seu.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-13 17:25:33 -08:00
Jiaming Zhang
f796a8dec9 net: core: prevent NULL deref in generic_hwtstamp_ioctl_lower()
The ethtool tsconfig Netlink path can trigger a null pointer
dereference. A call chain such as:

  tsconfig_prepare_data() ->
  dev_get_hwtstamp_phylib() ->
  vlan_hwtstamp_get() ->
  generic_hwtstamp_get_lower() ->
  generic_hwtstamp_ioctl_lower()

results in generic_hwtstamp_ioctl_lower() being called with
kernel_cfg->ifr as NULL.

The generic_hwtstamp_ioctl_lower() function does not expect
a NULL ifr and dereferences it, leading to a system crash.

Fix this by adding a NULL check for kernel_cfg->ifr in
generic_hwtstamp_ioctl_lower(). If ifr is NULL, return -EINVAL.

Fixes: 6e9e2eed4f ("net: ethtool: Add support for tsconfig command to get/set hwtstamp config")
Closes: https://lore.kernel.org/cd6a7056-fa6d-43f8-b78a-f5e811247ba8@linux.dev
Signed-off-by: Jiaming Zhang <r772577952@gmail.com>
Reviewed-by: Kory Maincent <kory.maincent@bootlin.com>
Link: https://patch.msgid.link/20251111173652.749159-2-r772577952@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-13 17:23:21 -08:00
Baruch Siach
7410c86fc0 MAINTAINERS: Remove eth bridge website
Ethernet bridge website URL shows "This page isn’t available".

Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/0a32aaf7fa4473e7574f7327480e8fbc4fef2741.1762946223.git.baruch@tkos.co.il
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-13 16:24:14 -08:00
Marcos Vega
fa0498f804 platform/x86: hp-wmi: Add Omen MAX 16-ah0xx fan support and thermal profile
New HP Omen laptops follow the same WMI thermal profile as Victus
16-r1000 and 16-s1000.

Add DMI board 8D41 to victus_s_thermal_profile_boards.

Signed-off-by: Marcos Vega <marcosmola2@gmail.com>
Link: https://patch.msgid.link/20251108114739.9255-3-marcosmola2@gmail.com
[ij: changelog taken partially from v1]
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-11-12 14:28:44 +02:00
Armin Wolf
97b726eb1d platform/x86: msi-wmi-platform: Fix typo in WMI GUID
The WMI driver core only supports GUID strings containing only
uppercase characters, however the GUID string used by the
msi-wmi-platform driver contains a single lowercase character.
This prevents the WMI driver core from matching said driver to
its WMI device.

Fix this by turning the lowercase character into a uppercase
character. Also update the WMI driver development guide to warn
about this.

Reported-by: Antheas Kapenekakis <lkml@antheas.dev>
Fixes: 9c0beb6b29 ("platform/x86: wmi: Add MSI WMI Platform driver")
Tested-by: Antheas Kapenekakis <lkml@antheas.dev>
Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Link: https://patch.msgid.link/20251110111253.16204-3-W_Armin@gmx.de
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-11-10 19:13:29 +02:00
Armin Wolf
c93433fd4e platform/x86: msi-wmi-platform: Only load on MSI devices
It turns out that the GUID used by the msi-wmi-platform driver
(ABBC0F60-8EA1-11D1-00A0-C90629100000) is not unique, but was instead
copied from the WIndows Driver Samples. This means that this driver
could load on devices from other manufacturers that also copied this
GUID, potentially causing hardware errors.

Prevent this by only loading on devices whitelisted via DMI. The DMI
matches where taken from the msi-ec driver.

Reported-by: Antheas Kapenekakis <lkml@antheas.dev>
Fixes: 9c0beb6b29 ("platform/x86: wmi: Add MSI WMI Platform driver")
Tested-by: Antheas Kapenekakis <lkml@antheas.dev>
Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Link: https://patch.msgid.link/20251110111253.16204-2-W_Armin@gmx.de
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-11-10 19:12:40 +02:00
Antheas Kapenekakis
f945afe01c platform/x86/amd: pmc: Add Lenovo Legion Go 2 to pmc quirk list
The Lenovo Legion Go 2 takes a long time to resume from suspend.
This is due to it having an nvme resume handler that interferes
with IOMMU mappings. It is a common issue with older Lenovo
laptops. Adding it to that quirk list fixes this issue.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4618
Suggested-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Antheas Kapenekakis <lkml@antheas.dev>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Link: https://patch.msgid.link/20251008135057.731928-1-lkml@antheas.dev
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-11-06 14:19:32 +02:00
Antheas Kapenekakis
c0ddc54016 platform/x86/amd/pmc: Add spurious_8042 to Xbox Ally
The Xbox Ally features a Van Gogh SoC that has spurious interrupts
during resume. We get the following logs:

atkbd_receive_byte: 20 callbacks suppressed
atkbd serio0: Spurious ACK on isa0060/serio0. Some program might be trying to access hardware directly.

So, add the spurious_8042 quirk for it. It does not have a keyboard, so
this does not result in any functional loss.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4659
Signed-off-by: Antheas Kapenekakis <lkml@antheas.dev>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Link: https://patch.msgid.link/20251024152152.3981721-3-lkml@antheas.dev
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-11-06 14:19:31 +02:00
Antheas Kapenekakis
db4a3f0fbe platform/x86/amd/pmc: Add support for Van Gogh SoC
The ROG Xbox Ally (non-X) SoC features a similar architecture to the
Steam Deck. While the Steam Deck supports S3 (s2idle causes a crash),
this support was dropped by the Xbox Ally which only S0ix suspend.

Since the handler is missing here, this causes the device to not suspend
and the AMD GPU driver to crash while trying to resume afterwards due to
a power hang.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4659
Signed-off-by: Antheas Kapenekakis <lkml@antheas.dev>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Acked-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Link: https://patch.msgid.link/20251024152152.3981721-2-lkml@antheas.dev
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-11-06 14:19:29 +02:00
Kurt Borja
a6003d90f0 platform/x86: alienware-wmi-wmax: Add support for the whole "G" family
Add support for the whole "Dell G" laptop family.

Cc: stable@vger.kernel.org
Signed-off-by: Kurt Borja <kuurtb@gmail.com>
Link: https://patch.msgid.link/20251103-family-supp-v1-5-a241075d1787@gmail.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-11-06 14:19:27 +02:00
Kurt Borja
21ebfff1cf platform/x86: alienware-wmi-wmax: Add support for the whole "X" family
Add support for the whole "Alienware X" laptop family.

Cc: stable@vger.kernel.org
Signed-off-by: Kurt Borja <kuurtb@gmail.com>
Link: https://patch.msgid.link/20251103-family-supp-v1-4-a241075d1787@gmail.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-11-06 14:19:26 +02:00
Kurt Borja
e8c3c875e1 platform/x86: alienware-wmi-wmax: Add support for the whole "M" family
Add support for the whole "Alienware M" laptop family.

Cc: stable@vger.kernel.org
Signed-off-by: Kurt Borja <kuurtb@gmail.com>
Link: https://patch.msgid.link/20251103-family-supp-v1-3-a241075d1787@gmail.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-11-06 14:19:24 +02:00
Kurt Borja
173b238087 platform/x86: alienware-wmi-wmax: Drop redundant DMI entries
The awcc_dmi_table[] uses DMI_MATCH() that supports partial matches. As
there is already "Alienware Area-51m" entry, "Alienware Area-51m R2" entry
is redundant.

Signed-off-by: Kurt Borja <kuurtb@gmail.com>
Link: https://patch.msgid.link/20251103-family-supp-v1-2-a241075d1787@gmail.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-11-06 14:19:23 +02:00
Kurt Borja
bd4f9f113d platform/x86: alienware-wmi-wmax: Fix "Alienware m16 R1 AMD" quirk order
Quirks are matched using dmi_first_match(), therefore move the
"Alienware m16 R1 AMD" entry above other m16 entries.

Reported-by: Cihan Ozakca <cozakca@outlook.com>
Fixes: e2468dc700 ("Revert "platform/x86: alienware-wmi-wmax: Add G-Mode support to Alienware m16 R1"")
Cc: stable@vger.kernel.org
Signed-off-by: Kurt Borja <kuurtb@gmail.com>
Link: https://patch.msgid.link/20251103-family-supp-v1-1-a241075d1787@gmail.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-11-06 14:19:21 +02:00
Randy Dunlap
5f20bc206b platform/x86: ISST: isst_if.h: fix all kernel-doc warnings
Fix all kernel-doc warnings in <uapi/linux/isst_if.h>:

- don't use "[]" in the variable name in kernel-doc
- add a few missing entries
- change "power_domain" to "power_domain_id" in kernel-doc to match
  the struct member name
- add a leading '@' on a few existing kernel-doc lines
- use '_' instead of '-' in struct member names

Examples (but not all 27 warnings):

Warning: include/uapi/linux/isst_if.h:63 struct member 'cpu_map'
 not described in 'isst_if_cpu_maps'
Warning: ../include/uapi/linux/isst_if.h:95 struct member 'req_count'
 not described in 'isst_if_io_regs'
Warning: include/uapi/linux/isst_if.h:132 struct member 'mbox_cmd'
 not described in 'isst_if_mbox_cmds'
Warning: ../include/uapi/linux/isst_if.h:183 struct member 'supported'
 not described in 'isst_core_power'
Warning: ../include/uapi/linux/isst_if.h:206 struct member
 'power_domain_id' not described in 'isst_clos_param'
Warning: ../include/uapi/linux/isst_if.h:239 struct member 'assoc_info'
 not described in 'isst_if_clos_assoc_cmds'
Warning: ../include/uapi/linux/isst_if.h:286 struct member 'sst_tf_support'
 not described in 'isst_perf_level_info'
Warning: ../include/uapi/linux/isst_if.h:375 struct member 'trl_freq_mhz'
 not described in 'isst_perf_level_data_info'
Warning: ../include/uapi/linux/isst_if.h:475 struct member 'max_buckets'
 not described in 'isst_turbo_freq_info'

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://patch.msgid.link/20251023194615.180824-1-rdunlap@infradead.org
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-11-06 14:19:20 +02:00
Kuppuswamy Sathyanarayanan
a229809c18 platform/x86: intel-uncore-freq: Add additional client processors
Add Intel uncore frequency driver support for Pantherlake, Wildcatlake
and Novalake processors.

Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: https://patch.msgid.link/20251022211733.3565526-1-sathyanarayanan.kuppuswamy@linux.intel.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-11-06 14:19:18 +02:00
Krishna Chomal
fb146a38cb platform/x86: hp-wmi: Add Omen 16-wf1xxx fan support
The newer HP Omen laptops, such as Omen 16-wf1xxx, use the same
WMI-based thermal profile interface as Victus 16-r1000 and 16-s1000
models.

Add the DMI board name "8C78" to the victus_s_thermal_profile_boards
list to enable proper fan and thermal mode control.

Tested on: HP Omen 16-wf1xxx (board 8C78)
Result:
* Fan RPMs are readable
* echo 0 | sudo tee /sys/devices/platform/hp-wmi/hwmon/*/pwm1_enable
  allows the fans to run on max RPM.

Signed-off-by: Krishna Chomal <krishna.chomal108@gmail.com>
Link: https://patch.msgid.link/20251018111001.56625-1-krishna.chomal108@gmail.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-11-06 14:19:06 +02:00
Jia Ston
5c72329716 platform/x86: huawei-wmi: add keys for HONOR models
HONOR MagicBook X16/X14 models produced in 2025 cannot use the Print
Screen and YOYO keys properly, with the system reporting them as
unknown key presses (codes: 0x028b and 0x028e).

To resolve this, a key_entry is added for both the HONOR Print Screen
key and the HONOR YOYO key, ensuring they function correctly on these
models.

Signed-off-by: Ston Jia <ston.jia@outlook.com>
Link: https://patch.msgid.link/20251029051804.220111-1-ston.jia@outlook.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-11-06 14:11:43 +02:00
Edip Hazuri
54afb047cd platform/x86: hp-wmi: mark Victus 16-r0 and 16-s0 for victus_s fan and thermal profile support
This patch adds Victus 16-r0 (8bbe) and Victus 16-s0(8bd4, 8bd5) laptop
DMI board name into existing list

Signed-off-by: Edip Hazuri <edip@medip.dev>
Link: https://patch.msgid.link/20251015181042.23961-3-edip@medip.dev
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-11-06 14:10:59 +02:00
Jianbo Liu
59630e2ccd xfrm: Prevent locally generated packets from direct output in tunnel mode
Add a check to ensure locally generated packets (skb->sk != NULL) do
not use direct output in tunnel mode, as these packets require proper
L2 header setup that is handled by the normal XFRM processing path.

Fixes: 5eddd76ec2 ("xfrm: fix tunnel mode TX datapath in packet offload mode")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2025-10-30 11:52:38 +01:00
Jianbo Liu
61fafbee6c xfrm: Determine inner GSO type from packet inner protocol
The GSO segmentation functions for ESP tunnel mode
(xfrm4_tunnel_gso_segment and xfrm6_tunnel_gso_segment) were
determining the inner packet's L2 protocol type by checking the static
x->inner_mode.family field from the xfrm state.

This is unreliable. In tunnel mode, the state's actual inner family
could be defined by x->inner_mode.family or by
x->inner_mode_iaf.family. Checking only the former can lead to a
mismatch with the actual packet being processed, causing GSO to create
segments with the wrong L2 header type.

This patch fixes the bug by deriving the inner mode directly from the
packet's inner protocol stored in XFRM_MODE_SKB_CB(skb)->protocol.

Instead of replicating the code, this patch modifies the
xfrm_ip2inner_mode helper function. It now correctly returns
&x->inner_mode if the selector family (x->sel.family) is already
specified, thereby handling both specific and AF_UNSPEC cases
appropriately.

With this change, ESP GSO can use xfrm_ip2inner_mode to get the
correct inner mode. It doesn't affect existing callers, as the updated
logic now mirrors the checks they were already performing externally.

Fixes: 26dbd66eab ("esp: choose the correct inner protocol for GSO on inter address family tunnels")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2025-10-30 11:52:31 +01:00
Jianbo Liu
082ef944e5 xfrm: Check inner packet family directly from skb_dst
In the output path, xfrm_dev_offload_ok and xfrm_get_inner_ipproto
need to determine the protocol family of the inner packet (skb) before
it gets encapsulated.

In xfrm_dev_offload_ok, the code checked x->inner_mode.family. This is
unreliable because, for states handling both IPv4 and IPv6, the
relevant inner family could be either x->inner_mode.family or
x->inner_mode_iaf.family. Checking only the former can lead to a
mismatch with the actual packet being processed.

In xfrm_get_inner_ipproto, the code checked x->outer_mode.family. This
is also incorrect for tunnel mode, as the inner packet's family can be
different from the outer header's family.

At both of these call sites, the skb variable holds the original inner
packet. The most direct and reliable source of truth for its protocol
family is its destination entry. This patch fixes the issue by using
skb_dst(skb)->ops->family to ensure protocol-specific headers are only
accessed for the correct packet type.

Fixes: 91d8a53db2 ("xfrm: fix offloading of cross-family tunnels")
Fixes: 45a98ef492 ("net/xfrm: IPsec tunnel mode fix inner_ipproto setting in sec_path")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2025-10-30 11:52:06 +01:00
Sabrina Dubroca
f2bc8231fd xfrm: check all hash buckets for leftover states during netns deletion
The current hlist_empty checks only test the first bucket of each
hashtable, ignoring any other bucket. They should be caught by the
WARN_ON for state_all, but better to make all the checks accurate.

Fixes: 73d189dce4 ("netns xfrm: per-netns xfrm_state_bydst hash")
Fixes: d320bbb306 ("netns xfrm: per-netns xfrm_state_bysrc hash")
Fixes: b754a4fd8f ("netns xfrm: per-netns xfrm_state_byspi hash")
Fixes: fe9f1d8779 ("xfrm: add state hashtable keyed by seq")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2025-10-21 10:42:45 +02:00
Sabrina Dubroca
1dcf617bec xfrm: set err and extack on failure to create pcpu SA
xfrm_state_construct can fail without setting an error if the
requested pcpu_num value is too big. Set err and add an extack message
to avoid confusing userspace.

Fixes: 1ddf9916ac ("xfrm: Add support for per cpu xfrm state handling.")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2025-10-21 10:42:45 +02:00
Sabrina Dubroca
7f02285764 xfrm: call xfrm_dev_state_delete when xfrm_state_migrate fails to add the state
In case xfrm_state_migrate fails after calling xfrm_dev_state_add, we
directly release the last reference and destroy the new state, without
calling xfrm_dev_state_delete (this only happens in
__xfrm_state_delete, which we're not calling on this path, since the
state was never added).

Call xfrm_dev_state_delete on error when an offload configuration was
provided.

Fixes: ab244a394c ("xfrm: Migrate offload configuration")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2025-10-21 10:42:44 +02:00
Sabrina Dubroca
5502bc4746 xfrm: make state as DEAD before final put when migrate fails
xfrm_state_migrate/xfrm_state_clone_and_setup create a new state, and
call xfrm_state_put to destroy it in case of
failure. __xfrm_state_destroy expects the state to be in
XFRM_STATE_DEAD, but we currently don't do that.

Reported-by: syzbot+5cd6299ede4d4f70987b@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=5cd6299ede4d4f70987b
Fixes: 78347c8c6b ("xfrm: Fix xfrm_state_migrate leak")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2025-10-21 10:42:43 +02:00
Sabrina Dubroca
10deb69864 xfrm: also call xfrm_state_delete_tunnel at destroy time for states that were never added
In commit b441cf3f8c ("xfrm: delete x->tunnel as we delete x"), I
missed the case where state creation fails between full
initialization (->init_state has been called) and being inserted on
the lists.

In this situation, ->init_state has been called, so for IPcomp
tunnels, the fallback tunnel has been created and added onto the
lists, but the user state never gets added, because we fail before
that. The user state doesn't go through __xfrm_state_delete, so we
don't call xfrm_state_delete_tunnel for those states, and we end up
leaking the FB tunnel.

There are several codepaths affected by this: the add/update paths, in
both net/key and xfrm, and the migrate code (xfrm_migrate,
xfrm_state_migrate). A "proper" rollback of the init_state work would
probably be doable in the add/update code, but for migrate it gets
more complicated as multiple states may be involved.

At some point, the new (not-inserted) state will be destroyed, so call
xfrm_state_delete_tunnel during xfrm_state_gc_destroy. Most states
will have their fallback tunnel cleaned up during __xfrm_state_delete,
which solves the issue that b441cf3f8c (and other patches before it)
aimed at. All states (including FB tunnels) will be removed from the
lists once xfrm_state_fini has called flush_work(&xfrm_state_gc_work).

Reported-by: syzbot+999eb23467f83f9bf9bf@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=999eb23467f83f9bf9bf
Fixes: b441cf3f8c ("xfrm: delete x->tunnel as we delete x")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2025-10-21 10:42:43 +02:00
Sabrina Dubroca
8d2a2a49c3 xfrm: drop SA reference in xfrm_state_update if dir doesn't match
We're not updating x1, but we still need to put() it.

Fixes: a4a87fa4e9 ("xfrm: Add Direction to the SA in or out")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2025-10-21 10:42:42 +02:00
62 changed files with 759 additions and 408 deletions

View File

@@ -54,6 +54,7 @@ to matching WMI devices using a struct wmi_device_id table:
::
static const struct wmi_device_id foo_id_table[] = {
/* Only use uppercase letters! */
{ "936DA01F-9ABD-4D9D-80C7-02AF85C822A8", NULL },
{ }
};

View File

@@ -9266,7 +9266,6 @@ M: Ido Schimmel <idosch@nvidia.com>
L: bridge@lists.linux.dev
L: netdev@vger.kernel.org
S: Maintained
W: http://www.linuxfoundation.org/en/Net:Bridge
F: include/linux/if_bridge.h
F: include/uapi/linux/if_bridge.h
F: include/linux/netfilter_bridge/

View File

@@ -182,6 +182,7 @@ bool einj_initialized __ro_after_init;
static void __iomem *einj_param;
static u32 v5param_size;
static u32 v66param_size;
static bool is_v2;
static void einj_exec_ctx_init(struct apei_exec_context *ctx)
@@ -283,6 +284,24 @@ static void check_vendor_extension(u64 paddr,
acpi_os_unmap_iomem(p, sizeof(v));
}
static u32 einjv2_init(struct einjv2_extension_struct *e)
{
if (e->revision != 1) {
pr_info("Unknown v2 extension revision %u\n", e->revision);
return 0;
}
if (e->length < sizeof(*e) || e->length > PAGE_SIZE) {
pr_info(FW_BUG "Bad1 v2 extension length %u\n", e->length);
return 0;
}
if ((e->length - sizeof(*e)) % sizeof(e->component_arr[0])) {
pr_info(FW_BUG "Bad2 v2 extension length %u\n", e->length);
return 0;
}
return (e->length - sizeof(*e)) / sizeof(e->component_arr[0]);
}
static void __iomem *einj_get_parameter_address(void)
{
int i;
@@ -310,28 +329,21 @@ static void __iomem *einj_get_parameter_address(void)
v5param_size = sizeof(v5param);
p = acpi_os_map_iomem(pa_v5, sizeof(*p));
if (p) {
int offset, len;
memcpy_fromio(&v5param, p, v5param_size);
acpi5 = 1;
check_vendor_extension(pa_v5, &v5param);
if (is_v2 && available_error_type & ACPI65_EINJV2_SUPP) {
len = v5param.einjv2_struct.length;
offset = offsetof(struct einjv2_extension_struct, component_arr);
max_nr_components = (len - offset) /
sizeof(v5param.einjv2_struct.component_arr[0]);
/*
* The first call to acpi_os_map_iomem above does not include the
* component array, instead it is used to read and calculate maximum
* number of components supported by the system. Below, the mapping
* is expanded to include the component array.
*/
if (available_error_type & ACPI65_EINJV2_SUPP) {
struct einjv2_extension_struct *e;
e = &v5param.einjv2_struct;
max_nr_components = einjv2_init(e);
/* remap including einjv2_extension_struct */
acpi_os_unmap_iomem(p, v5param_size);
offset = offsetof(struct set_error_type_with_address, einjv2_struct);
v5param_size = offset + struct_size(&v5param.einjv2_struct,
component_arr, max_nr_components);
p = acpi_os_map_iomem(pa_v5, v5param_size);
v66param_size = v5param_size - sizeof(*e) + e->length;
p = acpi_os_map_iomem(pa_v5, v66param_size);
}
return p;
}
}
@@ -527,6 +539,7 @@ static int __einj_error_inject(u32 type, u32 flags, u64 param1, u64 param2,
u64 param3, u64 param4)
{
struct apei_exec_context ctx;
u32 param_size = is_v2 ? v66param_size : v5param_size;
u64 val, trigger_paddr, timeout = FIRMWARE_TIMEOUT;
int i, rc;
@@ -539,11 +552,11 @@ static int __einj_error_inject(u32 type, u32 flags, u64 param1, u64 param2,
if (acpi5) {
struct set_error_type_with_address *v5param;
v5param = kmalloc(v5param_size, GFP_KERNEL);
v5param = kmalloc(param_size, GFP_KERNEL);
if (!v5param)
return -ENOMEM;
memcpy_fromio(v5param, einj_param, v5param_size);
memcpy_fromio(v5param, einj_param, param_size);
v5param->type = type;
if (type & ACPI5_VENDOR_BIT) {
switch (vendor_flags) {
@@ -601,7 +614,7 @@ static int __einj_error_inject(u32 type, u32 flags, u64 param1, u64 param2,
break;
}
}
memcpy_toio(einj_param, v5param, v5param_size);
memcpy_toio(einj_param, v5param, param_size);
kfree(v5param);
} else {
rc = apei_exec_run(&ctx, ACPI_EINJ_SET_ERROR_TYPE);
@@ -1132,9 +1145,14 @@ static void einj_remove(struct faux_device *fdev)
struct apei_exec_context ctx;
if (einj_param) {
acpi_size size = (acpi5) ?
v5param_size :
sizeof(struct einj_parameter);
acpi_size size;
if (v66param_size)
size = v66param_size;
else if (acpi5)
size = v5param_size;
else
size = sizeof(struct einj_parameter);
acpi_os_unmap_iomem(einj_param, size);
if (vendor_errors.size)

View File

@@ -888,12 +888,15 @@ static void device_resume_early(struct device *dev, pm_message_t state, bool asy
TRACE_DEVICE(dev);
TRACE_RESUME(0);
if (dev->power.syscore || dev->power.direct_complete)
if (dev->power.direct_complete)
goto Out;
if (!dev->power.is_late_suspended)
goto Out;
if (dev->power.syscore)
goto Skip;
if (!dpm_wait_for_superior(dev, async))
goto Out;
@@ -926,11 +929,11 @@ Run:
Skip:
dev->power.is_late_suspended = false;
pm_runtime_enable(dev);
Out:
TRACE_RESUME(error);
pm_runtime_enable(dev);
complete_all(&dev->power.completion);
if (error) {
@@ -1615,12 +1618,6 @@ static void device_suspend_late(struct device *dev, pm_message_t state, bool asy
TRACE_DEVICE(dev);
TRACE_SUSPEND(0);
/*
* Disable runtime PM for the device without checking if there is a
* pending resume request for it.
*/
__pm_runtime_disable(dev, false);
dpm_wait_for_subordinate(dev, async);
if (READ_ONCE(async_error))
@@ -1631,9 +1628,18 @@ static void device_suspend_late(struct device *dev, pm_message_t state, bool asy
goto Complete;
}
if (dev->power.syscore || dev->power.direct_complete)
if (dev->power.direct_complete)
goto Complete;
/*
* Disable runtime PM for the device without checking if there is a
* pending resume request for it.
*/
__pm_runtime_disable(dev, false);
if (dev->power.syscore)
goto Skip;
if (dev->pm_domain) {
info = "late power domain ";
callback = pm_late_early_op(&dev->pm_domain->ops, state);
@@ -1664,6 +1670,7 @@ Run:
WRITE_ONCE(async_error, error);
dpm_save_failed_dev(dev_name(dev));
pm_dev_err(dev, state, async ? " async late" : " late", error);
pm_runtime_enable(dev);
goto Complete;
}
dpm_propagate_wakeup_to_parent(dev);

View File

@@ -376,8 +376,18 @@ static int hellcreek_led_setup(struct hellcreek *hellcreek)
hellcreek_set_brightness(hellcreek, STATUS_OUT_IS_GM, 1);
/* Register both leds */
led_classdev_register(hellcreek->dev, &hellcreek->led_sync_good);
led_classdev_register(hellcreek->dev, &hellcreek->led_is_gm);
ret = led_classdev_register(hellcreek->dev, &hellcreek->led_sync_good);
if (ret) {
dev_err(hellcreek->dev, "Failed to register sync_good LED\n");
goto out;
}
ret = led_classdev_register(hellcreek->dev, &hellcreek->led_is_gm);
if (ret) {
dev_err(hellcreek->dev, "Failed to register is_gm LED\n");
led_classdev_unregister(&hellcreek->led_sync_good);
goto out;
}
ret = 0;

View File

@@ -540,6 +540,7 @@ static void lan937x_set_tune_adj(struct ksz_device *dev, int port,
ksz_pread16(dev, port, reg, &data16);
/* Update tune Adjust */
data16 &= ~PORT_TUNE_ADJ;
data16 |= FIELD_PREP(PORT_TUNE_ADJ, val);
ksz_pwrite16(dev, port, reg, data16);

View File

@@ -282,7 +282,7 @@ static int airoha_ppe_foe_entry_prepare(struct airoha_eth *eth,
if (!airoha_is_valid_gdm_port(eth, port))
return -EINVAL;
if (dsa_port >= 0)
if (dsa_port >= 0 || eth->ports[1])
pse_port = port->id == 4 ? FE_PSE_PORT_GDM4
: port->id;
else

View File

@@ -1296,7 +1296,8 @@ static void be_xmit_flush(struct be_adapter *adapter, struct be_tx_obj *txo)
(adapter->bmc_filt_mask & BMC_FILT_MULTICAST)
static bool be_send_pkt_to_bmc(struct be_adapter *adapter,
struct sk_buff **skb)
struct sk_buff **skb,
struct be_wrb_params *wrb_params)
{
struct ethhdr *eh = (struct ethhdr *)(*skb)->data;
bool os2bmc = false;
@@ -1360,7 +1361,7 @@ done:
* to BMC, asic expects the vlan to be inline in the packet.
*/
if (os2bmc)
*skb = be_insert_vlan_in_pkt(adapter, *skb, NULL);
*skb = be_insert_vlan_in_pkt(adapter, *skb, wrb_params);
return os2bmc;
}
@@ -1387,7 +1388,7 @@ static netdev_tx_t be_xmit(struct sk_buff *skb, struct net_device *netdev)
/* if os2bmc is enabled and if the pkt is destined to bmc,
* enqueue the pkt a 2nd time with mgmt bit set.
*/
if (be_send_pkt_to_bmc(adapter, &skb)) {
if (be_send_pkt_to_bmc(adapter, &skb, &wrb_params)) {
BE_WRB_F_SET(wrb_params.features, OS2BMC, 1);
wrb_cnt = be_xmit_enqueue(adapter, txo, skb, &wrb_params);
if (unlikely(!wrb_cnt))

View File

@@ -3246,7 +3246,7 @@ void ice_ptp_init(struct ice_pf *pf)
err = ice_ptp_init_port(pf, &ptp->port);
if (err)
goto err_exit;
goto err_clean_pf;
/* Start the PHY timestamping block */
ice_ptp_reset_phy_timestamping(pf);
@@ -3263,13 +3263,19 @@ void ice_ptp_init(struct ice_pf *pf)
dev_info(ice_pf_to_dev(pf), "PTP init successful\n");
return;
err_clean_pf:
mutex_destroy(&ptp->port.ps_lock);
ice_ptp_cleanup_pf(pf);
err_exit:
/* If we registered a PTP clock, release it */
if (pf->ptp.clock) {
ptp_clock_unregister(ptp->clock);
pf->ptp.clock = NULL;
}
ptp->state = ICE_PTP_ERROR;
/* Keep ICE_PTP_UNINIT state to avoid ambiguity at driver unload
* and to avoid duplicated resources release.
*/
ptp->state = ICE_PTP_UNINIT;
dev_err(ice_pf_to_dev(pf), "PTP failed %d\n", err);
}
@@ -3282,9 +3288,19 @@ err_exit:
*/
void ice_ptp_release(struct ice_pf *pf)
{
if (pf->ptp.state != ICE_PTP_READY)
if (pf->ptp.state == ICE_PTP_UNINIT)
return;
if (pf->ptp.state != ICE_PTP_READY) {
mutex_destroy(&pf->ptp.port.ps_lock);
ice_ptp_cleanup_pf(pf);
if (pf->ptp.clock) {
ptp_clock_unregister(pf->ptp.clock);
pf->ptp.clock = NULL;
}
return;
}
pf->ptp.state = ICE_PTP_UNINIT;
/* Disable timestamping for both Tx and Rx */

View File

@@ -63,6 +63,8 @@ destroy_wqs:
destroy_workqueue(adapter->vc_event_wq);
for (i = 0; i < adapter->max_vports; i++) {
if (!adapter->vport_config[i])
continue;
kfree(adapter->vport_config[i]->user_config.q_coalesce);
kfree(adapter->vport_config[i]);
adapter->vport_config[i] = NULL;

View File

@@ -324,10 +324,8 @@ err_xa:
free_irq(irq->map.virq, &irq->nh);
err_req_irq:
#ifdef CONFIG_RFS_ACCEL
if (i && rmap && *rmap) {
free_irq_cpu_rmap(*rmap);
*rmap = NULL;
}
if (i && rmap && *rmap)
irq_cpu_rmap_remove(*rmap, irq->map.virq);
err_irq_rmap:
#endif
if (i && pci_msix_can_alloc_dyn(dev->pdev))

View File

@@ -601,6 +601,8 @@ int mlxsw_linecard_devlink_info_get(struct mlxsw_linecard *linecard,
err = devlink_info_version_fixed_put(req,
DEVLINK_INFO_VERSION_GENERIC_FW_PSID,
info->psid);
if (err)
goto unlock;
sprintf(buf, "%u.%u.%u", info->fw_major, info->fw_minor,
info->fw_sub_minor);

View File

@@ -830,8 +830,10 @@ int mlxsw_sp_flower_stats(struct mlxsw_sp *mlxsw_sp,
return -EINVAL;
rule = mlxsw_sp_acl_rule_lookup(mlxsw_sp, ruleset, f->cookie);
if (!rule)
return -EINVAL;
if (!rule) {
err = -EINVAL;
goto err_rule_get_stats;
}
err = mlxsw_sp_acl_rule_get_stats(mlxsw_sp, rule, &packets, &bytes,
&drops, &lastuse, &used_hw_stats);

View File

@@ -4,6 +4,7 @@
* Copyright (c) 2019-2020 Marvell International Ltd.
*/
#include <linux/array_size.h>
#include <linux/netdevice.h>
#include <linux/etherdevice.h>
#include <linux/skbuff.h>
@@ -960,7 +961,7 @@ static inline void qede_tpa_cont(struct qede_dev *edev,
{
int i;
for (i = 0; cqe->len_list[i]; i++)
for (i = 0; cqe->len_list[i] && i < ARRAY_SIZE(cqe->len_list); i++)
qede_fill_frag_skb(edev, rxq, cqe->tpa_agg_index,
le16_to_cpu(cqe->len_list[i]));
@@ -985,7 +986,7 @@ static int qede_tpa_end(struct qede_dev *edev,
dma_unmap_page(rxq->dev, tpa_info->buffer.mapping,
PAGE_SIZE, rxq->data_direction);
for (i = 0; cqe->len_list[i]; i++)
for (i = 0; cqe->len_list[i] && i < ARRAY_SIZE(cqe->len_list); i++)
qede_fill_frag_skb(edev, rxq, cqe->tpa_agg_index,
le16_to_cpu(cqe->len_list[i]));
if (unlikely(i > 1))

View File

@@ -260,6 +260,7 @@ void gelic_card_down(struct gelic_card *card)
if (atomic_dec_if_positive(&card->users) == 0) {
pr_debug("%s: real do\n", __func__);
napi_disable(&card->napi);
timer_delete_sync(&card->rx_oom_timer);
/*
* Disable irq. Wireless interrupts will
* be disabled later if any
@@ -970,7 +971,8 @@ static void gelic_net_pass_skb_up(struct gelic_descr *descr,
* gelic_card_decode_one_descr - processes an rx descriptor
* @card: card structure
*
* returns 1 if a packet has been sent to the stack, otherwise 0
* returns 1 if a packet has been sent to the stack, -ENOMEM on skb alloc
* failure, otherwise 0
*
* processes an rx descriptor by iommu-unmapping the data buffer and passing
* the packet up to the stack
@@ -981,16 +983,18 @@ static int gelic_card_decode_one_descr(struct gelic_card *card)
struct gelic_descr_chain *chain = &card->rx_chain;
struct gelic_descr *descr = chain->head;
struct net_device *netdev = NULL;
int dmac_chain_ended;
int dmac_chain_ended = 0;
int prepare_rx_ret;
status = gelic_descr_get_status(descr);
if (status == GELIC_DESCR_DMA_CARDOWNED)
return 0;
if (status == GELIC_DESCR_DMA_NOT_IN_USE) {
if (status == GELIC_DESCR_DMA_NOT_IN_USE || !descr->skb) {
dev_dbg(ctodev(card), "dormant descr? %p\n", descr);
return 0;
dmac_chain_ended = 1;
goto refill;
}
/* netdevice select */
@@ -1048,9 +1052,10 @@ static int gelic_card_decode_one_descr(struct gelic_card *card)
refill:
/* is the current descriptor terminated with next_descr == NULL? */
dmac_chain_ended =
be32_to_cpu(descr->hw_regs.dmac_cmd_status) &
GELIC_DESCR_RX_DMA_CHAIN_END;
if (!dmac_chain_ended)
dmac_chain_ended =
be32_to_cpu(descr->hw_regs.dmac_cmd_status) &
GELIC_DESCR_RX_DMA_CHAIN_END;
/*
* So that always DMAC can see the end
* of the descriptor chain to avoid
@@ -1062,10 +1067,11 @@ refill:
gelic_descr_set_status(descr, GELIC_DESCR_DMA_NOT_IN_USE);
/*
* this call can fail, but for now, just leave this
* descriptor without skb
* this call can fail, propagate the error
*/
gelic_descr_prepare_rx(card, descr);
prepare_rx_ret = gelic_descr_prepare_rx(card, descr);
if (prepare_rx_ret)
return prepare_rx_ret;
chain->tail = descr;
chain->head = descr->next;
@@ -1087,6 +1093,13 @@ refill:
return 1;
}
static void gelic_rx_oom_timer(struct timer_list *t)
{
struct gelic_card *card = timer_container_of(card, t, rx_oom_timer);
napi_schedule(&card->napi);
}
/**
* gelic_net_poll - NAPI poll function called by the stack to return packets
* @napi: napi structure
@@ -1099,14 +1112,22 @@ static int gelic_net_poll(struct napi_struct *napi, int budget)
{
struct gelic_card *card = container_of(napi, struct gelic_card, napi);
int packets_done = 0;
int work_result = 0;
while (packets_done < budget) {
if (!gelic_card_decode_one_descr(card))
work_result = gelic_card_decode_one_descr(card);
if (work_result != 1)
break;
packets_done++;
}
if (work_result == -ENOMEM) {
napi_complete_done(napi, packets_done);
mod_timer(&card->rx_oom_timer, jiffies + 1);
return packets_done;
}
if (packets_done < budget) {
napi_complete_done(napi, packets_done);
gelic_card_rx_irq_on(card);
@@ -1576,6 +1597,8 @@ static struct gelic_card *gelic_alloc_card_net(struct net_device **netdev)
mutex_init(&card->updown_lock);
atomic_set(&card->users, 0);
timer_setup(&card->rx_oom_timer, gelic_rx_oom_timer, 0);
return card;
}

View File

@@ -268,6 +268,7 @@ struct gelic_vlan_id {
struct gelic_card {
struct napi_struct napi;
struct net_device *netdev[GELIC_PORT_MAX];
struct timer_list rx_oom_timer;
/*
* hypervisor requires irq_status should be
* 8 bytes aligned, but u64 member is

View File

@@ -637,6 +637,9 @@ static int phylink_validate(struct phylink *pl, unsigned long *supported,
static void phylink_fill_fixedlink_supported(unsigned long *supported)
{
linkmode_set_bit(ETHTOOL_LINK_MODE_Pause_BIT, supported);
linkmode_set_bit(ETHTOOL_LINK_MODE_Asym_Pause_BIT, supported);
linkmode_set_bit(ETHTOOL_LINK_MODE_Autoneg_BIT, supported);
linkmode_set_bit(ETHTOOL_LINK_MODE_10baseT_Half_BIT, supported);
linkmode_set_bit(ETHTOOL_LINK_MODE_10baseT_Full_BIT, supported);
linkmode_set_bit(ETHTOOL_LINK_MODE_100baseT_Half_BIT, supported);

View File

@@ -392,14 +392,12 @@ static netdev_tx_t veth_xmit(struct sk_buff *skb, struct net_device *dev)
}
/* Restore Eth hdr pulled by dev_forward_skb/eth_type_trans */
__skb_push(skb, ETH_HLEN);
/* Depend on prior success packets started NAPI consumer via
* __veth_xdp_flush(). Cancel TXQ stop if consumer stopped,
* paired with empty check in veth_poll().
*/
netif_tx_stop_queue(txq);
smp_mb__after_atomic();
if (unlikely(__ptr_ring_empty(&rq->xdp_ring)))
netif_tx_wake_queue(txq);
/* Makes sure NAPI peer consumer runs. Consumer is responsible
* for starting txq again, until then ndo_start_xmit (this
* function) will not be invoked by the netstack again.
*/
__veth_xdp_flush(rq);
break;
case NET_RX_DROP: /* same as NET_XMIT_DROP */
drop:
@@ -900,17 +898,9 @@ static int veth_xdp_rcv(struct veth_rq *rq, int budget,
struct veth_xdp_tx_bq *bq,
struct veth_stats *stats)
{
struct veth_priv *priv = netdev_priv(rq->dev);
int queue_idx = rq->xdp_rxq.queue_index;
struct netdev_queue *peer_txq;
struct net_device *peer_dev;
int i, done = 0, n_xdpf = 0;
void *xdpf[VETH_XDP_BATCH];
/* NAPI functions as RCU section */
peer_dev = rcu_dereference_check(priv->peer, rcu_read_lock_bh_held());
peer_txq = peer_dev ? netdev_get_tx_queue(peer_dev, queue_idx) : NULL;
for (i = 0; i < budget; i++) {
void *ptr = __ptr_ring_consume(&rq->xdp_ring);
@@ -959,9 +949,6 @@ static int veth_xdp_rcv(struct veth_rq *rq, int budget,
rq->stats.vs.xdp_packets += done;
u64_stats_update_end(&rq->stats.syncp);
if (peer_txq && unlikely(netif_tx_queue_stopped(peer_txq)))
netif_tx_wake_queue(peer_txq);
return done;
}
@@ -969,12 +956,20 @@ static int veth_poll(struct napi_struct *napi, int budget)
{
struct veth_rq *rq =
container_of(napi, struct veth_rq, xdp_napi);
struct veth_priv *priv = netdev_priv(rq->dev);
int queue_idx = rq->xdp_rxq.queue_index;
struct netdev_queue *peer_txq;
struct veth_stats stats = {};
struct net_device *peer_dev;
struct veth_xdp_tx_bq bq;
int done;
bq.count = 0;
/* NAPI functions as RCU section */
peer_dev = rcu_dereference_check(priv->peer, rcu_read_lock_bh_held());
peer_txq = peer_dev ? netdev_get_tx_queue(peer_dev, queue_idx) : NULL;
xdp_set_return_frame_no_direct();
done = veth_xdp_rcv(rq, budget, &bq, &stats);
@@ -996,6 +991,13 @@ static int veth_poll(struct napi_struct *napi, int budget)
veth_xdp_flush(rq, &bq);
xdp_clear_return_frame_no_direct();
/* Release backpressure per NAPI poll */
smp_rmb(); /* Paired with netif_tx_stop_queue set_bit */
if (peer_txq && netif_tx_queue_stopped(peer_txq)) {
txq_trans_cond_update(peer_txq);
netif_tx_wake_queue(peer_txq);
}
return done;
}

View File

@@ -7694,6 +7694,13 @@ int rtw89_hw_scan_add_chan_list_ax(struct rtw89_dev *rtwdev,
INIT_LIST_HEAD(&list);
list_for_each_entry_safe(ch_info, tmp, &scan_info->chan_list, list) {
/* The operating channel (tx_null == true) should
* not be last in the list, to avoid breaking
* RTL8851BU and RTL8832BU.
*/
if (list_len + 1 == RTW89_SCAN_LIST_LIMIT_AX && ch_info->tx_null)
break;
list_move_tail(&ch_info->list, &list);
list_len++;

View File

@@ -545,6 +545,7 @@ config MSI_WMI
config MSI_WMI_PLATFORM
tristate "MSI WMI Platform features"
depends on ACPI_WMI
depends on DMI
depends on HWMON
help
Say Y here if you want to have support for WMI-based platform features

View File

@@ -102,6 +102,7 @@ MODULE_ALIAS("wmi:676AA15E-6A47-4D9F-A2CC-1E6D18D14026");
enum acer_wmi_event_ids {
WMID_HOTKEY_EVENT = 0x1,
WMID_BACKLIGHT_EVENT = 0x4,
WMID_ACCEL_OR_KBD_DOCK_EVENT = 0x5,
WMID_GAMING_TURBO_KEY_EVENT = 0x7,
WMID_AC_EVENT = 0x8,
@@ -2369,6 +2370,9 @@ static void acer_wmi_notify(union acpi_object *obj, void *context)
sparse_keymap_report_event(acer_wmi_input_dev, scancode, 1, true);
}
break;
case WMID_BACKLIGHT_EVENT:
/* Already handled by acpi-video */
break;
case WMID_ACCEL_OR_KBD_DOCK_EVENT:
acer_gsensor_event();
acer_kbd_dock_event(&return_value);

View File

@@ -122,6 +122,14 @@ static const struct dmi_system_id fwbug_list[] = {
DMI_MATCH(DMI_PRODUCT_NAME, "21A1"),
}
},
{
.ident = "ROG Xbox Ally RC73YA",
.driver_data = &quirk_spurious_8042,
.matches = {
DMI_MATCH(DMI_BOARD_VENDOR, "ASUSTeK COMPUTER INC."),
DMI_MATCH(DMI_BOARD_NAME, "RC73YA"),
}
},
/* https://bugzilla.kernel.org/show_bug.cgi?id=218024 */
{
.ident = "V14 G4 AMN",
@@ -204,6 +212,23 @@ static const struct dmi_system_id fwbug_list[] = {
DMI_MATCH(DMI_PRODUCT_NAME, "82ND"),
}
},
/* https://gitlab.freedesktop.org/drm/amd/-/issues/4618 */
{
.ident = "Lenovo Legion Go 2",
.driver_data = &quirk_s2idle_bug,
.matches = {
DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
DMI_MATCH(DMI_PRODUCT_NAME, "83N0"),
}
},
{
.ident = "Lenovo Legion Go 2",
.driver_data = &quirk_s2idle_bug,
.matches = {
DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
DMI_MATCH(DMI_PRODUCT_NAME, "83N1"),
}
},
/* https://gitlab.freedesktop.org/drm/amd/-/issues/2684 */
{
.ident = "HP Laptop 15s-eq2xxx",

View File

@@ -106,6 +106,7 @@ static void amd_pmc_get_ip_info(struct amd_pmc_dev *dev)
switch (dev->cpu_id) {
case AMD_CPU_ID_PCO:
case AMD_CPU_ID_RN:
case AMD_CPU_ID_VG:
case AMD_CPU_ID_YC:
case AMD_CPU_ID_CB:
dev->num_ips = 12;
@@ -517,6 +518,7 @@ static int amd_pmc_get_os_hint(struct amd_pmc_dev *dev)
case AMD_CPU_ID_PCO:
return MSG_OS_HINT_PCO;
case AMD_CPU_ID_RN:
case AMD_CPU_ID_VG:
case AMD_CPU_ID_YC:
case AMD_CPU_ID_CB:
case AMD_CPU_ID_PS:
@@ -717,6 +719,7 @@ static const struct pci_device_id pmc_pci_ids[] = {
{ PCI_DEVICE(PCI_VENDOR_ID_AMD, AMD_CPU_ID_RV) },
{ PCI_DEVICE(PCI_VENDOR_ID_AMD, AMD_CPU_ID_SP) },
{ PCI_DEVICE(PCI_VENDOR_ID_AMD, AMD_CPU_ID_SHP) },
{ PCI_DEVICE(PCI_VENDOR_ID_AMD, AMD_CPU_ID_VG) },
{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_1AH_M20H_ROOT) },
{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_1AH_M60H_ROOT) },
{ }

View File

@@ -156,6 +156,7 @@ void amd_mp2_stb_deinit(struct amd_pmc_dev *dev);
#define AMD_CPU_ID_RN 0x1630
#define AMD_CPU_ID_PCO AMD_CPU_ID_RV
#define AMD_CPU_ID_CZN AMD_CPU_ID_RN
#define AMD_CPU_ID_VG 0x1645
#define AMD_CPU_ID_YC 0x14B5
#define AMD_CPU_ID_CB 0x14D8
#define AMD_CPU_ID_PS 0x14E8

View File

@@ -89,6 +89,14 @@ static struct awcc_quirks generic_quirks = {
static struct awcc_quirks empty_quirks;
static const struct dmi_system_id awcc_dmi_table[] __initconst = {
{
.ident = "Alienware 16 Aurora",
.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "Alienware"),
DMI_MATCH(DMI_PRODUCT_NAME, "Alienware 16 Aurora"),
},
.driver_data = &g_series_quirks,
},
{
.ident = "Alienware Area-51m",
.matches = {
@@ -98,26 +106,18 @@ static const struct dmi_system_id awcc_dmi_table[] __initconst = {
.driver_data = &generic_quirks,
},
{
.ident = "Alienware Area-51m R2",
.ident = "Alienware m15",
.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "Alienware"),
DMI_MATCH(DMI_PRODUCT_NAME, "Alienware Area-51m R2"),
DMI_MATCH(DMI_PRODUCT_NAME, "Alienware m15"),
},
.driver_data = &generic_quirks,
},
{
.ident = "Alienware m15 R5",
.ident = "Alienware m16 R1 AMD",
.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "Alienware"),
DMI_MATCH(DMI_PRODUCT_NAME, "Alienware m15 R5"),
},
.driver_data = &generic_quirks,
},
{
.ident = "Alienware m15 R7",
.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "Alienware"),
DMI_MATCH(DMI_PRODUCT_NAME, "Alienware m15 R7"),
DMI_MATCH(DMI_PRODUCT_NAME, "Alienware m16 R1 AMD"),
},
.driver_data = &generic_quirks,
},
@@ -129,14 +129,6 @@ static const struct dmi_system_id awcc_dmi_table[] __initconst = {
},
.driver_data = &g_series_quirks,
},
{
.ident = "Alienware m16 R1 AMD",
.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "Alienware"),
DMI_MATCH(DMI_PRODUCT_NAME, "Alienware m16 R1 AMD"),
},
.driver_data = &generic_quirks,
},
{
.ident = "Alienware m16 R2",
.matches = {
@@ -146,114 +138,66 @@ static const struct dmi_system_id awcc_dmi_table[] __initconst = {
.driver_data = &generic_quirks,
},
{
.ident = "Alienware m17 R5",
.ident = "Alienware m17",
.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "Alienware"),
DMI_MATCH(DMI_PRODUCT_NAME, "Alienware m17 R5 AMD"),
DMI_MATCH(DMI_PRODUCT_NAME, "Alienware m17"),
},
.driver_data = &generic_quirks,
},
{
.ident = "Alienware m18 R2",
.ident = "Alienware m18",
.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "Alienware"),
DMI_MATCH(DMI_PRODUCT_NAME, "Alienware m18 R2"),
DMI_MATCH(DMI_PRODUCT_NAME, "Alienware m18"),
},
.driver_data = &generic_quirks,
},
{
.ident = "Alienware x15 R1",
.ident = "Alienware x15",
.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "Alienware"),
DMI_MATCH(DMI_PRODUCT_NAME, "Alienware x15 R1"),
DMI_MATCH(DMI_PRODUCT_NAME, "Alienware x15"),
},
.driver_data = &generic_quirks,
},
{
.ident = "Alienware x15 R2",
.ident = "Alienware x17",
.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "Alienware"),
DMI_MATCH(DMI_PRODUCT_NAME, "Alienware x15 R2"),
DMI_MATCH(DMI_PRODUCT_NAME, "Alienware x17"),
},
.driver_data = &generic_quirks,
},
{
.ident = "Alienware x17 R2",
.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "Alienware"),
DMI_MATCH(DMI_PRODUCT_NAME, "Alienware x17 R2"),
},
.driver_data = &generic_quirks,
},
{
.ident = "Dell Inc. G15 5510",
.ident = "Dell Inc. G15",
.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
DMI_MATCH(DMI_PRODUCT_NAME, "Dell G15 5510"),
DMI_MATCH(DMI_PRODUCT_NAME, "Dell G15"),
},
.driver_data = &g_series_quirks,
},
{
.ident = "Dell Inc. G15 5511",
.ident = "Dell Inc. G16",
.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
DMI_MATCH(DMI_PRODUCT_NAME, "Dell G15 5511"),
DMI_MATCH(DMI_PRODUCT_NAME, "Dell G16"),
},
.driver_data = &g_series_quirks,
},
{
.ident = "Dell Inc. G15 5515",
.ident = "Dell Inc. G3",
.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
DMI_MATCH(DMI_PRODUCT_NAME, "Dell G15 5515"),
DMI_MATCH(DMI_PRODUCT_NAME, "G3"),
},
.driver_data = &g_series_quirks,
},
{
.ident = "Dell Inc. G15 5530",
.ident = "Dell Inc. G5",
.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
DMI_MATCH(DMI_PRODUCT_NAME, "Dell G15 5530"),
},
.driver_data = &g_series_quirks,
},
{
.ident = "Dell Inc. G16 7630",
.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
DMI_MATCH(DMI_PRODUCT_NAME, "Dell G16 7630"),
},
.driver_data = &g_series_quirks,
},
{
.ident = "Dell Inc. G3 3500",
.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
DMI_MATCH(DMI_PRODUCT_NAME, "G3 3500"),
},
.driver_data = &g_series_quirks,
},
{
.ident = "Dell Inc. G3 3590",
.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
DMI_MATCH(DMI_PRODUCT_NAME, "G3 3590"),
},
.driver_data = &g_series_quirks,
},
{
.ident = "Dell Inc. G5 5500",
.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
DMI_MATCH(DMI_PRODUCT_NAME, "G5 5500"),
},
.driver_data = &g_series_quirks,
},
{
.ident = "Dell Inc. G5 5505",
.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
DMI_MATCH(DMI_PRODUCT_NAME, "G5 5505"),
DMI_MATCH(DMI_PRODUCT_NAME, "G5"),
},
.driver_data = &g_series_quirks,
},

View File

@@ -92,9 +92,11 @@ static const char * const victus_thermal_profile_boards[] = {
"8A25"
};
/* DMI Board names of Victus 16-r1000 and Victus 16-s1000 laptops */
/* DMI Board names of Victus 16-r and Victus 16-s laptops */
static const char * const victus_s_thermal_profile_boards[] = {
"8C99", "8C9C"
"8BBE", "8BD4", "8BD5",
"8C78", "8C99", "8C9C",
"8D41",
};
enum hp_wmi_radio {

View File

@@ -81,6 +81,10 @@ static const struct key_entry huawei_wmi_keymap[] = {
{ KE_KEY, 0x289, { KEY_WLAN } },
// Huawei |M| key
{ KE_KEY, 0x28a, { KEY_CONFIG } },
// HONOR YOYO key
{ KE_KEY, 0x28b, { KEY_NOTIFICATION_CENTER } },
// HONOR print screen
{ KE_KEY, 0x28e, { KEY_PRINT } },
// Keyboard backlit
{ KE_IGNORE, 0x293, { KEY_KBDILLUMTOGGLE } },
{ KE_IGNORE, 0x294, { KEY_KBDILLUMUP } },

View File

@@ -55,6 +55,7 @@ static const struct acpi_device_id intel_hid_ids[] = {
{ "INTC10CB" },
{ "INTC10CC" },
{ "INTC10F1" },
{ "INTC10F2" },
{ }
};
MODULE_DEVICE_TABLE(acpi, intel_hid_ids);

View File

@@ -108,11 +108,11 @@ static int isst_if_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
ret = pci_read_config_dword(pdev, 0xD0, &mmio_base);
if (ret)
return ret;
return pcibios_err_to_errno(ret);
ret = pci_read_config_dword(pdev, 0xFC, &pcu_base);
if (ret)
return ret;
return pcibios_err_to_errno(ret);
pcu_base &= GENMASK(10, 0);
base_addr = (u64)mmio_base << 23 | (u64) pcu_base << 12;

View File

@@ -40,7 +40,7 @@
* @agent_type_mask: Bit mask of all hardware agents for this domain
* @uncore_attr_group: Attribute group storage
* @max_freq_khz_kobj_attr: Storage for kobject attribute max_freq_khz
* @mix_freq_khz_kobj_attr: Storage for kobject attribute min_freq_khz
* @min_freq_khz_kobj_attr: Storage for kobject attribute min_freq_khz
* @initial_max_freq_khz_kobj_attr: Storage for kobject attribute initial_max_freq_khz
* @initial_min_freq_khz_kobj_attr: Storage for kobject attribute initial_min_freq_khz
* @current_freq_khz_kobj_attr: Storage for kobject attribute current_freq_khz
@@ -48,13 +48,14 @@
* @fabric_cluster_id_kobj_attr: Storage for kobject attribute fabric_cluster_id
* @package_id_kobj_attr: Storage for kobject attribute package_id
* @elc_low_threshold_percent_kobj_attr:
Storage for kobject attribute elc_low_threshold_percent
* Storage for kobject attribute elc_low_threshold_percent
* @elc_high_threshold_percent_kobj_attr:
Storage for kobject attribute elc_high_threshold_percent
* Storage for kobject attribute elc_high_threshold_percent
* @elc_high_threshold_enable_kobj_attr:
Storage for kobject attribute elc_high_threshold_enable
* Storage for kobject attribute elc_high_threshold_enable
* @elc_floor_freq_khz_kobj_attr: Storage for kobject attribute elc_floor_freq_khz
* @agent_types_kobj_attr: Storage for kobject attribute agent_type
* @die_id_kobj_attr: Attribute storage for die_id information
* @uncore_attrs: Attribute storage for group creation
*
* This structure is used to encapsulate all data related to uncore sysfs

View File

@@ -256,6 +256,10 @@ static const struct x86_cpu_id intel_uncore_cpu_ids[] = {
X86_MATCH_VFM(INTEL_ARROWLAKE, NULL),
X86_MATCH_VFM(INTEL_ARROWLAKE_H, NULL),
X86_MATCH_VFM(INTEL_LUNARLAKE_M, NULL),
X86_MATCH_VFM(INTEL_PANTHERLAKE_L, NULL),
X86_MATCH_VFM(INTEL_WILDCATLAKE_L, NULL),
X86_MATCH_VFM(INTEL_NOVALAKE, NULL),
X86_MATCH_VFM(INTEL_NOVALAKE_L, NULL),
{}
};
MODULE_DEVICE_TABLE(x86cpu, intel_uncore_cpu_ids);

View File

@@ -14,6 +14,7 @@
#include <linux/debugfs.h>
#include <linux/device.h>
#include <linux/device/driver.h>
#include <linux/dmi.h>
#include <linux/errno.h>
#include <linux/hwmon.h>
#include <linux/kernel.h>
@@ -28,7 +29,7 @@
#define DRIVER_NAME "msi-wmi-platform"
#define MSI_PLATFORM_GUID "ABBC0F6E-8EA1-11d1-00A0-C90629100000"
#define MSI_PLATFORM_GUID "ABBC0F6E-8EA1-11D1-00A0-C90629100000"
#define MSI_WMI_PLATFORM_INTERFACE_VERSION 2
@@ -448,7 +449,45 @@ static struct wmi_driver msi_wmi_platform_driver = {
.probe = msi_wmi_platform_probe,
.no_singleton = true,
};
module_wmi_driver(msi_wmi_platform_driver);
/*
* MSI reused the WMI GUID from the WMI-ACPI sample code provided by Microsoft,
* so other manufacturers might use it as well for their WMI-ACPI implementations.
*/
static const struct dmi_system_id msi_wmi_platform_whitelist[] __initconst = {
{
.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "MICRO-STAR INT"),
},
},
{
.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "Micro-Star International"),
},
},
{ }
};
static int __init msi_wmi_platform_module_init(void)
{
if (!dmi_check_system(msi_wmi_platform_whitelist)) {
if (!force)
return -ENODEV;
pr_warn("Ignoring DMI whitelist\n");
}
return wmi_driver_register(&msi_wmi_platform_driver);
}
static void __exit msi_wmi_platform_module_exit(void)
{
wmi_driver_unregister(&msi_wmi_platform_driver);
}
module_init(msi_wmi_platform_module_init);
module_exit(msi_wmi_platform_module_exit);
MODULE_AUTHOR("Armin Wolf <W_Armin@gmx.de>");
MODULE_DESCRIPTION("MSI WMI platform features");

View File

@@ -701,7 +701,6 @@ static void mpc_rcvd_sweep_req(struct mpcg_info *mpcginfo)
grp->sweep_req_pend_num--;
ctcmpc_send_sweep_resp(ch);
kfree(mpcginfo);
return;
}

View File

@@ -536,7 +536,8 @@ static inline int xfrm_af2proto(unsigned int family)
static inline const struct xfrm_mode *xfrm_ip2inner_mode(struct xfrm_state *x, int ipproto)
{
if ((ipproto == IPPROTO_IPIP && x->props.family == AF_INET) ||
if ((x->sel.family != AF_UNSPEC) ||
(ipproto == IPPROTO_IPIP && x->props.family == AF_INET) ||
(ipproto == IPPROTO_IPV6 && x->props.family == AF_INET6))
return &x->inner_mode;
else

View File

@@ -52,7 +52,7 @@ struct isst_if_cpu_map {
/**
* struct isst_if_cpu_maps - structure for CPU map IOCTL
* @cmd_count: Number of CPU mapping command in cpu_map[]
* @cpu_map[]: Holds one or more CPU map data structure
* @cpu_map: Holds one or more CPU map data structure
*
* This structure used with ioctl ISST_IF_GET_PHY_ID to send
* one or more CPU mapping commands. Here IOCTL return value indicates
@@ -82,8 +82,8 @@ struct isst_if_io_reg {
/**
* struct isst_if_io_regs - structure for IO register commands
* @cmd_count: Number of io reg commands in io_reg[]
* @io_reg[]: Holds one or more io_reg command structure
* @req_count: Number of io reg commands in io_reg[]
* @io_reg: Holds one or more io_reg command structure
*
* This structure used with ioctl ISST_IF_IO_CMD to send
* one or more read/write commands to PUNIT. Here IOCTL return value
@@ -120,7 +120,7 @@ struct isst_if_mbox_cmd {
/**
* struct isst_if_mbox_cmds - structure for mailbox commands
* @cmd_count: Number of mailbox commands in mbox_cmd[]
* @mbox_cmd[]: Holds one or more mbox commands
* @mbox_cmd: Holds one or more mbox commands
*
* This structure used with ioctl ISST_IF_MBOX_COMMAND to send
* one or more mailbox commands to PUNIT. Here IOCTL return value
@@ -152,7 +152,7 @@ struct isst_if_msr_cmd {
/**
* struct isst_if_msr_cmds - structure for msr commands
* @cmd_count: Number of mailbox commands in msr_cmd[]
* @msr_cmd[]: Holds one or more msr commands
* @msr_cmd: Holds one or more msr commands
*
* This structure used with ioctl ISST_IF_MSR_COMMAND to send
* one or more MSR commands. IOCTL return value indicates number of
@@ -167,8 +167,9 @@ struct isst_if_msr_cmds {
* struct isst_core_power - Structure to get/set core_power feature
* @get_set: 0: Get, 1: Set
* @socket_id: Socket/package id
* @power_domain: Power Domain id
* @power_domain_id: Power Domain id
* @enable: Feature enable status
* @supported: Power domain supports SST_CP interface
* @priority_type: Priority type for the feature (ordered/proportional)
*
* Structure to get/set core_power feature state using IOCTL
@@ -187,11 +188,11 @@ struct isst_core_power {
* struct isst_clos_param - Structure to get/set clos praram
* @get_set: 0: Get, 1: Set
* @socket_id: Socket/package id
* @power_domain: Power Domain id
* clos: Clos ID for the parameters
* min_freq_mhz: Minimum frequency in MHz
* max_freq_mhz: Maximum frequency in MHz
* prop_prio: Proportional priority from 0-15
* @power_domain_id: Power Domain id
* @clos: Clos ID for the parameters
* @min_freq_mhz: Minimum frequency in MHz
* @max_freq_mhz: Maximum frequency in MHz
* @prop_prio: Proportional priority from 0-15
*
* Structure to get/set per clos property using IOCTL
* ISST_IF_CLOS_PARAM.
@@ -209,7 +210,7 @@ struct isst_clos_param {
/**
* struct isst_if_clos_assoc - Structure to assign clos to a CPU
* @socket_id: Socket/package id
* @power_domain: Power Domain id
* @power_domain_id: Power Domain id
* @logical_cpu: CPU number
* @clos: Clos ID to assign to the logical CPU
*
@@ -228,6 +229,7 @@ struct isst_if_clos_assoc {
* @get_set: Request is for get or set
* @punit_cpu_map: Set to 1 if the CPU number is punit numbering not
* Linux CPU number
* @assoc_info: CLOS data for this CPU
*
* Structure used to get/set associate CPUs to clos using IOCTL
* ISST_IF_CLOS_ASSOC.
@@ -257,7 +259,7 @@ struct isst_tpmi_instance_count {
/**
* struct isst_perf_level_info - Structure to get information on SST-PP levels
* @socket_id: Socket/package id
* @power_domain: Power Domain id
* @power_domain_id: Power Domain id
* @logical_cpu: CPU number
* @clos: Clos ID to assign to the logical CPU
* @max_level: Maximum performance level supported by the platform
@@ -267,8 +269,8 @@ struct isst_tpmi_instance_count {
* @feature_state: SST-BF and SST-TF (enabled/disabled) status at current level
* @locked: SST-PP performance level change is locked/unlocked
* @enabled: SST-PP feature is enabled or not
* @sst-tf_support: SST-TF support status at this level
* @sst-bf_support: SST-BF support status at this level
* @sst_tf_support: SST-TF support status at this level
* @sst_bf_support: SST-BF support status at this level
*
* Structure to get SST-PP details using IOCTL ISST_IF_PERF_LEVELS.
*/
@@ -289,7 +291,7 @@ struct isst_perf_level_info {
/**
* struct isst_perf_level_control - Structure to set SST-PP level
* @socket_id: Socket/package id
* @power_domain: Power Domain id
* @power_domain_id: Power Domain id
* @level: level to set
*
* Structure used change SST-PP level using IOCTL ISST_IF_PERF_SET_LEVEL.
@@ -303,7 +305,7 @@ struct isst_perf_level_control {
/**
* struct isst_perf_feature_control - Structure to activate SST-BF/SST-TF
* @socket_id: Socket/package id
* @power_domain: Power Domain id
* @power_domain_id: Power Domain id
* @feature: bit 0 = SST-BF state, bit 1 = SST-TF state
*
* Structure used to enable SST-BF/SST-TF using IOCTL ISST_IF_PERF_SET_FEATURE.
@@ -320,7 +322,7 @@ struct isst_perf_feature_control {
/**
* struct isst_perf_level_data_info - Structure to get SST-PP level details
* @socket_id: Socket/package id
* @power_domain: Power Domain id
* @power_domain_id: Power Domain id
* @level: SST-PP level for which caller wants to get information
* @tdp_ratio: TDP Ratio
* @base_freq_mhz: Base frequency in MHz
@@ -341,8 +343,8 @@ struct isst_perf_feature_control {
* @pm_fabric_freq_mhz: Fabric (Uncore) minimum frequency
* @max_buckets: Maximum trl buckets
* @max_trl_levels: Maximum trl levels
* @bucket_core_counts[TRL_MAX_BUCKETS]: Number of cores per bucket
* @trl_freq_mhz[TRL_MAX_LEVELS][TRL_MAX_BUCKETS]: maximum frequency
* @bucket_core_counts: Number of cores per bucket
* @trl_freq_mhz: maximum frequency
* for a bucket and trl level
*
* Structure used to get information on frequencies and TDP for a SST-PP
@@ -402,7 +404,7 @@ struct isst_perf_level_fabric_info {
/**
* struct isst_perf_level_cpu_mask - Structure to get SST-PP level CPU mask
* @socket_id: Socket/package id
* @power_domain: Power Domain id
* @power_domain_id: Power Domain id
* @level: SST-PP level for which caller wants to get information
* @punit_cpu_map: Set to 1 if the CPU number is punit numbering not
* Linux CPU number. If 0 CPU buffer is copied to user space
@@ -430,7 +432,7 @@ struct isst_perf_level_cpu_mask {
/**
* struct isst_base_freq_info - Structure to get SST-BF frequencies
* @socket_id: Socket/package id
* @power_domain: Power Domain id
* @power_domain_id: Power Domain id
* @level: SST-PP level for which caller wants to get information
* @high_base_freq_mhz: High priority CPU base frequency
* @low_base_freq_mhz: Low priority CPU base frequency
@@ -453,9 +455,11 @@ struct isst_base_freq_info {
/**
* struct isst_turbo_freq_info - Structure to get SST-TF frequencies
* @socket_id: Socket/package id
* @power_domain: Power Domain id
* @power_domain_id: Power Domain id
* @level: SST-PP level for which caller wants to get information
* @max_clip_freqs: Maximum number of low priority core clipping frequencies
* @max_buckets: Maximum trl buckets
* @max_trl_levels: Maximum trl levels
* @lp_clip_freq_mhz: Clip frequencies per trl level
* @bucket_core_counts: Maximum number of cores for a bucket
* @trl_freq_mhz: Frequencies per trl level for each bucket

View File

@@ -4479,8 +4479,11 @@ static struct scx_sched *scx_alloc_and_add_sched(struct sched_ext_ops *ops)
goto err_free_gdsqs;
sch->helper = kthread_run_worker(0, "sched_ext_helper");
if (!sch->helper)
if (IS_ERR(sch->helper)) {
ret = PTR_ERR(sch->helper);
goto err_free_pcpu;
}
sched_set_fifo(sch->helper->task);
atomic_set(&sch->exit_kind, SCX_EXIT_NONE);

View File

@@ -68,10 +68,20 @@ static void check_element(mempool_t *pool, void *element)
} else if (pool->free == mempool_free_pages) {
/* Mempools backed by page allocator */
int order = (int)(long)pool->pool_data;
void *addr = kmap_local_page((struct page *)element);
__check_element(pool, addr, 1UL << (PAGE_SHIFT + order));
kunmap_local(addr);
#ifdef CONFIG_HIGHMEM
for (int i = 0; i < (1 << order); i++) {
struct page *page = (struct page *)element;
void *addr = kmap_local_page(page + i);
__check_element(pool, addr, PAGE_SIZE);
kunmap_local(addr);
}
#else
void *addr = page_address((struct page *)element);
__check_element(pool, addr, PAGE_SIZE << order);
#endif
}
}
@@ -97,10 +107,20 @@ static void poison_element(mempool_t *pool, void *element)
} else if (pool->alloc == mempool_alloc_pages) {
/* Mempools backed by page allocator */
int order = (int)(long)pool->pool_data;
void *addr = kmap_local_page((struct page *)element);
__poison_element(addr, 1UL << (PAGE_SHIFT + order));
kunmap_local(addr);
#ifdef CONFIG_HIGHMEM
for (int i = 0; i < (1 << order); i++) {
struct page *page = (struct page *)element;
void *addr = kmap_local_page(page + i);
__poison_element(addr, PAGE_SIZE);
kunmap_local(addr);
}
#else
void *addr = page_address((struct page *)element);
__poison_element(addr, PAGE_SIZE << order);
#endif
}
}
#else /* CONFIG_SLUB_DEBUG_ON */

View File

@@ -443,6 +443,9 @@ static int generic_hwtstamp_ioctl_lower(struct net_device *dev, int cmd,
struct ifreq ifrr;
int err;
if (!kernel_cfg->ifr)
return -EINVAL;
strscpy_pad(ifrr.ifr_name, dev->name, IFNAMSIZ);
ifrr.ifr_ifru = kernel_cfg->ifr->ifr_ifru;

View File

@@ -828,13 +828,15 @@ void devl_rate_nodes_destroy(struct devlink *devlink)
if (!devlink_rate->parent)
continue;
refcount_dec(&devlink_rate->parent->refcnt);
if (devlink_rate_is_leaf(devlink_rate))
ops->rate_leaf_parent_set(devlink_rate, NULL, devlink_rate->priv,
NULL, NULL);
else if (devlink_rate_is_node(devlink_rate))
ops->rate_node_parent_set(devlink_rate, NULL, devlink_rate->priv,
NULL, NULL);
refcount_dec(&devlink_rate->parent->refcnt);
devlink_rate->parent = NULL;
}
list_for_each_entry_safe(devlink_rate, tmp, &devlink->rate_list, list) {
if (devlink_rate_is_node(devlink_rate)) {

View File

@@ -122,8 +122,10 @@ static struct sk_buff *xfrm4_tunnel_gso_segment(struct xfrm_state *x,
struct sk_buff *skb,
netdev_features_t features)
{
__be16 type = x->inner_mode.family == AF_INET6 ? htons(ETH_P_IPV6)
: htons(ETH_P_IP);
const struct xfrm_mode *inner_mode = xfrm_ip2inner_mode(x,
XFRM_MODE_SKB_CB(skb)->protocol);
__be16 type = inner_mode->family == AF_INET6 ? htons(ETH_P_IPV6)
: htons(ETH_P_IP);
return skb_eth_gso_segment(skb, features, type);
}

View File

@@ -158,8 +158,10 @@ static struct sk_buff *xfrm6_tunnel_gso_segment(struct xfrm_state *x,
struct sk_buff *skb,
netdev_features_t features)
{
__be16 type = x->inner_mode.family == AF_INET ? htons(ETH_P_IP)
: htons(ETH_P_IPV6);
const struct xfrm_mode *inner_mode = xfrm_ip2inner_mode(x,
XFRM_MODE_SKB_CB(skb)->protocol);
__be16 type = inner_mode->family == AF_INET ? htons(ETH_P_IP)
: htons(ETH_P_IPV6);
return skb_eth_gso_segment(skb, features, type);
}

View File

@@ -1246,9 +1246,9 @@ static int l2tp_xmit_core(struct l2tp_session *session, struct sk_buff *skb, uns
else
l2tp_build_l2tpv3_header(session, __skb_push(skb, session->hdr_len));
/* Reset skb netfilter state */
memset(&(IPCB(skb)->opt), 0, sizeof(IPCB(skb)->opt));
IPCB(skb)->flags &= ~(IPSKB_XFRM_TUNNEL_SIZE | IPSKB_XFRM_TRANSFORMED | IPSKB_REROUTED);
/* Reset control buffer */
memset(skb->cb, 0, sizeof(skb->cb));
nf_reset_ct(skb);
/* L2TP uses its own lockdep subclass to avoid lockdep splats caused by

View File

@@ -838,8 +838,11 @@ bool mptcp_established_options(struct sock *sk, struct sk_buff *skb,
opts->suboptions = 0;
/* Force later mptcp_write_options(), but do not use any actual
* option space.
*/
if (unlikely(__mptcp_check_fallback(msk) && !mptcp_check_infinite_map(skb)))
return false;
return true;
if (unlikely(skb && TCP_SKB_CB(skb)->tcp_flags & TCPHDR_RST)) {
if (mptcp_established_options_fastclose(sk, &opt_size, remaining, opts) ||
@@ -1041,6 +1044,31 @@ static void __mptcp_snd_una_update(struct mptcp_sock *msk, u64 new_snd_una)
WRITE_ONCE(msk->snd_una, new_snd_una);
}
static void rwin_update(struct mptcp_sock *msk, struct sock *ssk,
struct sk_buff *skb)
{
struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
struct tcp_sock *tp = tcp_sk(ssk);
u64 mptcp_rcv_wnd;
/* Avoid touching extra cachelines if TCP is going to accept this
* skb without filling the TCP-level window even with a possibly
* outdated mptcp-level rwin.
*/
if (!skb->len || skb->len < tcp_receive_window(tp))
return;
mptcp_rcv_wnd = atomic64_read(&msk->rcv_wnd_sent);
if (!after64(mptcp_rcv_wnd, subflow->rcv_wnd_sent))
return;
/* Some other subflow grew the mptcp-level rwin since rcv_wup,
* resync.
*/
tp->rcv_wnd += mptcp_rcv_wnd - subflow->rcv_wnd_sent;
subflow->rcv_wnd_sent = mptcp_rcv_wnd;
}
static void ack_update_msk(struct mptcp_sock *msk,
struct sock *ssk,
struct mptcp_options_received *mp_opt)
@@ -1208,6 +1236,7 @@ bool mptcp_incoming_options(struct sock *sk, struct sk_buff *skb)
*/
if (mp_opt.use_ack)
ack_update_msk(msk, sk, &mp_opt);
rwin_update(msk, sk, skb);
/* Zero-data-length packets are dropped by the caller and not
* propagated to the MPTCP layer, so the skb extension does not
@@ -1294,6 +1323,10 @@ static void mptcp_set_rwin(struct tcp_sock *tp, struct tcphdr *th)
if (rcv_wnd_new != rcv_wnd_old) {
raise_win:
/* The msk-level rcv wnd is after the tcp level one,
* sync the latter.
*/
rcv_wnd_new = rcv_wnd_old;
win = rcv_wnd_old - ack_seq;
tp->rcv_wnd = min_t(u64, win, U32_MAX);
new_win = tp->rcv_wnd;
@@ -1317,6 +1350,21 @@ raise_win:
update_wspace:
WRITE_ONCE(msk->old_wspace, tp->rcv_wnd);
subflow->rcv_wnd_sent = rcv_wnd_new;
}
static void mptcp_track_rwin(struct tcp_sock *tp)
{
const struct sock *ssk = (const struct sock *)tp;
struct mptcp_subflow_context *subflow;
struct mptcp_sock *msk;
if (!ssk)
return;
subflow = mptcp_subflow_ctx(ssk);
msk = mptcp_sk(subflow->conn);
WRITE_ONCE(msk->old_wspace, tp->rcv_wnd);
}
__sum16 __mptcp_make_csum(u64 data_seq, u32 subflow_seq, u16 data_len, __wsum sum)
@@ -1611,6 +1659,10 @@ mp_rst:
opts->reset_transient,
opts->reset_reason);
return;
} else if (unlikely(!opts->suboptions)) {
/* Fallback to TCP */
mptcp_track_rwin(tp);
return;
}
if (OPTION_MPTCP_PRIO & opts->suboptions) {

View File

@@ -18,6 +18,7 @@ struct mptcp_pm_add_entry {
u8 retrans_times;
struct timer_list add_timer;
struct mptcp_sock *sock;
struct rcu_head rcu;
};
static DEFINE_SPINLOCK(mptcp_pm_list_lock);
@@ -155,7 +156,7 @@ bool mptcp_remove_anno_list_by_saddr(struct mptcp_sock *msk,
entry = mptcp_pm_del_add_timer(msk, addr, false);
ret = entry;
kfree(entry);
kfree_rcu(entry, rcu);
return ret;
}
@@ -345,22 +346,27 @@ mptcp_pm_del_add_timer(struct mptcp_sock *msk,
{
struct mptcp_pm_add_entry *entry;
struct sock *sk = (struct sock *)msk;
struct timer_list *add_timer = NULL;
bool stop_timer = false;
rcu_read_lock();
spin_lock_bh(&msk->pm.lock);
entry = mptcp_lookup_anno_list_by_saddr(msk, addr);
if (entry && (!check_id || entry->addr.id == addr->id)) {
entry->retrans_times = ADD_ADDR_RETRANS_MAX;
add_timer = &entry->add_timer;
stop_timer = true;
}
if (!check_id && entry)
list_del(&entry->list);
spin_unlock_bh(&msk->pm.lock);
/* no lock, because sk_stop_timer_sync() is calling timer_delete_sync() */
if (add_timer)
sk_stop_timer_sync(sk, add_timer);
/* Note: entry might have been removed by another thread.
* We hold rcu_read_lock() to ensure it is not freed under us.
*/
if (stop_timer)
sk_stop_timer_sync(sk, &entry->add_timer);
rcu_read_unlock();
return entry;
}
@@ -415,7 +421,7 @@ static void mptcp_pm_free_anno_list(struct mptcp_sock *msk)
list_for_each_entry_safe(entry, tmp, &free_list, list) {
sk_stop_timer_sync(sk, &entry->add_timer);
kfree(entry);
kfree_rcu(entry, rcu);
}
}

View File

@@ -672,7 +672,7 @@ static void mptcp_pm_nl_add_addr_received(struct mptcp_sock *msk)
void mptcp_pm_nl_rm_addr(struct mptcp_sock *msk, u8 rm_id)
{
if (rm_id && WARN_ON_ONCE(msk->pm.add_addr_accepted == 0)) {
if (rm_id && !WARN_ON_ONCE(msk->pm.add_addr_accepted == 0)) {
u8 limit_add_addr_accepted =
mptcp_pm_get_limit_add_addr_accepted(msk);

View File

@@ -78,6 +78,13 @@ bool __mptcp_try_fallback(struct mptcp_sock *msk, int fb_mib)
if (__mptcp_check_fallback(msk))
return true;
/* The caller possibly is not holding the msk socket lock, but
* in the fallback case only the current subflow is touching
* the OoO queue.
*/
if (!RB_EMPTY_ROOT(&msk->out_of_order_queue))
return false;
spin_lock_bh(&msk->fallback_lock);
if (!msk->allow_infinite_fallback) {
spin_unlock_bh(&msk->fallback_lock);
@@ -937,14 +944,19 @@ static void mptcp_reset_rtx_timer(struct sock *sk)
bool mptcp_schedule_work(struct sock *sk)
{
if (inet_sk_state_load(sk) != TCP_CLOSE &&
schedule_work(&mptcp_sk(sk)->work)) {
/* each subflow already holds a reference to the sk, and the
* workqueue is invoked by a subflow, so sk can't go away here.
*/
sock_hold(sk);
if (inet_sk_state_load(sk) == TCP_CLOSE)
return false;
/* Get a reference on this socket, mptcp_worker() will release it.
* As mptcp_worker() might complete before us, we can not avoid
* a sock_hold()/sock_put() if schedule_work() returns false.
*/
sock_hold(sk);
if (schedule_work(&mptcp_sk(sk)->work))
return true;
}
sock_put(sk);
return false;
}
@@ -2399,7 +2411,6 @@ bool __mptcp_retransmit_pending_data(struct sock *sk)
/* flags for __mptcp_close_ssk() */
#define MPTCP_CF_PUSH BIT(1)
#define MPTCP_CF_FASTCLOSE BIT(2)
/* be sure to send a reset only if the caller asked for it, also
* clean completely the subflow status when the subflow reaches
@@ -2410,7 +2421,7 @@ static void __mptcp_subflow_disconnect(struct sock *ssk,
unsigned int flags)
{
if (((1 << ssk->sk_state) & (TCPF_CLOSE | TCPF_LISTEN)) ||
(flags & MPTCP_CF_FASTCLOSE)) {
subflow->send_fastclose) {
/* The MPTCP code never wait on the subflow sockets, TCP-level
* disconnect should never fail
*/
@@ -2457,14 +2468,8 @@ static void __mptcp_close_ssk(struct sock *sk, struct sock *ssk,
lock_sock_nested(ssk, SINGLE_DEPTH_NESTING);
if ((flags & MPTCP_CF_FASTCLOSE) && !__mptcp_check_fallback(msk)) {
/* be sure to force the tcp_close path
* to generate the egress reset
*/
ssk->sk_lingertime = 0;
sock_set_flag(ssk, SOCK_LINGER);
subflow->send_fastclose = 1;
}
if (subflow->send_fastclose && ssk->sk_state != TCP_CLOSE)
tcp_set_state(ssk, TCP_CLOSE);
need_push = (flags & MPTCP_CF_PUSH) && __mptcp_retransmit_pending_data(sk);
if (!dispose_it) {
@@ -2560,7 +2565,8 @@ static void __mptcp_close_subflow(struct sock *sk)
if (ssk_state != TCP_CLOSE &&
(ssk_state != TCP_CLOSE_WAIT ||
inet_sk_state_load(sk) != TCP_ESTABLISHED))
inet_sk_state_load(sk) != TCP_ESTABLISHED ||
__mptcp_check_fallback(msk)))
continue;
/* 'subflow_data_ready' will re-sched once rx queue is empty */
@@ -2768,9 +2774,26 @@ static void mptcp_do_fastclose(struct sock *sk)
struct mptcp_sock *msk = mptcp_sk(sk);
mptcp_set_state(sk, TCP_CLOSE);
mptcp_for_each_subflow_safe(msk, subflow, tmp)
__mptcp_close_ssk(sk, mptcp_subflow_tcp_sock(subflow),
subflow, MPTCP_CF_FASTCLOSE);
/* Explicitly send the fastclose reset as need */
if (__mptcp_check_fallback(msk))
return;
mptcp_for_each_subflow_safe(msk, subflow, tmp) {
struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
lock_sock(ssk);
/* Some subflow socket states don't allow/need a reset.*/
if ((1 << ssk->sk_state) & (TCPF_LISTEN | TCPF_CLOSE))
goto unlock;
subflow->send_fastclose = 1;
tcp_send_active_reset(ssk, ssk->sk_allocation,
SK_RST_REASON_TCP_ABORT_ON_CLOSE);
unlock:
release_sock(ssk);
}
}
static void mptcp_worker(struct work_struct *work)
@@ -2797,7 +2820,11 @@ static void mptcp_worker(struct work_struct *work)
__mptcp_close_subflow(sk);
if (mptcp_close_tout_expired(sk)) {
struct mptcp_subflow_context *subflow, *tmp;
mptcp_do_fastclose(sk);
mptcp_for_each_subflow_safe(msk, subflow, tmp)
__mptcp_close_ssk(sk, subflow->tcp_sock, subflow, 0);
mptcp_close_wake_up(sk);
}
@@ -3222,7 +3249,8 @@ static int mptcp_disconnect(struct sock *sk, int flags)
/* msk->subflow is still intact, the following will not free the first
* subflow
*/
mptcp_destroy_common(msk, MPTCP_CF_FASTCLOSE);
mptcp_do_fastclose(sk);
mptcp_destroy_common(msk);
/* The first subflow is already in TCP_CLOSE status, the following
* can't overlap with a fallback anymore
@@ -3401,7 +3429,7 @@ void mptcp_rcv_space_init(struct mptcp_sock *msk, const struct sock *ssk)
msk->rcvq_space.space = TCP_INIT_CWND * TCP_MSS_DEFAULT;
}
void mptcp_destroy_common(struct mptcp_sock *msk, unsigned int flags)
void mptcp_destroy_common(struct mptcp_sock *msk)
{
struct mptcp_subflow_context *subflow, *tmp;
struct sock *sk = (struct sock *)msk;
@@ -3410,7 +3438,7 @@ void mptcp_destroy_common(struct mptcp_sock *msk, unsigned int flags)
/* join list will be eventually flushed (with rst) at sock lock release time */
mptcp_for_each_subflow_safe(msk, subflow, tmp)
__mptcp_close_ssk(sk, mptcp_subflow_tcp_sock(subflow), subflow, flags);
__mptcp_close_ssk(sk, mptcp_subflow_tcp_sock(subflow), subflow, 0);
__skb_queue_purge(&sk->sk_receive_queue);
skb_rbtree_purge(&msk->out_of_order_queue);
@@ -3428,7 +3456,7 @@ static void mptcp_destroy(struct sock *sk)
/* allow the following to close even the initial subflow */
msk->free_first = 1;
mptcp_destroy_common(msk, 0);
mptcp_destroy_common(msk);
sk_sockets_allocated_dec(sk);
}

View File

@@ -509,6 +509,7 @@ struct mptcp_subflow_context {
u64 remote_key;
u64 idsn;
u64 map_seq;
u64 rcv_wnd_sent;
u32 snd_isn;
u32 token;
u32 rel_write_seq;
@@ -976,7 +977,7 @@ static inline void mptcp_propagate_sndbuf(struct sock *sk, struct sock *ssk)
local_bh_enable();
}
void mptcp_destroy_common(struct mptcp_sock *msk, unsigned int flags);
void mptcp_destroy_common(struct mptcp_sock *msk);
#define MPTCP_TOKEN_MAX_RETRIES 4

View File

@@ -572,69 +572,6 @@ static int set_ipv6(struct sk_buff *skb, struct sw_flow_key *flow_key,
return 0;
}
static int set_nsh(struct sk_buff *skb, struct sw_flow_key *flow_key,
const struct nlattr *a)
{
struct nshhdr *nh;
size_t length;
int err;
u8 flags;
u8 ttl;
int i;
struct ovs_key_nsh key;
struct ovs_key_nsh mask;
err = nsh_key_from_nlattr(a, &key, &mask);
if (err)
return err;
/* Make sure the NSH base header is there */
if (!pskb_may_pull(skb, skb_network_offset(skb) + NSH_BASE_HDR_LEN))
return -ENOMEM;
nh = nsh_hdr(skb);
length = nsh_hdr_len(nh);
/* Make sure the whole NSH header is there */
err = skb_ensure_writable(skb, skb_network_offset(skb) +
length);
if (unlikely(err))
return err;
nh = nsh_hdr(skb);
skb_postpull_rcsum(skb, nh, length);
flags = nsh_get_flags(nh);
flags = OVS_MASKED(flags, key.base.flags, mask.base.flags);
flow_key->nsh.base.flags = flags;
ttl = nsh_get_ttl(nh);
ttl = OVS_MASKED(ttl, key.base.ttl, mask.base.ttl);
flow_key->nsh.base.ttl = ttl;
nsh_set_flags_and_ttl(nh, flags, ttl);
nh->path_hdr = OVS_MASKED(nh->path_hdr, key.base.path_hdr,
mask.base.path_hdr);
flow_key->nsh.base.path_hdr = nh->path_hdr;
switch (nh->mdtype) {
case NSH_M_TYPE1:
for (i = 0; i < NSH_MD1_CONTEXT_SIZE; i++) {
nh->md1.context[i] =
OVS_MASKED(nh->md1.context[i], key.context[i],
mask.context[i]);
}
memcpy(flow_key->nsh.context, nh->md1.context,
sizeof(nh->md1.context));
break;
case NSH_M_TYPE2:
memset(flow_key->nsh.context, 0,
sizeof(flow_key->nsh.context));
break;
default:
return -EINVAL;
}
skb_postpush_rcsum(skb, nh, length);
return 0;
}
/* Must follow skb_ensure_writable() since that can move the skb data. */
static void set_tp_port(struct sk_buff *skb, __be16 *port,
__be16 new_port, __sum16 *check)
@@ -1130,10 +1067,6 @@ static int execute_masked_set_action(struct sk_buff *skb,
get_mask(a, struct ovs_key_ethernet *));
break;
case OVS_KEY_ATTR_NSH:
err = set_nsh(skb, flow_key, a);
break;
case OVS_KEY_ATTR_IPV4:
err = set_ipv4(skb, flow_key, nla_data(a),
get_mask(a, struct ovs_key_ipv4 *));
@@ -1170,6 +1103,7 @@ static int execute_masked_set_action(struct sk_buff *skb,
case OVS_KEY_ATTR_CT_LABELS:
case OVS_KEY_ATTR_CT_ORIG_TUPLE_IPV4:
case OVS_KEY_ATTR_CT_ORIG_TUPLE_IPV6:
case OVS_KEY_ATTR_NSH:
err = -EINVAL;
break;
}

View File

@@ -1305,6 +1305,11 @@ static int metadata_from_nlattrs(struct net *net, struct sw_flow_match *match,
return 0;
}
/*
* Constructs NSH header 'nh' from attributes of OVS_ACTION_ATTR_PUSH_NSH,
* where 'nh' points to a memory block of 'size' bytes. It's assumed that
* attributes were previously validated with validate_push_nsh().
*/
int nsh_hdr_from_nlattr(const struct nlattr *attr,
struct nshhdr *nh, size_t size)
{
@@ -1314,8 +1319,6 @@ int nsh_hdr_from_nlattr(const struct nlattr *attr,
u8 ttl = 0;
int mdlen = 0;
/* validate_nsh has check this, so we needn't do duplicate check here
*/
if (size < NSH_BASE_HDR_LEN)
return -ENOBUFS;
@@ -1359,46 +1362,6 @@ int nsh_hdr_from_nlattr(const struct nlattr *attr,
return 0;
}
int nsh_key_from_nlattr(const struct nlattr *attr,
struct ovs_key_nsh *nsh, struct ovs_key_nsh *nsh_mask)
{
struct nlattr *a;
int rem;
/* validate_nsh has check this, so we needn't do duplicate check here
*/
nla_for_each_nested(a, attr, rem) {
int type = nla_type(a);
switch (type) {
case OVS_NSH_KEY_ATTR_BASE: {
const struct ovs_nsh_key_base *base = nla_data(a);
const struct ovs_nsh_key_base *base_mask = base + 1;
nsh->base = *base;
nsh_mask->base = *base_mask;
break;
}
case OVS_NSH_KEY_ATTR_MD1: {
const struct ovs_nsh_key_md1 *md1 = nla_data(a);
const struct ovs_nsh_key_md1 *md1_mask = md1 + 1;
memcpy(nsh->context, md1->context, sizeof(*md1));
memcpy(nsh_mask->context, md1_mask->context,
sizeof(*md1_mask));
break;
}
case OVS_NSH_KEY_ATTR_MD2:
/* Not supported yet */
return -ENOTSUPP;
default:
return -EINVAL;
}
}
return 0;
}
static int nsh_key_put_from_nlattr(const struct nlattr *attr,
struct sw_flow_match *match, bool is_mask,
bool is_push_nsh, bool log)
@@ -2839,17 +2802,13 @@ static int validate_and_copy_set_tun(const struct nlattr *attr,
return err;
}
static bool validate_nsh(const struct nlattr *attr, bool is_mask,
bool is_push_nsh, bool log)
static bool validate_push_nsh(const struct nlattr *attr, bool log)
{
struct sw_flow_match match;
struct sw_flow_key key;
int ret = 0;
ovs_match_init(&match, &key, true, NULL);
ret = nsh_key_put_from_nlattr(attr, &match, is_mask,
is_push_nsh, log);
return !ret;
return !nsh_key_put_from_nlattr(attr, &match, false, true, log);
}
/* Return false if there are any non-masked bits set.
@@ -2997,13 +2956,6 @@ static int validate_set(const struct nlattr *a,
break;
case OVS_KEY_ATTR_NSH:
if (eth_type != htons(ETH_P_NSH))
return -EINVAL;
if (!validate_nsh(nla_data(a), masked, false, log))
return -EINVAL;
break;
default:
return -EINVAL;
}
@@ -3437,7 +3389,7 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr,
return -EINVAL;
}
mac_proto = MAC_PROTO_NONE;
if (!validate_nsh(nla_data(a), false, true, true))
if (!validate_push_nsh(nla_data(a), log))
return -EINVAL;
break;

View File

@@ -65,8 +65,6 @@ int ovs_nla_put_actions(const struct nlattr *attr,
void ovs_nla_free_flow_actions(struct sw_flow_actions *);
void ovs_nla_free_flow_actions_rcu(struct sw_flow_actions *);
int nsh_key_from_nlattr(const struct nlattr *attr, struct ovs_key_nsh *nsh,
struct ovs_key_nsh *nsh_mask);
int nsh_hdr_from_nlattr(const struct nlattr *attr, struct nshhdr *nh,
size_t size);

View File

@@ -2954,6 +2954,7 @@ static int unix_stream_read_generic(struct unix_stream_read_state *state,
u = unix_sk(sk);
redo:
/* Lock the socket to prevent queue disordering
* while sleeps in memcpy_tomsg
*/
@@ -2965,7 +2966,6 @@ static int unix_stream_read_generic(struct unix_stream_read_state *state,
struct sk_buff *skb, *last;
int chunk;
redo:
unix_state_lock(sk);
if (sock_flag(sk, SOCK_DEAD)) {
err = -ECONNRESET;
@@ -3015,7 +3015,6 @@ again:
goto out;
}
mutex_lock(&u->iolock);
goto redo;
unlock:
unix_state_unlock(sk);

View File

@@ -1661,18 +1661,40 @@ static int vsock_connect(struct socket *sock, struct sockaddr *addr,
timeout = schedule_timeout(timeout);
lock_sock(sk);
if (signal_pending(current)) {
err = sock_intr_errno(timeout);
sk->sk_state = sk->sk_state == TCP_ESTABLISHED ? TCP_CLOSING : TCP_CLOSE;
sock->state = SS_UNCONNECTED;
vsock_transport_cancel_pkt(vsk);
vsock_remove_connected(vsk);
goto out_wait;
} else if ((sk->sk_state != TCP_ESTABLISHED) && (timeout == 0)) {
err = -ETIMEDOUT;
/* Connection established. Whatever happens to socket once we
* release it, that's not connect()'s concern. No need to go
* into signal and timeout handling. Call it a day.
*
* Note that allowing to "reset" an already established socket
* here is racy and insecure.
*/
if (sk->sk_state == TCP_ESTABLISHED)
break;
/* If connection was _not_ established and a signal/timeout came
* to be, we want the socket's state reset. User space may want
* to retry.
*
* sk_state != TCP_ESTABLISHED implies that socket is not on
* vsock_connected_table. We keep the binding and the transport
* assigned.
*/
if (signal_pending(current) || timeout == 0) {
err = timeout == 0 ? -ETIMEDOUT : sock_intr_errno(timeout);
/* Listener might have already responded with
* VIRTIO_VSOCK_OP_RESPONSE. Its handling expects our
* sk_state == TCP_SYN_SENT, which hereby we break.
* In such case VIRTIO_VSOCK_OP_RST will follow.
*/
sk->sk_state = TCP_CLOSE;
sock->state = SS_UNCONNECTED;
/* Try to cancel VIRTIO_VSOCK_OP_REQUEST skb sent out by
* transport->connect().
*/
vsock_transport_cancel_pkt(vsk);
goto out_wait;
}

View File

@@ -438,7 +438,7 @@ ok:
check_tunnel_size = x->xso.type == XFRM_DEV_OFFLOAD_PACKET &&
x->props.mode == XFRM_MODE_TUNNEL;
switch (x->inner_mode.family) {
switch (skb_dst(skb)->ops->family) {
case AF_INET:
/* Check for IPv4 options */
if (ip_hdr(skb)->ihl != 5)

View File

@@ -698,7 +698,7 @@ static void xfrm_get_inner_ipproto(struct sk_buff *skb, struct xfrm_state *x)
return;
if (x->outer_mode.encap == XFRM_MODE_TUNNEL) {
switch (x->outer_mode.family) {
switch (skb_dst(skb)->ops->family) {
case AF_INET:
xo->inner_ipproto = ip_hdr(skb)->protocol;
break;
@@ -772,8 +772,12 @@ int xfrm_output(struct sock *sk, struct sk_buff *skb)
/* Exclusive direct xmit for tunnel mode, as
* some filtering or matching rules may apply
* in transport mode.
* Locally generated packets also require
* the normal XFRM path for L2 header setup,
* as the hardware needs the L2 header to match
* for encryption, so skip direct output as well.
*/
if (x->props.mode == XFRM_MODE_TUNNEL)
if (x->props.mode == XFRM_MODE_TUNNEL && !skb->sk)
return xfrm_dev_direct_output(sk, x, skb);
return xfrm_output_resume(sk, skb, 0);

View File

@@ -592,6 +592,7 @@ void xfrm_state_free(struct xfrm_state *x)
}
EXPORT_SYMBOL(xfrm_state_free);
static void xfrm_state_delete_tunnel(struct xfrm_state *x);
static void xfrm_state_gc_destroy(struct xfrm_state *x)
{
if (x->mode_cbs && x->mode_cbs->destroy_state)
@@ -607,6 +608,7 @@ static void xfrm_state_gc_destroy(struct xfrm_state *x)
kfree(x->replay_esn);
kfree(x->preplay_esn);
xfrm_unset_type_offload(x);
xfrm_state_delete_tunnel(x);
if (x->type) {
x->type->destructor(x);
xfrm_put_type(x->type);
@@ -806,7 +808,6 @@ void __xfrm_state_destroy(struct xfrm_state *x)
}
EXPORT_SYMBOL(__xfrm_state_destroy);
static void xfrm_state_delete_tunnel(struct xfrm_state *x);
int __xfrm_state_delete(struct xfrm_state *x)
{
struct net *net = xs_net(x);
@@ -2073,6 +2074,7 @@ static struct xfrm_state *xfrm_state_clone_and_setup(struct xfrm_state *orig,
return x;
error:
x->km.state = XFRM_STATE_DEAD;
xfrm_state_put(x);
out:
return NULL;
@@ -2157,11 +2159,15 @@ struct xfrm_state *xfrm_state_migrate(struct xfrm_state *x,
xfrm_state_insert(xc);
} else {
if (xfrm_state_add(xc) < 0)
goto error;
goto error_add;
}
return xc;
error_add:
if (xuo)
xfrm_dev_state_delete(xc);
error:
xc->km.state = XFRM_STATE_DEAD;
xfrm_state_put(xc);
return NULL;
}
@@ -2191,14 +2197,18 @@ int xfrm_state_update(struct xfrm_state *x)
}
if (x1->km.state == XFRM_STATE_ACQ) {
if (x->dir && x1->dir != x->dir)
if (x->dir && x1->dir != x->dir) {
to_put = x1;
goto out;
}
__xfrm_state_insert(x);
x = NULL;
} else {
if (x1->dir != x->dir)
if (x1->dir != x->dir) {
to_put = x1;
goto out;
}
}
err = 0;
@@ -3298,6 +3308,7 @@ out_bydst:
void xfrm_state_fini(struct net *net)
{
unsigned int sz;
int i;
flush_work(&net->xfrm.state_hash_work);
xfrm_state_flush(net, 0, false);
@@ -3305,14 +3316,17 @@ void xfrm_state_fini(struct net *net)
WARN_ON(!list_empty(&net->xfrm.state_all));
for (i = 0; i <= net->xfrm.state_hmask; i++) {
WARN_ON(!hlist_empty(net->xfrm.state_byseq + i));
WARN_ON(!hlist_empty(net->xfrm.state_byspi + i));
WARN_ON(!hlist_empty(net->xfrm.state_bysrc + i));
WARN_ON(!hlist_empty(net->xfrm.state_bydst + i));
}
sz = (net->xfrm.state_hmask + 1) * sizeof(struct hlist_head);
WARN_ON(!hlist_empty(net->xfrm.state_byseq));
xfrm_hash_free(net->xfrm.state_byseq, sz);
WARN_ON(!hlist_empty(net->xfrm.state_byspi));
xfrm_hash_free(net->xfrm.state_byspi, sz);
WARN_ON(!hlist_empty(net->xfrm.state_bysrc));
xfrm_hash_free(net->xfrm.state_bysrc, sz);
WARN_ON(!hlist_empty(net->xfrm.state_bydst));
xfrm_hash_free(net->xfrm.state_bydst, sz);
free_percpu(net->xfrm.state_cache_input);
}

View File

@@ -947,8 +947,11 @@ static struct xfrm_state *xfrm_state_construct(struct net *net,
if (attrs[XFRMA_SA_PCPU]) {
x->pcpu_num = nla_get_u32(attrs[XFRMA_SA_PCPU]);
if (x->pcpu_num >= num_possible_cpus())
if (x->pcpu_num >= num_possible_cpus()) {
err = -ERANGE;
NL_SET_ERR_MSG(extack, "pCPU number too big");
goto error;
}
}
err = __xfrm_init_state(x, extack);
@@ -3035,6 +3038,9 @@ static int xfrm_add_acquire(struct sk_buff *skb, struct nlmsghdr *nlh,
}
xfrm_state_free(x);
xfrm_dev_policy_delete(xp);
xfrm_dev_policy_free(xp);
security_xfrm_policy_free(xp->security);
kfree(xp);
return 0;

View File

@@ -45,6 +45,7 @@ skf_net_off
socket
so_incoming_cpu
so_netns_cookie
so_peek_off
so_txtime
so_rcv_listener
stress_reuseport_listen

View File

@@ -6,6 +6,7 @@ TEST_GEN_PROGS := \
scm_inq \
scm_pidfd \
scm_rights \
so_peek_off \
unix_connect \
# end of TEST_GEN_PROGS

View File

@@ -0,0 +1,162 @@
// SPDX-License-Identifier: GPL-2.0
/* Copyright 2025 Google LLC */
#include <stdlib.h>
#include <unistd.h>
#include <sys/socket.h>
#include "../../kselftest_harness.h"
FIXTURE(so_peek_off)
{
int fd[2]; /* 0: sender, 1: receiver */
};
FIXTURE_VARIANT(so_peek_off)
{
int type;
};
FIXTURE_VARIANT_ADD(so_peek_off, stream)
{
.type = SOCK_STREAM,
};
FIXTURE_VARIANT_ADD(so_peek_off, dgram)
{
.type = SOCK_DGRAM,
};
FIXTURE_VARIANT_ADD(so_peek_off, seqpacket)
{
.type = SOCK_SEQPACKET,
};
FIXTURE_SETUP(so_peek_off)
{
struct timeval timeout = {
.tv_sec = 0,
.tv_usec = 3000,
};
int ret;
ret = socketpair(AF_UNIX, variant->type, 0, self->fd);
ASSERT_EQ(0, ret);
ret = setsockopt(self->fd[1], SOL_SOCKET, SO_RCVTIMEO_NEW,
&timeout, sizeof(timeout));
ASSERT_EQ(0, ret);
ret = setsockopt(self->fd[1], SOL_SOCKET, SO_PEEK_OFF,
&(int){0}, sizeof(int));
ASSERT_EQ(0, ret);
}
FIXTURE_TEARDOWN(so_peek_off)
{
close_range(self->fd[0], self->fd[1], 0);
}
#define sendeq(fd, str, flags) \
do { \
int bytes, len = strlen(str); \
\
bytes = send(fd, str, len, flags); \
ASSERT_EQ(len, bytes); \
} while (0)
#define recveq(fd, str, buflen, flags) \
do { \
char buf[(buflen) + 1] = {}; \
int bytes; \
\
bytes = recv(fd, buf, buflen, flags); \
ASSERT_NE(-1, bytes); \
ASSERT_STREQ(str, buf); \
} while (0)
#define async \
for (pid_t pid = (pid = fork(), \
pid < 0 ? \
__TH_LOG("Failed to start async {}"), \
_metadata->exit_code = KSFT_FAIL, \
__bail(1, _metadata), \
0xdead : \
pid); \
!pid; exit(0))
TEST_F(so_peek_off, single_chunk)
{
sendeq(self->fd[0], "aaaabbbb", 0);
recveq(self->fd[1], "aaaa", 4, MSG_PEEK);
recveq(self->fd[1], "bbbb", 100, MSG_PEEK);
}
TEST_F(so_peek_off, two_chunks)
{
sendeq(self->fd[0], "aaaa", 0);
sendeq(self->fd[0], "bbbb", 0);
recveq(self->fd[1], "aaaa", 4, MSG_PEEK);
recveq(self->fd[1], "bbbb", 100, MSG_PEEK);
}
TEST_F(so_peek_off, two_chunks_blocking)
{
async {
usleep(1000);
sendeq(self->fd[0], "aaaa", 0);
}
recveq(self->fd[1], "aaaa", 4, MSG_PEEK);
async {
usleep(1000);
sendeq(self->fd[0], "bbbb", 0);
}
/* goto again; -> goto redo; in unix_stream_read_generic(). */
recveq(self->fd[1], "bbbb", 100, MSG_PEEK);
}
TEST_F(so_peek_off, two_chunks_overlap)
{
sendeq(self->fd[0], "aaaa", 0);
recveq(self->fd[1], "aa", 2, MSG_PEEK);
sendeq(self->fd[0], "bbbb", 0);
if (variant->type == SOCK_STREAM) {
/* SOCK_STREAM tries to fill the buffer. */
recveq(self->fd[1], "aabb", 4, MSG_PEEK);
recveq(self->fd[1], "bb", 100, MSG_PEEK);
} else {
/* SOCK_DGRAM and SOCK_SEQPACKET returns at the skb boundary. */
recveq(self->fd[1], "aa", 100, MSG_PEEK);
recveq(self->fd[1], "bbbb", 100, MSG_PEEK);
}
}
TEST_F(so_peek_off, two_chunks_overlap_blocking)
{
async {
usleep(1000);
sendeq(self->fd[0], "aaaa", 0);
}
recveq(self->fd[1], "aa", 2, MSG_PEEK);
async {
usleep(1000);
sendeq(self->fd[0], "bbbb", 0);
}
/* Even SOCK_STREAM does not wait if at least one byte is read. */
recveq(self->fd[1], "aa", 100, MSG_PEEK);
recveq(self->fd[1], "bbbb", 100, MSG_PEEK);
}
TEST_HARNESS_MAIN

View File

@@ -30,6 +30,11 @@ tfail()
do_test "tfail" false
}
tfail2()
{
do_test "tfail2" false
}
txfail()
{
FAIL_TO_XFAIL=yes do_test "txfail" false
@@ -132,6 +137,8 @@ test_ret()
ret_subtest $ksft_fail "tfail" txfail tfail
ret_subtest $ksft_xfail "txfail" txfail txfail
ret_subtest $ksft_fail "tfail2" tfail2 tfail
}
exit_status_tests_run()

View File

@@ -43,7 +43,7 @@ __ksft_status_merge()
weights[$i]=$((weight++))
done
if [[ ${weights[$a]} > ${weights[$b]} ]]; then
if [[ ${weights[$a]} -ge ${weights[$b]} ]]; then
echo "$a"
return 0
else

View File

@@ -3500,7 +3500,6 @@ fullmesh_tests()
fastclose_tests()
{
if reset_check_counter "fastclose test" "MPTcpExtMPFastcloseTx"; then
MPTCP_LIB_SUBTEST_FLAKY=1
test_linkfail=1024 fastclose=client \
run_tests $ns1 $ns2 10.0.1.1
chk_join_nr 0 0 0
@@ -3509,7 +3508,6 @@ fastclose_tests()
fi
if reset_check_counter "fastclose server test" "MPTcpExtMPFastcloseRx"; then
MPTCP_LIB_SUBTEST_FLAKY=1
test_linkfail=1024 fastclose=server \
run_tests $ns1 $ns2 10.0.1.1
join_rst_nr=1 \
@@ -3806,7 +3804,7 @@ userspace_tests()
continue_if mptcp_lib_has_file '/proc/sys/net/mptcp/pm_type'; then
set_userspace_pm $ns1
pm_nl_set_limits $ns2 2 2
{ test_linkfail=128 speed=5 \
{ timeout_test=120 test_linkfail=128 speed=5 \
run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null
local tests_pid=$!
wait_mpj $ns1
@@ -3839,7 +3837,7 @@ userspace_tests()
continue_if mptcp_lib_has_file '/proc/sys/net/mptcp/pm_type'; then
set_userspace_pm $ns2
pm_nl_set_limits $ns1 0 1
{ test_linkfail=128 speed=5 \
{ timeout_test=120 test_linkfail=128 speed=5 \
run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null
local tests_pid=$!
wait_mpj $ns2
@@ -3867,7 +3865,7 @@ userspace_tests()
continue_if mptcp_lib_has_file '/proc/sys/net/mptcp/pm_type'; then
set_userspace_pm $ns2
pm_nl_set_limits $ns1 0 1
{ test_linkfail=128 speed=5 \
{ timeout_test=120 test_linkfail=128 speed=5 \
run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null
local tests_pid=$!
wait_mpj $ns2
@@ -3888,7 +3886,7 @@ userspace_tests()
continue_if mptcp_lib_has_file '/proc/sys/net/mptcp/pm_type'; then
set_userspace_pm $ns2
pm_nl_set_limits $ns1 0 1
{ test_linkfail=128 speed=5 \
{ timeout_test=120 test_linkfail=128 speed=5 \
run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null
local tests_pid=$!
wait_mpj $ns2
@@ -3912,7 +3910,7 @@ userspace_tests()
continue_if mptcp_lib_has_file '/proc/sys/net/mptcp/pm_type'; then
set_userspace_pm $ns1
pm_nl_set_limits $ns2 1 1
{ test_linkfail=128 speed=5 \
{ timeout_test=120 test_linkfail=128 speed=5 \
run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null
local tests_pid=$!
wait_mpj $ns1
@@ -3943,7 +3941,7 @@ endpoint_tests()
pm_nl_set_limits $ns1 2 2
pm_nl_set_limits $ns2 2 2
pm_nl_add_endpoint $ns1 10.0.2.1 flags signal
{ test_linkfail=128 speed=slow \
{ timeout_test=120 test_linkfail=128 speed=slow \
run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null
local tests_pid=$!
@@ -3970,7 +3968,7 @@ endpoint_tests()
pm_nl_set_limits $ns2 0 3
pm_nl_add_endpoint $ns2 10.0.1.2 id 1 dev ns2eth1 flags subflow
pm_nl_add_endpoint $ns2 10.0.2.2 id 2 dev ns2eth2 flags subflow
{ test_linkfail=128 speed=5 \
{ timeout_test=120 test_linkfail=128 speed=5 \
run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null
local tests_pid=$!
@@ -4048,7 +4046,7 @@ endpoint_tests()
# broadcast IP: no packet for this address will be received on ns1
pm_nl_add_endpoint $ns1 224.0.0.1 id 2 flags signal
pm_nl_add_endpoint $ns1 10.0.1.1 id 42 flags signal
{ test_linkfail=128 speed=5 \
{ timeout_test=120 test_linkfail=128 speed=5 \
run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null
local tests_pid=$!
@@ -4057,38 +4055,45 @@ endpoint_tests()
$ns1 10.0.2.1 id 1 flags signal
chk_subflow_nr "before delete" 2
chk_mptcp_info subflows 1 subflows 1
chk_mptcp_info add_addr_signal 2 add_addr_accepted 1
pm_nl_del_endpoint $ns1 1 10.0.2.1
pm_nl_del_endpoint $ns1 2 224.0.0.1
sleep 0.5
chk_subflow_nr "after delete" 1
chk_mptcp_info subflows 0 subflows 0
chk_mptcp_info add_addr_signal 0 add_addr_accepted 0
pm_nl_add_endpoint $ns1 10.0.2.1 id 1 flags signal
pm_nl_add_endpoint $ns1 10.0.3.1 id 2 flags signal
wait_mpj $ns2
chk_subflow_nr "after re-add" 3
chk_mptcp_info subflows 2 subflows 2
chk_mptcp_info add_addr_signal 2 add_addr_accepted 2
pm_nl_del_endpoint $ns1 42 10.0.1.1
sleep 0.5
chk_subflow_nr "after delete ID 0" 2
chk_mptcp_info subflows 2 subflows 2
chk_mptcp_info add_addr_signal 2 add_addr_accepted 2
pm_nl_add_endpoint $ns1 10.0.1.1 id 99 flags signal
wait_mpj $ns2
chk_subflow_nr "after re-add ID 0" 3
chk_mptcp_info subflows 3 subflows 3
chk_mptcp_info add_addr_signal 3 add_addr_accepted 2
pm_nl_del_endpoint $ns1 99 10.0.1.1
sleep 0.5
chk_subflow_nr "after re-delete ID 0" 2
chk_mptcp_info subflows 2 subflows 2
chk_mptcp_info add_addr_signal 2 add_addr_accepted 2
pm_nl_add_endpoint $ns1 10.0.1.1 id 88 flags signal
wait_mpj $ns2
chk_subflow_nr "after re-re-add ID 0" 3
chk_mptcp_info subflows 3 subflows 3
chk_mptcp_info add_addr_signal 3 add_addr_accepted 2
mptcp_lib_kill_group_wait $tests_pid
kill_events_pids
@@ -4121,7 +4126,7 @@ endpoint_tests()
# broadcast IP: no packet for this address will be received on ns1
pm_nl_add_endpoint $ns1 224.0.0.1 id 2 flags signal
pm_nl_add_endpoint $ns2 10.0.3.2 id 3 flags subflow
{ test_linkfail=128 speed=20 \
{ timeout_test=120 test_linkfail=128 speed=20 \
run_tests $ns1 $ns2 10.0.1.1 & } 2>/dev/null
local tests_pid=$!