linux

mirror of https://github.com/torvalds/linux.git synced 2025-12-01 07:26:02 +07:00

Author	SHA1	Message	Date
Linus Torvalds	5bc55a333a	Linux 6.13-rc7	2025-01-12 14:37:56 -08:00
Linus Torvalds	0cbe10470b	Merge tag 'char-misc-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc/IIO driver fixes from Greg KH: "Here are a bunch of small IIO and interconnect and other driver fixes to resolve reported issues. Included in here are: - loads of iio driver fixes as a result of an audit of places where uninitialized data would leak to userspace. - other smaller, and normal, iio driver fixes. - mhi driver fix - interconnect driver fixes - pci1xxxx driver fix All of these have been in linux-next for a while with no reported issues" * tag 'char-misc-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (32 commits) misc: microchip: pci1xxxx: Resolve return code mismatch during GPIO set config misc: microchip: pci1xxxx: Resolve kernel panic during GPIO IRQ handling interconnect: icc-clk: check return values of devm_kasprintf() interconnect: qcom: icc-rpm: Set the count member before accessing the flex array iio: adc: ti-ads1119: fix sample size in scan struct for triggered buffer iio: temperature: tmp006: fix information leak in triggered buffer iio: inkern: call iio_device_put() only on mapped devices iio: adc: ad9467: Fix the "don't allow reading vref if not available" case iio: adc: at91: call input_free_device() on allocated iio_dev iio: adc: ad7173: fix using shared static info struct iio: adc: ti-ads124s08: Use gpiod_set_value_cansleep() iio: adc: ti-ads1119: fix information leak in triggered buffer iio: pressure: zpa2326: fix information leak in triggered buffer iio: adc: rockchip_saradc: fix information leak in triggered buffer iio: imu: kmx61: fix information leak in triggered buffer iio: light: vcnl4035: fix information leak in triggered buffer iio: light: bh1745: fix information leak in triggered buffer iio: adc: ti-ads8688: fix information leak in triggered buffer iio: dummy: iio_simply_dummy_buffer: fix information leak in triggered buffer iio: test: Fix GTS test config ...	2025-01-12 14:34:00 -08:00
Linus Torvalds	083f9fac67	Merge tag 'driver-core-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core and debugfs fixes from Greg KH: "Here are some small driver core and debugfs fixes that resolve some reported problems: - debugfs runtime error reporting fixes - topology cpumask race-condition fix - MAINTAINERS file email update All of these have been in linux-next this week with no reported issues" * tag 'driver-core-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: fs: debugfs: fix open proxy for unsafe files MAINTAINERS: align Danilo's maintainer entries topology: Keep the cpumask unchanged when printing cpumap debugfs: fix missing mutex_destroy() in short_fops case fs: debugfs: differentiate short fops with proxy ops	2025-01-12 14:26:31 -08:00
Linus Torvalds	91fff6fa94	Merge tag 'staging-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging driver fixes from Greg KH: "Here are some small staging driver fixes that resolve some reported issues and have been in my tree for too long due to the holiday break. They resolve the following issues: - lots of gpib build-time fixes as reported by testers and 0-day - gpib logical fixes - mailmap fix All of these have been in linux-next for a while, with no reported issues other than the duplicated change" * tag 'staging-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: staging: gpib: mite: remove unused global functions staging: gpib: refer to correct config symbol in tnt4882 Makefile mailmap: update Bingwu Zhang's email address staging: gpib: fix address space mixup staging: gpib: use ioport_map staging: gpib: fix pcmcia dependencies staging: gpib: add module author and description fields staging: gpib: fix Makefiles staging: gpib: make global 'usec_diff' functions static staging: gpib: Modify mismatched function name staging: gpib: Add lower bound check for secondary address staging: gpib: Fix erroneous removal of blank before newline	2025-01-12 14:22:13 -08:00
Linus Torvalds	4bd9e3b4c5	Merge tag 'tty-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull serial driver fixes from Greg KH: "Here are three small serial driver fixes tree. They resolve some reported issues: - stm32 break control fix - 8250 runtime pm usage counter fix - imx driver locking fix All have been in my tree and linux-next for three weeks now, with no reported issues" * tag 'tty-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: serial: stm32: use port lock wrappers for break control serial: imx: Use uart_port_lock_irq() instead of uart_port_lock() tty: serial: 8250: Fix another runtime PM usage counter underflow	2025-01-12 13:27:15 -08:00
Linus Torvalds	196856db7c	Merge tag 'usb-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are some small USB driver fixes and new device ids for 6.13-rc7. Included in here are: - usb serial new device ids - typec bugfixes for reported issues - dwc3 driver fixes - chipidea driver fixes - gadget driver fixes - other minor fixes for reported problems. All of these have been in linux-next for a while, with no reported issues" * tag 'usb-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: USB: serial: option: add Neoway N723-EA support USB: serial: option: add MeiG Smart SRM815 USB: serial: cp210x: add Phoenix Contact UPS Device usb: typec: fix pm usage counter imbalance in ucsi_ccg_sync_control() usb-storage: Add max sectors quirk for Nokia 208 usb: gadget: midi2: Reverse-select at the right place usb: gadget: f_fs: Remove WARN_ON in functionfs_bind USB: core: Disable LPM only for non-suspended ports usb: fix reference leak in usb_new_device() usb: typec: tcpci: fix NULL pointer issue on shared irq case usb: gadget: u_serial: Disable ep before setting port to null to fix the crash caused by port being null usb: chipidea: ci_hdrc_imx: decrement device's refcount in .remove() and in the error path of .probe() usb: typec: ucsi: Set orientation as none when connector is unplugged usb: gadget: configfs: Ignore trailing LF for user strings to cdev USB: usblp: return error when setting unsupported protocol usb: gadget: f_uac2: Fix incorrect setting of bNumEndpoints usb: typec: tcpm/tcpci_maxim: fix error code in max_contaminant_read_resistance_kohm() usb: host: xhci-plat: set skip_phy_initialization if software node has XHCI_SKIP_PHY_INIT property usb: dwc3-am62: Disable autosuspend during remove usb: dwc3: gadget: fix writing NYET threshold	2025-01-12 13:09:00 -08:00
Linus Torvalds	be54864552	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm fixes from Paolo Bonzini: "The largest part here is for KVM/PPC, where a NULL pointer dereference was introduced in the 6.13 merge window and is now fixed. There's some "holiday-induced lateness", as the s390 submaintainer put it, but otherwise things looks fine. s390: - fix a latent bug when the kernel is compiled in debug mode - two small UCONTROL fixes and their selftests arm64: - always check page state in hyp_ack_unshare() - align set_id_regs selftest with the fact that ASIDBITS field is RO - various vPMU fixes for bugs that only affect nested virt PPC e500: - Fix a mostly impossible (but just wrong) case where IRQs were never re-enabled - Observe host permissions instead of mapping readonly host pages as guest-writable. This fixes a NULL-pointer dereference in 6.13 - Replace brittle VMA-based attempts at building huge shadow TLB entries with PTE lookups" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: e500: perform hugepage check after looking up the PFN KVM: e500: map readonly host pages for read KVM: e500: track host-writability of pages KVM: e500: use shadow TLB entry as witness for writability KVM: e500: always restore irqs KVM: s390: selftests: Add has device attr check to uc_attr_mem_limit selftest KVM: s390: selftests: Add ucontrol gis routing test KVM: s390: Reject KVM_SET_GSI_ROUTING on ucontrol VMs KVM: s390: selftests: Add ucontrol flic attr selftests KVM: s390: Reject setting flic pfault attributes on ucontrol VMs KVM: s390: vsie: fix virtual/physical address in unpin_scb() KVM: arm64: Only apply PMCR_EL0.P to the guest range of counters KVM: arm64: nv: Reload PMU events upon MDCR_EL2.HPME change KVM: arm64: Use KVM_REQ_RELOAD_PMU to handle PMCR_EL0.E change KVM: arm64: Add unified helper for reprogramming counters by mask KVM: arm64: Always check the state from hyp_ack_unshare() KVM: arm64: Fix set_id_regs selftest for ASIDBITS becoming unwritable	2025-01-12 12:04:53 -08:00
Linus Torvalds	a603abe345	Merge tag 'perf_urgent_for_v6.13_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fix from Borislav Petkov: - Fix a #GP in the perf user callchain code caused by a race between uprobe freeing the task and the bpf profiler unwinding the task's user stack * tag 'perf_urgent_for_v6.13_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: uprobes: Fix race in uprobe_free_utask	2025-01-12 11:57:45 -08:00
Linus Torvalds	f31acaef55	Merge tag 'x86_urgent_for_v6.13_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Check whether shadow stack is active before using the ptrace regset getter - Remove a wrong BUG_ON in the early static call code which breaks Xen PVH when booting as dom0 * tag 'x86_urgent_for_v6.13_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/fpu: Ensure shadow stack is active before "getting" registers x86/static-call: Remove early_boot_irqs_disabled check to fix Xen PVH dom0	2025-01-12 11:55:48 -08:00
Paolo Bonzini	a5546c2f0d	Merge tag 'kvm-s390-master-6.13-1' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: three small bugfixes Fix a latent bug when the kernel is compiled in debug mode. Two small UCONTROL fixes and their selftests.	2025-01-12 12:51:05 +01:00
Paolo Bonzini	5c99a684c9	Merge tag 'kvmarm-fixes-6.13-3' of https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 changes for 6.13, part #3 - Always check page state in hyp_ack_unshare() - Align set_id_regs selftest with the fact that ASIDBITS field is RO - Various vPMU fixes for bugs that only affect nested virt	2025-01-12 12:50:39 +01:00
Paolo Bonzini	71b7bf1702	Merge branch 'kvm-e500-check-writable-pfn' into HEAD The new __kvm_faultin_pfn() function is upset by the fact that e500 KVM ignores host page permissions - __kvm_faultin requires a "writable" outgoing argument, but e500 KVM is passing NULL. While a simple fix would be possible that simply allows writable to be NULL, it is quite ugly to have e500 KVM ignore completely the host permissions and map readonly host pages as guest-writable. Merge a more complete fix and remove the VMA-based attempts at building huge shadow TLB entries. Using a PTE lookup, similar to what is done for x86, is better and works with remap_pfn_range() because it does not assume that VM_PFNMAP areas are contiguous. Note that the same incorrect logic is there in ARM's get_vma_page_shift() and RISC-V's kvm_riscv_gstage_ioremap(). Fortunately, for e500 most of the code is already there; it just has to be changed to compute the range from find_linux_pte()'s output rather than find_vma(). The new code works for both VM_PFNMAP and hugetlb mappings, so the latter is removed. Patches 2-5 were tested by the reporter, Christian Zigotzky. Since the difference with v1 is minimal, I am going to send it to Linus today.	2025-01-12 12:48:14 +01:00
Paolo Bonzini	55f4db79c4	KVM: e500: perform hugepage check after looking up the PFN e500 KVM tries to bypass __kvm_faultin_pfn() in order to map VM_PFNMAP VMAs as huge pages. This is a Bad Idea because VM_PFNMAP VMAs could become noncontiguous as a result of callsto remap_pfn_range(). Instead, use the already existing host PTE lookup to retrieve a valid host-side mapping level after __kvm_faultin_pfn() has returned. Then find the largest size that will satisfy the guest's request while staying within a single host PTE. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-01-12 10:53:00 +01:00
Paolo Bonzini	03b755b2aa	KVM: e500: map readonly host pages for read The new __kvm_faultin_pfn() function is upset by the fact that e500 KVM ignores host page permissions - __kvm_faultin requires a "writable" outgoing argument, but e500 KVM is nonchalantly passing NULL. If the host page permissions do not include writability, the shadow TLB entry is forcibly mapped read-only. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-01-12 10:35:45 +01:00
Paolo Bonzini	f2104bf22f	KVM: e500: track host-writability of pages Add the possibility of marking a page so that the UW and SW bits are force-cleared. This is stored in the private info so that it persists across multiple calls to kvmppc_e500_setup_stlbe. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-01-12 10:35:41 +01:00
Paolo Bonzini	e97fbb43fb	KVM: e500: use shadow TLB entry as witness for writability kvmppc_e500_ref_setup is returning whether the guest TLB entry is writable, which is than passed to kvm_release_faultin_page. This makes little sense for two reasons: first, because the function sets up the private data for the page and the return value feels like it has been bolted on the side; second, because what really matters is whether the _shadow_ TLB entry is writable. If it is not writable, the page can be released as non-dirty. Shift from using tlbe_is_writable(gtlbe) to doing the same check on the shadow TLB entry. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-01-12 10:35:36 +01:00
Paolo Bonzini	87ecfdbc69	KVM: e500: always restore irqs If find_linux_pte fails, IRQs will not be restored. This is unlikely to happen in practice since it would have been reported as hanging hosts, but it should of course be fixed anyway. Cc: stable@vger.kernel.org Reported-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2025-01-12 10:34:44 +01:00
Linus Torvalds	a87d1203bb	Merge tag 'probes-fixes-v6.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull probes fix from Masami Hiramatsu: "Fix to free trace_kprobe objects at a failure path in __trace_kprobe_create() function. This fixes a memory leak" * tag 'probes-fixes-v6.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing/kprobes: Fix to free objects when failed to copy a symbol	2025-01-11 20:34:12 -08:00
Linus Torvalds	b62cef9a5c	Merge tag 'hwmon-for-v6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon fix from Guenter Roeck: "One patch to fix error handling in drivetemp driver" * tag 'hwmon-for-v6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (drivetemp) Fix driver producing garbage data when SCSI errors occur	2025-01-11 11:42:48 -08:00
Linus Torvalds	05c2d1f272	Merge tag 'block-6.13-20250111' of git://git.kernel.dk/linux Pull block fix from Jens Axboe: "A single fix for a use-after-free in the BFQ IO scheduler" * tag 'block-6.13-20250111' of git://git.kernel.dk/linux: block, bfq: fix waker_bfqq UAF after bfq_split_bfqq()	2025-01-11 11:17:08 -08:00
Linus Torvalds	52a5a22d8a	Merge tag 'io_uring-6.13-20250111' of git://git.kernel.dk/linux Pull io_uring fixes from Jens Axboe: - Fix for multishot timeout updates only using the updated value for the first invocation, not subsequent ones - Silence a false positive lockdep warning - Fix the eventfd signaling and putting RCU logic - Fix fault injected SQPOLL setup not clearing the task pointer in the error path - Fix local task_work looking at the SQPOLL thread rather than just signaling the safe variant. Again one of those theoretical issues, which should be closed up none the less. * tag 'io_uring-6.13-20250111' of git://git.kernel.dk/linux: io_uring: don't touch sqd->thread off tw add io_uring/sqpoll: zero sqd->thread on tctx errors io_uring/eventfd: ensure io_eventfd_signal() defers another RCU period io_uring: silence false positive warnings io_uring/timeout: fix multishot updates	2025-01-11 10:59:43 -08:00
Linus Torvalds	57162361c3	Merge tag '6.13-rc6-SMB3-client-fix' of git://git.samba.org/sfrench/cifs-2.6 Pull smb client fix from Steve French: - fix unneeded session setup retry due to stale password e.g. for DFS automounts * tag '6.13-rc6-SMB3-client-fix' of git://git.samba.org/sfrench/cifs-2.6: smb: client: sync the root session and superblock context passwords before automounting	2025-01-11 10:49:50 -08:00
Linus Torvalds	da60d1547d	Merge tag 'soc-fixes-6.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull SoC fixes from Arnd Bergmann: "Over the Christmas break a couple of devicetree fixes came in for Rockchips, Qualcomm and NXP/i.MX. These add some missing board specific properties, address build time warnings, The USB/TOG supoprt on X1 Elite regressed, so two earlier DT changes get reverted for now. Aside from the devicetree fixes, there is One build fix for the stm32 firewall driver, and a defconfig change to enable SPDIF support for i.MX" * tag 'soc-fixes-6.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: firewall: remove misplaced semicolon from stm32_firewall_get_firewall arm64: dts: rockchip: add hevc power domain clock to rk3328 arm64: dts: rockchip: Fix the SD card detection on NanoPi R6C/R6S arm64: dts: qcom: sa8775p: fix the secure device bootup issue Revert "arm64: dts: qcom: x1e80100: enable OTG on USB-C controllers" Revert "arm64: dts: qcom: x1e80100-crd: enable otg on usb ports" arm64: dts: qcom: x1e80100: Fix up BAR space size for PCIe6a Revert "arm64: dts: qcom: x1e78100-t14s: enable otg on usb-c ports" ARM: dts: imxrt1050: Fix clocks for mmc ARM: imx_v6_v7_defconfig: enable SND_SOC_SPDIF arm64: dts: imx95: correct the address length of netcmix_blk_ctrl arm64: dts: imx8-ss-audio: add fallback compatible string fsl,imx6ull-esai for esai arm64: dts: rockchip: rename rfkill label for Radxa ROCK 5B arm64: dts: rockchip: add reset-names for combphy on rk3568 arm64: dts: qcom: sa8775p: Fix the size of 'addr_space' regions	2025-01-11 10:42:05 -08:00
Michael Ellerman	77a903cd8e	MAINTAINERS: powerpc: Update my status Maddy is taking over the day-to-day maintenance of powerpc. I will still be around to help, and as a backup. Re-order the main POWERPC list to put Maddy first to reflect that. KVM/powerpc patches will be handled by Maddy via the powerpc tree with review from Nick, so replace myself with Maddy there. Remove myself from BPF, leaving Hari & Christophe as maintainers. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2025-01-10 17:19:33 -08:00
Meetakshi Setiya	20b1aa9123	smb: client: sync the root session and superblock context passwords before automounting In some cases, when password2 becomes the working password, the client swaps the two password fields in the root session struct, but not in the smb3_fs_context struct in cifs_sb. DFS automounts inherit fs context from their parent mounts. Therefore, they might end up getting the passwords in the stale order. The automount should succeed, because the mount function will end up retrying with the actual password anyway. But to reduce these unnecessary session setup retries for automounts, we can sync the parent context's passwords with the root session's passwords before duplicating it to the child's fs context. Cc: stable@vger.kernel.org Signed-off-by: Meetakshi Setiya <msetiya@microsoft.com> Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-01-10 17:55:35 -06:00
Linus Torvalds	2e3f3090bd	Merge tag 'sched_ext-for-6.13-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext Pull sched_ext fixes from Tejun Heo: - Fix corner case bug where ops.dispatch() couldn't extend the execution of the current task if SCX_OPS_ENQ_LAST is set. - Fix ops.cpu_release() not being called when a SCX task is preempted by a higher priority sched class task. - Fix buitin idle mask being incorrectly left as busy after an idle CPU is picked and kicked. - scx_ops_bypass() was unnecessarily using rq_lock() which comes with rq pinning related sanity checks which could trigger spuriously. Switch to raw_spin_rq_lock(). * tag 'sched_ext-for-6.13-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: sched_ext: idle: Refresh idle masks during idle-to-idle transitions sched_ext: switch class when preempted by higher priority scheduler sched_ext: Replace rq_lock() to raw_spin_rq_lock() in scx_ops_bypass() sched_ext: keep running prev when prev->scx.slice != 0	2025-01-10 15:11:58 -08:00
Linus Torvalds	58624e4bc8	Merge tag 'cgroup-for-6.13-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: "Cpuset fixes: - Fix isolated CPUs leaking into sched domains - Remove now unnecessary kernfs active break which can trigger a warning - Comment updates" * tag 'cgroup-for-6.13-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup/cpuset: remove kernfs active break cgroup/cpuset: Prevent leakage of isolated CPUs into sched domains cgroup/cpuset: Remove stale text	2025-01-10 15:03:02 -08:00
Linus Torvalds	257a8be4e9	Merge tag 'wq-for-6.13-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq Pull workqueue fix from Tejun Heo: - Add a WARN_ON_ONCE() on queue_delayed_work_on() on an offline CPU as such work items won't get executed till the CPU comes back online * tag 'wq-for-6.13-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: warn if delayed_work is queued to an offlined cpu.	2025-01-10 14:52:30 -08:00
Linus Torvalds	da13af8392	Merge tag 'thermal-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull thermal control fix from Rafael Wysocki: "Fix an OF node leak in the code parsing thermal zone DT properties (Joe Hattori)" * tag 'thermal-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: thermal: of: fix OF node leak in of_thermal_zone_find()	2025-01-10 14:46:49 -08:00
Linus Torvalds	475c9f5854	Merge tag 'acpi-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "Add two more ACPI IRQ override quirks and update the code using them to avoid unnecessary overhead (Hans de Goede)" * tag 'acpi-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: resource: acpi_dev_irq_override(): Check DMI match last ACPI: resource: Add TongFang GM5HG0A to irq1_edge_low_force_override[] ACPI: resource: Add Asus Vivobook X1504VAP to irq1_level_low_skip_override[]	2025-01-10 14:41:46 -08:00
Andrea Righi	a2a3374c47	sched_ext: idle: Refresh idle masks during idle-to-idle transitions With the consolidation of put_prev_task/set_next_task(), see commit `436f3eed5c` ("sched: Combine the last put_prev_task() and the first set_next_task()"), we are now skipping the transition between these two functions when the previous and the next tasks are the same. As a result, the scx idle state of a CPU is updated only when transitioning to or from the idle thread. While this is generally correct, it can lead to uneven and inefficient core utilization in certain scenarios [1]. A typical scenario involves proactive wake-ups: scx_bpf_pick_idle_cpu() selects and marks an idle CPU as busy, followed by a wake-up via scx_bpf_kick_cpu(), without dispatching any tasks. In this case, the CPU continues running the idle thread, returns to idle, but remains marked as busy, preventing it from being selected again as an idle CPU (until a task eventually runs on it and releases the CPU). For example, running a workload that uses 20% of each CPU, combined with an scx scheduler using proactive wake-ups, results in the following core utilization: CPU 0: 25.7% CPU 1: 29.3% CPU 2: 26.5% CPU 3: 25.5% CPU 4: 0.0% CPU 5: 25.5% CPU 6: 0.0% CPU 7: 10.5% To address this, refresh the idle state also in pick_task_idle(), during idle-to-idle transitions, but only trigger ops.update_idle() on actual state changes to prevent unnecessary updates to the scx scheduler and maintain balanced state transitions. With this change in place, the core utilization in the previous example becomes the following: CPU 0: 18.8% CPU 1: 19.4% CPU 2: 18.0% CPU 3: 18.7% CPU 4: 19.3% CPU 5: 18.9% CPU 6: 18.7% CPU 7: 19.3% [1] https://github.com/sched-ext/scx/pull/1139 Fixes: `7c65ae81ea` ("sched_ext: Don't call put_prev_task_scx() before picking the next task") Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2025-01-10 12:40:42 -10:00
Pavel Begunkov	bd2703b42d	io_uring: don't touch sqd->thread off tw add With IORING_SETUP_SQPOLL all requests are created by the SQPOLL task, which means that req->task should always match sqd->thread. Since accesses to sqd->thread should be separately protected, use req->task in io_req_normal_work_add() instead. Note, in the eyes of io_req_normal_work_add(), the SQPOLL task struct is always pinned and alive, and sqd->thread can either be the task or NULL. It's only problematic if the compiler decides to reload the value after the null check, which is not so likely. Cc: stable@vger.kernel.org Cc: Bui Quang Minh <minhquangbui99@gmail.com> Reported-by: lizetao <lizetao1@huawei.com> Fixes: `78f9b61bd8` ("io_uring: wake SQPOLL task when task_work is added to an empty queue") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/1cbbe72cf32c45a8fee96026463024cd8564a7d7.1736541357.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-01-10 14:00:25 -07:00
Pavel Begunkov	4b7cfa8b6c	io_uring/sqpoll: zero sqd->thread on tctx errors Syzkeller reports: BUG: KASAN: slab-use-after-free in thread_group_cputime+0x409/0x700 kernel/sched/cputime.c:341 Read of size 8 at addr ffff88803578c510 by task syz.2.3223/27552 Call Trace: <TASK> ... kasan_report+0x143/0x180 mm/kasan/report.c:602 thread_group_cputime+0x409/0x700 kernel/sched/cputime.c:341 thread_group_cputime_adjusted+0xa6/0x340 kernel/sched/cputime.c:639 getrusage+0x1000/0x1340 kernel/sys.c:1863 io_uring_show_fdinfo+0xdfe/0x1770 io_uring/fdinfo.c:197 seq_show+0x608/0x770 fs/proc/fd.c:68 ... That's due to sqd->task not being cleared properly in cases where SQPOLL task tctx setup fails, which can essentially only happen with fault injection to insert allocation errors. Cc: stable@vger.kernel.org Fixes: `1251d2025c` ("io_uring/sqpoll: early exit thread if task_context wasn't allocated") Reported-by: syzbot+3d92cfcfa84070b0a470@syzkaller.appspotmail.com Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/efc7ec7010784463b2e7466d7b5c02c2cb381635.1736519461.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-01-10 14:00:19 -07:00
Linus Torvalds	e0daef7de1	Merge tag 'drm-fixes-2025-01-11' of https://gitlab.freedesktop.org/drm/kernel Pull drm fixes from Dave Airlie: "Regular weekly fixes, this has the usual amdgpu/xe/i915 bits. There is a bigger bunch of mediatek patches that I considered not including at this stage, but all the changes (except for one were obvious small fixes, and the rotation one is a few lines, and I suppose will help someone have their screen up the right way), I decided to include it since I expect it got slowed down by holidays etc, and it's not that mainstream a hw platform. i915: - Revert "drm/i915/hdcp: Don't enable HDCP1.4 directly from check_link" amdgpu: - Display interrupt fixes - Fix display max surface mismatches - Fix divide error in DM plane scale calcs - Display divide by 0 checks in dml helpers - SMU 13 AD/DC interrrupt handling fix - Fix locking around buddy trim handling amdkfd: - Fix page fault with shader debugger enabled - Fix eviction fence wq handling xe: - Avoid a NULL ptr deref when wedging - Fix power gate sequence on DG1 mediatek: - Revert "drm/mediatek: dsi: Correct calculation formula of PHY Timing" - Set private->all_drm_private[i]->drm to NULL if mtk_drm_bind returns err - Move mtk_crtc_finish_page_flip() to ddp_cmdq_cb() - Only touch DISP_REG_OVL_PITCH_MSB if AFBC is supported - Add support for 180-degree rotation in the display driver - Stop selecting foreign drivers - Revert "drm/mediatek: Switch to for_each_child_of_node_scoped()" - Fix YCbCr422 color format issue for DP - Fix mode valid issue for dp - dp: Reference common DAI properties - dsi: Add registers to pdata to fix MT8186/MT8188 - Remove unneeded semicolon - Add return value check when reading DPCD - Initialize pointer in mtk_drm_of_ddp_path_build_one()" * tag 'drm-fixes-2025-01-11' of https://gitlab.freedesktop.org/drm/kernel: (26 commits) drm/xe/dg1: Fix power gate sequence. drm/xe: Fix tlb invalidation when wedging Revert "drm/i915/hdcp: Don't enable HDCP1.4 directly from check_link" drm/amdgpu: Add a lock when accessing the buddy trim function drm/amd/pm: fix BUG: scheduling while atomic drm/amdkfd: wq_release signals dma_fence only when available drm/amd/display: Add check for granularity in dml ceil/floor helpers drm/amdkfd: fixed page fault when enable MES shader debugger drm/amd/display: fix divide error in DM plane scale calcs drm/amd/display: increase MAX_SURFACES to the value supported by hw drm/amd/display: fix page fault due to max surface definition mismatch drm/amd/display: Remove unnecessary amdgpu_irq_get/put drm/mediatek: Initialize pointer in mtk_drm_of_ddp_path_build_one() drm/mediatek: Add return value check when reading DPCD drm/mediatek: Remove unneeded semicolon drm/mediatek: mtk_dsi: Add registers to pdata to fix MT8186/MT8188 dt-bindings: display: mediatek: dp: Reference common DAI properties drm/mediatek: Fix mode valid issue for dp drm/mediatek: Fix YCbCr422 color format issue for DP Revert "drm/mediatek: Switch to for_each_child_of_node_scoped()" ...	2025-01-10 12:35:46 -08:00
Linus Torvalds	f8c6263347	Merge tag 'riscv-for-linus-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - a handful of selftest fixes - fix a memory leak in relocation processing during module loading - avoid sleeping in die() - fix kprobe instruction slot address calculations - fix DT node reference leak in SBI idle probing - avoid initializing out of bounds pages on sparse vmemmap systems with a gap at the start of their physical memory map - fix backtracing through exceptions - _Q_PENDING_LOOPS is now defined whenever QUEUED_SPINLOCKS=y - local labels in entry.S are now marked with ".L", which prevents them from trashing backtraces - a handful of fixes for SBI-based performance counters * tag 'riscv-for-linus-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: drivers/perf: riscv: Do not allow invalid raw event config drivers/perf: riscv: Return error for default case drivers/perf: riscv: Fix Platform firmware event data tools: selftests: riscv: Add test count for vstate_prctl tools: selftests: riscv: Add pass message for v_initval_nolibc riscv: use local label names instead of global ones in assembly riscv: qspinlock: Fixup _Q_PENDING_LOOPS definition riscv: stacktrace: fix backtracing through exceptions riscv: mm: Fix the out of bound issue of vmemmap address cpuidle: riscv-sbi: fix device node release in early exit of for_each_possible_cpu riscv: kprobes: Fix incorrect address calculation riscv: Fix sleeping in invalid context in die() riscv: module: remove relocation_head rel_entry member allocation riscv: selftests: Fix warnings pointer masking test	2025-01-10 10:50:30 -08:00
Imran Khan	da30ba227c	workqueue: warn if delayed_work is queued to an offlined cpu. delayed_work submitted to an offlined cpu, will not get executed, after the specified delay if the cpu remains offline. If the cpu never comes online the work will never get executed. checking for online cpu in __queue_delayed_work, does not sound like a good idea because to do this reliably we need hotplug lock and since work may be submitted from atomic contexts, we would have to use cpus_read_trylock. But if trylock fails we would queue the work on any cpu and this may not be optimal because our intended cpu might still be online. Putting a WARN_ON_ONCE for an already offlined cpu, will indicate users of queue_delayed_work_on, if they are (wrongly) trying to queue delayed_work on offlined cpu. Also indicate the problem of using offlined cpu with queue_delayed_work_on, in its description. Signed-off-by: Imran Khan <imran.f.khan@oracle.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2025-01-10 08:33:39 -10:00
Arnd Bergmann	5391c5d8e6	Merge tag 'v6.13-rockchip-dtsfixes1' of https://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into arm/fixes Fixed card-detect on one board and some missing properties added. * tag 'v6.13-rockchip-dtsfixes1' of https://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip: arm64: dts: rockchip: add hevc power domain clock to rk3328 arm64: dts: rockchip: Fix the SD card detection on NanoPi R6C/R6S arm64: dts: rockchip: rename rfkill label for Radxa ROCK 5B arm64: dts: rockchip: add reset-names for combphy on rk3568 Link: https://lore.kernel.org/r/2914560.yaVYbkx8dN@diego Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2025-01-10 18:20:59 +01:00
Linus Torvalds	7110f24f9e	Merge tag 'vfs-6.13-rc7.fixes.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: "afs: - Fix the maximum cell name length - Fix merge preference rule failure condition fuse: - Fix fuse_get_user_pages() so it doesn't risk misleading the caller to think pages have been allocated when they actually haven't - Fix direct-io folio offset and length calculation netfs: - Fix async direct-io handling - Fix read-retry for filesystems that don't provide a ->prepare_read() method vfs: - Prevent truncating 64-bit offsets to 32-bits in iomap - Fix memory barrier interactions when polling - Remove MNT_ONRB to fix concurrent modification of @mnt->mnt_flags leading to MNT_ONRB to not be raised and invalid access to a list member" * tag 'vfs-6.13-rc7.fixes.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: poll: kill poll_does_not_wait() sock_poll_wait: kill the no longer necessary barrier after poll_wait() io_uring_poll: kill the no longer necessary barrier after poll_wait() poll_wait: kill the obsolete wait_address check poll_wait: add mb() to fix theoretical race between waitqueue_active() and .poll() afs: Fix merge preference rule failure condition netfs: Fix read-retry for fs with no ->prepare_read() netfs: Fix kernel async DIO fs: kill MNT_ONRB iomap: avoid avoid truncating 64-bit offset to 32 bits afs: Fix the maximum cell name length fuse: Set *nbytesp=0 in fuse_get_user_pages on allocation failure fuse: fix direct io folio offset and length calculation	2025-01-10 09:11:11 -08:00
Linus Torvalds	36eb21945a	Merge tag 'xfs-fixes-6.13-rc7' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux Pull xfs fixes from Carlos Maiolino: - Fix a missing lock while detaching a dquot buffer - Fix failure on xfs_update_last_rtgroup_size for !XFS_RT * tag 'xfs-fixes-6.13-rc7' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: lock dquot buffer before detaching dquot from b_li_list xfs: don't return an error from xfs_update_last_rtgroup_size for !XFS_RT	2025-01-10 09:04:27 -08:00
Linus Torvalds	8c8d54116f	Merge tag 'platform-drivers-x86-v6.13-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Ilpo Järvinen: "Fixes and new HW support: - amd/pmc: Match IRQ1 wakeup disable with the enable on i8042 side - intel: power-domains: Clearwater Forest support - intel/pmc: Skip SSRAM setup when no additional devices are present - ISST: Clearwater Forest support" * tag 'platform-drivers-x86-v6.13-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86: intel/pmc: Fix ioremap() of bad address platform/x86: ISST: Add Clearwater Forest to support list platform/x86/intel: power-domains: Add Clearwater Forest support platform/x86/amd/pmc: Only disable IRQ1 wakeup where i8042 actually enabled it	2025-01-10 08:14:22 -08:00
Linus Torvalds	b999e7f92e	Merge tag 'regulator-fix-v6.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fixes from Mark Brown: "A couple of fixes for !REGULATOR and !OF configurations, adding missing stubs" * tag 'regulator-fix-v6.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: Move OF_ API declarations/definitions outside CONFIG_REGULATOR regulator: Guard of_regulator_bulk_get_all() with CONFIG_OF	2025-01-10 08:05:32 -08:00
Linus Torvalds	9d64558493	Merge tag 'gpio-fixes-for-v6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull gpio fixes from Bartosz Golaszewski: "There's one small fix for real HW - gpio-loongson. The rest concern two virtual testing drivers in which some issues were recently found and addressed: - fix resource leaks in error path in gpio-virtuser (and one consistent memory leak triggered on every device removal)) - fix the use-case of having multiple con_ids in a lookup table in gpio-virtuser which has never worked (despite being advertised) - don't allow rmdir() on configfs directories when they are in use in gpio-sim and gpio-virtuser - fix register offsets in gpio-loongson-64" * tag 'gpio-fixes-for-v6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: gpio: loongson: Fix Loongson-2K2000 ACPI GPIO register offset gpio: sim: lock up configfs that an instantiated device depends on gpio: virtuser: lock up configfs that an instantiated device depends on gpio: virtuser: fix handling of multiple conn_ids in lookup table gpio: virtuser: fix missing lookup table cleanups	2025-01-10 07:59:47 -08:00
Greg Kroah-Hartman	f3149ed697	Merge tag 'usb-serial-6.13-rc7' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus Johan writes: USB-serial device ids for 6.13-rc7 Here are some new modem and cp210x device ids. All have been in linux-next with no reported issues. * tag 'usb-serial-6.13-rc7' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial: USB: serial: option: add Neoway N723-EA support USB: serial: option: add MeiG Smart SRM815 USB: serial: cp210x: add Phoenix Contact UPS Device	2025-01-10 14:59:20 +01:00
Christian Brauner	1623bc27a8	Merge branch 'vfs-6.14.poll' into vfs.fixes Bring in the fixes for __pollwait() and waitqueue_active() interactions. Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-01-10 12:01:21 +01:00
Christian Brauner	67cd2e23c0	Merge patch series "poll_wait: add mb() to fix theoretical race between waitqueue_active() and .poll()" Oleg Nesterov <oleg@redhat.com> says: The waitqueue_active() helper can only be used if both waker and waiter have memory barriers that pair with each other. But __pollwait() is broken in this respect. Fix it. * patches from https://lore.kernel.org/r/20250107162649.GA18886@redhat.com: poll: kill poll_does_not_wait() sock_poll_wait: kill the no longer necessary barrier after poll_wait() io_uring_poll: kill the no longer necessary barrier after poll_wait() poll_wait: kill the obsolete wait_address check poll_wait: add mb() to fix theoretical race between waitqueue_active() and .poll() Link: https://lore.kernel.org/r/20250107162649.GA18886@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-01-10 11:59:45 +01:00
Oleg Nesterov	f005bf18a5	poll: kill poll_does_not_wait() It no longer has users. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Link: https://lore.kernel.org/r/20250107162743.GA18947@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-01-10 11:59:00 +01:00
Oleg Nesterov	b2849867b3	sock_poll_wait: kill the no longer necessary barrier after poll_wait() Now that poll_wait() provides a full barrier we can remove smp_mb() from sock_poll_wait(). Also, the poll_does_not_wait() check before poll_wait() just adds the unnecessary confusion, kill it. poll_wait() does the same "p && p->_qproc" check. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Link: https://lore.kernel.org/r/20250107162736.GA18944@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-01-10 11:59:00 +01:00
Oleg Nesterov	4e15fa8305	io_uring_poll: kill the no longer necessary barrier after poll_wait() Now that poll_wait() provides a full barrier we can remove smp_rmb() from io_uring_poll(). In fact I don't think smp_rmb() was correct, it can't serialize LOADs and STOREs. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Link: https://lore.kernel.org/r/20250107162730.GA18940@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-01-10 11:58:59 +01:00
Oleg Nesterov	10b02a2cfe	poll_wait: kill the obsolete wait_address check This check is historical and no longer needed, wait_address is never NULL. These days we rely on the poll_table->_qproc check. NULL if select/poll is not going to sleep, or it already has a data to report, or all waiters have already been registered after the 1st iteration. However, poll_table *p can be NULL, see p9_fd_poll() for example, so we can't remove the "p != NULL" check. Link: https://lore.kernel.org/all/20250106180325.GF7233@redhat.com/ Signed-off-by: Oleg Nesterov <oleg@redhat.com> Link: https://lore.kernel.org/r/20250107162724.GA18926@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-01-10 11:58:59 +01:00
Oleg Nesterov	cacd9ae4bf	poll_wait: add mb() to fix theoretical race between waitqueue_active() and .poll() As the comment above waitqueue_active() explains, it can only be used if both waker and waiter have mb()'s that pair with each other. However __pollwait() is broken in this respect. This is not pipe-specific, but let's look at pipe_poll() for example: poll_wait(...); // -> __pollwait() -> add_wait_queue() LOAD(pipe->head); LOAD(pipe->head); In theory these LOAD()'s can leak into the critical section inside add_wait_queue() and can happen before list_add(entry, wq_head), in this case pipe_poll() can race with wakeup_pipe_readers/writers which do smp_mb(); if (waitqueue_active(wq_head)) wake_up_interruptible(wq_head); There are more __pollwait()-like functions (grep init_poll_funcptr), and it seems that at least ep_ptable_queue_proc() has the same problem, so the patch adds smp_mb() into poll_wait(). Link: https://lore.kernel.org/all/20250102163320.GA17691@redhat.com/ Signed-off-by: Oleg Nesterov <oleg@redhat.com> Link: https://lore.kernel.org/r/20250107162717.GA18922@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-01-10 11:58:59 +01:00
Darrick J. Wong	111d36d627	xfs: lock dquot buffer before detaching dquot from b_li_list We have to lock the buffer before we can delete the dquot log item from the buffer's log item list. Cc: stable@vger.kernel.org # v6.13-rc3 Fixes: `acc8f8628c` ("xfs: attach dquot buffer to dquot log item buffer") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2025-01-10 10:12:48 +01:00
Johannes Berg	67510d7e2e	fs: debugfs: fix open proxy for unsafe files In the previous commit referenced below, I had to split the short fops handling into different proxy fops. This necessitated knowing out-of-band whether or not the ops are short or full, when attempting to convert from fops to allocated fsdata. Unfortunately, I only converted full_proxy_open() which is used for the new full_proxy_open_regular() and full_proxy_open_short(), but forgot about the call in open_proxy_open(), used for debugfs_create_file_unsafe(). Fix that, it never has short fops. Fixes: `f8f25893a4` ("fs: debugfs: differentiate short fops with proxy ops") Reported-by: Suresh Kumar Kurmi <suresh.kumar.kurmi@intel.com> Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202501101055.bb8bf3e7-lkp@intel.com Reported-by: Venkat Rao Bagalkote <venkat88@linux.vnet.ibm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://lore.kernel.org/r/20250110085826.cd74f3b7a36b.I430c79c82ec3f954c2ff9665753bf6ac9e63eef8@changeid Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-01-10 09:41:53 +01:00
Jiri Olsa	b583ef82b6	uprobes: Fix race in uprobe_free_utask Max Makarov reported kernel panic [1] in perf user callchain code. The reason for that is the race between uprobe_free_utask and bpf profiler code doing the perf user stack unwind and is triggered within uprobe_free_utask function: - after current->utask is freed and - before current->utask is set to NULL general protection fault, probably for non-canonical address 0x9e759c37ee555c76: 0000 [#1] SMP PTI RIP: 0010:is_uprobe_at_func_entry+0x28/0x80 ... ? die_addr+0x36/0x90 ? exc_general_protection+0x217/0x420 ? asm_exc_general_protection+0x26/0x30 ? is_uprobe_at_func_entry+0x28/0x80 perf_callchain_user+0x20a/0x360 get_perf_callchain+0x147/0x1d0 bpf_get_stackid+0x60/0x90 bpf_prog_9aac297fb833e2f5_do_perf_event+0x434/0x53b ? __smp_call_single_queue+0xad/0x120 bpf_overflow_handler+0x75/0x110 ... asm_sysvec_apic_timer_interrupt+0x1a/0x20 RIP: 0010:__kmem_cache_free+0x1cb/0x350 ... ? uprobe_free_utask+0x62/0x80 ? acct_collect+0x4c/0x220 uprobe_free_utask+0x62/0x80 mm_release+0x12/0xb0 do_exit+0x26b/0xaa0 __x64_sys_exit+0x1b/0x20 do_syscall_64+0x5a/0x80 It can be easily reproduced by running following commands in separate terminals: # while :; do bpftrace -e 'uprobe:/bin/ls:_start { printf("hit\n"); }' -c ls; done # bpftrace -e 'profile:hz:100000 { @[ustack()] = count(); }' Fixing this by making sure current->utask pointer is set to NULL before we start to release the utask object. [1] https://github.com/grafana/pyroscope/issues/3673 Fixes: `cfa7f3d2c5` ("perf,x86: avoid missing caller address in stack traces captured in uprobe") Reported-by: Max Makarov <maxpain@linux.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20250109141440.2692173-1-jolsa@kernel.org	2025-01-10 09:28:01 +01:00
Dave Airlie	fddb4fd91a	Merge tag 'mediatek-drm-fixes-20250104' of https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-fixes Mediatek DRM Fixes - 20250104 1. Revert "drm/mediatek: dsi: Correct calculation formula of PHY Timing" 2. Set private->all_drm_private[i]->drm to NULL if mtk_drm_bind returns err 3. Move mtk_crtc_finish_page_flip() to ddp_cmdq_cb() 4. Only touch DISP_REG_OVL_PITCH_MSB if AFBC is supported 5. Add support for 180-degree rotation in the display driver 6. Stop selecting foreign drivers 7. Revert "drm/mediatek: Switch to for_each_child_of_node_scoped()" 8. Fix YCbCr422 color format issue for DP 9. Fix mode valid issue for dp 10. dp: Reference common DAI properties 11. dsi: Add registers to pdata to fix MT8186/MT8188 12. Remove unneeded semicolon 13. Add return value check when reading DPCD 14. Initialize pointer in mtk_drm_of_ddp_path_build_one() Signed-off-by: Dave Airlie <airlied@redhat.com> From: Chun-Kuang Hu <chunkuang.hu@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20250104124227.45505-1-chunkuang.hu@kernel.org	2025-01-10 16:57:59 +10:00
Dave Airlie	85bf89f268	Merge tag 'drm-xe-fixes-2025-01-09' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes Driver Changes: - Avoid a NULL ptr deref when wedging (Lucas) - Fix power gate sequence on DG1 (Rodrigo) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/Z4AcqP3Io_r0pEsR@fedora	2025-01-10 16:43:32 +10:00
Dave Airlie	66d4709abc	Merge tag 'amd-drm-fixes-6.13-2025-01-09' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.13-2025-01-09: amdgpu: - Display interrupt fixes - Fix display max surface mismatches - Fix divide error in DM plane scale calcs - Display divide by 0 checks in dml helpers - SMU 13 AD/DC interrrupt handling fix - Fix locking around buddy trim handling amdkfd: - Fix page fault with shader debugger enabled - Fix eviction fence wq handling Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250109164236.477295-1-alexander.deucher@amd.com	2025-01-10 16:12:28 +10:00
Dave Airlie	7ac9f3366f	Merge tag 'drm-intel-fixes-2025-01-08' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes - Revert "drm/i915/hdcp: Don't enable HDCP1.4 directly from check_link" [hdcp] (Suraj Kandpal) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Tvrtko Ursulin <tursulin@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/Z37BPchEzY0ovIqF@linux	2025-01-10 14:50:21 +10:00
Linus Torvalds	2144da2558	Merge tag '6.13-rc6-ksmbd-server-fixes' of git://git.samba.org/ksmbd Pull smb server fixes from Steve French: "Four ksmbd server fixes, most also for stable: - fix for reporting special file type more accurately when POSIX extensions negotiated - minor cleanup - fix possible incorrect creation path when dirname is not present. In some cases, Windows apps create files without checking if they exist. - fix potential NULL pointer dereference sending interim response" * tag '6.13-rc6-ksmbd-server-fixes' of git://git.samba.org/ksmbd: ksmbd: Implement new SMB3 POSIX type ksmbd: fix unexpectedly changed path in ksmbd_vfs_kern_path_locked ksmbd: Remove unneeded if check in ksmbd_rdma_capable_netdev() ksmbd: fix a missing return value check bug	2025-01-09 18:19:59 -08:00
Masami Hiramatsu (Google)	30c8fd31c5	tracing/kprobes: Fix to free objects when failed to copy a symbol In __trace_kprobe_create(), if something fails it must goto error block to free objects. But when strdup() a symbol, it returns without that. Fix it to goto the error block to free objects correctly. Link: https://lore.kernel.org/all/173643297743.1514810.2408159540454241947.stgit@devnote2/ Fixes: `6212dd2968` ("tracing/kprobes: Use dyn_event framework for kprobe events") Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-01-10 08:57:18 +09:00
guanjing	155c5bf26f	firewall: remove misplaced semicolon from stm32_firewall_get_firewall Remove misplaced colon in stm32_firewall_get_firewall() which results in a syntax error when the code is compiled without CONFIG_STM32_FIREWALL. Fixes: `5c9668cfc6` ("firewall: introduce stm32_firewall framework") Signed-off-by: guanjing <guanjing@cmss.chinamobile.com> Reviewed-by: Gatien Chevallier <gatien.chevallier@foss.st.com> Signed-off-by: Alexandre Torgue <alexandre.torgue@foss.st.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2025-01-09 22:57:34 +01:00
Arnd Bergmann	00f63a0f5f	Merge tag 'imx-fixes-6.13' of https://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into arm/fixes i.MX fixes for 6.13: - Add fallback for i.MX8QM ESAI compatible to fix a dt-schema warning caused by bindings update - Fix uSDHC1 clock for i.MX RT1050 - Enable SND_SOC_SPDIF in imx_v6_v7_defconfig to fix a regression caused by an i.MX6 SPDIF sound card change in DT - Fix address length of i.MX95 netcmix_blk_ctrl * tag 'imx-fixes-6.13' of https://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux: ARM: dts: imxrt1050: Fix clocks for mmc ARM: imx_v6_v7_defconfig: enable SND_SOC_SPDIF arm64: dts: imx95: correct the address length of netcmix_blk_ctrl arm64: dts: imx8-ss-audio: add fallback compatible string fsl,imx6ull-esai for esai Link: https://lore.kernel.org/r/Z3Jf9zbv/xH3YzuB@dragon Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2025-01-09 22:56:15 +01:00
Arnd Bergmann	627522c8bf	Merge tag 'qcom-arm64-fixes-for-6.13' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/fixes Qualcomm Arm64 DeviceTree fixes for v6.13 Revert the enablement of OTG support on primary and secondary USB Type-C controllers of X1 Elite, for now, as this broke support for USB hotplug. Disable the TPDM DCC device on SA8775P, as this is inaccessible per current firmware configuration. Also correct the PCIe "addr_space" region to enable larger BAR sizes. Also fix the address space of PCIe6a found in X1 Elite. * tag 'qcom-arm64-fixes-for-6.13' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux: arm64: dts: qcom: sa8775p: fix the secure device bootup issue Revert "arm64: dts: qcom: x1e80100: enable OTG on USB-C controllers" Revert "arm64: dts: qcom: x1e80100-crd: enable otg on usb ports" arm64: dts: qcom: x1e80100: Fix up BAR space size for PCIe6a Revert "arm64: dts: qcom: x1e78100-t14s: enable otg on usb-c ports" arm64: dts: qcom: sa8775p: Fix the size of 'addr_space' regions Link: https://lore.kernel.org/r/20250103024945.4649-1-andersson@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2025-01-09 22:54:53 +01:00
Linus Torvalds	c77cd47cee	Merge tag 'net-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from netfilter, Bluetooth and WPAN. No outstanding fixes / investigations at this time. Current release - new code bugs: - eth: fbnic: revert HWMON support, it doesn't work at all and revert is similar size as the fixes Previous releases - regressions: - tcp: allow a connection when sk_max_ack_backlog is zero - tls: fix tls_sw_sendmsg error handling Previous releases - always broken: - netdev netlink family: - prevent accessing NAPI instances from another namespace - don't dump Tx and uninitialized NAPIs - net: sysctl: avoid using current->nsproxy, fix null-deref if task is exiting and stick to opener's netns - sched: sch_cake: add bounds checks to host bulk flow fairness counts Misc: - annual cleanup of inactive maintainers" * tag 'net-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (57 commits) rds: sysctl: rds_tcp_{rcv,snd}buf: avoid using current->nsproxy sctp: sysctl: plpmtud_probe_interval: avoid using current->nsproxy sctp: sysctl: udp_port: avoid using current->nsproxy sctp: sysctl: auth_enable: avoid using current->nsproxy sctp: sysctl: rto_min/max: avoid using current->nsproxy sctp: sysctl: cookie_hmac_alg: avoid using current->nsproxy mptcp: sysctl: blackhole timeout: avoid using current->nsproxy mptcp: sysctl: sched: avoid using current->nsproxy mptcp: sysctl: avail sched: remove write access MAINTAINERS: remove Lars Povlsen from Microchip Sparx5 SoC MAINTAINERS: remove Noam Dagan from AMAZON ETHERNET MAINTAINERS: remove Ying Xue from TIPC MAINTAINERS: remove Mark Lee from MediaTek Ethernet MAINTAINERS: mark stmmac ethernet as an Orphan MAINTAINERS: remove Andy Gospodarek from bonding MAINTAINERS: update maintainers for Microchip LAN78xx MAINTAINERS: mark Synopsys DW XPCS as Orphan net/mlx5: Fix variable not being completed when function returns rtase: Fix a check for error in rtase_alloc_msix() net: stmmac: dwmac-tegra: Read iommu stream id from device tree ...	2025-01-09 12:40:58 -08:00
Linus Torvalds	643e2e259c	Merge tag 'for-6.13-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "A few more fixes. Besides the one-liners in Btrfs there's fix to the io_uring and encoded read integration (added in this development cycle). The update to io_uring provides more space for the ongoing command that is then used in Btrfs to handle some cases. - io_uring and encoded read: - provide stable storage for io_uring command data - make a copy of encoded read ioctl call, reuse that in case the call would block and will be called again - properly initialize zlib context for hardware compression on s390 - fix max extent size calculation on filesystems with non-zoned devices - fix crash in scrub on crafted image due to invalid extent tree" * tag 'for-6.13-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: zlib: fix avail_in bytes for s390 zlib HW compression path btrfs: zoned: calculate max_extent_size properly on non-zoned setup btrfs: avoid NULL pointer dereference if no valid extent tree btrfs: don't read from userspace twice in btrfs_uring_encoded_read() io_uring: add io_uring_cmd_get_async_data helper io_uring/cmd: add per-op data to struct io_uring_cmd_data io_uring/cmd: rename struct uring_cache to io_uring_cmd_data	2025-01-09 10:16:45 -08:00
Palmer Dabbelt	6f6ecce59d	Merge patch series "SBI PMU event related fixes" Atish Patra <atishp@rivosinc.com> says: Here are two minor improvement/fixes in the PMU event path. The first patch was part of the series[1]. The 2nd patch was suggested during the series review. While the series can only be merged once SBI v3.0 is frozen, these two patches can be independent of SBI v3.0 and can be merged sooner. Hence, these two patches are sent as a separate series. * b4-shazam-merge: drivers/perf: riscv: Do not allow invalid raw event config drivers/perf: riscv: Return error for default case drivers/perf: riscv: Fix Platform firmware event data Link: https://lore.kernel.org/r/20241212-pmu_event_fixes_v2-v2-0-813e8a4f5962@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2025-01-09 09:37:12 -08:00
Atish Patra	3aff4cdbe5	drivers/perf: riscv: Do not allow invalid raw event config The SBI specification allows only lower 48bits of hpmeventX to be configured via SBI PMU. Currently, the driver masks of the higher bits but doesn't return an error. This will lead to an additional SBI call for config matching which should return for an invalid event error in most of the cases. However, if a platform(i.e Rocket and sifive cores) implements a bitmap of all bits in the event encoding this will lead to an incorrect event being programmed leading to user confusion. Report the error to the user if higher bits are set during the event mapping itself to avoid the confusion and save an additional SBI call. Suggested-by: Samuel Holland <samuel.holland@sifive.com> Signed-off-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20241212-pmu_event_fixes_v2-v2-3-813e8a4f5962@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2025-01-09 09:37:09 -08:00
Atish Patra	2c206cdede	drivers/perf: riscv: Return error for default case If the upper two bits has an invalid valid (0x1), the event mapping is not reliable as it returns an uninitialized variable. Return appropriate value for the default case. Fixes: `f0c9363db2` ("perf/riscv-sbi: Add platform specific firmware event handling") Signed-off-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20241212-pmu_event_fixes_v2-v2-2-813e8a4f5962@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2025-01-09 09:37:09 -08:00
Atish Patra	fc58db9aeb	drivers/perf: riscv: Fix Platform firmware event data Platform firmware event data field is allowed to be 62 bits for Linux as uppper most two bits are reserved to indicate SBI fw or platform specific firmware events. However, the event data field is masked as per the hardware raw event mask which is not correct. Fix the platform firmware event data field with proper mask. Fixes: `f0c9363db2` ("perf/riscv-sbi: Add platform specific firmware event handling") Signed-off-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20241212-pmu_event_fixes_v2-v2-1-813e8a4f5962@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2025-01-09 09:37:08 -08:00
Palmer Dabbelt	89726fb01a	Merge patch series "selftest: fix riscv/vector tests" This contains a pair of fixes for the vector self tests, which avoids some warnings and provides proper status messages. * b4-shazam-merge: tools: selftests: riscv: Add test count for vstate_prctl tools: selftests: riscv: Add pass message for v_initval_nolibc Link: https://lore.kernel.org/r/20241220091730.28006-1-yongxuan.wang@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2025-01-09 09:35:42 -08:00
Yong-Xuan Wang	ebdc22c51a	tools: selftests: riscv: Add test count for vstate_prctl Add the test count to drop the warning message. "Planned tests != run tests (0 != 1)" Fixes: `7cf6198ce2` ("selftests: Test RISC-V Vector prctl interface") Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Andy Chiu <AndybnAC@gmail.com> Link: https://lore.kernel.org/r/20241220091730.28006-3-yongxuan.wang@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2025-01-09 09:35:40 -08:00
Yong-Xuan Wang	503465d4dc	tools: selftests: riscv: Add pass message for v_initval_nolibc Add the pass message after we successfully complete the test. Fixes: `5c93c4c72f` ("selftests: Test RISC-V Vector's first-use handler") Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Andy Chiu <AndybnAC@gmail.com> Link: https://lore.kernel.org/r/20241220091730.28006-2-yongxuan.wang@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2025-01-09 09:35:36 -08:00
Jakub Kicinski	b5cf67a8f7	Merge tag 'nf-25-01-09' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Fix imbalance between flowtable BIND and UNBIND calls to configure hardware offload, this fixes a possible kmemleak. 2) Clamp maximum conntrack hashtable size to INT_MAX to fix a possible WARN_ON_ONCE splat coming from kvmalloc_array(), only possible from init_netns. * tag 'nf-25-01-09' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: conntrack: clamp maximum hashtable size to INT_MAX netfilter: nf_tables: imbalance in flowtable binding ==================== Link: https://patch.msgid.link/20250109123532.41768-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-09 08:54:49 -08:00
Jakub Kicinski	2664bc9c7d	Merge branch 'net-sysctl-avoid-using-current-nsproxy' Matthieu Baerts says: ==================== net: sysctl: avoid using current->nsproxy As pointed out by Al Viro and Eric Dumazet in [1], using the 'net' structure via 'current' is not recommended for different reasons: - Inconsistency: getting info from the reader's/writer's netns vs only from the opener's netns as it is usually done. This could cause unexpected issues when other operations are done on the wrong netns. - current->nsproxy can be NULL in some cases, resulting in an 'Oops' (null-ptr-deref), e.g. when the current task is exiting, as spotted by syzbot [1] using acct(2). The 'net' or 'pernet' structure can be obtained from the table->data using container_of(). Note that table->data could also be used directly in more places, but that would increase the size of this fix to replace all accesses via 'net'. Probably best to avoid that for fixes. Patches 2-9 remove access of net via current->nsproxy in sysfs handlers in MPTCP, SCTP and RDS. There are multiple patches doing almost the same thing, but the reason is to ease the backports. Patch 1 is not directly linked to this, but it is a small fix for MPTCP available_schedulers sysctl knob to explicitly mark it as read-only. Please note that this series does not address Al's comment [2]. In SCTP, some sysctl knobs set other sysfs-exposed variables for the min/max: two processes could then write two linked values at the same time, resulting in new values being outside the new boundaries. It would be great if SCTP developers can look at this problem. Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com [1] Link: https://lore.kernel.org/20250105211158.GL1977892@ZenIV [2] ==================== Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-0-5df34b2083e8@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-09 08:53:37 -08:00
Matthieu Baerts (NGI0)	7f5611cbc4	rds: sysctl: rds_tcp_{rcv,snd}buf: avoid using current->nsproxy As mentioned in a previous commit of this series, using the 'net' structure via 'current' is not recommended for different reasons: - Inconsistency: getting info from the reader's/writer's netns vs only from the opener's netns. - current->nsproxy can be NULL in some cases, resulting in an 'Oops' (null-ptr-deref), e.g. when the current task is exiting, as spotted by syzbot [1] using acct(2). The per-netns structure can be obtained from the table->data using container_of(), then the 'net' one can be retrieved from the listen socket (if available). Fixes: `c6a58ffed5` ("RDS: TCP: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com [1] Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-9-5df34b2083e8@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-09 08:53:35 -08:00
Matthieu Baerts (NGI0)	6259d2484d	sctp: sysctl: plpmtud_probe_interval: avoid using current->nsproxy As mentioned in a previous commit of this series, using the 'net' structure via 'current' is not recommended for different reasons: - Inconsistency: getting info from the reader's/writer's netns vs only from the opener's netns. - current->nsproxy can be NULL in some cases, resulting in an 'Oops' (null-ptr-deref), e.g. when the current task is exiting, as spotted by syzbot [1] using acct(2). The 'net' structure can be obtained from the table->data using container_of(). Note that table->data could also be used directly, as this is the only member needed from the 'net' structure, but that would increase the size of this fix, to use '*data' everywhere 'net->sctp.probe_interval' is used. Fixes: `d1e462a7a5` ("sctp: add probe_interval in sysctl and sock/asoc/transport") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com [1] Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-8-5df34b2083e8@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-09 08:53:35 -08:00
Matthieu Baerts (NGI0)	c10377bbc1	sctp: sysctl: udp_port: avoid using current->nsproxy As mentioned in a previous commit of this series, using the 'net' structure via 'current' is not recommended for different reasons: - Inconsistency: getting info from the reader's/writer's netns vs only from the opener's netns. - current->nsproxy can be NULL in some cases, resulting in an 'Oops' (null-ptr-deref), e.g. when the current task is exiting, as spotted by syzbot [1] using acct(2). The 'net' structure can be obtained from the table->data using container_of(). Note that table->data could also be used directly, but that would increase the size of this fix, while 'sctp.ctl_sock' still needs to be retrieved from 'net' structure. Fixes: `046c052b47` ("sctp: enable udp tunneling socks") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com [1] Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-7-5df34b2083e8@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-09 08:53:35 -08:00
Matthieu Baerts (NGI0)	15649fd541	sctp: sysctl: auth_enable: avoid using current->nsproxy As mentioned in a previous commit of this series, using the 'net' structure via 'current' is not recommended for different reasons: - Inconsistency: getting info from the reader's/writer's netns vs only from the opener's netns. - current->nsproxy can be NULL in some cases, resulting in an 'Oops' (null-ptr-deref), e.g. when the current task is exiting, as spotted by syzbot [1] using acct(2). The 'net' structure can be obtained from the table->data using container_of(). Note that table->data could also be used directly, but that would increase the size of this fix, while 'sctp.ctl_sock' still needs to be retrieved from 'net' structure. Fixes: `b14878ccb7` ("net: sctp: cache auth_enable per endpoint") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com [1] Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-6-5df34b2083e8@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-09 08:53:35 -08:00
Matthieu Baerts (NGI0)	9fc17b76fc	sctp: sysctl: rto_min/max: avoid using current->nsproxy As mentioned in a previous commit of this series, using the 'net' structure via 'current' is not recommended for different reasons: - Inconsistency: getting info from the reader's/writer's netns vs only from the opener's netns. - current->nsproxy can be NULL in some cases, resulting in an 'Oops' (null-ptr-deref), e.g. when the current task is exiting, as spotted by syzbot [1] using acct(2). The 'net' structure can be obtained from the table->data using container_of(). Note that table->data could also be used directly, as this is the only member needed from the 'net' structure, but that would increase the size of this fix, to use '*data' everywhere 'net->sctp.rto_min/max' is used. Fixes: `4f3fdf3bc5` ("sctp: add check rto_min and rto_max in sysctl") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com [1] Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-5-5df34b2083e8@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-09 08:53:34 -08:00
Matthieu Baerts (NGI0)	ea62dd1383	sctp: sysctl: cookie_hmac_alg: avoid using current->nsproxy As mentioned in a previous commit of this series, using the 'net' structure via 'current' is not recommended for different reasons: - Inconsistency: getting info from the reader's/writer's netns vs only from the opener's netns. - current->nsproxy can be NULL in some cases, resulting in an 'Oops' (null-ptr-deref), e.g. when the current task is exiting, as spotted by syzbot [1] using acct(2). The 'net' structure can be obtained from the table->data using container_of(). Note that table->data could also be used directly, as this is the only member needed from the 'net' structure, but that would increase the size of this fix, to use '*data' everywhere 'net->sctp.sctp_hmac_alg' is used. Fixes: `3c68198e75` ("sctp: Make hmac algorithm selection for cookie generation dynamic") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com [1] Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-4-5df34b2083e8@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-09 08:53:34 -08:00
Matthieu Baerts (NGI0)	92cf7a51bd	mptcp: sysctl: blackhole timeout: avoid using current->nsproxy As mentioned in the previous commit, using the 'net' structure via 'current' is not recommended for different reasons: - Inconsistency: getting info from the reader's/writer's netns vs only from the opener's netns. - current->nsproxy can be NULL in some cases, resulting in an 'Oops' (null-ptr-deref), e.g. when the current task is exiting, as spotted by syzbot [1] using acct(2). The 'pernet' structure can be obtained from the table->data using container_of(). Fixes: `27069e7cb3` ("mptcp: disable active MPTCP in case of blackhole") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com [1] Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-3-5df34b2083e8@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-09 08:53:34 -08:00
Matthieu Baerts (NGI0)	d38e26e362	mptcp: sysctl: sched: avoid using current->nsproxy Using the 'net' structure via 'current' is not recommended for different reasons. First, if the goal is to use it to read or write per-netns data, this is inconsistent with how the "generic" sysctl entries are doing: directly by only using pointers set to the table entry, e.g. table->data. Linked to that, the per-netns data should always be obtained from the table linked to the netns it had been created for, which may not coincide with the reader's or writer's netns. Another reason is that access to current->nsproxy->netns can oops if attempted when current->nsproxy had been dropped when the current task is exiting. This is what syzbot found, when using acct(2): Oops: general protection fault, probably for non-canonical address 0xdffffc0000000005: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000028-0x000000000000002f] CPU: 1 UID: 0 PID: 5924 Comm: syz-executor Not tainted 6.13.0-rc5-syzkaller-00004-gccb98ccef0e5 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 RIP: 0010:proc_scheduler+0xc6/0x3c0 net/mptcp/ctrl.c:125 Code: 03 42 80 3c 38 00 0f 85 fe 02 00 00 4d 8b a4 24 08 09 00 00 48 b8 00 00 00 00 00 fc ff df 49 8d 7c 24 28 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 cc 02 00 00 4d 8b 7c 24 28 48 8d 84 24 c8 00 00 RSP: 0018:ffffc900034774e8 EFLAGS: 00010206 RAX: dffffc0000000000 RBX: 1ffff9200068ee9e RCX: ffffc90003477620 RDX: 0000000000000005 RSI: ffffffff8b08f91e RDI: 0000000000000028 RBP: 0000000000000001 R08: ffffc90003477710 R09: 0000000000000040 R10: 0000000000000040 R11: 00000000726f7475 R12: 0000000000000000 R13: ffffc90003477620 R14: ffffc90003477710 R15: dffffc0000000000 FS: 0000000000000000(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fee3cd452d8 CR3: 000000007d116000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> proc_sys_call_handler+0x403/0x5d0 fs/proc/proc_sysctl.c:601 __kernel_write_iter+0x318/0xa80 fs/read_write.c:612 __kernel_write+0xf6/0x140 fs/read_write.c:632 do_acct_process+0xcb0/0x14a0 kernel/acct.c:539 acct_pin_kill+0x2d/0x100 kernel/acct.c:192 pin_kill+0x194/0x7c0 fs/fs_pin.c:44 mnt_pin_kill+0x61/0x1e0 fs/fs_pin.c:81 cleanup_mnt+0x3ac/0x450 fs/namespace.c:1366 task_work_run+0x14e/0x250 kernel/task_work.c:239 exit_task_work include/linux/task_work.h:43 [inline] do_exit+0xad8/0x2d70 kernel/exit.c:938 do_group_exit+0xd3/0x2a0 kernel/exit.c:1087 get_signal+0x2576/0x2610 kernel/signal.c:3017 arch_do_signal_or_restart+0x90/0x7e0 arch/x86/kernel/signal.c:337 exit_to_user_mode_loop kernel/entry/common.c:111 [inline] exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline] __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline] syscall_exit_to_user_mode+0x150/0x2a0 kernel/entry/common.c:218 do_syscall_64+0xda/0x250 arch/x86/entry/common.c:89 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7fee3cb87a6a Code: Unable to access opcode bytes at 0x7fee3cb87a40. RSP: 002b:00007fffcccac688 EFLAGS: 00000202 ORIG_RAX: 0000000000000037 RAX: 0000000000000000 RBX: 00007fffcccac710 RCX: 00007fee3cb87a6a RDX: 0000000000000041 RSI: 0000000000000000 RDI: 0000000000000003 RBP: 0000000000000003 R08: 00007fffcccac6ac R09: 00007fffcccacac7 R10: 00007fffcccac710 R11: 0000000000000202 R12: 00007fee3cd49500 R13: 00007fffcccac6ac R14: 0000000000000000 R15: 00007fee3cd4b000 </TASK> Modules linked in: ---[ end trace 0000000000000000 ]--- RIP: 0010:proc_scheduler+0xc6/0x3c0 net/mptcp/ctrl.c:125 Code: 03 42 80 3c 38 00 0f 85 fe 02 00 00 4d 8b a4 24 08 09 00 00 48 b8 00 00 00 00 00 fc ff df 49 8d 7c 24 28 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 cc 02 00 00 4d 8b 7c 24 28 48 8d 84 24 c8 00 00 RSP: 0018:ffffc900034774e8 EFLAGS: 00010206 RAX: dffffc0000000000 RBX: 1ffff9200068ee9e RCX: ffffc90003477620 RDX: 0000000000000005 RSI: ffffffff8b08f91e RDI: 0000000000000028 RBP: 0000000000000001 R08: ffffc90003477710 R09: 0000000000000040 R10: 0000000000000040 R11: 00000000726f7475 R12: 0000000000000000 R13: ffffc90003477620 R14: ffffc90003477710 R15: dffffc0000000000 FS: 0000000000000000(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fee3cd452d8 CR3: 000000007d116000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 ---------------- Code disassembly (best guess), 1 bytes skipped: 0: 42 80 3c 38 00 cmpb $0x0,(%rax,%r15,1) 5: 0f 85 fe 02 00 00 jne 0x309 b: 4d 8b a4 24 08 09 00 mov 0x908(%r12),%r12 12: 00 13: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax 1a: fc ff df 1d: 49 8d 7c 24 28 lea 0x28(%r12),%rdi 22: 48 89 fa mov %rdi,%rdx 25: 48 c1 ea 03 shr $0x3,%rdx * 29: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1) <-- trapping instruction 2d: 0f 85 cc 02 00 00 jne 0x2ff 33: 4d 8b 7c 24 28 mov 0x28(%r12),%r15 38: 48 rex.W 39: 8d .byte 0x8d 3a: 84 24 c8 test %ah,(%rax,%rcx,8) Here with 'net.mptcp.scheduler', the 'net' structure is not really needed, because the table->data already has a pointer to the current scheduler, the only thing needed from the per-netns data. Simply use 'data', instead of getting (most of the time) the same thing, but from a longer and indirect way. Fixes: `6963c508fd` ("mptcp: only allow set existing scheduler for net.mptcp.scheduler") Cc: stable@vger.kernel.org Reported-by: syzbot+e364f774c6f57f2c86d1@syzkaller.appspotmail.com Closes: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-2-5df34b2083e8@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-09 08:53:34 -08:00
Matthieu Baerts (NGI0)	771ec78dc8	mptcp: sysctl: avail sched: remove write access 'net.mptcp.available_schedulers' sysctl knob is there to list available schedulers, not to modify this list. There are then no reasons to give write access to it. Nothing would have been written anyway, but no errors would have been returned, which is unexpected. Fixes: `73c900aa36` ("mptcp: add net.mptcp.available_schedulers") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-1-5df34b2083e8@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-09 08:53:34 -08:00
Jakub Kicinski	92afd9f3bc	Merge branch 'maintainers-spring-2025-cleanup-of-networking-maintainers' Jakub Kicinski says: ==================== MAINTAINERS: spring 2025 cleanup of networking maintainers Annual cleanup of inactive maintainers. To identify inactive maintainers we use Jon Corbet's maintainer analysis script from gitdm, and some manual scanning of lore. v1: https://lore.kernel.org/20250106165404.1832481-1-kuba@kernel.org ==================== Link: https://patch.msgid.link/20250108155242.2575530-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-09 08:30:05 -08:00
Jakub Kicinski	d9e03c6ffc	MAINTAINERS: remove Lars Povlsen from Microchip Sparx5 SoC We have not seen emails or tags from Lars in almost 4 years. Steen and Daniel are pretty active, but the review coverage isn't stellar (35% of changes go in without a review tag). Subsystem ARM/Microchip Sparx5 SoC support Changes 28 / 79 (35%) Last activity: 2024-11-24 Lars Povlsen <lars.povlsen@microchip.com>: Steen Hegelund <Steen.Hegelund@microchip.com>: Tags `6c7c4b91aa` 2024-04-08 00:00:00 15 Daniel Machon <daniel.machon@microchip.com>: Author `48ba00da2e` 2024-04-09 00:00:00 2 Tags `f164b29663` 2024-11-24 00:00:00 6 Top reviewers: [7]: horms@kernel.org [1]: jacob.e.keller@intel.com [1]: jensemil.schulzostergaard@microchip.com [1]: horatiu.vultur@microchip.com INACTIVE MAINTAINER Lars Povlsen <lars.povlsen@microchip.com> Acked-by: Daniel Machon <daniel.machon@microchip.com> Link: https://patch.msgid.link/20250108155242.2575530-9-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-09 08:30:02 -08:00
Jakub Kicinski	d95e2cc737	MAINTAINERS: remove Noam Dagan from AMAZON ETHERNET Noam Dagan was added to ENA reviewers in 2021, we have not seen a single email from this person to any list, ever (according to lore). Git history mentions the name in 2 SoB tags from 2020. Acked-by: Arthur Kiyanovski <akiyano@amazon.com> Link: https://patch.msgid.link/20250108155242.2575530-8-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-09 08:30:02 -08:00
Jakub Kicinski	d4782fbab1	MAINTAINERS: remove Ying Xue from TIPC There is a steady stream of fixes for TIPC, even tho the development has slowed down a lot. Over last 2 years we have merged almost 70 TIPC patches, but we haven't heard from Ying Xue once: Subsystem TIPC NETWORK LAYER Changes 42 / 69 (60%) Last activity: 2023-10-04 Jon Maloy <jmaloy@redhat.com>: Tags `08e50cf071` 2023-10-04 00:00:00 6 Ying Xue <ying.xue@windriver.com>: Top reviewers: [9]: horms@kernel.org [8]: tung.q.nguyen@dektech.com.au [4]: jiri@nvidia.com [3]: tung.q.nguyen@endava.com [2]: kuniyu@amazon.com INACTIVE MAINTAINER Ying Xue <ying.xue@windriver.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250108155242.2575530-7-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-09 08:30:02 -08:00
Jakub Kicinski	9d7b1191d0	MAINTAINERS: remove Mark Lee from MediaTek Ethernet The mailing lists have seen no email from Mark Lee in the last 4 years. gitdm missingmaints says: Subsystem MEDIATEK ETHERNET DRIVER Changes 103 / 400 (25%) Last activity: 2024-12-19 Felix Fietkau <nbd@nbd.name>: Author `88806efc03` 2024-10-17 00:00:00 44 Tags `88806efc03` 2024-10-17 00:00:00 51 Sean Wang <sean.wang@mediatek.com>: Tags `a5d7553829` 2020-04-07 00:00:00 1 Mark Lee <Mark-MC.Lee@mediatek.com>: Lorenzo Bianconi <lorenzo@kernel.org>: Author `0c7469ee71` 2024-12-19 00:00:00 123 Tags `0c7469ee71` 2024-12-19 00:00:00 139 Top reviewers: [32]: horms@kernel.org [15]: leonro@nvidia.com [9]: andrew@lunn.ch INACTIVE MAINTAINER Mark Lee <Mark-MC.Lee@mediatek.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250108155242.2575530-6-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-09 08:30:02 -08:00
Jakub Kicinski	03868822c5	MAINTAINERS: mark stmmac ethernet as an Orphan I tried a couple of things to reinvigorate the stmmac maintainers over the last few years but with little effect. The maintainers are not active, let the MAINTAINERS file reflect reality. The Synopsys IP this driver supports is very popular we need a solid maintainer to deal with the complexity of the driver. gitdm missingmaints says: Subsystem STMMAC ETHERNET DRIVER Changes 344 / 978 (35%) Last activity: 2020-05-01 Alexandre Torgue <alexandre.torgue@foss.st.com>: Tags `1bb694e208` 2020-05-01 00:00:00 1 Jose Abreu <joabreu@synopsys.com>: Top reviewers: [75]: horms@kernel.org [49]: andrew@lunn.ch [46]: fancer.lancer@gmail.com INACTIVE MAINTAINER Jose Abreu <joabreu@synopsys.com> Acked-by: Alexandre Torgue <alexandre.torgue@foss.st.com> Link: https://patch.msgid.link/20250108155242.2575530-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-09 08:30:02 -08:00
Jakub Kicinski	e049fb86d3	MAINTAINERS: remove Andy Gospodarek from bonding Andy does not participate much in bonding reviews, unfortunately. Move him to CREDITS. gitdm missingmaint says: Subsystem BONDING DRIVER Changes 149 / 336 (44%) Last activity: 2024-09-05 Jay Vosburgh <jv@jvosburgh.net>: Tags `68db604e16` 2024-09-05 00:00:00 8 Andy Gospodarek <andy@greyhouse.net>: Top reviewers: [65]: jay.vosburgh@canonical.com [23]: liuhangbin@gmail.com [16]: razor@blackwall.org INACTIVE MAINTAINER Andy Gospodarek <andy@greyhouse.net> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20250108155242.2575530-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-09 08:30:01 -08:00
Jakub Kicinski	b506668613	MAINTAINERS: update maintainers for Microchip LAN78xx Woojung Huh seems to have only replied to the list 35 times in the last 5 years, and didn't provide any reviews in 3 years. The LAN78XX driver has seen quite a bit of activity lately. gitdm missingmaints says: Subsystem USB LAN78XX ETHERNET DRIVER Changes 35 / 91 (38%) (No activity) Top reviewers: [23]: andrew@lunn.ch [3]: horms@kernel.org [2]: mateusz.polchlopek@intel.com INACTIVE MAINTAINER Woojung Huh <woojung.huh@microchip.com> Move Woojung to CREDITS and add new maintainers who are more likely to review LAN78xx patches. Acked-by: Woojung Huh <woojung.huh@microchip.com> Acked-by: Rengarajan Sundararajan <rengarajan.s@microchip.com> Link: https://patch.msgid.link/20250108155242.2575530-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-09 08:30:01 -08:00
Jakub Kicinski	d58200966e	MAINTAINERS: mark Synopsys DW XPCS as Orphan There's not much review support from Jose, there is a sharp drop in his participation around 4 years ago. The DW XPCS IP is very popular and the driver requires active maintenance. gitdm missingmaints says: Subsystem SYNOPSYS DESIGNWARE ETHERNET XPCS DRIVER Changes 33 / 94 (35%) (No activity) Top reviewers: [16]: andrew@lunn.ch [12]: vladimir.oltean@nxp.com [2]: f.fainelli@gmail.com INACTIVE MAINTAINER Jose Abreu <Jose.Abreu@synopsys.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250108155242.2575530-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-09 08:30:01 -08:00
Chenguang Zhao	0e2909c6be	net/mlx5: Fix variable not being completed when function returns When cmd_alloc_index(), fails cmd_work_handler() needs to complete ent->slotted before returning early. Otherwise the task which issued the command may hang: mlx5_core 0000:01:00.0: cmd_work_handler:877:(pid 3880418): failed to allocate command entry INFO: task kworker/13:2:4055883 blocked for more than 120 seconds. Not tainted 4.19.90-25.44.v2101.ky10.aarch64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kworker/13:2 D 0 4055883 2 0x00000228 Workqueue: events mlx5e_tx_dim_work [mlx5_core] Call trace: __switch_to+0xe8/0x150 __schedule+0x2a8/0x9b8 schedule+0x2c/0x88 schedule_timeout+0x204/0x478 wait_for_common+0x154/0x250 wait_for_completion+0x28/0x38 cmd_exec+0x7a0/0xa00 [mlx5_core] mlx5_cmd_exec+0x54/0x80 [mlx5_core] mlx5_core_modify_cq+0x6c/0x80 [mlx5_core] mlx5_core_modify_cq_moderation+0xa0/0xb8 [mlx5_core] mlx5e_tx_dim_work+0x54/0x68 [mlx5_core] process_one_work+0x1b0/0x448 worker_thread+0x54/0x468 kthread+0x134/0x138 ret_from_fork+0x10/0x18 Fixes: `485d65e135` ("net/mlx5: Add a timeout to acquire the command queue semaphore") Signed-off-by: Chenguang Zhao <zhaochenguang@kylinos.cn> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Acked-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250108030009.68520-1-zhaochenguang@kylinos.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-09 08:28:23 -08:00
Dan Carpenter	2055272e3a	rtase: Fix a check for error in rtase_alloc_msix() The pci_irq_vector() function never returns zero. It returns negative error codes or a positive non-zero IRQ number. Fix the error checking to test for negatives. Fixes: `a36e9f5cfe` ("rtase: Add support for a pci table in this module") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Link: https://patch.msgid.link/f2ecc88d-af13-4651-9820-7cc665230019@stanley.mountain Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-09 08:22:05 -08:00
Lizhi Xu	17a4fde81d	afs: Fix merge preference rule failure condition syzbot reported a lock held when returning to userspace[1]. This is because if argc is less than 0 and the function returns directly, the held inode lock is not released. Fix this by store the error in ret and jump to done to clean up instead of returning directly. [dh: Modified Lizhi Xu's original patch to make it honour the error code from afs_split_string()] [1] WARNING: lock held when returning to user space! 6.13.0-rc3-syzkaller-00209-g499551201b5f #0 Not tainted ------------------------------------------------ syz-executor133/5823 is leaving the kernel with locks still held! 1 lock held by syz-executor133/5823: #0: ffff888071cffc00 (&sb->s_type->i_mutex_key#9){++++}-{4:4}, at: inode_lock include/linux/fs.h:818 [inline] #0: ffff888071cffc00 (&sb->s_type->i_mutex_key#9){++++}-{4:4}, at: afs_proc_addr_prefs_write+0x2bb/0x14e0 fs/afs/addr_prefs.c:388 Reported-by: syzbot+76f33569875eb708e575@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=76f33569875eb708e575 Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com> Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20241226012616.2348907-1-lizhi.xu@windriver.com/ Link: https://lore.kernel.org/r/529850.1736261552@warthog.procyon.org.uk Tested-by: syzbot+76f33569875eb708e575@syzkaller.appspotmail.com cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-01-09 17:21:41 +01:00
Parker Newman	426046e2d6	net: stmmac: dwmac-tegra: Read iommu stream id from device tree Nvidia's Tegra MGBE controllers require the IOMMU "Stream ID" (SID) to be written to the MGBE_WRAP_AXI_ASID0_CTRL register. The current driver is hard coded to use MGBE0's SID for all controllers. This causes softirq time outs and kernel panics when using controllers other than MGBE0. Example dmesg errors when an ethernet cable is connected to MGBE1: [ 116.133290] tegra-mgbe 6910000.ethernet eth1: Link is Up - 1Gbps/Full - flow control rx/tx [ 121.851283] tegra-mgbe 6910000.ethernet eth1: NETDEV WATCHDOG: CPU: 5: transmit queue 0 timed out 5690 ms [ 121.851782] tegra-mgbe 6910000.ethernet eth1: Reset adapter. [ 121.892464] tegra-mgbe 6910000.ethernet eth1: Register MEM_TYPE_PAGE_POOL RxQ-0 [ 121.905920] tegra-mgbe 6910000.ethernet eth1: PHY [stmmac-1:00] driver [Aquantia AQR113] (irq=171) [ 121.907356] tegra-mgbe 6910000.ethernet eth1: Enabling Safety Features [ 121.907578] tegra-mgbe 6910000.ethernet eth1: IEEE 1588-2008 Advanced Timestamp supported [ 121.908399] tegra-mgbe 6910000.ethernet eth1: registered PTP clock [ 121.908582] tegra-mgbe 6910000.ethernet eth1: configuring for phy/10gbase-r link mode [ 125.961292] tegra-mgbe 6910000.ethernet eth1: Link is Up - 1Gbps/Full - flow control rx/tx [ 181.921198] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: [ 181.921404] rcu: 7-....: (1 GPs behind) idle=540c/1/0x4000000000000002 softirq=1748/1749 fqs=2337 [ 181.921684] rcu: (detected by 4, t=6002 jiffies, g=1357, q=1254 ncpus=8) [ 181.921878] Sending NMI from CPU 4 to CPUs 7: [ 181.921886] NMI backtrace for cpu 7 [ 181.922131] CPU: 7 UID: 0 PID: 0 Comm: swapper/7 Kdump: loaded Not tainted 6.13.0-rc3+ #6 [ 181.922390] Hardware name: NVIDIA CTI Forge + Orin AGX/Jetson, BIOS 202402.1-Unknown 10/28/2024 [ 181.922658] pstate: 40400009 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 181.922847] pc : handle_softirqs+0x98/0x368 [ 181.922978] lr : __do_softirq+0x18/0x20 [ 181.923095] sp : ffff80008003bf50 [ 181.923189] x29: ffff80008003bf50 x28: 0000000000000008 x27: 0000000000000000 [ 181.923379] x26: ffffce78ea277000 x25: 0000000000000000 x24: 0000001c61befda0 [ 181.924486] x23: 0000000060400009 x22: ffffce78e99918bc x21: ffff80008018bd70 [ 181.925568] x20: ffffce78e8bb00d8 x19: ffff80008018bc20 x18: 0000000000000000 [ 181.926655] x17: ffff318ebe7d3000 x16: ffff800080038000 x15: 0000000000000000 [ 181.931455] x14: ffff000080816680 x13: ffff318ebe7d3000 x12: 000000003464d91d [ 181.938628] x11: 0000000000000040 x10: ffff000080165a70 x9 : ffffce78e8bb0160 [ 181.945804] x8 : ffff8000827b3160 x7 : f9157b241586f343 x6 : eeb6502a01c81c74 [ 181.953068] x5 : a4acfcdd2e8096bb x4 : ffffce78ea277340 x3 : 00000000ffffd1e1 [ 181.960329] x2 : 0000000000000101 x1 : ffffce78ea277340 x0 : ffff318ebe7d3000 [ 181.967591] Call trace: [ 181.970043] handle_softirqs+0x98/0x368 (P) [ 181.974240] __do_softirq+0x18/0x20 [ 181.977743] ____do_softirq+0x14/0x28 [ 181.981415] call_on_irq_stack+0x24/0x30 [ 181.985180] do_softirq_own_stack+0x20/0x30 [ 181.989379] __irq_exit_rcu+0x114/0x140 [ 181.993142] irq_exit_rcu+0x14/0x28 [ 181.996816] el1_interrupt+0x44/0xb8 [ 182.000316] el1h_64_irq_handler+0x14/0x20 [ 182.004343] el1h_64_irq+0x80/0x88 [ 182.007755] cpuidle_enter_state+0xc4/0x4a8 (P) [ 182.012305] cpuidle_enter+0x3c/0x58 [ 182.015980] cpuidle_idle_call+0x128/0x1c0 [ 182.020005] do_idle+0xe0/0xf0 [ 182.023155] cpu_startup_entry+0x3c/0x48 [ 182.026917] secondary_start_kernel+0xdc/0x120 [ 182.031379] __secondary_switched+0x74/0x78 [ 212.971162] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 7-.... } 6103 jiffies s: 417 root: 0x80/. [ 212.985935] rcu: blocking rcu_node structures (internal RCU debug): [ 212.992758] Sending NMI from CPU 0 to CPUs 7: [ 212.998539] NMI backtrace for cpu 7 [ 213.004304] CPU: 7 UID: 0 PID: 0 Comm: swapper/7 Kdump: loaded Not tainted 6.13.0-rc3+ #6 [ 213.016116] Hardware name: NVIDIA CTI Forge + Orin AGX/Jetson, BIOS 202402.1-Unknown 10/28/2024 [ 213.030817] pstate: 40400009 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 213.040528] pc : handle_softirqs+0x98/0x368 [ 213.046563] lr : __do_softirq+0x18/0x20 [ 213.051293] sp : ffff80008003bf50 [ 213.055839] x29: ffff80008003bf50 x28: 0000000000000008 x27: 0000000000000000 [ 213.067304] x26: ffffce78ea277000 x25: 0000000000000000 x24: 0000001c61befda0 [ 213.077014] x23: 0000000060400009 x22: ffffce78e99918bc x21: ffff80008018bd70 [ 213.087339] x20: ffffce78e8bb00d8 x19: ffff80008018bc20 x18: 0000000000000000 [ 213.097313] x17: ffff318ebe7d3000 x16: ffff800080038000 x15: 0000000000000000 [ 213.107201] x14: ffff000080816680 x13: ffff318ebe7d3000 x12: 000000003464d91d [ 213.116651] x11: 0000000000000040 x10: ffff000080165a70 x9 : ffffce78e8bb0160 [ 213.127500] x8 : ffff8000827b3160 x7 : 0a37b344852820af x6 : 3f049caedd1ff608 [ 213.138002] x5 : cff7cfdbfaf31291 x4 : ffffce78ea277340 x3 : 00000000ffffde04 [ 213.150428] x2 : 0000000000000101 x1 : ffffce78ea277340 x0 : ffff318ebe7d3000 [ 213.162063] Call trace: [ 213.165494] handle_softirqs+0x98/0x368 (P) [ 213.171256] __do_softirq+0x18/0x20 [ 213.177291] ____do_softirq+0x14/0x28 [ 213.182017] call_on_irq_stack+0x24/0x30 [ 213.186565] do_softirq_own_stack+0x20/0x30 [ 213.191815] __irq_exit_rcu+0x114/0x140 [ 213.196891] irq_exit_rcu+0x14/0x28 [ 213.202401] el1_interrupt+0x44/0xb8 [ 213.207741] el1h_64_irq_handler+0x14/0x20 [ 213.213519] el1h_64_irq+0x80/0x88 [ 213.217541] cpuidle_enter_state+0xc4/0x4a8 (P) [ 213.224364] cpuidle_enter+0x3c/0x58 [ 213.228653] cpuidle_idle_call+0x128/0x1c0 [ 213.233993] do_idle+0xe0/0xf0 [ 213.237928] cpu_startup_entry+0x3c/0x48 [ 213.243791] secondary_start_kernel+0xdc/0x120 [ 213.249830] __secondary_switched+0x74/0x78 This bug has existed since the dwmac-tegra driver was added in Dec 2022 (See Fixes tag below for commit hash). The Tegra234 SOC has 4 MGBE controllers, however Nvidia's Developer Kit only uses MGBE0 which is why the bug was not found previously. Connect Tech has many products that use 2 (or more) MGBE controllers. The solution is to read the controller's SID from the existing "iommus" device tree property. The 2nd field of the "iommus" device tree property is the controller's SID. Device tree snippet from tegra234.dtsi showing MGBE1's "iommus" property: smmu_niso0: iommu@12000000 { compatible = "nvidia,tegra234-smmu", "nvidia,smmu-500"; ... } /* MGBE1 */ ethernet@6900000 { compatible = "nvidia,tegra234-mgbe"; ... iommus = <&smmu_niso0 TEGRA234_SID_MGBE_VF1>; ... } Nvidia's arm-smmu driver reads the "iommus" property and stores the SID in the MGBE device's "fwspec" struct. The dwmac-tegra driver can access the SID using the tegra_dev_iommu_get_stream_id() helper function found in linux/iommu.h. Calling tegra_dev_iommu_get_stream_id() should not fail unless the "iommus" property is removed from the device tree or the IOMMU is disabled. While the Tegra234 SOC technically supports bypassing the IOMMU, it is not supported by the current firmware, has not been tested and not recommended. More detailed discussion with Thierry Reding from Nvidia linked below. Fixes: `d8ca113724` ("net: stmmac: tegra: Add MGBE support") Link: https://lore.kernel.org/netdev/cover.1731685185.git.pnewman@connecttech.com Signed-off-by: Parker Newman <pnewman@connecttech.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Thierry Reding <treding@nvidia.com> Link: https://patch.msgid.link/6fb97f32cf4accb4f7cf92846f6b60064ba0a3bd.1736284360.git.pnewman@connecttech.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-09 08:20:34 -08:00
David Howells	904abff4b1	netfs: Fix read-retry for fs with no ->prepare_read() Fix netfslib's read-retry to only call ->prepare_read() in the backing filesystem such a function is provided. We can get to this point if a there's an active cache as failed reads from the cache need negotiating with the server instead. Fixes: `ee4cdf7ba8` ("netfs: Speed up buffered reading") Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/529329.1736261010@warthog.procyon.org.uk cc: Jeff Layton <jlayton@kernel.org> cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-01-09 17:20:04 +01:00
David Howells	3f6bc9e3ab	netfs: Fix kernel async DIO Netfslib needs to be able to handle kernel-initiated asynchronous DIO that is supplied with a bio_vec[] array. Currently, because of the async flag, this gets passed to netfs_extract_user_iter() which throws a warning and fails because it only handles IOVEC and UBUF iterators. This can be triggered through a combination of cifs and a loopback blockdev with something like: mount //my/cifs/share /foo dd if=/dev/zero of=/foo/m0 bs=4K count=1K losetup --sector-size 4096 --direct-io=on /dev/loop2046 /foo/m0 echo hello >/dev/loop2046 This causes the following to appear in syslog: WARNING: CPU: 2 PID: 109 at fs/netfs/iterator.c:50 netfs_extract_user_iter+0x170/0x250 [netfs] and the write to fail. Fix this by removing the check in netfs_unbuffered_write_iter_locked() that causes async kernel DIO writes to be handled as userspace writes. Note that this change relies on the kernel caller maintaining the existence of the bio_vec array (or kvec[] or folio_queue) until the op is complete. Fixes: `153a9961b5` ("netfs: Implement unbuffered/DIO write support") Reported-by: Nicolas Baranger <nicolas.baranger@3xo.fr> Closes: https://lore.kernel.org/r/fedd8a40d54b2969097ffa4507979858@3xo.fr/ Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/608725.1736275167@warthog.procyon.org.uk Tested-by: Nicolas Baranger <nicolas.baranger@3xo.fr> Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> cc: Steve French <smfrench@gmail.com> cc: Jeff Layton <jlayton@kernel.org> cc: netfs@lists.linux.dev cc: linux-cifs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-01-09 17:18:51 +01:00
Toke Høiland-Jørgensen	737d4d91d3	sched: sch_cake: add bounds checks to host bulk flow fairness counts Even though we fixed a logic error in the commit cited below, syzbot still managed to trigger an underflow of the per-host bulk flow counters, leading to an out of bounds memory access. To avoid any such logic errors causing out of bounds memory accesses, this commit factors out all accesses to the per-host bulk flow counters to a series of helpers that perform bounds-checking before any increments and decrements. This also has the benefit of improving readability by moving the conditional checks for the flow mode into these helpers, instead of having them spread out throughout the code (which was the cause of the original logic error). As part of this change, the flow quantum calculation is consolidated into a helper function, which means that the dithering applied to the ost load scaling is now applied both in the DRR rotation and when a sparse flow's quantum is first initiated. The only user-visible effect of this is that the maximum packet size that can be sent while a flow stays sparse will now vary with +/- one byte in some cases. This should not make a noticeable difference in practice, and thus it's not worth complicating the code to preserve the old behaviour. Fixes: `546ea84d07` ("sched: sch_cake: fix bulk flow accounting logic for host fairness") Reported-by: syzbot+f63600d288bfb7057424@syzkaller.appspotmail.com Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Dave Taht <dave.taht@gmail.com> Link: https://patch.msgid.link/20250107120105.70685-1-toke@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-09 08:18:41 -08:00
Christian Brauner	482d520d86	Merge tag 'vfs-6.14-rc7.mount.fixes' Bring in the fix for the mount namespace rbtree. It is used as the base for the vfs mount work for this cycle and so shouldn't be applied directly. Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-01-09 17:03:21 +01:00
Christian Brauner	344bac8f0d	fs: kill MNT_ONRB Move mnt->mnt_node into the union with mnt->mnt_rcu and mnt->mnt_llist instead of keeping it with mnt->mnt_list. This allows us to use RB_CLEAR_NODE(&mnt->mnt_node) in umount_tree() as well as list_empty(&mnt->mnt_node). That in turn allows us to remove MNT_ONRB. This also fixes the bug reported in [1] where seemingly MNT_ONRB wasn't set in @mnt->mnt_flags even though the mount was present in the mount rbtree of the mount namespace. The root cause is the following race. When a btrfs subvolume is mounted a temporary mount is created: btrfs_get_tree_subvol() { mnt = fc_mount() // Register the newly allocated mount with sb->mounts: lock_mount_hash(); list_add_tail(&mnt->mnt_instance, &mnt->mnt.mnt_sb->s_mounts); unlock_mount_hash(); } and registered on sb->s_mounts. Later it is added to an anonymous mount namespace via mount_subvol(): -> mount_subvol() -> mount_subtree() -> alloc_mnt_ns() mnt_add_to_ns() vfs_path_lookup() put_mnt_ns() The mnt_add_to_ns() call raises MNT_ONRB in @mnt->mnt_flags. If someone concurrently does a ro remount: reconfigure_super() -> sb_prepare_remount_readonly() { list_for_each_entry(mnt, &sb->s_mounts, mnt_instance) { } all mounts registered in sb->s_mounts are visited and first MNT_WRITE_HOLD is raised, then MNT_READONLY is raised, and finally MNT_WRITE_HOLD is removed again. The flag modification for MNT_WRITE_HOLD/MNT_READONLY and MNT_ONRB race so MNT_ONRB might be lost. Fixes: `2eea9ce431` ("mounts: keep list of mounts in an rbtree") Cc: <stable@kernel.org> # v6.8+ Link: https://lore.kernel.org/r/20241215-vfs-6-14-mount-work-v1-1-fd55922c4af8@kernel.org Link: https://lore.kernel.org/r/ec6784ed-8722-4695-980a-4400d4e7bd1a@gmx.com [1] Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-01-09 16:58:50 +01:00
Peter Geis	3699f2c43e	arm64: dts: rockchip: add hevc power domain clock to rk3328 There is a race condition at startup between disabling power domains not used and disabling clocks not used on the rk3328. When the clocks are disabled first, the hevc power domain fails to shut off leading to a splat of failures. Add the hevc core clock to the rk3328 power domain node to prevent this condition. rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 3-.... } 1087 jiffies s: 89 root: 0x8/. rcu: blocking rcu_node structures (internal RCU debug): Sending NMI from CPU 0 to CPUs 3: NMI backtrace for cpu 3 CPU: 3 UID: 0 PID: 86 Comm: kworker/3:3 Not tainted 6.12.0-rc5+ #53 Hardware name: Firefly ROC-RK3328-CC (DT) Workqueue: pm genpd_power_off_work_fn pstate: 20400005 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : regmap_unlock_spinlock+0x18/0x30 lr : regmap_read+0x60/0x88 sp : ffff800081123c00 x29: ffff800081123c00 x28: ffff2fa4c62cad80 x27: 0000000000000000 x26: ffffd74e6e660eb8 x25: ffff2fa4c62cae00 x24: 0000000000000040 x23: ffffd74e6d2f3ab8 x22: 0000000000000001 x21: ffff800081123c74 x20: 0000000000000000 x19: ffff2fa4c0412000 x18: 0000000000000000 x17: 77202c31203d2065 x16: 6c6469203a72656c x15: 6c6f72746e6f632d x14: 7265776f703a6e6f x13: 2063766568206e69 x12: 616d6f64202c3431 x11: 347830206f742030 x10: 3430303034783020 x9 : ffffd74e6c7369e0 x8 : 3030316666206e69 x7 : 205d383738353733 x6 : 332e31202020205b x5 : ffffd74e6c73fc88 x4 : ffffd74e6c73fcd4 x3 : ffffd74e6c740b40 x2 : ffff800080015484 x1 : 0000000000000000 x0 : ffff2fa4c0412000 Call trace: regmap_unlock_spinlock+0x18/0x30 rockchip_pmu_set_idle_request+0xac/0x2c0 rockchip_pd_power+0x144/0x5f8 rockchip_pd_power_off+0x1c/0x30 _genpd_power_off+0x9c/0x180 genpd_power_off.part.0.isra.0+0x130/0x2a8 genpd_power_off_work_fn+0x6c/0x98 process_one_work+0x170/0x3f0 worker_thread+0x290/0x4a8 kthread+0xec/0xf8 ret_from_fork+0x10/0x20 rockchip-pm-domain ff100000.syscon:power-controller: failed to get ack on domain 'hevc', val=0x88220 Fixes: `52e02d377a` ("arm64: dts: rockchip: add core dtsi file for RK3328 SoCs") Signed-off-by: Peter Geis <pgwipeout@gmail.com> Reviewed-by: Dragan Simic <dsimic@manjaro.org> Link: https://lore.kernel.org/r/20241214224339.24674-1-pgwipeout@gmail.com Signed-off-by: Heiko Stuebner <heiko@sntech.de>	2025-01-09 16:22:49 +01:00
Marco Nelissen	c13094b894	iomap: avoid avoid truncating 64-bit offset to 32 bits on 32-bit kernels, iomap_write_delalloc_scan() was inadvertently using a 32-bit position due to folio_next_index() returning an unsigned long. This could lead to an infinite loop when writing to an xfs filesystem. Signed-off-by: Marco Nelissen <marco.nelissen@gmail.com> Link: https://lore.kernel.org/r/20250109041253.2494374-1-marco.nelissen@gmail.com Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-01-09 16:09:20 +01:00
Jens Axboe	c9a40292a4	io_uring/eventfd: ensure io_eventfd_signal() defers another RCU period io_eventfd_do_signal() is invoked from an RCU callback, but when dropping the reference to the io_ev_fd, it calls io_eventfd_free() directly if the refcount drops to zero. This isn't correct, as any potential freeing of the io_ev_fd should be deferred another RCU grace period. Just call io_eventfd_put() rather than open-code the dec-and-test and free, which will correctly defer it another RCU grace period. Fixes: `21a091b970` ("io_uring: signal registered eventfd to process deferred task work") Reported-by: Jann Horn <jannh@google.com> Cc: stable@vger.kernel.org Tested-by: Li Zetao <lizetao1@huawei.com> Reviewed-by: Li Zetao<lizetao1@huawei.com> Reviewed-by: Prasanna Kumar T S M <ptsm@linux.microsoft.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-01-09 07:16:45 -07:00
Yu Kuai	fcede1f0a0	block, bfq: fix waker_bfqq UAF after bfq_split_bfqq() Our syzkaller report a following UAF for v6.6: BUG: KASAN: slab-use-after-free in bfq_init_rq+0x175d/0x17a0 block/bfq-iosched.c:6958 Read of size 8 at addr ffff8881b57147d8 by task fsstress/232726 CPU: 2 PID: 232726 Comm: fsstress Not tainted 6.6.0-g3629d1885222 #39 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x91/0xf0 lib/dump_stack.c:106 print_address_description.constprop.0+0x66/0x300 mm/kasan/report.c:364 print_report+0x3e/0x70 mm/kasan/report.c:475 kasan_report+0xb8/0xf0 mm/kasan/report.c:588 hlist_add_head include/linux/list.h:1023 [inline] bfq_init_rq+0x175d/0x17a0 block/bfq-iosched.c:6958 bfq_insert_request.isra.0+0xe8/0xa20 block/bfq-iosched.c:6271 bfq_insert_requests+0x27f/0x390 block/bfq-iosched.c:6323 blk_mq_insert_request+0x290/0x8f0 block/blk-mq.c:2660 blk_mq_submit_bio+0x1021/0x15e0 block/blk-mq.c:3143 __submit_bio+0xa0/0x6b0 block/blk-core.c:639 __submit_bio_noacct_mq block/blk-core.c:718 [inline] submit_bio_noacct_nocheck+0x5b7/0x810 block/blk-core.c:747 submit_bio_noacct+0xca0/0x1990 block/blk-core.c:847 __ext4_read_bh fs/ext4/super.c:205 [inline] ext4_read_bh+0x15e/0x2e0 fs/ext4/super.c:230 __read_extent_tree_block+0x304/0x6f0 fs/ext4/extents.c:567 ext4_find_extent+0x479/0xd20 fs/ext4/extents.c:947 ext4_ext_map_blocks+0x1a3/0x2680 fs/ext4/extents.c:4182 ext4_map_blocks+0x929/0x15a0 fs/ext4/inode.c:660 ext4_iomap_begin_report+0x298/0x480 fs/ext4/inode.c:3569 iomap_iter+0x3dd/0x1010 fs/iomap/iter.c:91 iomap_fiemap+0x1f4/0x360 fs/iomap/fiemap.c:80 ext4_fiemap+0x181/0x210 fs/ext4/extents.c:5051 ioctl_fiemap.isra.0+0x1b4/0x290 fs/ioctl.c:220 do_vfs_ioctl+0x31c/0x11a0 fs/ioctl.c:811 __do_sys_ioctl fs/ioctl.c:869 [inline] __se_sys_ioctl+0xae/0x190 fs/ioctl.c:857 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x70/0x120 arch/x86/entry/common.c:81 entry_SYSCALL_64_after_hwframe+0x78/0xe2 Allocated by task 232719: kasan_save_stack+0x22/0x50 mm/kasan/common.c:45 kasan_set_track+0x25/0x30 mm/kasan/common.c:52 __kasan_slab_alloc+0x87/0x90 mm/kasan/common.c:328 kasan_slab_alloc include/linux/kasan.h:188 [inline] slab_post_alloc_hook mm/slab.h:768 [inline] slab_alloc_node mm/slub.c:3492 [inline] kmem_cache_alloc_node+0x1b8/0x6f0 mm/slub.c:3537 bfq_get_queue+0x215/0x1f00 block/bfq-iosched.c:5869 bfq_get_bfqq_handle_split+0x167/0x5f0 block/bfq-iosched.c:6776 bfq_init_rq+0x13a4/0x17a0 block/bfq-iosched.c:6938 bfq_insert_request.isra.0+0xe8/0xa20 block/bfq-iosched.c:6271 bfq_insert_requests+0x27f/0x390 block/bfq-iosched.c:6323 blk_mq_insert_request+0x290/0x8f0 block/blk-mq.c:2660 blk_mq_submit_bio+0x1021/0x15e0 block/blk-mq.c:3143 __submit_bio+0xa0/0x6b0 block/blk-core.c:639 __submit_bio_noacct_mq block/blk-core.c:718 [inline] submit_bio_noacct_nocheck+0x5b7/0x810 block/blk-core.c:747 submit_bio_noacct+0xca0/0x1990 block/blk-core.c:847 __ext4_read_bh fs/ext4/super.c:205 [inline] ext4_read_bh_nowait+0x15a/0x240 fs/ext4/super.c:217 ext4_read_bh_lock+0xac/0xd0 fs/ext4/super.c:242 ext4_bread_batch+0x268/0x500 fs/ext4/inode.c:958 __ext4_find_entry+0x448/0x10f0 fs/ext4/namei.c:1671 ext4_lookup_entry fs/ext4/namei.c:1774 [inline] ext4_lookup.part.0+0x359/0x6f0 fs/ext4/namei.c:1842 ext4_lookup+0x72/0x90 fs/ext4/namei.c:1839 __lookup_slow+0x257/0x480 fs/namei.c:1696 lookup_slow fs/namei.c:1713 [inline] walk_component+0x454/0x5c0 fs/namei.c:2004 link_path_walk.part.0+0x773/0xda0 fs/namei.c:2331 link_path_walk fs/namei.c:3826 [inline] path_openat+0x1b9/0x520 fs/namei.c:3826 do_filp_open+0x1b7/0x400 fs/namei.c:3857 do_sys_openat2+0x5dc/0x6e0 fs/open.c:1428 do_sys_open fs/open.c:1443 [inline] __do_sys_openat fs/open.c:1459 [inline] __se_sys_openat fs/open.c:1454 [inline] __x64_sys_openat+0x148/0x200 fs/open.c:1454 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x70/0x120 arch/x86/entry/common.c:81 entry_SYSCALL_64_after_hwframe+0x78/0xe2 Freed by task 232726: kasan_save_stack+0x22/0x50 mm/kasan/common.c:45 kasan_set_track+0x25/0x30 mm/kasan/common.c:52 kasan_save_free_info+0x2b/0x50 mm/kasan/generic.c:522 ____kasan_slab_free mm/kasan/common.c:236 [inline] __kasan_slab_free+0x12a/0x1b0 mm/kasan/common.c:244 kasan_slab_free include/linux/kasan.h:164 [inline] slab_free_hook mm/slub.c:1827 [inline] slab_free_freelist_hook mm/slub.c:1853 [inline] slab_free mm/slub.c:3820 [inline] kmem_cache_free+0x110/0x760 mm/slub.c:3842 bfq_put_queue+0x6a7/0xfb0 block/bfq-iosched.c:5428 bfq_forget_entity block/bfq-wf2q.c:634 [inline] bfq_put_idle_entity+0x142/0x240 block/bfq-wf2q.c:645 bfq_forget_idle+0x189/0x1e0 block/bfq-wf2q.c:671 bfq_update_vtime block/bfq-wf2q.c:1280 [inline] __bfq_lookup_next_entity block/bfq-wf2q.c:1374 [inline] bfq_lookup_next_entity+0x350/0x480 block/bfq-wf2q.c:1433 bfq_update_next_in_service+0x1c0/0x4f0 block/bfq-wf2q.c:128 bfq_deactivate_entity+0x10a/0x240 block/bfq-wf2q.c:1188 bfq_deactivate_bfqq block/bfq-wf2q.c:1592 [inline] bfq_del_bfqq_busy+0x2e8/0xad0 block/bfq-wf2q.c:1659 bfq_release_process_ref+0x1cc/0x220 block/bfq-iosched.c:3139 bfq_split_bfqq+0x481/0xdf0 block/bfq-iosched.c:6754 bfq_init_rq+0xf29/0x17a0 block/bfq-iosched.c:6934 bfq_insert_request.isra.0+0xe8/0xa20 block/bfq-iosched.c:6271 bfq_insert_requests+0x27f/0x390 block/bfq-iosched.c:6323 blk_mq_insert_request+0x290/0x8f0 block/blk-mq.c:2660 blk_mq_submit_bio+0x1021/0x15e0 block/blk-mq.c:3143 __submit_bio+0xa0/0x6b0 block/blk-core.c:639 __submit_bio_noacct_mq block/blk-core.c:718 [inline] submit_bio_noacct_nocheck+0x5b7/0x810 block/blk-core.c:747 submit_bio_noacct+0xca0/0x1990 block/blk-core.c:847 __ext4_read_bh fs/ext4/super.c:205 [inline] ext4_read_bh+0x15e/0x2e0 fs/ext4/super.c:230 __read_extent_tree_block+0x304/0x6f0 fs/ext4/extents.c:567 ext4_find_extent+0x479/0xd20 fs/ext4/extents.c:947 ext4_ext_map_blocks+0x1a3/0x2680 fs/ext4/extents.c:4182 ext4_map_blocks+0x929/0x15a0 fs/ext4/inode.c:660 ext4_iomap_begin_report+0x298/0x480 fs/ext4/inode.c:3569 iomap_iter+0x3dd/0x1010 fs/iomap/iter.c:91 iomap_fiemap+0x1f4/0x360 fs/iomap/fiemap.c:80 ext4_fiemap+0x181/0x210 fs/ext4/extents.c:5051 ioctl_fiemap.isra.0+0x1b4/0x290 fs/ioctl.c:220 do_vfs_ioctl+0x31c/0x11a0 fs/ioctl.c:811 __do_sys_ioctl fs/ioctl.c:869 [inline] __se_sys_ioctl+0xae/0x190 fs/ioctl.c:857 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x70/0x120 arch/x86/entry/common.c:81 entry_SYSCALL_64_after_hwframe+0x78/0xe2 commit `1ba0403ac6` ("block, bfq: fix uaf for accessing waker_bfqq after splitting") fix the problem that if waker_bfqq is in the merge chain, and current is the only procress, waker_bfqq can be freed from bfq_split_bfqq(). However, the case that waker_bfqq is not in the merge chain is missed, and if the procress reference of waker_bfqq is 0, waker_bfqq can be freed as well. Fix the problem by checking procress reference if waker_bfqq is not in the merge_chain. Fixes: `1ba0403ac6` ("block, bfq: fix uaf for accessing waker_bfqq after splitting") Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20250108084148.1549973-1-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-01-09 06:52:46 -07:00
Pablo Neira Ayuso	b541ba7d1f	netfilter: conntrack: clamp maximum hashtable size to INT_MAX Use INT_MAX as maximum size for the conntrack hashtable. Otherwise, it is possible to hit WARN_ON_ONCE in __kvmalloc_node_noprof() when resizing hashtable because __GFP_NOWARN is unset. See: `0708a0afe2` ("mm: Consider __GFP_NOWARN flag for oversized kvmalloc() calls") Note: hashtable resize is only possible from init_netns. Fixes: `9cc1c73ad6` ("netfilter: conntrack: avoid integer overflow when resizing") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2025-01-09 13:29:45 +01:00
Pablo Neira Ayuso	13210fc63f	netfilter: nf_tables: imbalance in flowtable binding All these cases cause imbalance between BIND and UNBIND calls: - Delete an interface from a flowtable with multiple interfaces - Add a (device to a) flowtable with --check flag - Delete a netns containing a flowtable - In an interactive nft session, create a table with owner flag and flowtable inside, then quit. Fix it by calling FLOW_BLOCK_UNBIND when unregistering hooks, then remove late FLOW_BLOCK_UNBIND call when destroying flowtable. Fixes: `ff4bf2f42a` ("netfilter: nf_tables: add nft_unregister_flowtable_hook()") Reported-by: Phil Sutter <phil@nwl.cc> Tested-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2025-01-09 13:29:38 +01:00
Leo Yang	2d2d4f60ed	mctp i3c: fix MCTP I3C driver multi-thread issue We found a timeout problem with the pldm command on our system. The reason is that the MCTP-I3C driver has a race condition when receiving multiple-packet messages in multi-thread, resulting in a wrong packet order problem. We identified this problem by adding a debug message to the mctp_i3c_read function. According to the MCTP spec, a multiple-packet message must be composed in sequence, and if there is a wrong sequence, the whole message will be discarded and wait for the next SOM. For example, SOM → Pkt Seq #2 → Pkt Seq #1 → Pkt Seq #3 → EOM. Therefore, we try to solve this problem by adding a mutex to the mctp_i3c_read function. Before the modification, when a command requesting a multiple-packet message response is sent consecutively, an error usually occurs within 100 loops. After the mutex, it can go through 40000 loops without any error, and it seems to run well. Fixes: `c8755b29b5` ("mctp i3c: MCTP I3C driver") Signed-off-by: Leo Yang <Leo-Yang@quantatw.com> Link: https://patch.msgid.link/20250107031529.3296094-1-Leo-Yang@quantatw.com [pabeni@redhat.com: dropped already answered question from changelog] Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-01-09 11:52:38 +01:00
Rodrigo Vivi	b84e1cd22f	drm/xe/dg1: Fix power gate sequence. sub-pipe PG is not present on DG1. Setting these bits can disable other power gates and cause GPU hangs on video playbacks. VLK: 16314, 4304 Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13381 Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241219235536.454270-1-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit `2f12e9c029`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2025-01-09 10:38:56 +01:00
Lucas De Marchi	9ab4981552	drm/xe: Fix tlb invalidation when wedging If GuC fails to load, the driver wedges, but in the process it tries to do stuff that may not be initialized yet. This moves the xe_gt_tlb_invalidation_init() to be done earlier: as its own doc says, it's a software-only initialization and should had been named with the _early() suffix. Move it to be called by xe_gt_init_early(), so the locks and seqno are initialized, avoiding a NULL ptr deref when wedging: xe 0000:03:00.0: [drm] ERROR GT0: load failed: status: Reset = 0, BootROM = 0x50, UKernel = 0x00, MIA = 0x00, Auth = 0x01 xe 0000:03:00.0: [drm] ERROR GT0: firmware signature verification failed xe 0000:03:00.0: [drm] ERROR CRITICAL: Xe has declared device 0000:03:00.0 as wedged. ... BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 9 UID: 0 PID: 3908 Comm: modprobe Tainted: G U W 6.13.0-rc4-xe+ #3 Tainted: [U]=USER, [W]=WARN Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-S ADP-S DDR5 UDIMM CRB, BIOS ADLSFWI1.R00.3275.A00.2207010640 07/01/2022 RIP: 0010:xe_gt_tlb_invalidation_reset+0x75/0x110 [xe] This can be easily triggered by poking the GuC binary to force a signature failure. There will still be an extra message, xe 0000:03:00.0: [drm] ERROR GT0: GuC mmio request 0x4100: no reply 0x4100 but that's better than a NULL ptr deref. Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3956 Fixes: `c9474b726b` ("drm/xe: Wedge the entire device") Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250103001111.331684-2-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit `5001ef3af8`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2025-01-09 10:38:56 +01:00
Kory Maincent	d1bf27c4e1	dt-bindings: net: pse-pd: Fix unusual character in documentation The documentation contained an unusual character due to an issue in my personal b4 setup. Fix the problem by providing the correct PSE Pinout Alternatives table number description. Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250107142659.425877-1-kory.maincent@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-08 19:34:56 -08:00
Jakub Kicinski	4460e45700	Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2025-01-07 (ice, igc) For ice: Arkadiusz corrects mask value being used to determine DPLL phase range. Przemyslaw corrects frequency value for E823 devices. For igc: En-Wei Wu adds a check and, early, return for failed register read. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: igc: return early when failing to read EECD register ice: fix incorrect PHY settings for 100 GB/s ice: fix max values for dpll pin phase adjust ==================== Link: https://patch.msgid.link/20250107190150.1758577-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-08 19:33:26 -08:00
Jakub Kicinski	6730ee8f08	Merge tag 'for-net-2025-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - btmtk: Fix failed to send func ctrl for MediaTek devices. - hci_sync: Fix not setting Random Address when required - MGMT: Fix Add Device to responding before completing - btnxpuart: Fix driver sending truncated data * tag 'for-net-2025-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: btmtk: Fix failed to send func ctrl for MediaTek devices. Bluetooth: btnxpuart: Fix driver sending truncated data Bluetooth: MGMT: Fix Add Device to responding before completing Bluetooth: hci_sync: Fix not setting Random Address when required ==================== Link: https://patch.msgid.link/20250108162627.1623760-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-08 19:08:19 -08:00
Chen Ridong	3cb97a927f	cgroup/cpuset: remove kernfs active break A warning was found: WARNING: CPU: 10 PID: 3486953 at fs/kernfs/file.c:828 CPU: 10 PID: 3486953 Comm: rmdir Kdump: loaded Tainted: G RIP: 0010:kernfs_should_drain_open_files+0x1a1/0x1b0 RSP: 0018:ffff8881107ef9e0 EFLAGS: 00010202 RAX: 0000000080000002 RBX: ffff888154738c00 RCX: dffffc0000000000 RDX: 0000000000000007 RSI: 0000000000000004 RDI: ffff888154738c04 RBP: ffff888154738c04 R08: ffffffffaf27fa15 R09: ffffed102a8e7180 R10: ffff888154738c07 R11: 0000000000000000 R12: ffff888154738c08 R13: ffff888750f8c000 R14: ffff888750f8c0e8 R15: ffff888154738ca0 FS: 00007f84cd0be740(0000) GS:ffff8887ddc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000555f9fbe00c8 CR3: 0000000153eec001 CR4: 0000000000370ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: kernfs_drain+0x15e/0x2f0 __kernfs_remove+0x165/0x300 kernfs_remove_by_name_ns+0x7b/0xc0 cgroup_rm_file+0x154/0x1c0 cgroup_addrm_files+0x1c2/0x1f0 css_clear_dir+0x77/0x110 kill_css+0x4c/0x1b0 cgroup_destroy_locked+0x194/0x380 cgroup_rmdir+0x2a/0x140 It can be explained by: rmdir echo 1 > cpuset.cpus kernfs_fop_write_iter // active=0 cgroup_rm_file kernfs_remove_by_name_ns kernfs_get_active // active=1 __kernfs_remove // active=0x80000002 kernfs_drain cpuset_write_resmask wait_event //waiting (active == 0x80000001) kernfs_break_active_protection // active = 0x80000001 // continue kernfs_unbreak_active_protection // active = 0x80000002 ... kernfs_should_drain_open_files // warning occurs kernfs_put_active This warning is caused by 'kernfs_break_active_protection' when it is writing to cpuset.cpus, and the cgroup is removed concurrently. The commit `3a5a6d0c2b` ("cpuset: don't nest cgroup_mutex inside get_online_cpus()") made cpuset_hotplug_workfn asynchronous, This change involves calling flush_work(), which can create a multiple processes circular locking dependency that involve cgroup_mutex, potentially leading to a deadlock. To avoid deadlock. the commit `76bb5ab8f6` ("cpuset: break kernfs active protection in cpuset_write_resmask()") added 'kernfs_break_active_protection' in the cpuset_write_resmask. This could lead to this warning. After the commit `2125c0034c` ("cgroup/cpuset: Make cpuset hotplug processing synchronous"), the cpuset_write_resmask no longer needs to wait the hotplug to finish, which means that concurrent hotplug and cpuset operations are no longer possible. Therefore, the deadlock doesn't exist anymore and it does not have to 'break active protection' now. To fix this warning, just remove kernfs_break_active_protection operation in the 'cpuset_write_resmask'. Fixes: `bdb2fd7fc5` ("kernfs: Skip kernfs_drain_open_files() more aggressively") Fixes: `76bb5ab8f6` ("cpuset: break kernfs active protection in cpuset_write_resmask()") Reported-by: Ji Fa <jifa@huawei.com> Signed-off-by: Chen Ridong <chenridong@huawei.com> Acked-by: Waiman Long <longman@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2025-01-08 15:54:39 -10:00
Linus Torvalds	eea6e4b4df	Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Four driver fixes in UFS, mostly to do with power management" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ufs: qcom: Power down the controller/device during system suspend for SM8550/SM8650 SoCs scsi: ufs: qcom: Allow passing platform specific OF data scsi: ufs: core: Honor runtime/system PM levels if set by host controller drivers scsi: ufs: qcom: Power off the PHY if it was already powered on in ufs_qcom_power_up_sequence()	2025-01-08 11:55:20 -08:00
Clément Léger	5cd900b8b7	riscv: use local label names instead of global ones in assembly Local labels should be prefix by '.L' or they'll be exported in the symbol table. Additionally, this messes up the backtrace by displaying an incorrect symbol: ... [ 12.751810] [<ffffffff80441628>] _copy_from_user+0x28/0xc2 [ 12.752035] [<ffffffff800152ca>] handle_misaligned_load+0x1ca/0x2fc [ 12.752310] [<ffffffff80a033e8>] do_trap_load_misaligned+0x24/0xee [ 12.752596] [<ffffffff80a0dcae>] _new_vmalloc_restore_context_a0+0xc2/0xce After: ... [ 10.243916] [<ffffffff804415e4>] _copy_from_user+0x28/0xc2 [ 10.244026] [<ffffffff800152ca>] handle_misaligned_load+0x1ca/0x2fc [ 10.244150] [<ffffffff80a033a0>] do_trap_load_misaligned+0x24/0xee [ 10.244268] [<ffffffff80a0dc66>] handle_exception+0x146/0x152 Signed-off-by: Clément Léger <cleger@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Fixes: `503638e0ba` ("riscv: Stop emitting preventive sfence.vma for new vmalloc mappings") Link: https://lore.kernel.org/r/20250103141814.508865-1-cleger@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2025-01-08 10:46:14 -08:00
Guo Ren	40e6073e76	riscv: qspinlock: Fixup _Q_PENDING_LOOPS definition When CONFIG_RISCV_QUEUED_SPINLOCKS=y, the _Q_PENDING_LOOPS definition is missing. Add the _Q_PENDING_LOOPS definition for pure qspinlock usage. Fixes: `ab83647fad` ("riscv: Add qspinlock support") Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20241215135252.201983-1-guoren@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2025-01-08 10:46:01 -08:00
Clément Léger	51356ce60e	riscv: stacktrace: fix backtracing through exceptions Prior to commit `5d5fc33ce5` ("riscv: Improve exception and system call latency"), backtrace through exception worked since ra was filled with ret_from_exception symbol address and the stacktrace code checked 'pc' to be equal to that symbol. Now that handle_exception uses regular 'call' instructions, this isn't working anymore and backtrace stops at handle_exception(). Since there are multiple call site to C code in the exception handling path, rather than checking multiple potential return addresses, add a new symbol at the end of exception handling and check pc to be in that range. Fixes: `5d5fc33ce5` ("riscv: Improve exception and system call latency") Signed-off-by: Clément Léger <cleger@rivosinc.com> Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20241209155714.1239665-1-cleger@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2025-01-08 10:45:49 -08:00
Xu Lu	f754f27e98	riscv: mm: Fix the out of bound issue of vmemmap address In sparse vmemmap model, the virtual address of vmemmap is calculated as: ((struct page )VMEMMAP_START - (phys_ram_base >> PAGE_SHIFT)). And the struct page's va can be calculated with an offset: (vmemmap + (pfn)). However, when initializing struct pages, kernel actually starts from the first page from the same section that phys_ram_base belongs to. If the first page's physical address is not (phys_ram_base >> PAGE_SHIFT), then we get an va below VMEMMAP_START when calculating va for it's struct page. For example, if phys_ram_base starts from 0x82000000 with pfn 0x82000, the first page in the same section is actually pfn 0x80000. During init_unavailable_range(), we will initialize struct page for pfn 0x80000 with virtual address ((struct page )VMEMMAP_START - 0x2000), which is below VMEMMAP_START as well as PCI_IO_END. This commit fixes this bug by introducing a new variable 'vmemmap_start_pfn' which is aligned with memory section size and using it to calculate vmemmap address instead of phys_ram_base. Fixes: `a11dd49dcb` ("riscv: Sparse-Memory/vmemmap out-of-bounds fix") Signed-off-by: Xu Lu <luxu.kernel@bytedance.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Tested-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20241209122617.53341-1-luxu.kernel@bytedance.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2025-01-08 10:45:34 -08:00
Javier Carrasco	7e25044b80	cpuidle: riscv-sbi: fix device node release in early exit of for_each_possible_cpu The 'np' device_node is initialized via of_cpu_device_node_get(), which requires explicit calls to of_node_put() when it is no longer required to avoid leaking the resource. Instead of adding the missing calls to of_node_put() in all execution paths, use the cleanup attribute for 'np' by means of the __free() macro, which automatically calls of_node_put() when the variable goes out of scope. Given that 'np' is only used within the for_each_possible_cpu(), reduce its scope to release the nood after every iteration of the loop. Fixes: `6abf32f1d9` ("cpuidle: Add RISC-V SBI CPU idle driver") Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Link: https://lore.kernel.org/r/20241116-cpuidle-riscv-sbi-cleanup-v3-1-a3a46372ce08@gmail.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2025-01-08 10:40:14 -08:00
Nam Cao	13134cc949	riscv: kprobes: Fix incorrect address calculation p->ainsn.api.insn is a pointer to u32, therefore arithmetic operations are multiplied by four. This is clearly undesirable for this case. Cast it to (void *) first before any calculation. Below is a sample before/after. The dumped memory is two kprobe slots, the first slot has - c.addiw a0, 0x1c (0x7125) - ebreak (0x00100073) and the second slot has: - c.addiw a0, -4 (0x7135) - ebreak (0x00100073) Before this patch: (gdb) x/16xh 0xff20000000135000 0xff20000000135000: 0x7125 0x0000 0x0000 0x0000 0x7135 0x0010 0x0000 0x0000 0xff20000000135010: 0x0073 0x0010 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 After this patch: (gdb) x/16xh 0xff20000000125000 0xff20000000125000: 0x7125 0x0073 0x0010 0x0000 0x7135 0x0073 0x0010 0x0000 0xff20000000125010: 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 Fixes: `b1756750a3` ("riscv: kprobes: Use patch_text_nosync() for insn slots") Signed-off-by: Nam Cao <namcao@linutronix.de> Cc: stable@vger.kernel.org Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20241119111056.2554419-1-namcao@linutronix.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2025-01-08 10:39:39 -08:00
Jakub Kicinski	f552b3037d	Merge branch 'there-are-some-bugfix-for-the-hns3-ethernet-driver' Jijie Shao says: ==================== There are some bugfix for the HNS3 ethernet driver There's a series of bugfix that's been accepted: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=d80a3091308491455b6501b1c4b68698c4a7cd24 However, The series is making the driver poke into IOMMU internals instead of implementing appropriate IOMMU workarounds. After discussion, the series was reverted: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=249cfa318fb1b77eb726c2ff4f74c9685f04e568 But only two patches are related to the IOMMU. Other patches involve only the modification of the driver. This series resends other patches. v2*: https://lore.kernel.org/20241217010839.1742227-1-shaojijie@huawei.com v2: https://lore.kernel.org/20241216132346.1197079-1-shaojijie@huawei.com v1: https://lore.kernel.org/20241107133023.3813095-1-shaojijie@huawei.com ==================== Link: https://patch.msgid.link/20250106143642.539698-1-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-08 10:33:17 -08:00
Jie Wang	9741e72b22	net: hns3: fix kernel crash when 1588 is sent on HIP08 devices Currently, HIP08 devices does not register the ptp devices, so the hdev->ptp is NULL. But the tx process would still try to set hardware time stamp info with SKBTX_HW_TSTAMP flag and cause a kernel crash. [ 128.087798] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000018 ... [ 128.280251] pc : hclge_ptp_set_tx_info+0x2c/0x140 [hclge] [ 128.286600] lr : hclge_ptp_set_tx_info+0x20/0x140 [hclge] [ 128.292938] sp : ffff800059b93140 [ 128.297200] x29: ffff800059b93140 x28: 0000000000003280 [ 128.303455] x27: ffff800020d48280 x26: ffff0cb9dc814080 [ 128.309715] x25: ffff0cb9cde93fa0 x24: 0000000000000001 [ 128.315969] x23: 0000000000000000 x22: 0000000000000194 [ 128.322219] x21: ffff0cd94f986000 x20: 0000000000000000 [ 128.328462] x19: ffff0cb9d2a166c0 x18: 0000000000000000 [ 128.334698] x17: 0000000000000000 x16: ffffcf1fc523ed24 [ 128.340934] x15: 0000ffffd530a518 x14: 0000000000000000 [ 128.347162] x13: ffff0cd6bdb31310 x12: 0000000000000368 [ 128.353388] x11: ffff0cb9cfbc7070 x10: ffff2cf55dd11e02 [ 128.359606] x9 : ffffcf1f85a212b4 x8 : ffff0cd7cf27dab0 [ 128.365831] x7 : 0000000000000a20 x6 : ffff0cd7cf27d000 [ 128.372040] x5 : 0000000000000000 x4 : 000000000000ffff [ 128.378243] x3 : 0000000000000400 x2 : ffffcf1f85a21294 [ 128.384437] x1 : ffff0cb9db520080 x0 : ffff0cb9db500080 [ 128.390626] Call trace: [ 128.393964] hclge_ptp_set_tx_info+0x2c/0x140 [hclge] [ 128.399893] hns3_nic_net_xmit+0x39c/0x4c4 [hns3] [ 128.405468] xmit_one.constprop.0+0xc4/0x200 [ 128.410600] dev_hard_start_xmit+0x54/0xf0 [ 128.415556] sch_direct_xmit+0xe8/0x634 [ 128.420246] __dev_queue_xmit+0x224/0xc70 [ 128.425101] dev_queue_xmit+0x1c/0x40 [ 128.429608] ovs_vport_send+0xac/0x1a0 [openvswitch] [ 128.435409] do_output+0x60/0x17c [openvswitch] [ 128.440770] do_execute_actions+0x898/0x8c4 [openvswitch] [ 128.446993] ovs_execute_actions+0x64/0xf0 [openvswitch] [ 128.453129] ovs_dp_process_packet+0xa0/0x224 [openvswitch] [ 128.459530] ovs_vport_receive+0x7c/0xfc [openvswitch] [ 128.465497] internal_dev_xmit+0x34/0xb0 [openvswitch] [ 128.471460] xmit_one.constprop.0+0xc4/0x200 [ 128.476561] dev_hard_start_xmit+0x54/0xf0 [ 128.481489] __dev_queue_xmit+0x968/0xc70 [ 128.486330] dev_queue_xmit+0x1c/0x40 [ 128.490856] ip_finish_output2+0x250/0x570 [ 128.495810] __ip_finish_output+0x170/0x1e0 [ 128.500832] ip_finish_output+0x3c/0xf0 [ 128.505504] ip_output+0xbc/0x160 [ 128.509654] ip_send_skb+0x58/0xd4 [ 128.513892] udp_send_skb+0x12c/0x354 [ 128.518387] udp_sendmsg+0x7a8/0x9c0 [ 128.522793] inet_sendmsg+0x4c/0x8c [ 128.527116] __sock_sendmsg+0x48/0x80 [ 128.531609] __sys_sendto+0x124/0x164 [ 128.536099] __arm64_sys_sendto+0x30/0x5c [ 128.540935] invoke_syscall+0x50/0x130 [ 128.545508] el0_svc_common.constprop.0+0x10c/0x124 [ 128.551205] do_el0_svc+0x34/0xdc [ 128.555347] el0_svc+0x20/0x30 [ 128.559227] el0_sync_handler+0xb8/0xc0 [ 128.563883] el0_sync+0x160/0x180 Fixes: `0bf5eb7885` ("net: hns3: add support for PTP") Signed-off-by: Jie Wang <wangjie125@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://patch.msgid.link/20250106143642.539698-8-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-08 10:33:15 -08:00
Hao Lan	7997ddd46c	net: hns3: fixed hclge_fetch_pf_reg accesses bar space out of bounds issue The TQP BAR space is divided into two segments. TQPs 0-1023 and TQPs 1024-1279 are in different BAR space addresses. However, hclge_fetch_pf_reg does not distinguish the tqp space information when reading the tqp space information. When the number of TQPs is greater than 1024, access bar space overwriting occurs. The problem of different segments has been considered during the initialization of tqp.io_base. Therefore, tqp.io_base is directly used when the queue is read in hclge_fetch_pf_reg. The error message: Unable to handle kernel paging request at virtual address ffff800037200000 pc : hclge_fetch_pf_reg+0x138/0x250 [hclge] lr : hclge_get_regs+0x84/0x1d0 [hclge] Call trace: hclge_fetch_pf_reg+0x138/0x250 [hclge] hclge_get_regs+0x84/0x1d0 [hclge] hns3_get_regs+0x2c/0x50 [hns3] ethtool_get_regs+0xf4/0x270 dev_ethtool+0x674/0x8a0 dev_ioctl+0x270/0x36c sock_do_ioctl+0x110/0x2a0 sock_ioctl+0x2ac/0x530 __arm64_sys_ioctl+0xa8/0x100 invoke_syscall+0x4c/0x124 el0_svc_common.constprop.0+0x140/0x15c do_el0_svc+0x30/0xd0 el0_svc+0x1c/0x2c el0_sync_handler+0xb0/0xb4 el0_sync+0x168/0x180 Fixes: `939ccd107f` ("net: hns3: move dump regs function to a separate file") Signed-off-by: Hao Lan <lanhao@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Link: https://patch.msgid.link/20250106143642.539698-7-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-08 10:33:14 -08:00
Jian Shen	247fd1e33e	net: hns3: initialize reset_timer before hclgevf_misc_irq_init() Currently the misc irq is initialized before reset_timer setup. But it will access the reset_timer in the irq handler. So initialize the reset_timer earlier. Fixes: `ff200099d2` ("net: hns3: remove unnecessary work in hclgevf_main") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Link: https://patch.msgid.link/20250106143642.539698-6-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-08 10:33:14 -08:00
Jian Shen	98b1e3b277	net: hns3: don't auto enable misc vector Currently, there is a time window between misc irq enabled and service task inited. If an interrupte is reported at this time, it will cause warning like below: [ 16.324639] Call trace: [ 16.324641] __queue_delayed_work+0xb8/0xe0 [ 16.324643] mod_delayed_work_on+0x78/0xd0 [ 16.324655] hclge_errhand_task_schedule+0x58/0x90 [hclge] [ 16.324662] hclge_misc_irq_handle+0x168/0x240 [hclge] [ 16.324666] __handle_irq_event_percpu+0x64/0x1e0 [ 16.324667] handle_irq_event+0x80/0x170 [ 16.324670] handle_fasteoi_edge_irq+0x110/0x2bc [ 16.324671] __handle_domain_irq+0x84/0xfc [ 16.324673] gic_handle_irq+0x88/0x2c0 [ 16.324674] el1_irq+0xb8/0x140 [ 16.324677] arch_cpu_idle+0x18/0x40 [ 16.324679] default_idle_call+0x5c/0x1bc [ 16.324682] cpuidle_idle_call+0x18c/0x1c4 [ 16.324684] do_idle+0x174/0x17c [ 16.324685] cpu_startup_entry+0x30/0x6c [ 16.324687] secondary_start_kernel+0x1a4/0x280 [ 16.324688] ---[ end trace 6aa0bff672a964aa ]--- So don't auto enable misc vector when request irq.. Fixes: `7be1b9f3e9` ("net: hns3: make hclge_service use delayed workqueue") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Link: https://patch.msgid.link/20250106143642.539698-5-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-08 10:33:14 -08:00
Hao Lan	5191a8d3c2	net: hns3: Resolved the issue that the debugfs query result is inconsistent. This patch modifies the implementation of debugfs: When the user process stops unexpectedly, not all data of the file system is read. In this case, the save_buf pointer is not released. When the user process is called next time, save_buf is used to copy the cached data to the user space. As a result, the queried data is stale. To solve this problem, this patch implements .open() and .release() handler for debugfs file_operations. moving allocation buffer and execution of the cmd to the .open() handler and freeing in to the .release() handler. Allocate separate buffer for each reader and associate the buffer with the file pointer. When different user read processes no longer share the buffer, the stale data problem is fixed. Fixes: `5e69ea7ee2` ("net: hns3: refactor the debugfs process") Signed-off-by: Hao Lan <lanhao@huawei.com> Signed-off-by: Guangwei Zhang <zhangwangwei6@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://patch.msgid.link/20250106143642.539698-4-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-08 10:33:14 -08:00
Hao Lan	ac1e2836fe	net: hns3: fix missing features due to dev->features configuration too early Currently, the netdev->features is configured in hns3_nic_set_features. As a result, __netdev_update_features considers that there is no feature difference, and the procedures of the real features are missing. Fixes: `2a7556bb2b` ("net: hns3: implement ndo_features_check ops for hns3 driver") Signed-off-by: Hao Lan <lanhao@huawei.com> Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://patch.msgid.link/20250106143642.539698-3-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-08 10:33:14 -08:00
Hao Lan	5a4b584c67	net: hns3: fixed reset failure issues caused by the incorrect reset type When a reset type that is not supported by the driver is input, a reset pending flag bit of the HNAE3_NONE_RESET type is generated in reset_pending. The driver does not have a mechanism to clear this type of error. As a result, the driver considers that the reset is not complete. This patch provides a mechanism to clear the HNAE3_NONE_RESET flag and the parameter of hnae3_ae_ops.set_default_reset_request is verified. The error message: hns3 0000:39:01.0: cmd failed -16 hns3 0000:39:01.0: hclge device re-init failed, VF is disabled! hns3 0000:39:01.0: failed to reset VF stack hns3 0000:39:01.0: failed to reset VF(4) hns3 0000:39:01.0: prepare reset(2) wait done hns3 0000:39:01.0 eth4: already uninitialized Use the crash tool to view struct hclgevf_dev: struct hclgevf_dev { ... default_reset_request = 0x20, reset_level = HNAE3_NONE_RESET, reset_pending = 0x100, reset_type = HNAE3_NONE_RESET, ... }; Fixes: `720bd5837e` ("net: hns3: add set_default_reset_request in the hnae3_ae_ops") Signed-off-by: Hao Lan <lanhao@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Link: https://patch.msgid.link/20250106143642.539698-2-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-08 10:33:14 -08:00
Nam Cao	6a97f4118a	riscv: Fix sleeping in invalid context in die() die() can be called in exception handler, and therefore cannot sleep. However, die() takes spinlock_t which can sleep with PREEMPT_RT enabled. That causes the following warning: BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48 in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 285, name: mutex preempt_count: 110001, expected: 0 RCU nest depth: 0, expected: 0 CPU: 0 UID: 0 PID: 285 Comm: mutex Not tainted 6.12.0-rc7-00022-ge19049cf7d56-dirty #234 Hardware name: riscv-virtio,qemu (DT) Call Trace: dump_backtrace+0x1c/0x24 show_stack+0x2c/0x38 dump_stack_lvl+0x5a/0x72 dump_stack+0x14/0x1c __might_resched+0x130/0x13a rt_spin_lock+0x2a/0x5c die+0x24/0x112 do_trap_insn_illegal+0xa0/0xea _new_vmalloc_restore_context_a0+0xcc/0xd8 Oops - illegal instruction [#1] Switch to use raw_spinlock_t, which does not sleep even with PREEMPT_RT enabled. Fixes: `76d2a0493a` ("RISC-V: Init and Halt Code") Signed-off-by: Nam Cao <namcao@linutronix.de> Cc: stable@vger.kernel.org Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/r/20241118091333.1185288-1-namcao@linutronix.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2025-01-08 10:23:17 -08:00
Clément Léger	03f0b54853	riscv: module: remove relocation_head rel_entry member allocation relocation_head's list_head member, rel_entry, doesn't need to be allocated, its storage can just be part of the allocated relocation_head. Remove the pointer which allows to get rid of the allocation as well as an existing memory leak found by Kai Zhang using kmemleak. Fixes: `8fd6c51423` ("riscv: Add remaining module relocations") Reported-by: Kai Zhang <zhangkai@iscas.ac.cn> Signed-off-by: Clément Léger <cleger@rivosinc.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Tested-by: Charlie Jenkins <charlie@rivosinc.com> Link: https://lore.kernel.org/r/20241128081636.3620468-1-cleger@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2025-01-08 10:22:52 -08:00
Daniel Borkmann	80fb40baba	tcp: Annotate data-race around sk->sk_mark in tcp_v4_send_reset This is a follow-up to `3c5b4d69c3` ("net: annotate data-races around sk->sk_mark"). sk->sk_mark can be read and written without holding the socket lock. IPv6 equivalent is already covered with READ_ONCE() annotation in tcp_v6_send_response(). Fixes: `3c5b4d69c3` ("net: annotate data-races around sk->sk_mark") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/f459d1fc44f205e13f6d8bdca2c8bfb9902ffac9.1736244569.git.daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-08 10:22:02 -08:00
Jakub Kicinski	d1cacd7477	netdev: prevent accessing NAPI instances from another namespace The NAPI IDs were not fully exposed to user space prior to the netlink API, so they were never namespaced. The netlink API must ensure that at the very least NAPI instance belongs to the same netns as the owner of the genl sock. napi_by_id() can become static now, but it needs to move because of dev_get_by_napi_id(). Cc: stable@vger.kernel.org Fixes: `1287c1ae0f` ("netdev-genl: Support setting per-NAPI config values") Fixes: `27f91aaf49` ("netdev-genl: Add netlink framework functions for napi") Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250106180137.1861472-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-08 10:21:00 -08:00
Linus Torvalds	0b7958fa05	Merge tag 'for-6.13/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mikulas Patocka: - dm-array fixes - dm-verity forward error correction fixes - remove the flag DM_TARGET_PASSES_INTEGRITY from dm-ebs - dm-thin RCU list fix * tag 'for-6.13/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm thin: make get_first_thin use rcu-safe list first function dm-ebs: don't set the flag DM_TARGET_PASSES_INTEGRITY dm-verity FEC: Avoid copying RS parity bytes twice. dm-verity FEC: Fix RS FEC repair for roots unaligned to block size (take 2) dm array: fix cursor index when skipping across block boundaries dm array: fix unreleased btree blocks on closing a faulty array cursor dm array: fix releasing a faulty array block twice in dm_array_cursor_end	2025-01-08 10:12:01 -08:00
Honglei Wang	68e449d849	sched_ext: switch class when preempted by higher priority scheduler ops.cpu_release() function, if defined, must be invoked when preempted by a higher priority scheduler class task. This scenario was skipped in commit `f422316d74` ("sched_ext: Remove switch_class_scx()"). Let's fix it. Fixes: `f422316d74` ("sched_ext: Remove switch_class_scx()") Signed-off-by: Honglei Wang <jameshongleiwang@126.com> Acked-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2025-01-08 06:51:40 -10:00
Changwoo Min	6268d5bc10	sched_ext: Replace rq_lock() to raw_spin_rq_lock() in scx_ops_bypass() scx_ops_bypass() iterates all CPUs to re-enqueue all the scx tasks. For each CPU, it acquires a lock using rq_lock() regardless of whether a CPU is offline or the CPU is currently running a task in a higher scheduler class (e.g., deadline). The rq_lock() is supposed to be used for online CPUs, and the use of rq_lock() may trigger an unnecessary warning in rq_pin_lock(). Therefore, replace rq_lock() to raw_spin_rq_lock() in scx_ops_bypass(). Without this change, we observe the following warning: ===== START ===== [ 6.615205] rq->balance_callback && rq->balance_callback != &balance_push_callback [ 6.615208] WARNING: CPU: 2 PID: 0 at kernel/sched/sched.h:1730 __schedule+0x1130/0x1c90 ===== END ===== Fixes: `0e7ffff1b8` ("scx: Fix raciness in scx_ops_bypass()") Signed-off-by: Changwoo Min <changwoo@igalia.com> Acked-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2025-01-08 06:48:53 -10:00
Henry Huang	30dd3b13f9	sched_ext: keep running prev when prev->scx.slice != 0 When %SCX_OPS_ENQ_LAST is set and prev->scx.slice != 0, @prev will be dispacthed into the local DSQ in put_prev_task_scx(). However, pick_task_scx() is executed before put_prev_task_scx(), so it will not pick @prev. Set %SCX_RQ_BAL_KEEP in balance_one() to ensure that pick_task_scx() can pick @prev. Signed-off-by: Henry Huang <henry.hj@antgroup.com> Acked-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2025-01-08 06:48:33 -10:00
Chris Lu	67dba2c28f	Bluetooth: btmtk: Fix failed to send func ctrl for MediaTek devices. Use usb_autopm_get_interface() and usb_autopm_put_interface() in btmtk_usb_shutdown(), it could send func ctrl after enabling autosuspend. Bluetooth: btmtk_usb_hci_wmt_sync() hci0: Execution of wmt command timed out Bluetooth: btmtk_usb_shutdown() hci0: Failed to send wmt func ctrl (-110) Fixes: `5c5e8c52e3` ("Bluetooth: btmtk: move btusb_mtk_[setup, shutdown] to btmtk.c") Signed-off-by: Chris Lu <chris.lu@mediatek.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2025-01-08 11:14:03 -05:00
Neeraj Sanjay Kale	8023dd2204	Bluetooth: btnxpuart: Fix driver sending truncated data This fixes the apparent controller hang issue seen during stress test where the host sends a truncated payload, followed by HCI commands. The controller treats these HCI commands as a part of previously truncated payload, leading to command timeouts. Adding a serdev_device_wait_until_sent() call after serdev_device_write_buf() fixed the issue. Fixes: `689ca16e52` ("Bluetooth: NXP: Add protocol support for NXP Bluetooth chipsets") Signed-off-by: Neeraj Sanjay Kale <neeraj.sanjaykale@nxp.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2025-01-08 11:14:02 -05:00
Luiz Augusto von Dentz	a182d9c84f	Bluetooth: MGMT: Fix Add Device to responding before completing Add Device with LE type requires updating resolving/accept list which requires quite a number of commands to complete and each of them may fail, so instead of pretending it would always work this checks the return of hci_update_passive_scan_sync which indicates if everything worked as intended. Fixes: `e8907f7654` ("Bluetooth: hci_sync: Make use of hci_cmd_sync_queue set 3") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2025-01-08 11:14:01 -05:00
Luiz Augusto von Dentz	c2994b0084	Bluetooth: hci_sync: Fix not setting Random Address when required This fixes errors such as the following when Own address type is set to Random Address but it has not been programmed yet due to either be advertising or connecting: < HCI Command: LE Set Exte.. (0x08\|0x0041) plen 13 Own address type: Random (0x03) Filter policy: Ignore not in accept list (0x01) PHYs: 0x05 Entry 0: LE 1M Type: Passive (0x00) Interval: 60.000 msec (0x0060) Window: 30.000 msec (0x0030) Entry 1: LE Coded Type: Passive (0x00) Interval: 180.000 msec (0x0120) Window: 90.000 msec (0x0090) > HCI Event: Command Complete (0x0e) plen 4 LE Set Extended Scan Parameters (0x08\|0x0041) ncmd 1 Status: Success (0x00) < HCI Command: LE Set Exten.. (0x08\|0x0042) plen 6 Extended scan: Enabled (0x01) Filter duplicates: Enabled (0x01) Duration: 0 msec (0x0000) Period: 0.00 sec (0x0000) > HCI Event: Command Complete (0x0e) plen 4 LE Set Extended Scan Enable (0x08\|0x0042) ncmd 1 Status: Invalid HCI Command Parameters (0x12) Fixes: `c45074d68a` ("Bluetooth: Fix not generating RPA when required") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2025-01-08 11:14:00 -05:00
Rengarajan S	c7a5378a0f	misc: microchip: pci1xxxx: Resolve return code mismatch during GPIO set config Driver returns -EOPNOTSUPPORTED on unsupported parameters case in set config. Upper level driver checks for -ENOTSUPP. Because of the return code mismatch, the ioctls from userspace fail. Resolve the issue by passing -ENOTSUPP during unsupported case. Fixes: `7d3e4d807d` ("misc: microchip: pci1xxxx: load gpio driver for the gpio controller auxiliary device enumerated by the auxiliary bus driver.") Cc: stable <stable@kernel.org> Signed-off-by: Rengarajan S <rengarajan.s@microchip.com> Link: https://lore.kernel.org/r/20241205133626.1483499-3-rengarajan.s@microchip.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-01-08 15:47:39 +01:00
Rengarajan S	194f9f94a5	misc: microchip: pci1xxxx: Resolve kernel panic during GPIO IRQ handling Resolve kernel panic caused by improper handling of IRQs while accessing GPIO values. This is done by replacing generic_handle_irq with handle_nested_irq. Fixes: `1f4d8ae231` ("misc: microchip: pci1xxxx: Add gpio irq handler and irq helper functions irq_ack, irq_mask, irq_unmask and irq_set_type of irq_chip.") Cc: stable <stable@kernel.org> Signed-off-by: Rengarajan S <rengarajan.s@microchip.com> Link: https://lore.kernel.org/r/20241205133626.1483499-2-rengarajan.s@microchip.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-01-08 15:47:35 +01:00
Krister Johansen	80f130bfad	dm thin: make get_first_thin use rcu-safe list first function The documentation in rculist.h explains the absence of list_empty_rcu() and cautions programmers against relying on a list_empty() -> list_first() sequence in RCU safe code. This is because each of these functions performs its own READ_ONCE() of the list head. This can lead to a situation where the list_empty() sees a valid list entry, but the subsequent list_first() sees a different view of list head state after a modification. In the case of dm-thin, this author had a production box crash from a GP fault in the process_deferred_bios path. This function saw a valid list head in get_first_thin() but when it subsequently dereferenced that and turned it into a thin_c, it got the inside of the struct pool, since the list was now empty and referring to itself. The kernel on which this occurred printed both a warning about a refcount_t being saturated, and a UBSAN error for an out-of-bounds cpuid access in the queued spinlock, prior to the fault itself. When the resulting kdump was examined, it was possible to see another thread patiently waiting in thin_dtr's synchronize_rcu. The thin_dtr call managed to pull the thin_c out of the active thins list (and have it be the last entry in the active_thins list) at just the wrong moment which lead to this crash. Fortunately, the fix here is straight forward. Switch get_first_thin() function to use list_first_or_null_rcu() which performs just a single READ_ONCE() and returns NULL if the list is already empty. This was run against the devicemapper test suite's thin-provisioning suites for delete and suspend and no regressions were observed. Signed-off-by: Krister Johansen <kjlx@templeofstupid.com> Fixes: `b10ebd34cc` ("dm thin: fix rcu_read_lock being held in code that can sleep") Cc: stable@vger.kernel.org Acked-by: Ming-Hung Tsai <mtsai@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>	2025-01-08 15:29:39 +01:00
Mikulas Patocka	47f33c27fc	dm-ebs: don't set the flag DM_TARGET_PASSES_INTEGRITY dm-ebs uses dm-bufio to process requests that are not aligned on logical sector size. dm-bufio doesn't support passing integrity data (and it is unclear how should it do it), so we shouldn't set the DM_TARGET_PASSES_INTEGRITY flag. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Fixes: `d3c7b35c20` ("dm: add emulated block size target")	2025-01-08 15:28:47 +01:00
Christoph Hellwig	7ee7c9b39e	xfs: don't return an error from xfs_update_last_rtgroup_size for !XFS_RT Non-rtg file systems have a fake RT group even if they do not have a RT device, and thus an rgcount of 1. Ensure xfs_update_last_rtgroup_size doesn't fail when called for !XFS_RT to handle this case. Fixes: `87fe4c34a3` ("xfs: create incore realtime group structures") Reported-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2025-01-08 15:04:40 +01:00
Anton Kirilov	95147bb42b	arm64: dts: rockchip: Fix the SD card detection on NanoPi R6C/R6S Fix the SD card detection on FriendlyElec NanoPi R6C/R6S boards. Signed-off-by: Anton Kirilov <anton.kirilov@arm.com> Link: https://lore.kernel.org/r/20241219113145.483205-1-anton.kirilov@arm.com Signed-off-by: Heiko Stuebner <heiko@sntech.de>	2025-01-08 13:18:05 +01:00
Greg Kroah-Hartman	6f79db028e	staging: gpib: mite: remove unused global functions The mite.c file was originally copied from the COMEDI code, and now that it is in the kernel tree, along with the comedi code, on some build configurations there are errors due to duplicate symbols (specifically mite_dma_disarm). Remove all of the unused functions in the gpib mite.c and .h files as they aren't needed and cause the compiler to be confused. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/r/202501081239.BAPhfAHJ-lkp@intel.com/ Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Link: https://lore.kernel.org/r/2025010809-padding-survive-91b3@gregkh Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-01-08 13:15:52 +01:00
Michal Hrusecky	f5b435be70	USB: serial: option: add Neoway N723-EA support Update the USB serial option driver to support Neoway N723-EA. ID 2949:8700 Marvell Mobile Composite Device Bus T: Bus=02 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=2949 ProdID=8700 Rev= 1.00 S: Manufacturer=Marvell S: Product=Mobile Composite Device Bus S: SerialNumber=200806006809080000 C:* #Ifs= 5 Cfg#= 1 Atr=c0 MxPwr=500mA A: FirstIf#= 0 IfCount= 2 Cls=e0(wlcon) Sub=01 Prot=03 I:* If#= 0 Alt= 0 #EPs= 1 Cls=e0(wlcon) Sub=01 Prot=03 Driver=rndis_host E: Ad=87(I) Atr=03(Int.) MxPS= 64 Ivl=4096ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=rndis_host E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=0c(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=89(I) Atr=03(Int.) MxPS= 64 Ivl=4096ms E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=0b(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=86(I) Atr=03(Int.) MxPS= 64 Ivl=4096ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=0e(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 6 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=88(I) Atr=03(Int.) MxPS= 64 Ivl=4096ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=0a(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms Tested successfully connecting to the Internet via rndis interface after dialing via AT commands on If#=4 or If#=6. Not sure of the purpose of the other serial interface. Signed-off-by: Michal Hrusecky <michal.hrusecky@turris.com> Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <johan@kernel.org>	2025-01-08 11:49:54 +01:00
Chukun Pan	c1947d244f	USB: serial: option: add MeiG Smart SRM815 It looks like SRM815 shares ID with SRM825L. T: Bus=03 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=480 MxCh= 0 D: Ver= 2.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=2dee ProdID=4d22 Rev= 4.14 S: Manufacturer=MEIG S: Product=LTE-A Module S: SerialNumber=123456 C:* #Ifs= 5 Cfg#= 1 Atr=80 MxPwr=500mA I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=83(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=85(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=87(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none) E: Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=88(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms Signed-off-by: Chukun Pan <amadeus@jmu.edu.cn> Link: https://lore.kernel.org/lkml/20241215100027.1970930-1-amadeus@jmu.edu.cn/ Link: https://lore.kernel.org/all/4333b4d0-281f-439d-9944-5570cbc4971d@gmail.com/ Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <johan@kernel.org>	2025-01-08 11:40:38 +01:00
Johan Hovold	854eee93bd	USB: serial: cp210x: add Phoenix Contact UPS Device Phoenix Contact sells UPS Quint devices [1] with a custom datacable [2] that embeds a Silicon Labs converter: Bus 001 Device 003: ID 1b93:1013 Silicon Labs Phoenix Contact UPS Device Device Descriptor: bLength 18 bDescriptorType 1 bcdUSB 2.00 bDeviceClass 0 bDeviceSubClass 0 bDeviceProtocol 0 bMaxPacketSize0 64 idVendor 0x1b93 idProduct 0x1013 bcdDevice 1.00 iManufacturer 1 Silicon Labs iProduct 2 Phoenix Contact UPS Device iSerial 3 <redacted> bNumConfigurations 1 Configuration Descriptor: bLength 9 bDescriptorType 2 wTotalLength 0x0020 bNumInterfaces 1 bConfigurationValue 1 iConfiguration 0 bmAttributes 0x80 (Bus Powered) MaxPower 100mA Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 0 bAlternateSetting 0 bNumEndpoints 2 bInterfaceClass 255 Vendor Specific Class bInterfaceSubClass 0 bInterfaceProtocol 0 iInterface 2 Phoenix Contact UPS Device Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x01 EP 1 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0040 1x 64 bytes bInterval 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x82 EP 2 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0040 1x 64 bytes bInterval 0 [1] https://www.phoenixcontact.com/en-pc/products/power-supply-unit-quint-ps-1ac-24dc-10-2866763 [2] https://www.phoenixcontact.com/en-il/products/data-cable-preassembled-ifs-usb-datacable-2320500 Reported-by: Giuseppe Corbelli <giuseppe.corbelli@antaresvision.com> Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <johan@kernel.org>	2025-01-08 11:25:47 +01:00
Binbin Zhou	e59f4c9717	gpio: loongson: Fix Loongson-2K2000 ACPI GPIO register offset Since commit `3feb70a617` ("gpio: loongson: add more gpio chip support"), the Loongson-2K2000 GPIO is supported. However, according to the firmware development specification, the Loongson-2K2000 ACPI GPIO register offsets in the driver do not match the register base addresses in the firmware, resulting in the registers not being accessed properly. Now, we fix it to ensure the GPIO function works properly. Cc: stable@vger.kernel.org Cc: Yinbo Zhu <zhuyinbo@loongson.cn> Fixes: `3feb70a617` ("gpio: loongson: add more gpio chip support") Co-developed-by: Hongliang Wang <wanghongliang@loongson.cn> Signed-off-by: Hongliang Wang <wanghongliang@loongson.cn> Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn> Link: https://lore.kernel.org/r/20250107103856.1037222-1-zhoubinbin@loongson.cn Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>	2025-01-08 09:54:20 +01:00
Suraj Kandpal	77bf21a03a	Revert "drm/i915/hdcp: Don't enable HDCP1.4 directly from check_link" This reverts commit `483f7d94a0`. This needs to be reverted since HDCP even after updating the connector state HDCP property we don't reenable HDCP until the next commit in which the CP Property is set causing compliance to fail. --v2 -Fix build issue [Dnyaneshwar] Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com> Reviewed-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250103084517.239998-1-suraj.kandpal@intel.com (cherry picked from commit `fcf73e20cd`) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>	2025-01-08 08:53:35 +00:00
Jakub Kicinski	db78475ba0	eth: gve: use appropriate helper to set xdp_features Commit `f85949f982` ("xdp: add xdp_set_features_flag utility routine") added routines to inform the core about XDP flag changes. GVE support was added around the same time and missed using them. GVE only changes the flags on error recover or resume. Presumably the flags may change during resume if VM migrated. User would not get the notification and upper devices would not get a chance to recalculate their flags. Fixes: `75eaae158b` ("gve: Add XDP DROP and TX support for GQI-QPL format") Reviewed-By: Jeroen de Borst <jeroendb@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250106180210.1861784-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-07 18:07:14 -08:00
Kuniyuki Iwashima	cb358ff941	ipvlan: Fix use-after-free in ipvlan_get_iflink(). syzbot presented an use-after-free report [0] regarding ipvlan and linkwatch. ipvlan does not hold a refcnt of the lower device unlike vlan and macvlan. If the linkwatch work is triggered for the ipvlan dev, the lower dev might have already been freed, resulting in UAF of ipvlan->phy_dev in ipvlan_get_iflink(). We can delay the lower dev unregistration like vlan and macvlan by holding the lower dev's refcnt in dev->netdev_ops->ndo_init() and releasing it in dev->priv_destructor(). Jakub pointed out calling .ndo_XXX after unregister_netdevice() has returned is error prone and suggested [1] addressing this UAF in the core by taking commit `750e516033` ("net: avoid potential UAF in default_operstate()") further. Let's assume unregistering devices DOWN and use RCU protection in default_operstate() not to race with the device unregistration. [0]: BUG: KASAN: slab-use-after-free in ipvlan_get_iflink+0x84/0x88 drivers/net/ipvlan/ipvlan_main.c:353 Read of size 4 at addr ffff0000d768c0e0 by task kworker/u8:35/6944 CPU: 0 UID: 0 PID: 6944 Comm: kworker/u8:35 Not tainted 6.13.0-rc2-g9bc5c9515b48 #12 4c3cb9e8b4565456f6a355f312ff91f4f29b3c47 Hardware name: linux,dummy-virt (DT) Workqueue: events_unbound linkwatch_event Call trace: show_stack+0x38/0x50 arch/arm64/kernel/stacktrace.c:484 (C) __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0xbc/0x108 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0x16c/0x6f0 mm/kasan/report.c:489 kasan_report+0xc0/0x120 mm/kasan/report.c:602 __asan_report_load4_noabort+0x20/0x30 mm/kasan/report_generic.c:380 ipvlan_get_iflink+0x84/0x88 drivers/net/ipvlan/ipvlan_main.c:353 dev_get_iflink+0x7c/0xd8 net/core/dev.c:674 default_operstate net/core/link_watch.c:45 [inline] rfc2863_policy+0x144/0x360 net/core/link_watch.c:72 linkwatch_do_dev+0x60/0x228 net/core/link_watch.c:175 __linkwatch_run_queue+0x2f4/0x5b8 net/core/link_watch.c:239 linkwatch_event+0x64/0xa8 net/core/link_watch.c:282 process_one_work+0x700/0x1398 kernel/workqueue.c:3229 process_scheduled_works kernel/workqueue.c:3310 [inline] worker_thread+0x8c4/0xe10 kernel/workqueue.c:3391 kthread+0x2b0/0x360 kernel/kthread.c:389 ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:862 Allocated by task 9303: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x30/0x68 mm/kasan/common.c:68 kasan_save_alloc_info+0x44/0x58 mm/kasan/generic.c:568 poison_kmalloc_redzone mm/kasan/common.c:377 [inline] __kasan_kmalloc+0x84/0xa0 mm/kasan/common.c:394 kasan_kmalloc include/linux/kasan.h:260 [inline] __do_kmalloc_node mm/slub.c:4283 [inline] __kmalloc_node_noprof+0x2a0/0x560 mm/slub.c:4289 __kvmalloc_node_noprof+0x9c/0x230 mm/util.c:650 alloc_netdev_mqs+0xb4/0x1118 net/core/dev.c:11209 rtnl_create_link+0x2b8/0xb60 net/core/rtnetlink.c:3595 rtnl_newlink_create+0x19c/0x868 net/core/rtnetlink.c:3771 __rtnl_newlink net/core/rtnetlink.c:3896 [inline] rtnl_newlink+0x122c/0x15c0 net/core/rtnetlink.c:4011 rtnetlink_rcv_msg+0x61c/0x918 net/core/rtnetlink.c:6901 netlink_rcv_skb+0x1dc/0x398 net/netlink/af_netlink.c:2542 rtnetlink_rcv+0x34/0x50 net/core/rtnetlink.c:6928 netlink_unicast_kernel net/netlink/af_netlink.c:1321 [inline] netlink_unicast+0x618/0x838 net/netlink/af_netlink.c:1347 netlink_sendmsg+0x5fc/0x8b0 net/netlink/af_netlink.c:1891 sock_sendmsg_nosec net/socket.c:711 [inline] __sock_sendmsg net/socket.c:726 [inline] __sys_sendto+0x2ec/0x438 net/socket.c:2197 __do_sys_sendto net/socket.c:2204 [inline] __se_sys_sendto net/socket.c:2200 [inline] __arm64_sys_sendto+0xe4/0x110 net/socket.c:2200 __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline] invoke_syscall+0x90/0x278 arch/arm64/kernel/syscall.c:49 el0_svc_common+0x13c/0x250 arch/arm64/kernel/syscall.c:132 do_el0_svc+0x54/0x70 arch/arm64/kernel/syscall.c:151 el0_svc+0x4c/0xa8 arch/arm64/kernel/entry-common.c:744 el0t_64_sync_handler+0x78/0x108 arch/arm64/kernel/entry-common.c:762 el0t_64_sync+0x198/0x1a0 arch/arm64/kernel/entry.S:600 Freed by task 10200: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x30/0x68 mm/kasan/common.c:68 kasan_save_free_info+0x58/0x70 mm/kasan/generic.c:582 poison_slab_object mm/kasan/common.c:247 [inline] __kasan_slab_free+0x48/0x68 mm/kasan/common.c:264 kasan_slab_free include/linux/kasan.h:233 [inline] slab_free_hook mm/slub.c:2338 [inline] slab_free mm/slub.c:4598 [inline] kfree+0x140/0x420 mm/slub.c:4746 kvfree+0x4c/0x68 mm/util.c:693 netdev_release+0x94/0xc8 net/core/net-sysfs.c:2034 device_release+0x98/0x1c0 kobject_cleanup lib/kobject.c:689 [inline] kobject_release lib/kobject.c:720 [inline] kref_put include/linux/kref.h:65 [inline] kobject_put+0x2b0/0x438 lib/kobject.c:737 netdev_run_todo+0xdd8/0xf48 net/core/dev.c:10924 rtnl_unlock net/core/rtnetlink.c:152 [inline] rtnl_net_unlock net/core/rtnetlink.c:209 [inline] rtnl_dellink+0x484/0x680 net/core/rtnetlink.c:3526 rtnetlink_rcv_msg+0x61c/0x918 net/core/rtnetlink.c:6901 netlink_rcv_skb+0x1dc/0x398 net/netlink/af_netlink.c:2542 rtnetlink_rcv+0x34/0x50 net/core/rtnetlink.c:6928 netlink_unicast_kernel net/netlink/af_netlink.c:1321 [inline] netlink_unicast+0x618/0x838 net/netlink/af_netlink.c:1347 netlink_sendmsg+0x5fc/0x8b0 net/netlink/af_netlink.c:1891 sock_sendmsg_nosec net/socket.c:711 [inline] __sock_sendmsg net/socket.c:726 [inline] ____sys_sendmsg+0x410/0x708 net/socket.c:2583 ___sys_sendmsg+0x178/0x1d8 net/socket.c:2637 __sys_sendmsg net/socket.c:2669 [inline] __do_sys_sendmsg net/socket.c:2674 [inline] __se_sys_sendmsg net/socket.c:2672 [inline] __arm64_sys_sendmsg+0x12c/0x1c8 net/socket.c:2672 __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline] invoke_syscall+0x90/0x278 arch/arm64/kernel/syscall.c:49 el0_svc_common+0x13c/0x250 arch/arm64/kernel/syscall.c:132 do_el0_svc+0x54/0x70 arch/arm64/kernel/syscall.c:151 el0_svc+0x4c/0xa8 arch/arm64/kernel/entry-common.c:744 el0t_64_sync_handler+0x78/0x108 arch/arm64/kernel/entry-common.c:762 el0t_64_sync+0x198/0x1a0 arch/arm64/kernel/entry.S:600 The buggy address belongs to the object at ffff0000d768c000 which belongs to the cache kmalloc-cg-4k of size 4096 The buggy address is located 224 bytes inside of freed 4096-byte region [ffff0000d768c000, ffff0000d768d000) The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x117688 head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 memcg:ffff0000c77ef981 flags: 0xbfffe0000000040(head\|node=0\|zone=2\|lastcpupid=0x1ffff) page_type: f5(slab) raw: 0bfffe0000000040 ffff0000c000f500 dead000000000100 dead000000000122 raw: 0000000000000000 0000000000040004 00000001f5000000 ffff0000c77ef981 head: 0bfffe0000000040 ffff0000c000f500 dead000000000100 dead000000000122 head: 0000000000000000 0000000000040004 00000001f5000000 ffff0000c77ef981 head: 0bfffe0000000003 fffffdffc35da201 ffffffffffffffff 0000000000000000 head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff0000d768bf80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff0000d768c000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >ffff0000d768c080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff0000d768c100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff0000d768c180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb Fixes: `8c55facecd` ("net: linkwatch: only report IF_OPER_LOWERLAYERDOWN if iflink is actually down") Reported-by: syzkaller <syzkaller@googlegroups.com> Suggested-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/netdev/20250102174400.085fd8ac@kernel.org/ [1] Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250106071911.64355-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-07 17:50:49 -08:00
Benjamin Coddington	b341ca51d2	tls: Fix tls_sw_sendmsg error handling We've noticed that NFS can hang when using RPC over TLS on an unstable connection, and investigation shows that the RPC layer is stuck in a tight loop attempting to transmit, but forever getting -EBADMSG back from the underlying network. The loop begins when tcp_sendmsg_locked() returns -EPIPE to tls_tx_records(), but that error is converted to -EBADMSG when calling the socket's error reporting handler. Instead of converting errors from tcp_sendmsg_locked(), let's pass them along in this path. The RPC layer handles -EPIPE by reconnecting the transport, which prevents the endless attempts to transmit on a broken connection. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Fixes: `a42055e8d2` ("net/tls: Add support for async encryption of records for performance") Link: https://patch.msgid.link/9594185559881679d81f071b181a10eb07cd079f.1736004079.git.bcodding@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-07 17:00:19 -08:00
Namjae Jeon	e8580b4c60	ksmbd: Implement new SMB3 POSIX type As SMB3 posix extension specification, Give posix file type to posix mode. https://www.samba.org/~slow/SMB3_POSIX/fscc_posix_extensions.html#posix-file-type-definition Cc: stable@vger.kernel.org Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-01-07 18:48:49 -06:00
Daniil Stas	82163d63ae	hwmon: (drivetemp) Fix driver producing garbage data when SCSI errors occur scsi_execute_cmd() function can return both negative (linux codes) and positive (scsi_cmnd result field) error codes. Currently the driver just passes error codes of scsi_execute_cmd() to hwmon core, which is incorrect because hwmon only checks for negative error codes. This leads to hwmon reporting uninitialized data to userspace in case of SCSI errors (for example if the disk drive was disconnected). This patch checks scsi_execute_cmd() output and returns -EIO if it's error code is positive. Fixes: `5b46903d8b` ("hwmon: Driver for disk and solid state drives with temperature sensors") Signed-off-by: Daniil Stas <daniil.stas@posteo.net> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Chris Healy <cphealy@gmail.com> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Bart Van Assche <bvanassche@acm.org> Cc: linux-kernel@vger.kernel.org Cc: linux-scsi@vger.kernel.org Cc: linux-ide@vger.kernel.org Cc: linux-hwmon@vger.kernel.org Link: https://lore.kernel.org/r/20250105213618.531691-1-daniil.stas@posteo.net [groeck: Avoid inline variable declaration for portability] Signed-off-by: Guenter Roeck <linux@roeck-us.net>	2025-01-07 16:36:01 -08:00
Rick Edgecombe	a9d9c33132	x86/fpu: Ensure shadow stack is active before "getting" registers The x86 shadow stack support has its own set of registers. Those registers are XSAVE-managed, but they are "supervisor state components" which means that userspace can not touch them with XSAVE/XRSTOR. It also means that they are not accessible from the existing ptrace ABI for XSAVE state. Thus, there is a new ptrace get/set interface for it. The regset code that ptrace uses provides an ->active() handler in addition to the get/set ones. For shadow stack this ->active() handler verifies that shadow stack is enabled via the ARCH_SHSTK_SHSTK bit in the thread struct. The ->active() handler is checked from some call sites of the regset get/set handlers, but not the ptrace ones. This was not understood when shadow stack support was put in place. As a result, both the set/get handlers can be called with XFEATURE_CET_USER in its init state, which would cause get_xsave_addr() to return NULL and trigger a WARN_ON(). The ssp_set() handler luckily has an ssp_active() check to avoid surprising the kernel with shadow stack behavior when the kernel is not ready for it (ARCH_SHSTK_SHSTK==0). That check just happened to avoid the warning. But the ->get() side wasn't so lucky. It can be called with shadow stacks disabled, triggering the warning in practice, as reported by Christina Schimpe: WARNING: CPU: 5 PID: 1773 at arch/x86/kernel/fpu/regset.c:198 ssp_get+0x89/0xa0 [...] Call Trace: <TASK> ? show_regs+0x6e/0x80 ? ssp_get+0x89/0xa0 ? __warn+0x91/0x150 ? ssp_get+0x89/0xa0 ? report_bug+0x19d/0x1b0 ? handle_bug+0x46/0x80 ? exc_invalid_op+0x1d/0x80 ? asm_exc_invalid_op+0x1f/0x30 ? __pfx_ssp_get+0x10/0x10 ? ssp_get+0x89/0xa0 ? ssp_get+0x52/0xa0 __regset_get+0xad/0xf0 copy_regset_to_user+0x52/0xc0 ptrace_regset+0x119/0x140 ptrace_request+0x13c/0x850 ? wait_task_inactive+0x142/0x1d0 ? do_syscall_64+0x6d/0x90 arch_ptrace+0x102/0x300 [...] Ensure that shadow stacks are active in a thread before looking them up in the XSAVE buffer. Since ARCH_SHSTK_SHSTK and user_ssp[SHSTK_EN] are set at the same time, the active check ensures that there will be something to find in the XSAVE buffer. [ dhansen: changelog/subject tweaks ] Fixes: `2fab02b25a` ("x86: Add PTRACE interface for shadow stack") Reported-by: Christina Schimpe <christina.schimpe@intel.com> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Tested-by: Christina Schimpe <christina.schimpe@intel.com> Cc:stable@vger.kernel.org Link: https://lore.kernel.org/all/20250107233056.235536-1-rick.p.edgecombe%40intel.com	2025-01-07 15:55:51 -08:00
He Wang	2ac538e402	ksmbd: fix unexpectedly changed path in ksmbd_vfs_kern_path_locked When `ksmbd_vfs_kern_path_locked` met an error and it is not the last entry, it will exit without restoring changed path buffer. But later this buffer may be used as the filename for creation. Fixes: `c5a709f08d` ("ksmbd: handle caseless file creation") Signed-off-by: He Wang <xw897002528@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-01-07 17:29:17 -06:00
Linus Torvalds	09a0fa92e5	Merge tag 'selinux-pr-20250107' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux fix from Paul Moore: "A single SELinux patch to address a problem with a single domain using multiple xperm classes" * tag 'selinux-pr-20250107' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: match extended permissions to their base permissions	2025-01-07 14:49:48 -08:00
Su Hui	95978931d5	eth: fbnic: Revert "eth: fbnic: Add hardware monitoring support via HWMON interface" There is a garbage value problem in fbnic_mac_get_sensor_asic(). 'fw_cmpl' is uninitialized which makes 'sensor' and '*val' to be stored garbage value. Revert commit `d85ebade02` ("eth: fbnic: Add hardware monitoring support via HWMON interface") to avoid this problem. Fixes: `d85ebade02` ("eth: fbnic: Add hardware monitoring support via HWMON interface") Signed-off-by: Su Hui <suhui@nfschina.com> Suggested-by: Jakub Kicinski <kuba@kernel.org> Suggested-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://patch.msgid.link/20250106023647.47756-1-suhui@nfschina.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-07 12:13:04 -08:00
Joe Hattori	9164e0912a	thermal: of: fix OF node leak in of_thermal_zone_find() of_thermal_zone_find() calls of_parse_phandle_with_args(), but does not release the OF node reference obtained by it. Add a of_node_put() call when the call is successful. Fixes: `3fd6d6e2b4` ("thermal/of: Rework the thermal device tree initialization") Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp> Link: https://patch.msgid.link/20241224031809.950461-1-joe@pf.is.s.u-tokyo.ac.jp [ rjw: Changelog edit ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2025-01-07 20:05:32 +01:00
Hans de Goede	cd4a7b2e6a	ACPI: resource: acpi_dev_irq_override(): Check DMI match last acpi_dev_irq_override() gets called approx. 30 times during boot (15 legacy IRQs * 2 override_table entries). Of these 30 calls at max 1 will match the non DMI checks done by acpi_dev_irq_override(). The dmi_check_system() check is by far the most expensive check done by acpi_dev_irq_override(), make this call the last check done by acpi_dev_irq_override() so that it will be called at max 1 time instead of 30 times. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://patch.msgid.link/20241228165253.42584-1-hdegoede@redhat.com [ rjw: Subject edit ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2025-01-07 19:26:48 +01:00
Hans de Goede	7ed4e4a659	ACPI: resource: Add TongFang GM5HG0A to irq1_edge_low_force_override[] The TongFang GM5HG0A is a TongFang barebone design which is sold under various brand names. The ACPI IRQ override for the keyboard IRQ must be used on these AMD Zen laptops in order for the IRQ to work. At least on the SKIKK Vanaheim variant the DMI product- and board-name strings have been replaced by the OEM with "Vanaheim" so checking that board-name contains "GM5HG0A" as is usually done for TongFang barebones quirks does not work. The DMI OEM strings do contain "GM5HG0A". I have looked at the dmidecode for a few other TongFang devices and the TongFang code-name string being in the OEM strings seems to be something which is consistently true. Add a quirk checking one of the DMI_OEM_STRING(s) is "GM5HG0A" in the hope that this will work for other OEM versions of the "GM5HG0A" too. Link: https://www.skikk.eu/en/laptops/vanaheim-15-rtx-4060 Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219614 Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patch.msgid.link/20241228164845.42381-1-hdegoede@redhat.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2025-01-07 19:23:05 +01:00
Hans de Goede	66d337fede	ACPI: resource: Add Asus Vivobook X1504VAP to irq1_level_low_skip_override[] Like the Vivobook X1704VAP the X1504VAP has its keyboard IRQ (1) described as ActiveLow in the DSDT, which the kernel overrides to EdgeHigh which breaks the keyboard. Add the X1504VAP to the irq1_level_low_skip_override[] quirk table to fix this. Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219224 Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patch.msgid.link/20241220181352.25974-1-hdegoede@redhat.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2025-01-07 19:19:44 +01:00
Danilo Krummrich	b4aee757f1	MAINTAINERS: align Danilo's maintainer entries Some entries use my kernel.org address, while others use my Red Hat one. Since this is a bit of an inconvinience for me, align them to all use the same (kernel.org) address. Signed-off-by: Danilo Krummrich <dakr@kernel.org> Link: https://lore.kernel.org/r/20241204152248.8644-1-dakr@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-01-07 19:01:50 +01:00
En-Wei Wu	bd2776e39c	igc: return early when failing to read EECD register When booting with a dock connected, the igc driver may get stuck for ~40 seconds if PCIe link is lost during initialization. This happens because the driver access device after EECD register reads return all F's, indicating failed reads. Consequently, hw->hw_addr is set to NULL, which impacts subsequent rd32() reads. This leads to the driver hanging in igc_get_hw_semaphore_i225(), as the invalid hw->hw_addr prevents retrieving the expected value. To address this, a validation check and a corresponding return value catch is added for the EECD register read result. If all F's are returned, indicating PCIe link loss, the driver will return -ENXIO immediately. This avoids the 40-second hang and significantly improves boot time when using a dock with an igc NIC. Log before the patch: [ 0.911913] igc 0000:70:00.0: enabling device (0000 -> 0002) [ 0.912386] igc 0000:70:00.0: PTM enabled, 4ns granularity [ 1.571098] igc 0000:70:00.0 (unnamed net_device) (uninitialized): PCIe link lost, device now detached [ 43.449095] igc_get_hw_semaphore_i225: igc 0000:70:00.0 (unnamed net_device) (uninitialized): Driver can't access device - SMBI bit is set. [ 43.449186] igc 0000:70:00.0: probe with driver igc failed with error -13 [ 46.345701] igc 0000:70:00.0: enabling device (0000 -> 0002) [ 46.345777] igc 0000:70:00.0: PTM enabled, 4ns granularity Log after the patch: [ 1.031000] igc 0000:70:00.0: enabling device (0000 -> 0002) [ 1.032097] igc 0000:70:00.0: PTM enabled, 4ns granularity [ 1.642291] igc 0000:70:00.0 (unnamed net_device) (uninitialized): PCIe link lost, device now detached [ 5.480490] igc 0000:70:00.0: enabling device (0000 -> 0002) [ 5.480516] igc 0000:70:00.0: PTM enabled, 4ns granularity Fixes: `ab40561268` ("igc: Add NVM support") Cc: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com> Signed-off-by: En-Wei Wu <en-wei.wu@canonical.com> Reviewed-by: Vitaly Lifshits <vitaly.lifshits@intel.com> Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2025-01-07 09:01:15 -08:00
Przemyslaw Korba	6c5b989116	ice: fix incorrect PHY settings for 100 GB/s ptp4l application reports too high offset when ran on E823 device with a 100GB/s link. Those values cannot go under 100ns, like in a working case when using 100 GB/s cable. This is due to incorrect frequency settings on the PHY clocks for 100 GB/s speed. Changes are introduced to align with the internal hardware documentation, and correctly initialize frequency in PHY clocks with the frequency values that are in our HW spec. To reproduce the issue run ptp4l as a Time Receiver on E823 device, and observe the offset, which will never approach values seen in the PTP working case. Reproduction output: ptp4l -i enp137s0f3 -m -2 -s -f /etc/ptp4l_8275.conf ptp4l[5278.775]: master offset 12470 s2 freq +41288 path delay -3002 ptp4l[5278.837]: master offset 10525 s2 freq +39202 path delay -3002 ptp4l[5278.900]: master offset -24840 s2 freq -20130 path delay -3002 ptp4l[5278.963]: master offset 10597 s2 freq +37908 path delay -3002 ptp4l[5279.025]: master offset 8883 s2 freq +36031 path delay -3002 ptp4l[5279.088]: master offset 7267 s2 freq +34151 path delay -3002 ptp4l[5279.150]: master offset 5771 s2 freq +32316 path delay -3002 ptp4l[5279.213]: master offset 4388 s2 freq +30526 path delay -3002 ptp4l[5279.275]: master offset -30434 s2 freq -28485 path delay -3002 ptp4l[5279.338]: master offset -28041 s2 freq -27412 path delay -3002 ptp4l[5279.400]: master offset 7870 s2 freq +31118 path delay -3002 Fixes: `3a7496234d` ("ice: implement basic E822 PTP support") Reviewed-by: Milena Olech <milena.olech@intel.com> Signed-off-by: Przemyslaw Korba <przemyslaw.korba@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2025-01-07 09:01:15 -08:00
Arkadiusz Kubalewski	65104599b3	ice: fix max values for dpll pin phase adjust Mask admin command returned max phase adjust value for both input and output pins. Only 31 bits are relevant, last released data sheet wrongly points that 32 bits are valid - see [1] 3.2.6.4.1 Get CCU Capabilities Command for reference. Fix of the datasheet itself is in progress. Fix the min/max assignment logic, previously the value was wrongly considered as negative value due to most significant bit being set. Example of previous broken behavior: $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/dpll.yaml \ --do pin-get --json '{"id":1}'\| grep phase-adjust 'phase-adjust': 0, 'phase-adjust-max': 16723, 'phase-adjust-min': -16723, Correct behavior with the fix: $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/dpll.yaml \ --do pin-get --json '{"id":1}'\| grep phase-adjust 'phase-adjust': 0, 'phase-adjust-max': 2147466925, 'phase-adjust-min': -2147466925, [1] https://cdrdv2.intel.com/v1/dl/getContent/613875?explicitVersion=true Fixes: `90e1c90750` ("ice: dpll: implement phase related callbacks") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2025-01-07 09:01:15 -08:00
Li Huafei	cbd399f78e	topology: Keep the cpumask unchanged when printing cpumap During fuzz testing, the following warning was discovered: different return values (15 and 11) from vsnprintf("%*pbl ", ...) test:keyward is WARNING in kvasprintf WARNING: CPU: 55 PID: 1168477 at lib/kasprintf.c:30 kvasprintf+0x121/0x130 Call Trace: kvasprintf+0x121/0x130 kasprintf+0xa6/0xe0 bitmap_print_to_buf+0x89/0x100 core_siblings_list_read+0x7e/0xb0 kernfs_file_read_iter+0x15b/0x270 new_sync_read+0x153/0x260 vfs_read+0x215/0x290 ksys_read+0xb9/0x160 do_syscall_64+0x56/0x100 entry_SYSCALL_64_after_hwframe+0x78/0xe2 The call trace shows that kvasprintf() reported this warning during the printing of core_siblings_list. kvasprintf() has several steps: (1) First, calculate the length of the resulting formatted string. (2) Allocate a buffer based on the returned length. (3) Then, perform the actual string formatting. (4) Check whether the lengths of the formatted strings returned in steps (1) and (2) are consistent. If the core_cpumask is modified between steps (1) and (3), the lengths obtained in these two steps may not match. Indeed our test includes cpu hotplugging, which should modify core_cpumask while printing. To fix this issue, cache the cpumask into a temporary variable before calling cpumap_print_{list, cpumask}_to_buf(), to keep it unchanged during the printing process. Fixes: `bb9ec13d15` ("topology: use bin_attribute to break the size limitation of cpumap ABI") Cc: stable <stable@kernel.org> Signed-off-by: Li Huafei <lihuafei1@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20241114110141.94725-1-lihuafei1@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-01-07 17:58:08 +01:00
Christoph Schlameuss	e376d95887	KVM: s390: selftests: Add has device attr check to uc_attr_mem_limit selftest Fixup the uc_attr_mem_limit test case to also cover the KVM_HAS_DEVICE_ATTR ioctl. Signed-off-by: Christoph Schlameuss <schlameuss@linux.ibm.com> Tested-by: Hariharan Mari <hari55@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Link: https://lore.kernel.org/r/20241216092140.329196-7-schlameuss@linux.ibm.com Message-ID: <20241216092140.329196-7-schlameuss@linux.ibm.com> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>	2025-01-07 16:36:11 +01:00
Christoph Schlameuss	b1da33b0e3	KVM: s390: selftests: Add ucontrol gis routing test Add a selftests for the interrupt routing configuration when using ucontrol VMs. Calling the test may trigger a null pointer dereferences on kernels not containing the fixes in this patch series. Signed-off-by: Christoph Schlameuss <schlameuss@linux.ibm.com> Tested-by: Hariharan Mari <hari55@linux.ibm.com> Reviewed-by: Hariharan Mari <hari55@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Link: https://lore.kernel.org/r/20241216092140.329196-5-schlameuss@linux.ibm.com Message-ID: <20241216092140.329196-5-schlameuss@linux.ibm.com> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>	2025-01-07 16:36:11 +01:00
Christoph Schlameuss	5021fd77d6	KVM: s390: Reject KVM_SET_GSI_ROUTING on ucontrol VMs Prevent null pointer dereference when processing KVM_IRQ_ROUTING_S390_ADAPTER routing entries. The ioctl cannot be processed for ucontrol VMs. Fixes: `f65470661f` ("KVM: s390/interrupt: do not pin adapter interrupt pages") Signed-off-by: Christoph Schlameuss <schlameuss@linux.ibm.com> Tested-by: Hariharan Mari <hari55@linux.ibm.com> Reviewed-by: Hariharan Mari <hari55@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Link: https://lore.kernel.org/r/20241216092140.329196-4-schlameuss@linux.ibm.com Message-ID: <20241216092140.329196-4-schlameuss@linux.ibm.com> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>	2025-01-07 16:36:11 +01:00
Christoph Schlameuss	b07f6a30c7	KVM: s390: selftests: Add ucontrol flic attr selftests Add some superficial selftests for the floating interrupt controller when using ucontrol VMs. These tests are intended to cover very basic calls only. Some of the calls may trigger null pointer dereferences on kernels not containing the fixes in this patch series. Signed-off-by: Christoph Schlameuss <schlameuss@linux.ibm.com> Tested-by: Hariharan Mari <hari55@linux.ibm.com> Reviewed-by: Hariharan Mari <hari55@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Link: https://lore.kernel.org/r/20241216092140.329196-3-schlameuss@linux.ibm.com Message-ID: <20241216092140.329196-3-schlameuss@linux.ibm.com> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>	2025-01-07 16:36:10 +01:00
Christoph Schlameuss	df989238fa	KVM: s390: Reject setting flic pfault attributes on ucontrol VMs Prevent null pointer dereference when processing the KVM_DEV_FLIC_APF_ENABLE and KVM_DEV_FLIC_APF_DISABLE_WAIT ioctls in the interrupt controller. Fixes: `3c038e6be0` ("KVM: async_pf: Async page fault support on s390") Reported-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Christoph Schlameuss <schlameuss@linux.ibm.com> Reviewed-by: Hariharan Mari <hari55@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Link: https://lore.kernel.org/r/20241216092140.329196-2-schlameuss@linux.ibm.com Message-ID: <20241216092140.329196-2-schlameuss@linux.ibm.com> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>	2025-01-07 16:36:10 +01:00
Claudio Imbrenda	984aaf6161	KVM: s390: vsie: fix virtual/physical address in unpin_scb() In commit `77b5334115` ("KVM: s390: VSIE: sort out virtual/physical address in pin_guest_page"), only pin_scb() has been updated. This means that in unpin_scb() a virtual address was still used directly as physical address without conversion. The resulting physical address is obviously wrong and most of the time also invalid. Since commit `d0ef8d9fbe` ("KVM: s390: Use kvm_release_page_dirty() to unpin "struct page" memory"), unpin_guest_page() will directly use kvm_release_page_dirty(), instead of kvm_release_pfn_dirty(), which has since been removed. One of the checks that were performed by kvm_release_pfn_dirty() was to verify whether the page was valid at all, and silently return successfully without doing anything if the page was invalid. When kvm_release_pfn_dirty() was still used, the invalid page was thus silently ignored. Now the check is gone and the result is an Oops. This also means that when running with a V!=R kernel, the page was not released, causing a leak. The solution is simply to add the missing virt_to_phys(). Fixes: `77b5334115` ("KVM: s390: VSIE: sort out virtual/physical address in pin_guest_page") Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Nico Boehr <nrb@linux.ibm.com> Link: https://lore.kernel.org/r/20241210083948.23963-1-imbrenda@linux.ibm.com Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Message-ID: <20241210083948.23963-1-imbrenda@linux.ibm.com>	2025-01-07 16:36:10 +01:00
David E. Box	1d7461d0c8	platform/x86: intel/pmc: Fix ioremap() of bad address In pmc_core_ssram_get_pmc(), the physical addresses for hidden SSRAM devices are retrieved from the MMIO region of the primary SSRAM device. If additional devices are not present, the address returned is zero. Currently, the code does not check for this condition, resulting in ioremap() incorrectly attempting to map address 0. Add a check for a zero address and return 0 if no additional devices are found, as it is not an error for the device to be absent. Fixes: `a01486dc4b` ("platform/x86/intel/pmc: Cleanup SSRAM discovery") Signed-off-by: David E. Box <david.e.box@linux.intel.com> Link: https://lore.kernel.org/r/20250106174653.1497128-1-david.e.box@linux.intel.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>	2025-01-07 17:30:11 +02:00
Srinivas Pandruvada	cc1ff7bc1b	platform/x86: ISST: Add Clearwater Forest to support list Add Clearwater Forest (INTEL_ATOM_DARKMONT_X) to SST support list by adding to isst_cpu_ids. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Link: https://lore.kernel.org/r/20250103155255.1488139-2-srinivas.pandruvada@linux.intel.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>	2025-01-07 17:27:40 +02:00
Srinivas Pandruvada	bee9a0838f	platform/x86/intel: power-domains: Add Clearwater Forest support Add Clearwater Forest support (INTEL_ATOM_DARKMONT_X) to tpmi_cpu_ids to support domaid id mappings. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Link: https://lore.kernel.org/r/20250103155255.1488139-1-srinivas.pandruvada@linux.intel.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>	2025-01-07 17:27:38 +02:00
Maciej S. Szmigiero	dd410d7844	platform/x86/amd/pmc: Only disable IRQ1 wakeup where i8042 actually enabled it Wakeup for IRQ1 should be disabled only in cases where i8042 had actually enabled it, otherwise "wake_depth" for this IRQ will try to drop below zero and there will be an unpleasant WARN() logged: kernel: atkbd serio0: Disabling IRQ1 wakeup source to avoid platform firmware bug kernel: ------------[ cut here ]------------ kernel: Unbalanced IRQ 1 wake disable kernel: WARNING: CPU: 10 PID: 6431 at kernel/irq/manage.c:920 irq_set_irq_wake+0x147/0x1a0 The PMC driver uses DEFINE_SIMPLE_DEV_PM_OPS() to define its dev_pm_ops which sets amd_pmc_suspend_handler() to the .suspend, .freeze, and .poweroff handlers. i8042_pm_suspend(), however, is only set as the .suspend handler. Fix the issue by call PMC suspend handler only from the same set of dev_pm_ops handlers as i8042_pm_suspend(), which currently means just the .suspend handler. To reproduce this issue try hibernating (S4) the machine after a fresh boot without putting it into s2idle first. Fixes: `8e60615e89` ("platform/x86/amd: pmc: Disable IRQ1 wakeup for RN/CZN") Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name> Link: https://lore.kernel.org/r/c8f28c002ca3c66fbeeb850904a1f43118e17200.1736184606.git.mail@maciej.szmigiero.name [ij: edited the commit message.] Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>	2025-01-07 17:25:18 +02:00
Al Viro	24edfbdedf	debugfs: fix missing mutex_destroy() in short_fops case we need that in ->real_fops == NULL, ->short_fops != NULL case Fixes: `8dc6d81c6b` "debugfs: add small file operations for most files" Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Link: https://lore.kernel.org/r/20241229081223.3193228-1-viro@zeniv.linux.org.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-01-07 16:20:11 +01:00
Johannes Berg	f8f25893a4	fs: debugfs: differentiate short fops with proxy ops Geert reported that my previous short fops debugfs changes broke m68k, because it only has mandatory alignement of two, so we can't stash the "is it short" information into the pointer (as we already did with the "is it real" bit.) Instead, exploit the fact that debugfs_file_get() called on an already open file will already find that the fsdata is no longer the real fops but rather the allocated data that already distinguishes full/short ops, so only open() needs to be able to distinguish. We can achieve that by using two different open functions. Unfortunately this requires another set of full file ops, increasing the size by 536 bytes (x86-64), but that's still a reasonable trade-off given that only converting some of the wireless stack gained over 28k. This brings the total cost of this to around 1k, for wins of 28k (all x86-64). Reported-and-tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Link: https://lore.kernel.org/CAMuHMdWu_9-L2Te101w8hU7H_2yobJFPXSwwUmGHSJfaPWDKiQ@mail.gmail.com Fixes: `8dc6d81c6b` ("debugfs: add small file operations for most files") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Link: https://lore.kernel.org/r/20241129121536.30989-2-johannes@sipsolutions.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-01-07 16:20:08 +01:00
David Howells	8fd56ad6e7	afs: Fix the maximum cell name length The kafs filesystem limits the maximum length of a cell to 256 bytes, but a problem occurs if someone actually does that: kafs tries to create a directory under /proc/net/afs/ with the name of the cell, but that fails with a warning: WARNING: CPU: 0 PID: 9 at fs/proc/generic.c:405 because procfs limits the maximum filename length to 255. However, the DNS limits the maximum lookup length and, by extension, the maximum cell name, to 255 less two (length count and trailing NUL). Fix this by limiting the maximum acceptable cellname length to 253. This also allows us to be sure we can create the "/afs/.<cell>/" mountpoint too. Further, split the YFS VL record cell name maximum to be the 256 allowed by the protocol and ignore the record retrieved by YFSVL.GetCellName if it exceeds 253. Fixes: `c3e9f88826` ("afs: Implement client support for the YFSVL.GetCellName RPC op") Reported-by: syzbot+7848fee1f1e5c53f912b@syzkaller.appspotmail.com Closes: https://lore.kernel.org/r/6776d25d.050a0220.3a8527.0048.GAE@google.com/ Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/376236.1736180460@warthog.procyon.org.uk Tested-by: syzbot+7848fee1f1e5c53f912b@syzkaller.appspotmail.com cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-01-07 15:55:25 +01:00
Christian Brauner	3ff93c5935	Merge tag 'fuse-fixes-6.13-rc7' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse fixes from Miklos Szeredi <mszeredi@redhat.com>: - Fix fuse_get_user_pages() allocation failure handling. - Fix direct-io folio offset and length calculation. * tag 'fuse-fixes-6.13-rc7' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: Set *nbytesp=0 in fuse_get_user_pages on allocation failure fuse: fix direct io folio offset and length calculation Link: https://lore.kernel.org/r/CAJfpegu7o_X%3DSBWk_C47dUVUQ1mJZDEGe1MfD0N3wVJoUBWdmg@mail.gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-01-07 15:43:07 +01:00
Lukas Bulwahn	84b172cea4	staging: gpib: refer to correct config symbol in tnt4882 Makefile Commit `79d2e1919a` ("staging: gpib: fix Makefiles") uses the corresponding config symbols to let Makefiles include the driver sources appropriately in the kernel build. Unfortunately, the Makefile in the tnt4882 directory refers to the non-existing config GPIB_TNT4882. The actual config name for this driver is GPIB_NI_PCI_ISA, as can be observed in the gpib Makefile. Probably, this is caused by the subtle differences between the config names, directory names and file names in ./drivers/staging/gpib/, where often config names and directory names are identical or at least close in naming, but in this case, it is not. Change the reference in the tnt4882 Makefile from the non-existing config GPIB_TNT4882 to the existing config GPIB_NI_PCI_ISA. Fixes: `79d2e1919a` ("staging: gpib: fix Makefiles") Signed-off-by: Lukas Bulwahn <lukas.bulwahn@redhat.com> Link: https://lore.kernel.org/r/20250107135032.34424-1-lukas.bulwahn@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-01-07 15:37:54 +01:00
Pavel Begunkov	60495b08cf	io_uring: silence false positive warnings If we kill a ring and then immediately exit the task, we'll get cancellattion running by the task and a kthread in io_ring_exit_work. For DEFER_TASKRUN, we do want to limit it to only one entity executing it, however it's currently not an issue as it's protected by uring_lock. Silence lockdep assertions for now, we'll return to it later. Reported-by: syzbot+1bcb75613069ad4957fc@syzkaller.appspotmail.com Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/7e5f68281acb0f081f65fde435833c68a3b7e02f.1736257837.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-01-07 07:19:44 -07:00
Jakub Kicinski	fd48f071a3	net: don't dump Tx and uninitialized NAPIs We use NAPI ID as the key for continuing dumps. We also depend on the NAPIs being sorted by ID within the driver list. Tx NAPIs (which don't have an ID assigned) break this expectation, it's not currently possible to dump them reliably. Since Tx NAPIs are relatively rare, and can't be used in doit (GET or SET) hide them from the dump API as well. Fixes: `27f91aaf49` ("netdev-genl: Add netlink framework functions for napi") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250103183207.1216004-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-01-07 13:07:31 +01:00
GONG Ruiqi	b0e525d7a2	usb: typec: fix pm usage counter imbalance in ucsi_ccg_sync_control() The error handling for the case `con_index == 0` should involve dropping the pm usage counter, as ucsi_ccg_sync_control() gets it at the beginning. Fix it. Cc: stable <stable@kernel.org> Fixes: `e56aac6e5a` ("usb: typec: fix potential array underflow in ucsi_ccg_sync_control()") Signed-off-by: GONG Ruiqi <gongruiqi1@huawei.com> Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20250107015750.2778646-1-gongruiqi1@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-01-07 11:43:08 +01:00
Lubomir Rintel	cdef30e077	usb-storage: Add max sectors quirk for Nokia 208 This fixes data corruption when accessing the internal SD card in mass storage mode. I am actually not too sure why. I didn't figure a straightforward way to reproduce the issue, but i seem to get garbage when issuing a lot (over 50) of large reads (over 120 sectors) are done in a quick succession. That is, time seems to matter here -- larger reads are fine if they are done with some delay between them. But I'm not great at understanding this sort of things, so I'll assume the issue other, smarter, folks were seeing with similar phones is the same problem and I'll just put my quirk next to theirs. The "Software details" screen on the phone is as follows: V 04.06 07-08-13 RM-849 (c) Nokia TL;DR version of the device descriptor: idVendor 0x0421 Nokia Mobile Phones idProduct 0x06c2 bcdDevice 4.06 iManufacturer 1 Nokia iProduct 2 Nokia 208 The patch assumes older firmwares are broken too (I'm unable to test, but no biggie if they aren't I guess), and I have no idea if newer firmware exists. Signed-off-by: Lubomir Rintel <lkundrak@v3.sk> Cc: stable <stable@kernel.org> Acked-by: Alan Stern <stern@rowland.harvard.edu> Link: https://lore.kernel.org/r/20250101212206.2386207-1-lkundrak@v3.sk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-01-07 11:42:42 +01:00
Takashi Iwai	6f660ffce7	usb: gadget: midi2: Reverse-select at the right place We should do reverse selection of other components from CONFIG_USB_F_MIDI2 which is tristate, instead of CONFIG_USB_CONFIGFS_F_MIDI2 which is bool, for satisfying subtle module dependencies. Fixes: `8b645922b2` ("usb: gadget: Add support for USB MIDI 2.0 function driver") Cc: stable <stable@kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://lore.kernel.org/r/20250101131124.27599-1-tiwai@suse.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2025-01-07 11:42:22 +01:00
Anumula Murali Mohan Reddy	4c1224501e	cxgb4: Avoid removal of uninserted tid During ARP failure, tid is not inserted but _c4iw_free_ep() attempts to remove tid which results in error. This patch fixes the issue by avoiding removal of uninserted tid. Fixes: `59437d78f0` ("cxgb4/chtls: fix ULD connection failures due to wrong TID base") Signed-off-by: Anumula Murali Mohan Reddy <anumula@chelsio.com> Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Link: https://patch.msgid.link/20250103092327.1011925-1-anumula@chelsio.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-06 16:47:18 -08:00
Jakub Kicinski	3085d4b847	Merge branch 'bnxt_en-2-bug-fixes' Michael Chan says: ==================== bnxt_en: 2 Bug fixes The first patch fixes a potential memory leak when sending a FW message for the RoCE driver. The second patch fixes the potential issue of DIM modifying the coalescing parameters of a ring that has been freed. ==================== Link: https://patch.msgid.link/20250104043849.3482067-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-06 16:40:29 -08:00
Michael Chan	40452969a5	bnxt_en: Fix DIM shutdown DIM work will call the firmware to adjust the coalescing parameters on the RX rings. We should cancel DIM work before we call the firmware to free the RX rings. Otherwise, FW will reject the call from DIM work if the RX ring has been freed. This will generate an error message like this: bnxt_en 0000:21:00.1 ens2f1np1: hwrm req_type 0x53 seq id 0x6fca error 0x2 and cause unnecessary concern for the user. It is also possible to modify the coalescing parameters of the wrong ring if the ring has been re-allocated. To prevent this, cancel DIM work right before freeing the RX rings. We also have to add a check in NAPI poll to not schedule DIM if the RX rings are shutting down. Check that the VNIC is active before we schedule DIM. The VNIC is always disabled before we free the RX rings. Fixes: `0bc0b97fca` ("bnxt_en: cleanup DIM work on device shutdown") Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20250104043849.3482067-3-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-06 16:40:26 -08:00
Kalesh AP	c8dafb0e43	bnxt_en: Fix possible memory leak when hwrm_req_replace fails When hwrm_req_replace() fails, the driver is not invoking bnxt_req_drop() which could cause a memory leak. Fixes: `bbf33d1d98` ("bnxt_en: update all firmware calls to use the new APIs") Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20250104043849.3482067-2-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-06 16:40:26 -08:00
Shannon Nelson	8c817eb262	pds_core: limit loop over fw name list Add an array size limit to the for-loop to be sure we don't try to reference a fw_version string off the end of the fw info names array. We know that our firmware only has a limited number of firmware slot names, but we shouldn't leave this unchecked. Fixes: `45d76f4929` ("pds_core: set up device and adminq") Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Brett Creeley <brett.creeley@amd.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250103195147.7408-1-shannon.nelson@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-06 15:57:54 -08:00
Arunpravin Paneer Selvam	75c8b703e5	drm/amdgpu: Add a lock when accessing the buddy trim function When running YouTube videos and Steam games simultaneously, the tester found a system hang / race condition issue with the multi-display configuration setting. Adding a lock to the buddy allocator's trim function would be the solution. <log snip> [ 7197.250436] general protection fault, probably for non-canonical address 0xdead000000000108 [ 7197.250447] RIP: 0010:__alloc_range+0x8b/0x340 [amddrm_buddy] [ 7197.250470] Call Trace: [ 7197.250472] <TASK> [ 7197.250475] ? show_regs+0x6d/0x80 [ 7197.250481] ? die_addr+0x37/0xa0 [ 7197.250483] ? exc_general_protection+0x1db/0x480 [ 7197.250488] ? drm_suballoc_new+0x13c/0x93d [drm_suballoc_helper] [ 7197.250493] ? asm_exc_general_protection+0x27/0x30 [ 7197.250498] ? __alloc_range+0x8b/0x340 [amddrm_buddy] [ 7197.250501] ? __alloc_range+0x109/0x340 [amddrm_buddy] [ 7197.250506] amddrm_buddy_block_trim+0x1b5/0x260 [amddrm_buddy] [ 7197.250511] amdgpu_vram_mgr_new+0x4f5/0x590 [amdgpu] [ 7197.250682] amdttm_resource_alloc+0x46/0xb0 [amdttm] [ 7197.250689] ttm_bo_alloc_resource+0xe4/0x370 [amdttm] [ 7197.250696] amdttm_bo_validate+0x9d/0x180 [amdttm] [ 7197.250701] amdgpu_bo_pin+0x15a/0x2f0 [amdgpu] [ 7197.250831] amdgpu_dm_plane_helper_prepare_fb+0xb2/0x360 [amdgpu] [ 7197.251025] ? try_wait_for_completion+0x59/0x70 [ 7197.251030] drm_atomic_helper_prepare_planes.part.0+0x2f/0x1e0 [ 7197.251035] drm_atomic_helper_prepare_planes+0x5d/0x70 [ 7197.251037] drm_atomic_helper_commit+0x84/0x160 [ 7197.251040] drm_atomic_nonblocking_commit+0x59/0x70 [ 7197.251043] drm_mode_atomic_ioctl+0x720/0x850 [ 7197.251047] ? __pfx_drm_mode_atomic_ioctl+0x10/0x10 [ 7197.251049] drm_ioctl_kernel+0xb9/0x120 [ 7197.251053] ? srso_alias_return_thunk+0x5/0xfbef5 [ 7197.251056] drm_ioctl+0x2d4/0x550 [ 7197.251058] ? __pfx_drm_mode_atomic_ioctl+0x10/0x10 [ 7197.251063] amdgpu_drm_ioctl+0x4e/0x90 [amdgpu] [ 7197.251186] __x64_sys_ioctl+0xa0/0xf0 [ 7197.251190] x64_sys_call+0x143b/0x25c0 [ 7197.251193] do_syscall_64+0x7f/0x180 [ 7197.251197] ? srso_alias_return_thunk+0x5/0xfbef5 [ 7197.251199] ? amdgpu_display_user_framebuffer_create+0x215/0x320 [amdgpu] [ 7197.251329] ? drm_internal_framebuffer_create+0xb7/0x1a0 [ 7197.251332] ? srso_alias_return_thunk+0x5/0xfbef5 Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Fixes: `4a5ad08f53` ("drm/amdgpu: Add address alignment support to DCC buffers") Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `3318ba94e5`) Cc: stable@vger.kernel.org	2025-01-06 15:20:13 -05:00
Kun Liu	2a238b09bf	drm/amd/pm: fix BUG: scheduling while atomic atomic scheduling will be triggered in interrupt handler for AC/DC mode switch as following backtrace. Call Trace: <IRQ> dump_stack_lvl __schedule_bug __schedule schedule schedule_preempt_disabled __mutex_lock smu_cmn_send_smc_msg_with_param smu_v13_0_irq_process amdgpu_irq_dispatch amdgpu_ih_process amdgpu_irq_handler __handle_irq_event_percpu handle_irq_event handle_edge_irq __common_interrupt common_interrupt </IRQ> <TASK> asm_common_interrupt Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Kun Liu <Kun.Liu2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `03cc84b102`) Cc: stable@vger.kernel.org	2025-01-06 15:19:36 -05:00
Zhu Lingshan	a993d319ae	drm/amdkfd: wq_release signals dma_fence only when available kfd_process_wq_release() signals eviction fence by dma_fence_signal() which wanrs if dma_fence is NULL. kfd_process->ef is initialized by kfd_process_device_init_vm() through ioctl. That means the fence is NULL for a new created kfd_process, and close a kfd_process right after open it will trigger the warning. This commit conditionally signals the eviction fence in kfd_process_wq_release() only when it is available. [ 503.660882] WARNING: CPU: 0 PID: 9 at drivers/dma-buf/dma-fence.c:467 dma_fence_signal+0x74/0xa0 [ 503.782940] Workqueue: kfd_process_wq kfd_process_wq_release [amdgpu] [ 503.789640] RIP: 0010:dma_fence_signal+0x74/0xa0 [ 503.877620] Call Trace: [ 503.880066] <TASK> [ 503.882168] ? __warn+0xcd/0x260 [ 503.885407] ? dma_fence_signal+0x74/0xa0 [ 503.889416] ? report_bug+0x288/0x2d0 [ 503.893089] ? handle_bug+0x53/0xa0 [ 503.896587] ? exc_invalid_op+0x14/0x50 [ 503.900424] ? asm_exc_invalid_op+0x16/0x20 [ 503.904616] ? dma_fence_signal+0x74/0xa0 [ 503.908626] kfd_process_wq_release+0x6b/0x370 [amdgpu] [ 503.914081] process_one_work+0x654/0x10a0 [ 503.918186] worker_thread+0x6c3/0xe70 [ 503.921943] ? srso_alias_return_thunk+0x5/0xfbef5 [ 503.926735] ? srso_alias_return_thunk+0x5/0xfbef5 [ 503.931527] ? __kthread_parkme+0x82/0x140 [ 503.935631] ? __pfx_worker_thread+0x10/0x10 [ 503.939904] kthread+0x2a8/0x380 [ 503.943132] ? __pfx_kthread+0x10/0x10 [ 503.946882] ret_from_fork+0x2d/0x70 [ 503.950458] ? __pfx_kthread+0x10/0x10 [ 503.954210] ret_from_fork_asm+0x1a/0x30 [ 503.958142] </TASK> [ 503.960328] ---[ end trace 0000000000000000 ]--- Fixes: `967d226eaa` ("dma-buf: add WARN_ON() illegal dma-fence signaling") Signed-off-by: Zhu Lingshan <lingshan.zhu@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `2774ef7625`) Cc: stable@vger.kernel.org	2025-01-06 15:17:23 -05:00
Roman Li	0881fbc4fd	drm/amd/display: Add check for granularity in dml ceil/floor helpers [Why] Wrapper functions for dcn_bw_ceil2() and dcn_bw_floor2() should check for granularity is non zero to avoid assert and divide-by-zero error in dcn_bw_ functions. [How] Add check for granularity 0. Cc: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Roman Li <Roman.Li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `f6e09701c3`) Cc: stable@vger.kernel.org	2025-01-06 15:14:43 -05:00
Jesse.zhang@amd.com	9738609449	drm/amdkfd: fixed page fault when enable MES shader debugger Initialize the process context address before setting the shader debugger. [ 260.781212] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:32 vmid:0 pasid:0) [ 260.781236] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000000000000000 from client 10 [ 260.781255] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00040A40 [ 260.781270] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CPC (0x5) [ 260.781284] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0 [ 260.781296] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0 [ 260.781308] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x4 [ 260.781320] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0 [ 260.781332] amdgpu 0000:03:00.0: amdgpu: RW: 0x1 [ 260.782017] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:32 vmid:0 pasid:0) [ 260.782039] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000000000000000 from client 10 [ 260.782058] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00040A41 [ 260.782073] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CPC (0x5) [ 260.782087] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x1 [ 260.782098] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0 [ 260.782110] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x4 [ 260.782122] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0 [ 260.782137] amdgpu 0000:03:00.0: amdgpu: RW: 0x1 [ 260.782155] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:32 vmid:0 pasid:0) [ 260.782166] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000000000000000 from client 10 Fixes: `438b39ac74` ("drm/amdkfd: pause autosuspend when creating pdd") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3849 Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `5b231f5bc9`) Cc: stable@vger.kernel.org	2025-01-06 15:13:37 -05:00
Melissa Wen	5225fd2a26	drm/amd/display: fix divide error in DM plane scale calcs dm_get_plane_scale doesn't take into account plane scaled size equal to zero, leading to a kernel oops due to division by zero. Fix by setting out-scale size as zero when the dst size is zero, similar to what is done by drm_calc_scale(). This issue started with the introduction of cursor ovelay mode that uses this function to assess cursor mode changes via dm_crtc_get_cursor_mode() before checking plane state. [Dec17 17:14] Oops: divide error: 0000 [#1] PREEMPT SMP NOPTI [ +0.000018] CPU: 5 PID: 1660 Comm: surface-DP-1 Not tainted 6.10.0+ #231 [ +0.000007] Hardware name: Valve Jupiter/Jupiter, BIOS F7A0131 01/30/2024 [ +0.000004] RIP: 0010:dm_get_plane_scale+0x3f/0x60 [amdgpu] [ +0.000553] Code: 44 0f b7 41 3a 44 0f b7 49 3e 83 e0 0f 48 0f a3 c2 73 21 69 41 28 e8 03 00 00 31 d2 41 f7 f1 31 d2 89 06 69 41 2c e8 03 00 00 <41> f7 f0 89 07 e9 d7 d8 7e e9 44 89 c8 45 89 c1 41 89 c0 eb d4 66 [ +0.000005] RSP: 0018:ffffa8df0de6b8a0 EFLAGS: 00010246 [ +0.000006] RAX: 00000000000003e8 RBX: ffff9ac65c1f6e00 RCX: ffff9ac65d055500 [ +0.000003] RDX: 0000000000000000 RSI: ffffa8df0de6b8b0 RDI: ffffa8df0de6b8b4 [ +0.000004] RBP: ffff9ac64e7a5800 R08: 0000000000000000 R09: 0000000000000a00 [ +0.000003] R10: 00000000000000ff R11: 0000000000000054 R12: ffff9ac6d0700010 [ +0.000003] R13: ffff9ac65d054f00 R14: ffff9ac65d055500 R15: ffff9ac64e7a60a0 [ +0.000004] FS: 00007f869ea00640(0000) GS:ffff9ac970080000(0000) knlGS:0000000000000000 [ +0.000004] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ +0.000003] CR2: 000055ca701becd0 CR3: 000000010e7f2000 CR4: 0000000000350ef0 [ +0.000004] Call Trace: [ +0.000007] <TASK> [ +0.000006] ? __die_body.cold+0x19/0x27 [ +0.000009] ? die+0x2e/0x50 [ +0.000007] ? do_trap+0xca/0x110 [ +0.000007] ? do_error_trap+0x6a/0x90 [ +0.000006] ? dm_get_plane_scale+0x3f/0x60 [amdgpu] [ +0.000504] ? exc_divide_error+0x38/0x50 [ +0.000005] ? dm_get_plane_scale+0x3f/0x60 [amdgpu] [ +0.000488] ? asm_exc_divide_error+0x1a/0x20 [ +0.000011] ? dm_get_plane_scale+0x3f/0x60 [amdgpu] [ +0.000593] dm_crtc_get_cursor_mode+0x33f/0x430 [amdgpu] [ +0.000562] amdgpu_dm_atomic_check+0x2ef/0x1770 [amdgpu] [ +0.000501] drm_atomic_check_only+0x5e1/0xa30 [drm] [ +0.000047] drm_mode_atomic_ioctl+0x832/0xcb0 [drm] [ +0.000050] ? __pfx_drm_mode_atomic_ioctl+0x10/0x10 [drm] [ +0.000047] drm_ioctl_kernel+0xb3/0x100 [drm] [ +0.000062] drm_ioctl+0x27a/0x4f0 [drm] [ +0.000049] ? __pfx_drm_mode_atomic_ioctl+0x10/0x10 [drm] [ +0.000055] amdgpu_drm_ioctl+0x4e/0x90 [amdgpu] [ +0.000360] __x64_sys_ioctl+0x97/0xd0 [ +0.000010] do_syscall_64+0x82/0x190 [ +0.000008] ? __pfx_drm_mode_createblob_ioctl+0x10/0x10 [drm] [ +0.000044] ? srso_return_thunk+0x5/0x5f [ +0.000006] ? drm_ioctl_kernel+0xb3/0x100 [drm] [ +0.000040] ? srso_return_thunk+0x5/0x5f [ +0.000005] ? __check_object_size+0x50/0x220 [ +0.000007] ? srso_return_thunk+0x5/0x5f [ +0.000005] ? srso_return_thunk+0x5/0x5f [ +0.000005] ? drm_ioctl+0x2a4/0x4f0 [drm] [ +0.000039] ? __pfx_drm_mode_createblob_ioctl+0x10/0x10 [drm] [ +0.000043] ? srso_return_thunk+0x5/0x5f [ +0.000005] ? srso_return_thunk+0x5/0x5f [ +0.000005] ? __pm_runtime_suspend+0x69/0xc0 [ +0.000006] ? srso_return_thunk+0x5/0x5f [ +0.000005] ? amdgpu_drm_ioctl+0x71/0x90 [amdgpu] [ +0.000366] ? srso_return_thunk+0x5/0x5f [ +0.000006] ? syscall_exit_to_user_mode+0x77/0x210 [ +0.000007] ? srso_return_thunk+0x5/0x5f [ +0.000005] ? do_syscall_64+0x8e/0x190 [ +0.000006] ? srso_return_thunk+0x5/0x5f [ +0.000006] ? do_syscall_64+0x8e/0x190 [ +0.000006] ? srso_return_thunk+0x5/0x5f [ +0.000007] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ +0.000008] RIP: 0033:0x55bb7cd962bc [ +0.000007] Code: 4c 89 6c 24 18 4c 89 64 24 20 4c 89 74 24 28 0f 57 c0 0f 11 44 24 30 89 c7 48 8d 54 24 08 b8 10 00 00 00 be bc 64 38 c0 0f 05 <49> 89 c7 48 83 3b 00 74 09 4c 89 c7 ff 15 62 64 99 00 48 83 7b 18 [ +0.000005] RSP: 002b:00007f869e9f4da0 EFLAGS: 00000217 ORIG_RAX: 0000000000000010 [ +0.000007] RAX: ffffffffffffffda RBX: 00007f869e9f4fb8 RCX: 000055bb7cd962bc [ +0.000004] RDX: 00007f869e9f4da8 RSI: 00000000c03864bc RDI: 000000000000003b [ +0.000003] RBP: 000055bb9ddcbcc0 R08: 00007f86541b9920 R09: 0000000000000009 [ +0.000004] R10: 0000000000000004 R11: 0000000000000217 R12: 00007f865406c6b0 [ +0.000003] R13: 00007f86541b5290 R14: 00007f865410b700 R15: 000055bb9ddcbc18 [ +0.000009] </TASK> Fixes: `1b04dcca4f` ("drm/amd/display: Introduce overlay cursor mode") Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3729 Reported-by: Fabio Scaccabarozzi <fsvm88@gmail.com> Co-developed-by: Fabio Scaccabarozzi <fsvm88@gmail.com> Signed-off-by: Fabio Scaccabarozzi <fsvm88@gmail.com> Signed-off-by: Melissa Wen <mwen@igalia.com> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `ab75a0d2e0`) Cc: stable@vger.kernel.org	2025-01-06 15:12:01 -05:00
Melissa Wen	21541bc6b4	drm/amd/display: increase MAX_SURFACES to the value supported by hw As the hw supports up to 4 surfaces, increase the maximum number of surfaces to prevent the DC error when trying to use more than three planes. [drm:dc_state_add_plane [amdgpu]] ERROR Surface: can not attach plane_state 000000003e2cb82c! Maximum is: 3 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3693 Signed-off-by: Melissa Wen <mwen@igalia.com> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `b8d6daffc8`) Cc: stable@vger.kernel.org	2025-01-06 15:11:25 -05:00
Melissa Wen	7de8d5c90b	drm/amd/display: fix page fault due to max surface definition mismatch DC driver is using two different values to define the maximum number of surfaces: MAX_SURFACES and MAX_SURFACE_NUM. Consolidate MAX_SURFACES as the unique definition for surface updates across DC. It fixes page fault faced by Cosmic users on AMD display versions that support two overlay planes, since the introduction of cursor overlay mode. [Nov26 21:33] BUG: unable to handle page fault for address: 0000000051d0f08b [ +0.000015] #PF: supervisor read access in kernel mode [ +0.000006] #PF: error_code(0x0000) - not-present page [ +0.000005] PGD 0 P4D 0 [ +0.000007] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI [ +0.000006] CPU: 4 PID: 71 Comm: kworker/u32:6 Not tainted 6.10.0+ #300 [ +0.000006] Hardware name: Valve Jupiter/Jupiter, BIOS F7A0131 01/30/2024 [ +0.000007] Workqueue: events_unbound commit_work [drm_kms_helper] [ +0.000040] RIP: 0010:copy_stream_update_to_stream.isra.0+0x30d/0x750 [amdgpu] [ +0.000847] Code: 8b 10 49 89 94 24 f8 00 00 00 48 8b 50 08 49 89 94 24 00 01 00 00 8b 40 10 41 89 84 24 08 01 00 00 49 8b 45 78 48 85 c0 74 0b <0f> b6 00 41 88 84 24 90 64 00 00 49 8b 45 60 48 85 c0 74 3b 48 8b [ +0.000010] RSP: 0018:ffffc203802f79a0 EFLAGS: 00010206 [ +0.000009] RAX: 0000000051d0f08b RBX: 0000000000000004 RCX: ffff9f964f0a8070 [ +0.000004] RDX: ffff9f9710f90e40 RSI: ffff9f96600c8000 RDI: ffff9f964f000000 [ +0.000004] RBP: ffffc203802f79f8 R08: 0000000000000000 R09: 0000000000000000 [ +0.000005] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9f96600c8000 [ +0.000004] R13: ffff9f9710f90e40 R14: ffff9f964f000000 R15: ffff9f96600c8000 [ +0.000004] FS: 0000000000000000(0000) GS:ffff9f9970000000(0000) knlGS:0000000000000000 [ +0.000005] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ +0.000005] CR2: 0000000051d0f08b CR3: 00000002e6a20000 CR4: 0000000000350ef0 [ +0.000005] Call Trace: [ +0.000011] <TASK> [ +0.000010] ? __die_body.cold+0x19/0x27 [ +0.000012] ? page_fault_oops+0x15a/0x2d0 [ +0.000014] ? exc_page_fault+0x7e/0x180 [ +0.000009] ? asm_exc_page_fault+0x26/0x30 [ +0.000013] ? copy_stream_update_to_stream.isra.0+0x30d/0x750 [amdgpu] [ +0.000739] ? dc_commit_state_no_check+0xd6c/0xe70 [amdgpu] [ +0.000470] update_planes_and_stream_state+0x49b/0x4f0 [amdgpu] [ +0.000450] ? srso_return_thunk+0x5/0x5f [ +0.000009] ? commit_minimal_transition_state+0x239/0x3d0 [amdgpu] [ +0.000446] update_planes_and_stream_v2+0x24a/0x590 [amdgpu] [ +0.000464] ? srso_return_thunk+0x5/0x5f [ +0.000009] ? sort+0x31/0x50 [ +0.000007] ? amdgpu_dm_atomic_commit_tail+0x159f/0x3a30 [amdgpu] [ +0.000508] ? srso_return_thunk+0x5/0x5f [ +0.000009] ? amdgpu_crtc_get_scanout_position+0x28/0x40 [amdgpu] [ +0.000377] ? srso_return_thunk+0x5/0x5f [ +0.000009] ? drm_crtc_vblank_helper_get_vblank_timestamp_internal+0x160/0x390 [drm] [ +0.000058] ? srso_return_thunk+0x5/0x5f [ +0.000005] ? dma_fence_default_wait+0x8c/0x260 [ +0.000010] ? srso_return_thunk+0x5/0x5f [ +0.000005] ? wait_for_completion_timeout+0x13b/0x170 [ +0.000006] ? srso_return_thunk+0x5/0x5f [ +0.000005] ? dma_fence_wait_timeout+0x108/0x140 [ +0.000010] ? commit_tail+0x94/0x130 [drm_kms_helper] [ +0.000024] ? process_one_work+0x177/0x330 [ +0.000008] ? worker_thread+0x266/0x3a0 [ +0.000006] ? __pfx_worker_thread+0x10/0x10 [ +0.000004] ? kthread+0xd2/0x100 [ +0.000006] ? __pfx_kthread+0x10/0x10 [ +0.000006] ? ret_from_fork+0x34/0x50 [ +0.000004] ? __pfx_kthread+0x10/0x10 [ +0.000005] ? ret_from_fork_asm+0x1a/0x30 [ +0.000011] </TASK> Fixes: `1b04dcca4f` ("drm/amd/display: Introduce overlay cursor mode") Suggested-by: Leo Li <sunpeng.li@amd.com> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3693 Signed-off-by: Melissa Wen <mwen@igalia.com> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `1c86c81a86`) Cc: stable@vger.kernel.org	2025-01-06 15:11:00 -05:00
Alex Hung	5009628d85	drm/amd/display: Remove unnecessary amdgpu_irq_get/put [WHY & HOW] commit `7fb363c575` ("drm/amd/display: Let drm_crtc_vblank_on/off manage interrupts") lets drm_crtc_vblank_* to manage interrupts in amdgpu_dm_crtc_set_vblank, and amdgpu_irq_get/put do not need to be called here. Part of that patch got lost somehow, so fix it up. Fixes: `7fb363c575` ("drm/amd/display: Let drm_crtc_vblank_on/off manage interrupts") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `3782305ce5`) Cc: stable@vger.kernel.org	2025-01-06 15:10:40 -05:00
Linus Torvalds	fbfd64d25c	Merge tag 'vfs-6.13-rc7.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Relax assertions on failure to encode file handles The ->encode_fh() method can fail for various reasons. None of them warrant a WARN_ON(). - Fix overlayfs file handle encoding by allowing encoding an fid from an inode without an alias - Make sure fuse_dir_open() handles FOPEN_KEEP_CACHE. If it's not specified fuse needs to invaludate the directory inode page cache - Fix qnx6 so it builds with gcc-15 - Various fixes for netfslib and ceph and nfs filesystems: - Ignore silly rename files from afs and nfs when building header archives - Fix read result collection in netfslib with multiple subrequests - Handle ENOMEM for netfslib buffered reads - Fix oops in nfs_netfs_init_request() - Parse the secctx command immediately in cachefiles - Remove a redundant smp_rmb() in netfslib - Handle recursion in read retry in netfslib - Fix clearing of folio_queue - Fix missing cancellation of copy-to_cache when the cache for a file is temporarly disabled in netfslib - Sanity check the hfs root record - Fix zero padding data issues in concurrent write scenarios - Fix is_mnt_ns_file() after converting nsfs to path_from_stashed() - Fix missing declaration of init_files - Increase I/O priority when writing revoke records in jbd2 - Flush filesystem device before updating tail sequence in jbd2 * tag 'vfs-6.13-rc7.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (23 commits) ovl: support encoding fid from inode with no alias ovl: pass realinode to ovl_encode_real_fh() instead of realdentry fuse: respect FOPEN_KEEP_CACHE on opendir netfs: Fix is-caching check in read-retry netfs: Fix the (non-)cancellation of copy when cache is temporarily disabled netfs: Fix ceph copy to cache on write-begin netfs: Work around recursion by abandoning retry if nothing read netfs: Fix missing barriers by using clear_and_wake_up_bit() netfs: Remove redundant use of smp_rmb() cachefiles: Parse the "secctx" immediately nfs: Fix oops in nfs_netfs_init_request() when copying to cache netfs: Fix enomem handling in buffered reads netfs: Fix non-contiguous donation between completed reads kheaders: Ignore silly-rename files fs: relax assertions on failure to encode file handles fs: fix missing declaration of init_files fs: fix is_mnt_ns_file() iomap: fix zero padding data issue in concurrent append writes iomap: pass byte granular end position to iomap_add_to_ioend jbd2: flush filesystem device before updating tail sequence ...	2025-01-06 10:26:39 -08:00
Mikhail Zaslonko	0ee4736c00	btrfs: zlib: fix avail_in bytes for s390 zlib HW compression path Since the input data length passed to zlib_compress_folios() can be arbitrary, always setting strm.avail_in to a multiple of PAGE_SIZE may cause read-in bytes to exceed the input range. Currently this triggers an assert in btrfs_compress_folios() on the debug kernel (see below). Fix strm.avail_in calculation for S390 hardware acceleration path. assertion failed: *total_in <= orig_len, in fs/btrfs/compression.c:1041 ------------[ cut here ]------------ kernel BUG at fs/btrfs/compression.c:1041! monitor event: 0040 ilc:2 [#1] PREEMPT SMP CPU: 16 UID: 0 PID: 325 Comm: kworker/u273:3 Not tainted 6.13.0-20241204.rc1.git6.fae3b21430ca.300.fc41.s390x+debug #1 Hardware name: IBM 3931 A01 703 (z/VM 7.4.0) Workqueue: btrfs-delalloc btrfs_work_helper Krnl PSW : 0704d00180000000 0000021761df6538 (btrfs_compress_folios+0x198/0x1a0) R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3 Krnl GPRS: 0000000080000000 0000000000000001 0000000000000047 0000000000000000 0000000000000006 ffffff01757bb000 000001976232fcc0 000000000000130c 000001976232fcd0 000001976232fcc8 00000118ff4a0e30 0000000000000001 00000111821ab400 0000011100000000 0000021761df6534 000001976232fb58 Krnl Code: 0000021761df6528: c020006f5ef4 larl %r2,0000021762be2310 0000021761df652e: c0e5ffbd09d5 brasl %r14,00000217615978d8 #0000021761df6534: af000000 mc 0,0 >0000021761df6538: 0707 bcr 0,%r7 0000021761df653a: 0707 bcr 0,%r7 0000021761df653c: 0707 bcr 0,%r7 0000021761df653e: 0707 bcr 0,%r7 0000021761df6540: c004004bb7ec brcl 0,000002176276d518 Call Trace: [<0000021761df6538>] btrfs_compress_folios+0x198/0x1a0 ([<0000021761df6534>] btrfs_compress_folios+0x194/0x1a0) [<0000021761d97788>] compress_file_range+0x3b8/0x6d0 [<0000021761dcee7c>] btrfs_work_helper+0x10c/0x160 [<0000021761645760>] process_one_work+0x2b0/0x5d0 [<000002176164637e>] worker_thread+0x20e/0x3e0 [<000002176165221a>] kthread+0x15a/0x170 [<00000217615b859c>] __ret_from_fork+0x3c/0x60 [<00000217626e72d2>] ret_from_fork+0xa/0x38 INFO: lockdep is turned off. Last Breaking-Event-Address: [<0000021761597924>] _printk+0x4c/0x58 Kernel panic - not syncing: Fatal exception: panic_on_oops Fixes: `fd1e75d010` ("btrfs: make compression path to be subpage compatible") CC: stable@vger.kernel.org # 6.12+ Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Mikhail Zaslonko <zaslonko@linux.ibm.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-01-06 16:32:43 +01:00
Christoph Hellwig	7467bc5959	btrfs: zoned: calculate max_extent_size properly on non-zoned setup Since commit `559218d43e` ("block: pre-calculate max_zone_append_sectors"), queue_limits's max_zone_append_sectors is default to be 0 and it is only updated when there is a zoned device. So, we have lim->max_zone_append_sectors = 0 when there is no zoned device in the filesystem. That leads to fs_info->max_zone_append_size and thus fs_info->max_extent_size to be 0, which is wrong and can for example lead to a divide by zero in count_max_extents(). Fix this by only capping fs_info->max_extent_size to fs_info->max_zone_append_size when it is non-zero. Based on a patch from Naohiro Aota <naohiro.aota@wdc.com>, from which much of this commit message is stolen as well. Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> Fixes: `559218d43e` ("block: pre-calculate max_zone_append_sectors") Tested-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David Sterba <dsterba@suse.com>	2025-01-06 16:32:35 +01:00
Qu Wenruo	6aecd91a5c	btrfs: avoid NULL pointer dereference if no valid extent tree [BUG] Syzbot reported a crash with the following call trace: BTRFS info (device loop0): scrub: started on devid 1 BUG: kernel NULL pointer dereference, address: 0000000000000208 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 106e70067 P4D 106e70067 PUD 107143067 PMD 0 Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 1 UID: 0 PID: 689 Comm: repro Kdump: loaded Tainted: G O 6.13.0-rc4-custom+ #206 Tainted: [O]=OOT_MODULE Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown 02/02/2022 RIP: 0010:find_first_extent_item+0x26/0x1f0 [btrfs] Call Trace: <TASK> scrub_find_fill_first_stripe+0x13d/0x3b0 [btrfs] scrub_simple_mirror+0x175/0x260 [btrfs] scrub_stripe+0x5d4/0x6c0 [btrfs] scrub_chunk+0xbb/0x170 [btrfs] scrub_enumerate_chunks+0x2f4/0x5f0 [btrfs] btrfs_scrub_dev+0x240/0x600 [btrfs] btrfs_ioctl+0x1dc8/0x2fa0 [btrfs] ? do_sys_openat2+0xa5/0xf0 __x64_sys_ioctl+0x97/0xc0 do_syscall_64+0x4f/0x120 entry_SYSCALL_64_after_hwframe+0x76/0x7e </TASK> [CAUSE] The reproducer is using a corrupted image where extent tree root is corrupted, thus forcing to use "rescue=all,ro" mount option to mount the image. Then it triggered a scrub, but since scrub relies on extent tree to find where the data/metadata extents are, scrub_find_fill_first_stripe() relies on an non-empty extent root. But unfortunately scrub_find_fill_first_stripe() doesn't really expect an NULL pointer for extent root, it use extent_root to grab fs_info and triggered a NULL pointer dereference. [FIX] Add an extra check for a valid extent root at the beginning of scrub_find_fill_first_stripe(). The new error path is introduced by `42437a6386` ("btrfs: introduce mount option rescue=ignorebadroots"), but that's pretty old, and later commit `b979547513` ("btrfs: scrub: introduce helper to find and fill sector info for a scrub_stripe") changed how we do scrub. So for kernels older than 6.6, the fix will need manual backport. Reported-by: syzbot+339e9dbe3a2ca419b85d@syzkaller.appspotmail.com Link: https://lore.kernel.org/linux-btrfs/67756935.050a0220.25abdd.0a12.GAE@google.com/ Fixes: `42437a6386` ("btrfs: introduce mount option rescue=ignorebadroots") Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-01-06 16:32:31 +01:00
Linus Torvalds	13563da6ff	Merge tag 'vfio-v6.13-rc7' of https://github.com/awilliam/linux-vfio Pull vfio fix from Alex Williamson: - Fix a missed order alignment requirement of the pfn when inserting mappings through the new huge fault handler introduced in v6.12 (Alex Williamson) * tag 'vfio-v6.13-rc7' of https://github.com/awilliam/linux-vfio: vfio/pci: Fallback huge faults for unaligned pfn	2025-01-06 06:56:23 -08:00
Christian Brauner	368fcc5d3f	Merge patch series "Fix encoding overlayfs fid for fanotify delete events" Amir Goldstein <amir73il@gmail.com> says: This is a followup fix to the reported regression [1] that was introduced by overlayfs non-decodable file handles support in v6.6. The first fix posted two weeks ago [2] was a quick band aid which is justified on its own and is still queued on your vfs.fixes branch. This followup fix fixes the root cause of overlayfs file handle encoding failure and it also solves a bug with fanotify FAN_DELETE_SELF events on overlayfs, that was discovered from analysis of the first report. The fix to fanotify delete events was verified with a new LTP test [3]. [1] https://lore.kernel.org/linux-fsdevel/CAOQ4uxiie81voLZZi2zXS1BziXZCM24nXqPAxbu8kxXCUWdwOg@mail.gmail.com/ [2] https://lore.kernel.org/linux-fsdevel/20241219115301.465396-1-amir73il@gmail.com/ [3] https://github.com/amir73il/ltp/commits/ovl_encode_fid/ * patches from https://lore.kernel.org/r/20250105162404.357058-1-amir73il@gmail.com: ovl: support encoding fid from inode with no alias ovl: pass realinode to ovl_encode_real_fh() instead of realdentry Link: https://lore.kernel.org/r/20250105162404.357058-1-amir73il@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-01-06 15:43:58 +01:00
Amir Goldstein	c45beebfde	ovl: support encoding fid from inode with no alias Dmitry Safonov reported that a WARN_ON() assertion can be trigered by userspace when calling inotify_show_fdinfo() for an overlayfs watched inode, whose dentry aliases were discarded with drop_caches. The WARN_ON() assertion in inotify_show_fdinfo() was removed, because it is possible for encoding file handle to fail for other reason, but the impact of failing to encode an overlayfs file handle goes beyond this assertion. As shown in the LTP test case mentioned in the link below, failure to encode an overlayfs file handle from a non-aliased inode also leads to failure to report an fid with FAN_DELETE_SELF fanotify events. As Dmitry notes in his analyzis of the problem, ovl_encode_fh() fails if it cannot find an alias for the inode, but this failure can be fixed. ovl_encode_fh() seldom uses the alias and in the case of non-decodable file handles, as is often the case with fanotify fid info, ovl_encode_fh() never needs to use the alias to encode a file handle. Defer finding an alias until it is actually needed so ovl_encode_fh() will not fail in the common case of FAN_DELETE_SELF fanotify events. Fixes: `16aac5ad1f` ("ovl: support encoding non-decodable file handles") Reported-by: Dmitry Safonov <dima@arista.com> Closes: https://lore.kernel.org/linux-fsdevel/CAOQ4uxiie81voLZZi2zXS1BziXZCM24nXqPAxbu8kxXCUWdwOg@mail.gmail.com/ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/r/20250105162404.357058-3-amir73il@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-01-06 15:43:55 +01:00
Amir Goldstein	07aeefae7f	ovl: pass realinode to ovl_encode_real_fh() instead of realdentry We want to be able to encode an fid from an inode with no alias. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/r/20250105162404.357058-2-amir73il@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-01-06 15:43:55 +01:00
Linus Torvalds	5428dc1906	Merge tag 'exfat-for-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat Pull exfat fixes from Namjae Jeon: "All fixes are for issues reported by syzbot: - Fix wrong error return in exfat_find_empty_entry() - Fix a endless loop by self-linked chain - fix a KMSAN uninit-value issue in exfat_extend_valid_size()" * tag 'exfat-for-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat: exfat: fix the infinite loop in __exfat_free_cluster() exfat: fix the new buffer was not zeroed before writing exfat: fix the infinite loop in exfat_readdir() exfat: fix exfat_find_empty_entry() not returning error on failure	2025-01-06 06:19:36 -08:00
Linus Torvalds	cd6313beae	Revert "vmstat: disable vmstat_work on vmstat_cpu_down_prep()" This reverts commit `adcfb264c3`. It turns out this just causes a different warning splat instead that seems to be much easier to trigger, so let's revert ASAP. Reported-and-bisected-by: Borislav Petkov <bp@alien8.de> Tested-by: Breno Leitao <leitao@debian.org> Reported-by: Alexander Gordeev <agordeev@linux.ibm.com> Link: https://lore.kernel.org/all/20250106131817.GAZ3vYGVr3-hWFFPLj@fat_crate.local/ Cc: Koichiro Den <koichiro.den@canonical.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2025-01-06 06:10:24 -08:00
Manivannan Sadhasivam	907af7d6e0	regulator: Move OF_ API declarations/definitions outside CONFIG_REGULATOR Since these are hidden inside CONFIG_REGULATOR, building the consumer drivers without CONFIG_REGULATOR will result in the following build error: >> drivers/pci/pwrctrl/slot.c:39:15: error: implicit declaration of function 'of_regulator_bulk_get_all'; did you mean 'regulator_bulk_get'? [-Werror=implicit-function-declaration] 39 \| ret = of_regulator_bulk_get_all(dev, dev_of_node(dev), \| ^~~~~~~~~~~~~~~~~~~~~~~~~ \| regulator_bulk_get cc1: some warnings being treated as errors This also removes the duplicated definitions that were possibly added to fix the build issues. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202501020407.HmQQQKa0-lkp@intel.com/ Fixes: `27b9ecc7a9` ("regulator: Add of_regulator_bulk_get_all") Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://patch.msgid.link/20250104115058.19216-3-manivannan.sadhasivam@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>	2025-01-06 13:04:34 +00:00
Manivannan Sadhasivam	1156b5e8be	regulator: Guard of_regulator_bulk_get_all() with CONFIG_OF Since the definition is in drivers/regulator/of_regulator.c and compiled only if CONFIG_OF is enabled, building the consumer driver without CONFIG_OF and with CONFIG_REGULATOR will result in below build error: ERROR: modpost: "of_regulator_bulk_get_all" [drivers/pci/pwrctrl/pci-pwrctl-slot.ko] undefined! Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202412181640.12Iufkvd-lkp@intel.com/ Fixes: `27b9ecc7a9` ("regulator: Add of_regulator_bulk_get_all") Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Link: https://patch.msgid.link/20250104115058.19216-2-manivannan.sadhasivam@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>	2025-01-06 13:04:33 +00:00
Mark Harmstone	c21b89d495	btrfs: don't read from userspace twice in btrfs_uring_encoded_read() If we return -EAGAIN the first time because we need to block, btrfs_uring_encoded_read() will get called twice. Take a copy of args, the iovs, and the iter the first time, as by the time we are called the second time these may have gone out of scope. Reported-by: Jens Axboe <axboe@kernel.dk> Fixes: `34310c442e` ("btrfs: add io_uring command for encoded reads (ENCODED_READ ioctl)") Signed-off-by: Mark Harmstone <maharmstone@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-01-06 13:59:29 +01:00
Mark Harmstone	b0af20d33f	io_uring: add io_uring_cmd_get_async_data helper Add a helper function in include/linux/io_uring/cmd.h to read the async_data pointer from a struct io_uring_cmd. Signed-off-by: Mark Harmstone <maharmstone@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>	2025-01-06 13:51:08 +01:00
Jens Axboe	3347fa658a	io_uring/cmd: add per-op data to struct io_uring_cmd_data In case an op handler for ->uring_cmd() needs stable storage for user data, it can allocate io_uring_cmd_data->op_data and use it for the duration of the request. When the request gets cleaned up, uring_cmd will free it automatically. Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: David Sterba <dsterba@suse.com>	2025-01-06 13:51:06 +01:00
Jens Axboe	dadf03cfd4	io_uring/cmd: rename struct uring_cache to io_uring_cmd_data In preparation for making this more generically available for ->uring_cmd() usage that needs stable command data, rename it and move it to io_uring/cmd.h instead. Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: David Sterba <dsterba@suse.com>	2025-01-06 13:51:05 +01:00
Thorsten Blum	c7f3cd1b24	ksmbd: Remove unneeded if check in ksmbd_rdma_capable_netdev() Remove the unnecessary if check and assign the result directly. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-01-05 20:43:37 -06:00
Wentao Liang	4c16e1cadc	ksmbd: fix a missing return value check bug In the smb2_send_interim_resp(), if ksmbd_alloc_work_struct() fails to allocate a node, it returns a NULL pointer to the in_work pointer. This can lead to an illegal memory write of in_work->response_buf when allocate_interim_rsp_buf() attempts to perform a kzalloc() on it. To address this issue, incorporating a check for the return value of ksmbd_alloc_work_struct() ensures that the function returns immediately upon allocation failure, thereby preventing the aforementioned illegal memory access. Fixes: `041bba4414` ("ksmbd: fix wrong interim response on compound") Signed-off-by: Wentao Liang <liangwentao@iscas.ac.cn> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2025-01-05 20:43:37 -06:00
Linus Torvalds	9d89551994	Linux 6.13-rc6	2025-01-05 14:13:40 -08:00
Linus Torvalds	9244696b34	Merge tag 'kbuild-fixes-v6.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - Fix escaping of '$' in scripts/mksysmap - Fix a modpost crash observed with the latest binutils - Fix 'provides' in the linux-api-headers pacman package * tag 'kbuild-fixes-v6.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kbuild: pacman-pkg: provide versioned linux-api-headers package modpost: work around unaligned data access error modpost: refactor do_vmbus_entry() modpost: fix the missed iteration for the max bit in do_input() scripts/mksysmap: Fix escape chars '$'	2025-01-05 10:52:47 -08:00
Linus Torvalds	5635d8bad2	Merge tag 'mm-hotfixes-stable-2025-01-04-18-02' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull hotfixes from Andrew Morton: "25 hotfixes. 16 are cc:stable. 18 are MM and 7 are non-MM. The usual bunch of singletons and two doubletons - please see the relevant changelogs for details" * tag 'mm-hotfixes-stable-2025-01-04-18-02' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (25 commits) MAINTAINERS: change Arınç _NAL's name and email address scripts/sorttable: fix orc_sort_cmp() to maintain symmetry and transitivity mm/util: make memdup_user_nul() similar to memdup_user() mm, madvise: fix potential workingset node list_lru leaks mm/damon/core: fix ignored quota goals and filters of newly committed schemes mm/damon/core: fix new damon_target objects leaks on damon_commit_targets() mm/list_lru: fix false warning of negative counter vmstat: disable vmstat_work on vmstat_cpu_down_prep() mm: shmem: fix the update of 'shmem_falloc->nr_unswapped' mm: shmem: fix incorrect index alignment for within_size policy percpu: remove intermediate variable in PERCPU_PTR() mm: zswap: fix race between [de]compression and CPU hotunplug ocfs2: fix slab-use-after-free due to dangling pointer dqi_priv fs/proc/task_mmu: fix pagemap flags with PMD THP entries on 32bit kcov: mark in_softirq_really() as __always_inline docs: mm: fix the incorrect 'FileHugeMapped' field mailmap: modify the entry for Mathieu Othacehe mm/kmemleak: fix sleeping function called from invalid context at print message mm: hugetlb: independent PMD page table shared count maple_tree: reload mas before the second call for mas_empty_area ...	2025-01-05 10:37:45 -08:00
Linus Torvalds	7a5b6fc8bd	Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fixes from Stephen Boyd: "A randconfig build fix and a performance fix: - Fix the CONFIG_RESET_CONTROLLER=n path signature of clk_imx8mp_audiomix_reset_controller_register() to appease randconfig - Speed up the sdhci clk on TH1520 by a factor of 4 by adding a fixed factor clk" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: clk-imx8mp-audiomix: fix function signature clk: thead: Fix TH1520 emmc and shdci clock rate	2025-01-05 10:28:34 -08:00
Thomas Weißschuh	385443057f	kbuild: pacman-pkg: provide versioned linux-api-headers package The Arch Linux glibc package contains a versioned dependency on "linux-api-headers". If the linux-api-headers package provided by pacman-pkg does not specify an explicit version this dependency is not satisfied. Fix the dependency by providing an explicit version. Fixes: `c8578539de` ("kbuild: add script and target to generate pacman package") Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>	2025-01-05 23:19:17 +09:00
Thiébaud Weksteen	5e7f0efd23	selinux: match extended permissions to their base permissions In commit `d1d991efaf` ("selinux: Add netlink xperm support") a new extended permission was added ("nlmsg"). This was the second extended permission implemented in selinux ("ioctl" being the first one). Extended permissions are associated with a base permission. It was found that, in the access vector cache (avc), the extended permission did not keep track of its base permission. This is an issue for a domain that is using both extended permissions (i.e., a domain calling ioctl() on a netlink socket). In this case, the extended permissions were overlapping. Keep track of the base permission in the cache. A new field "base_perm" is added to struct extended_perms_decision to make sure that the extended permission refers to the correct policy permission. A new field "base_perms" is added to struct extended_perms to quickly decide if extended permissions apply. While it is in theory possible to retrieve the base permission from the access vector, the same base permission may not be mapped to the same bit for each class (e.g., "nlmsg" is mapped to a different bit for "netlink_route_socket" and "netlink_audit_socket"). Instead, use a constant (AVC_EXT_IOCTL or AVC_EXT_NLMSG) provided by the caller. Fixes: `d1d991efaf` ("selinux: Add netlink xperm support") Signed-off-by: Thiébaud Weksteen <tweek@google.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2025-01-04 20:58:46 -05:00
Jiawen Wu	8ce4f28752	net: libwx: fix firmware mailbox abnormal return The existing SW-FW interaction flow on the driver is wrong. Follow this wrong flow, driver would never return error if there is a unknown command. Since firmware writes back 'firmware ready' and 'unknown command' in the mailbox message if there is an unknown command sent by driver. So reading 'firmware ready' does not timeout. Then driver would mistakenly believe that the interaction has completed successfully. It tends to happen with the use of custom firmware. Move the check for 'unknown command' out of the poll timeout for 'firmware ready'. And adjust the debug log so that mailbox messages are always printed when commands timeout. Fixes: `1efa9bfe58` ("net: libwx: Implement interaction with firmware") Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Link: https://patch.msgid.link/20250103081013.1995939-1-jiawenwu@trustnetic.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-04 17:42:15 -08:00
Jakub Kicinski	a4faa15d28	Merge tag 'ieee802154-for-net-2025-01-03' of git://git.kernel.org/pub/scm/linux/kernel/git/wpan/wpan Stefan Schmidt says: ==================== pull-request: ieee802154 for net 2025-01-03 Keisuke Nishimura provided a fix to check for kfifo_alloc() in the ca8210 driver. Lizhi Xu provided a fix a corrupted list, found by syzkaller, by checking local interfaces first. * tag 'ieee802154-for-net-2025-01-03' of git://git.kernel.org/pub/scm/linux/kernel/git/wpan/wpan: mac802154: check local interfaces before deleting sdata list ieee802154: ca8210: Add missing check for kfifo_alloc() in ca8210_probe() ==================== Link: https://patch.msgid.link/20250103160046.469363-1-stefan@datenfreihafen.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-04 17:00:44 -08:00
Linus Torvalds	ab75170520	Merge tag 'linux-watchdog-6.13-rc6' of git://www.linux-watchdog.org/linux-watchdog Pull watchdog fix from Wim Van Sebroeck: - fix error message during stm32 driver probe * tag 'linux-watchdog-6.13-rc6' of git://www.linux-watchdog.org/linux-watchdog: watchdog: stm32_iwdg: fix error message during driver probe	2025-01-04 10:59:10 -08:00
Pavel Begunkov	c83c846231	io_uring/timeout: fix multishot updates After update only the first shot of a multishot timeout request adheres to the new timeout value while all subsequent retries continue to use the old value. Don't forget to update the timeout stored in struct io_timeout_data. Cc: stable@vger.kernel.org Fixes: `ea97f6c855` ("io_uring: add support for multishot timeouts") Reported-by: Christian Mazakas <christian.mazakas@gmail.com> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/e6516c3304eb654ec234cfa65c88a9579861e597.1736015288.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-01-04 11:39:00 -07:00
Jakub Kicinski	e95274dfe8	selftests: tc-testing: reduce rshift value After previous change rshift >= 32 is no longer allowed. Modify the test to use 31, the test doesn't seem to send any traffic so the exact value shouldn't matter. Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250103182458.1213486-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-04 08:49:47 -08:00
Eric Dumazet	a039e54397	net_sched: cls_flow: validate TCA_FLOW_RSHIFT attribute syzbot found that TCA_FLOW_RSHIFT attribute was not validated. Right shitfing a 32bit integer is undefined for large shift values. UBSAN: shift-out-of-bounds in net/sched/cls_flow.c:329:23 shift exponent 9445 is too large for 32-bit type 'u32' (aka 'unsigned int') CPU: 1 UID: 0 PID: 54 Comm: kworker/u8:3 Not tainted 6.13.0-rc3-syzkaller-00180-g4f619d518db9 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 Workqueue: ipv6_addrconf addrconf_dad_work Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120 ubsan_epilogue lib/ubsan.c:231 [inline] __ubsan_handle_shift_out_of_bounds+0x3c8/0x420 lib/ubsan.c:468 flow_classify+0x24d5/0x25b0 net/sched/cls_flow.c:329 tc_classify include/net/tc_wrapper.h:197 [inline] __tcf_classify net/sched/cls_api.c:1771 [inline] tcf_classify+0x420/0x1160 net/sched/cls_api.c:1867 sfb_classify net/sched/sch_sfb.c:260 [inline] sfb_enqueue+0x3ad/0x18b0 net/sched/sch_sfb.c:318 dev_qdisc_enqueue+0x4b/0x290 net/core/dev.c:3793 __dev_xmit_skb net/core/dev.c:3889 [inline] __dev_queue_xmit+0xf0e/0x3f50 net/core/dev.c:4400 dev_queue_xmit include/linux/netdevice.h:3168 [inline] neigh_hh_output include/net/neighbour.h:523 [inline] neigh_output include/net/neighbour.h:537 [inline] ip_finish_output2+0xd41/0x1390 net/ipv4/ip_output.c:236 iptunnel_xmit+0x55d/0x9b0 net/ipv4/ip_tunnel_core.c:82 udp_tunnel_xmit_skb+0x262/0x3b0 net/ipv4/udp_tunnel_core.c:173 geneve_xmit_skb drivers/net/geneve.c:916 [inline] geneve_xmit+0x21dc/0x2d00 drivers/net/geneve.c:1039 __netdev_start_xmit include/linux/netdevice.h:5002 [inline] netdev_start_xmit include/linux/netdevice.h:5011 [inline] xmit_one net/core/dev.c:3590 [inline] dev_hard_start_xmit+0x27a/0x7d0 net/core/dev.c:3606 __dev_queue_xmit+0x1b73/0x3f50 net/core/dev.c:4434 Fixes: `e5dfb81518` ("[NET_SCHED]: Add flow classifier") Reported-by: syzbot+1dbb57d994e54aaa04d2@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/6777bf49.050a0220.178762.0040.GAE@google.com/T/#u Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250103104546.3714168-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-04 08:49:36 -08:00
Zhongqiu Duan	3479c7549f	tcp/dccp: allow a connection when sk_max_ack_backlog is zero If the backlog of listen() is set to zero, sk_acceptq_is_full() allows one connection to be made, but inet_csk_reqsk_queue_is_full() does not. When the net.ipv4.tcp_syncookies is zero, inet_csk_reqsk_queue_is_full() will cause an immediate drop before the sk_acceptq_is_full() check in tcp_conn_request(), resulting in no connection can be made. This patch tries to keep consistent with `64a146513f` ("[NET]: Revert incorrect accept queue backlog changes."). Link: https://lore.kernel.org/netdev/20250102080258.53858-1-kuniyu@amazon.com/ Fixes: `ef547f2ac1` ("tcp: remove max_qlen_log") Signed-off-by: Zhongqiu Duan <dzq.aishenghu0@gmail.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250102171426.915276-1-dzq.aishenghu0@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-04 08:17:23 -08:00
Antonio Pastor	1e9b0e1c55	net: 802: LLC+SNAP OID:PID lookup on start of skb data 802.2+LLC+SNAP frames received by napi_complete_done() with GRO and DSA have skb->transport_header set two bytes short, or pointing 2 bytes before network_header & skb->data. This was an issue as snap_rcv() expected offset to point to SNAP header (OID:PID), causing packet to be dropped. A fix at llc_fixup_skb() (`a024e377ef`) resets transport_header for any LLC consumers that may care about it, and stops SNAP packets from being dropped, but doesn't fix the problem which is that LLC and SNAP should not use transport_header offset. Ths patch eliminates the use of transport_header offset for SNAP lookup of OID:PID so that SNAP does not rely on the offset at all. The offset is reset after pull for any SNAP packet consumers that may (but shouldn't) use it. Fixes: `fda55eca5a` ("net: introduce skb_transport_header_was_set()") Signed-off-by: Antonio Pastor <antonio.pastor@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250103012303.746521-1-antonio.pastor@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-04 08:06:24 -08:00
AngeloGioacchino Del Regno	f563dd9ca6	drm/mediatek: Initialize pointer in mtk_drm_of_ddp_path_build_one() The struct device_node next pointer is not initialized, and it is used in an error path in which it may have never been modified by function mtk_drm_of_get_ddp_ep_cid(). Since the error path is relying on that pointer being NULL for the OVL Adaptor and/or invalid component check and since said pointer is being used in prints for %pOF, in the case that it points to a bogus address, the print may cause a KP. To resolve that, initialize the next pointer to NULL before usage. Fixes: `4c932840db` ("drm/mediatek: Implement OF graphs support for display paths") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/dri-devel/633f3c6d-d09f-447c-95f1-dfb4114c50e6@stanley.mountain/ Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: CK Hu <ck.hu@mediatek.com> Reviewed-by: Alexandre Mergnat <amergnat@baylibre.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20241112105030.93337-1-angelogioacchino.delregno@collabora.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>	2025-01-04 12:15:19 +00:00
Liankun Yang	5229081406	drm/mediatek: Add return value check when reading DPCD Check the return value of drm_dp_dpcd_readb() to confirm that AUX communication is successful. To simplify the code, replace drm_dp_dpcd_readb() and DP_GET_SINK_COUNT() with drm_dp_read_sink_count(). Fixes: `f70ac097a2` ("drm/mediatek: Add MT8195 Embedded DisplayPort driver") Signed-off-by: Liankun Yang <liankun.yang@mediatek.com> Reviewed-by: Guillaume Ranquet <granquet@baylibre.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20241218113448.2992-1-liankun.yang@mediatek.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>	2025-01-04 12:15:08 +00:00
Amir Goldstein	03f275adb8	fuse: respect FOPEN_KEEP_CACHE on opendir The re-factoring of fuse_dir_open() missed the need to invalidate directory inode page cache with open flag FOPEN_KEEP_CACHE. Fixes: `7de64d521b` ("fuse: break up fuse_open_common()") Reported-by: Prince Kumar <princer@google.com> Closes: https://lore.kernel.org/linux-fsdevel/CAEW=TRr7CYb4LtsvQPLj-zx5Y+EYBmGfM24SuzwyDoGVNoKm7w@mail.gmail.com/ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/r/20250101130037.96680-1-amir73il@gmail.com Reviewed-by: Bernd Schubert <bernd.schubert@fastmail.fm> Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-01-04 09:58:14 +01:00
Linus Torvalds	63676eefb7	Merge tag 'sched_ext-for-6.13-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext Pull sched_ext fixes from Tejun Heo: - Fix a bug where bpf_iter_scx_dsq_new() was not initializing the iterator's flags and could inadvertently enable e.g. reverse iteration - Fix a bug where scx_ops_bypass() could call irq_restore twice - Add Andrea and Changwoo as maintainers for better review coverage - selftests and tools/sched_ext build and other fixes * tag 'sched_ext-for-6.13-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: sched_ext: Fix dsq_local_on selftest sched_ext: initialize kit->cursor.flags sched_ext: Fix invalid irq restore in scx_ops_bypass() MAINTAINERS: add me as reviewer for sched_ext MAINTAINERS: add self as reviewer for sched_ext scx: Fix maximal BPF selftest prog sched_ext: fix application of sizeof to pointer selftests/sched_ext: fix build after renames in sched_ext API sched_ext: Add __weak to fix the build errors	2025-01-03 15:09:12 -08:00
Linus Torvalds	f9aa1fb9f8	Merge tag 'wq-for-6.13-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq Pull workqueue fixes from Tejun Heo: - Suppress a corner case spurious flush dependency warning - Two trivial changes * tag 'wq-for-6.13-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: add printf attribute to __alloc_workqueue() workqueue: Do not warn when cancelling WQ_MEM_RECLAIM work from !WQ_MEM_RECLAIM worker rust: add safety comment in workqueue traits	2025-01-03 15:03:56 -08:00
Linus Torvalds	2ae3aab557	Merge tag 'block-6.13-20250103' of git://git.kernel.dk/linux Pull block fixes from Jens Axboe: "Collection of fixes for block. Particularly the target name overflow has been a bit annoying, as it results in overwriting random memory and hence shows up as triggering various other bugs. - NVMe pull request via Keith: - Fix device specific quirk for PRP list alignment (Robert) - Fix target name overflow (Leo) - Fix target write granularity (Luis) - Fix target sleeping in atomic context (Nilay) - Remove unnecessary tcp queue teardown (Chunguang) - Simple cdrom typo fix" * tag 'block-6.13-20250103' of git://git.kernel.dk/linux: cdrom: Fix typo, 'devicen' to 'device' nvme-tcp: remove nvme_tcp_destroy_io_queues() nvmet-loop: avoid using mutex in IO hotpath nvmet: propagate npwg topology nvmet: Don't overflow subsysnqn nvme-pci: 512 byte aligned dma pool segment quirk	2025-01-03 14:58:57 -08:00
Linus Torvalds	a984e234fc	Merge tag 'io_uring-6.13-20250103' of git://git.kernel.dk/linux Pull io_uring fixes from Jens Axboe: - Fix an issue with the read multishot support and posting of CQEs from io-wq context - Fix a regression introduced in this cycle, where making the timeout lock a raw one uncovered another locking dependency. As a result, move the timeout flushing outside of the timeout lock, punting them to a local list first - Fix use of an uninitialized variable in io_async_msghdr. Doesn't really matter functionally, but silences a valid KMSAN complaint that it's not always initialized - Fix use of incrementally provided buffers for read on non-pollable files, where the buffer always gets committed upfront. Unfortunately the buffer address isn't resolved first, so the read ends up using the updated rather than the current value * tag 'io_uring-6.13-20250103' of git://git.kernel.dk/linux: io_uring/kbuf: use pre-committed buffer address for non-pollable file io_uring/net: always initialize kmsg->msg.msg_inq upfront io_uring/timeout: flush timeouts outside of the timeout lock io_uring/rw: fix downgraded mshot read	2025-01-03 14:45:59 -08:00
Linus Torvalds	aba74e639f	Merge tag 'net-6.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from wireles and netfilter. Nothing major here. Over the last two weeks we gathered only around two-thirds of our normal weekly fix count, but delaying sending these until -rc7 seemed like a really bad idea. AFAIK we have no bugs under investigation. One or two reverts for stuff for which we haven't gotten a proper fix will likely come in the next PR. Current release - fix to a fix: - netfilter: nft_set_hash: unaligned atomic read on struct nft_set_ext - eth: gve: trigger RX NAPI instead of TX NAPI in gve_xsk_wakeup Previous releases - regressions: - net: reenable NETIF_F_IPV6_CSUM offload for BIG TCP packets - mptcp: - fix sleeping rcvmsg sleeping forever after bad recvbuffer adjust - fix TCP options overflow - prevent excessive coalescing on receive, fix throughput - net: fix memory leak in tcp_conn_request() if map insertion fails - wifi: cw1200: fix potential NULL dereference after conversion to GPIO descriptors - phy: micrel: dynamically control external clock of KSZ PHY, fix suspend behavior Previous releases - always broken: - af_packet: fix VLAN handling with MSG_PEEK - net: restrict SO_REUSEPORT to inet sockets - netdev-genl: avoid empty messages in NAPI get - dsa: microchip: fix set_ageing_time function on KSZ9477 and LAN937X - eth: - gve: XDP fixes around transmit, queue wakeup etc. - ti: icssg-prueth: fix firmware load sequence to prevent time jump which breaks timesync related operations Misc: - netlink: specs: mptcp: add missing attr and improve documentation" * tag 'net-6.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (50 commits) net: ti: icssg-prueth: Fix clearing of IEP_CMP_CFG registers during iep_init net: ti: icssg-prueth: Fix firmware load sequence. mptcp: prevent excessive coalescing on receive mptcp: don't always assume copied data in mptcp_cleanup_rbuf() mptcp: fix recvbuffer adjust on sleeping rcvmsg ila: serialize calls to nf_register_net_hooks() af_packet: fix vlan_get_protocol_dgram() vs MSG_PEEK af_packet: fix vlan_get_tci() vs MSG_PEEK net: wwan: iosm: Properly check for valid exec stage in ipc_mmio_init() net: restrict SO_REUSEPORT to inet sockets net: reenable NETIF_F_IPV6_CSUM offload for BIG TCP packets net: sfc: Correct key_len for efx_tc_ct_zone_ht_params net: wwan: t7xx: Fix FSM command timeout issue sky2: Add device ID 11ab:4373 for Marvell 88E8075 mptcp: fix TCP options overflow. net: mv643xx_eth: fix an OF node reference leak gve: trigger RX NAPI instead of TX NAPI in gve_xsk_wakeup eth: bcmsysport: fix call balance of priv->clk handling routines net: llc: reset skb->transport_header netlink: specs: mptcp: fix missing doc ...	2025-01-03 14:36:54 -08:00
Linus Torvalds	ee063c23e4	Merge tag 'nios2_update_for_v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux Pull nios2 fixlet from Dinh Nguyen: - Use str_yes_no() helper function * tag 'nios2_update_for_v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux: nios2: Use str_yes_no() helper in show_cpuinfo()	2025-01-03 14:16:25 -08:00
Linus Torvalds	dea3165f98	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull rdma fixes from Jason Gunthorpe: "A lot of fixes accumulated over the holiday break: - Static tool fixes, value is already proven to be NULL, possible integer overflow - Many bnxt_re fixes: - Crashes due to a mismatch in the maximum SGE list size - Don't waste memory for user QPs by creating kernel-only structures - Fix compatability issues with older HW in some of the new HW features recently introduced: RTS->RTS feature, work around 9096 - Do not allow destroy_qp to fail - Validate QP MTU against device limits - Add missing validation on madatory QP attributes for RTR->RTS - Report port_num in query_qp as required by the spec - Fix creation of QPs of the maximum queue size, and in the variable mode - Allow all QPs to be used on newer HW by limiting a work around only to HW it affects - Use the correct MSN table size for variable mode QPs - Add missing locking in create_qp() accessing the qp_tbl - Form WQE buffers correctly when some of the buffers are 0 hop - Don't crash on QP destroy if the userspace doesn't setup the dip_ctx - Add the missing QP flush handler call on the DWQE path to avoid hanging on error recovery - Consistently use ENXIO for return codes if the devices is fatally errored - Try again to fix VLAN support on iwarp, previous fix was reverted due to breaking other cards - Correct error path return code for rdma netlink events - Remove the seperate net_device pointer in siw and rxe which syzkaller found a way to UAF - Fix a UAF of a stack ib_sge in rtrs - Fix a regression where old mlx5 devices and FW were wrongly activing new device features and failing" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (28 commits) RDMA/mlx5: Enable multiplane mode only when it is supported RDMA/bnxt_re: Fix error recovery sequence RDMA/rtrs: Ensure 'ib_sge list' is accessible RDMA/rxe: Remove the direct link to net_device RDMA/hns: Fix missing flush CQE for DWQE RDMA/hns: Fix warning storm caused by invalid input in IO path RDMA/hns: Fix accessing invalid dip_ctx during destroying QP RDMA/hns: Fix mapping error of zero-hop WQE buffer RDMA/bnxt_re: Fix the locking while accessing the QP table RDMA/bnxt_re: Fix MSN table size for variable wqe mode RDMA/bnxt_re: Add send queue size check for variable wqe RDMA/bnxt_re: Disable use of reserved wqes RDMA/bnxt_re: Fix max_qp_wrs reported RDMA/siw: Remove direct link to net_device RDMA/nldev: Set error code in rdma_nl_notify_event RDMA/bnxt_re: Fix reporting hw_ver in query_device RDMA/bnxt_re: Fix to export port num to ib_query_qp RDMA/bnxt_re: Fix setting mandatory attributes for modify_qp RDMA/bnxt_re: Add check for path mtu in modify_qp RDMA/bnxt_re: Fix the check for 9060 condition ...	2025-01-03 11:09:35 -08:00
Linus Torvalds	f274fffbc2	Merge tag 'pinctrl-v6.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control fixes from Linus Walleij: - A small Kconfig fixup for the i.MX. In principle this could come in from the SoC tree but the bug was introduced from the pin control tree so let's fix it from here. - Fix a sleep in atomic context in the MCP23xxx GPIO expander by disabling the regmap locking and using explicit mutex locks. * tag 'pinctrl-v6.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: mcp23s08: Fix sleeping in atomic context due to regmap locking ARM: imx: Re-introduce the PINCTRL selection	2025-01-03 10:57:57 -08:00
Linus Torvalds	4f5d3da619	Merge tag 'sound-6.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "The first new year pull request: no surprises, all small fixes, including: - Follow-up fixes for the new compress-offload API extension - A couple of fixes for MIDI 2.0 UMP handling - A trivial race fix for OSS sequencer emulation ioctls - USB-audio and HD-audio fixes / quirks" * tag 'sound-6.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: seq: Check UMP support for midi_version change ALSA hda/realtek: Add quirk for Framework F111:000C Revert "ALSA: ump: Don't enumeration invalid groups for legacy rawmidi" ALSA: seq: oss: Fix races at processing SysEx messages ALSA: compress_offload: fix remaining descriptor races in sound/core/compress_offload.c ALSA: compress_offload: Drop unneeded no_free_ptr() ALSA: hda/tas2781: Ignore SUBSYS_ID not found for tas2563 projects ALSA: usb-audio: US16x08: Initialize array before use	2025-01-03 10:54:51 -08:00
Linus Torvalds	92c3bb3d2e	Merge tag 'drm-fixes-2025-01-03' of https://gitlab.freedesktop.org/drm/kernel Pull drm fixes from Dave Airlie: "Happy New Year. It was fairly quiet for holidays period, certainly nothing that worth getting off the couch before I needed to, this is for the past two weeks, i915, xe and some adv7511, I expect we will see some amdgpu etc happening next week, but otherwise all quiet. i915: - Fix C10 pll programming sequence [cx0_phy] - Fix power gate sequence. [dg1] xe: - uapi: Revert some devcoredump file format changes breaking a mesa debug tool - Fixes around waits when moving to system - Fix a typo when checking for LMEM provisioning - Fix a fault on fd close after unbind - A couple of OA fixes squashed for stable backporting adv7511: - fix UAF - drop single lane support - audio infoframe fix" * tag 'drm-fixes-2025-01-03' of https://gitlab.freedesktop.org/drm/kernel: xe/oa: Fix query mode of operation for OAR/OAC drm/i915/dg1: Fix power gate sequence. drm/i915/cx0_phy: Fix C10 pll programming sequence drm/xe: Fix fault on fd close after unbind drm/xe/pf: Use correct function to check LMEM provisioning drm/xe: Wait for migration job before unmapping pages drm/xe: Use non-interruptible wait when moving BO to system drm/xe: Revert some changes that break a mesa debug tool drm: adv7511: Drop dsi single lane support dt-bindings: display: adi,adv7533: Drop single lane support drm: adv7511: Fix use-after-free in adv7533_attach_dsi() drm/bridge: adv7511_audio: Update Audio InfoFrame properly	2025-01-03 10:06:44 -08:00
Linus Torvalds	e30dd219c7	Merge tag 'ftrace-v6.13-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull ftrace fixes from Steven Rostedt: - Add needed READ_ONCE() around access to the fgraph array element The updates to the fgraph array can happen when callbacks are registered and unregistered. The __ftrace_return_to_handler() can handle reading either the old value or the new value. But once it reads that value it must stay consistent otherwise the check that looks to see if the value is a stub may show false, but if the compiler decides to re-read after that check, it can be true which can cause the code to crash later on. - Make function profiler use the top level ops for filtering again When function graph became available for instances, its filter ops became independent from the top level set_ftrace_filter. In the process the function profiler received its own filter ops as well. But the function profiler uses the top level set_ftrace_filter file and does not have one of its own. In giving it its own filter ops, it lost any user interface it once had. Make it use the top level set_ftrace_filter file again. This fixes a regression. * tag 'ftrace-v6.13-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: ftrace: Fix function profiler's filtering functionality fgraph: Add READ_ONCE() when accessing fgraph_array[]	2025-01-03 10:04:43 -08:00
Jens Axboe	ed123c948d	io_uring/kbuf: use pre-committed buffer address for non-pollable file For non-pollable files, buffer ring consumption will commit upfront. This is fine, but io_ring_buffer_select() will return the address of the buffer after having committed it. For incrementally consumed buffers, this is incorrect as it will modify the buffer address. Store the pre-committed value and return that. If that isn't done, then the initial part of the buffer is not used and the application will correctly assume the content arrived at the start of the userspace buffer, but the kernel will have put it later in the buffer. Or it can cause a spurious -EFAULT returned in the CQE, depending on the buffer size. As bounds are suitably checked for doing the actual IO, no adverse side effects are possible - it's just a data misplacement within the existing buffer. Reported-by: Gwendal Fernet <gwendalfernet@gmail.com> Cc: stable@vger.kernel.org Fixes: `ae98dbf43d` ("io_uring/kbuf: add support for incremental buffer consumption") Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-01-03 09:38:37 -07:00
Koichiro Den	8bd76b3d3f	gpio: sim: lock up configfs that an instantiated device depends on Once a sim device is instantiated and actively used, allowing rmdir for its configfs serves no purpose and can be confusing. Effectively, arbitrary users start depending on its existence. Make the subsystem itself depend on the configfs entry for a sim device while it is in active use. Signed-off-by: Koichiro Den <koichiro.den@canonical.com> Link: https://lore.kernel.org/r/20250103141829.430662-5-koichiro.den@canonical.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>	2025-01-03 17:15:04 +01:00
Koichiro Den	c7c434c1db	gpio: virtuser: lock up configfs that an instantiated device depends on Once a virtuser device is instantiated and actively used, allowing rmdir for its configfs serves no purpose and can be confusing. Userspace interacts with the virtual consumer at arbitrary times, meaning it depends on its existence. Make the subsystem itself depend on the configfs entry for a virtuser device while it is in active use. Signed-off-by: Koichiro Den <koichiro.den@canonical.com> Link: https://lore.kernel.org/r/20250103141829.430662-4-koichiro.den@canonical.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>	2025-01-03 17:15:04 +01:00
Koichiro Den	656cc2e892	gpio: virtuser: fix handling of multiple conn_ids in lookup table Creating a virtuser device via configfs with multiple conn_ids fails due to incorrect indexing of lookup entries. Correct the indexing logic to ensure proper functionality when multiple gpio_virtuser_lookup are created. Fixes: `91581c4b3f` ("gpio: virtuser: new virtual testing driver for the GPIO API") Signed-off-by: Koichiro Den <koichiro.den@canonical.com> Link: https://lore.kernel.org/r/20250103141829.430662-3-koichiro.den@canonical.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>	2025-01-03 17:15:03 +01:00
Koichiro Den	a619cba8c6	gpio: virtuser: fix missing lookup table cleanups When a virtuser device is created via configfs and the probe fails due to an incorrect lookup table, the table is not removed. This prevents subsequent probe attempts from succeeding, even if the issue is corrected, unless the device is released. Additionally, cleanup is also needed in the less likely case of platform_device_register_full() failure. Besides, a consistent memory leak in lookup_table->dev_id was spotted using kmemleak by toggling the live state between 0 and 1 with a correct lookup table. Introduce gpio_virtuser_remove_lookup_table() as the counterpart to the existing gpio_virtuser_make_lookup_table() and call it from all necessary points to ensure proper cleanup. Fixes: `91581c4b3f` ("gpio: virtuser: new virtual testing driver for the GPIO API") Signed-off-by: Koichiro Den <koichiro.den@canonical.com> Link: https://lore.kernel.org/r/20250103141829.430662-2-koichiro.den@canonical.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>	2025-01-03 17:15:03 +01:00
Milan Broz	548c6edbed	dm-verity FEC: Avoid copying RS parity bytes twice. Caching RS parity bytes is already done in fec_decode_bufs() now, no need to use yet another buffer for conversion to uint16_t. This patch removes that double copy of RS parity bytes. Signed-off-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>	2025-01-03 17:08:49 +01:00
Milan Broz	6df90c02ba	dm-verity FEC: Fix RS FEC repair for roots unaligned to block size (take 2) This patch fixes an issue that was fixed in the commit `df7b59ba92` ("dm verity: fix FEC for RS roots unaligned to block size") but later broken again in the commit `8ca7cab82b` ("dm verity fec: fix misaligned RS roots IO") If the Reed-Solomon roots setting spans multiple blocks, the code does not use proper parity bytes and randomly fails to repair even trivial errors. This bug cannot happen if the sector size is multiple of RS roots setting (Android case with roots 2). The previous solution was to find a dm-bufio block size that is multiple of the device sector size and roots size. Unfortunately, the optimization in commit `8ca7cab82b` ("dm verity fec: fix misaligned RS roots IO") is incorrect and uses data block size for some roots (for example, it uses 4096 block size for roots = 20). This patch uses a different approach: - It always uses a configured data block size for dm-bufio to avoid possible misaligned IOs. - and it caches the processed parity bytes, so it can join it if it spans two blocks. As the RS calculation is called only if an error is detected and the process is computationally intensive, copying a few more bytes should not introduce performance issues. The issue was reported to cryptsetup with trivial reproducer https://gitlab.com/cryptsetup/cryptsetup/-/issues/923 Reproducer (with roots=20): # create verity device with RS FEC dd if=/dev/urandom of=data.img bs=4096 count=8 status=none veritysetup format data.img hash.img --fec-device=fec.img --fec-roots=20 \| \ awk '/^Root hash/{ print $3 }' >roothash # create an erasure that should always be repairable with this roots setting dd if=/dev/zero of=data.img conv=notrunc bs=1 count=4 seek=4 status=none # try to read it through dm-verity veritysetup open data.img test hash.img --fec-device=fec.img --fec-roots=20 $(cat roothash) dd if=/dev/mapper/test of=/dev/null bs=4096 status=noxfer Even now the log says it cannot repair it: : verity-fec: 7:1: FEC 0: failed to correct: -74 : device-mapper: verity: 7:1: data block 0 is corrupted ... With this fix, errors are properly repaired. : verity-fec: 7:1: FEC 0: corrected 4 errors Signed-off-by: Milan Broz <gmazyland@gmail.com> Fixes: `8ca7cab82b` ("dm verity fec: fix misaligned RS roots IO") Cc: stable@vger.kernel.org Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>	2025-01-03 17:08:25 +01:00
Alex Williamson	09dfc8a5f2	vfio/pci: Fallback huge faults for unaligned pfn The PFN must also be aligned to the fault order to insert a huge pfnmap. Test the alignment and fallback when unaligned. Fixes: `f9e54c3a2f` ("vfio/pci: implement huge_fault support") Link: https://bugzilla.kernel.org/show_bug.cgi?id=219619 Reported-by: Athul Krishna <athul.krishna.kr@protonmail.com> Reported-by: Precific <precification@posteo.de> Reviewed-by: Peter Xu <peterx@redhat.com> Tested-by: Precific <precification@posteo.de> Link: https://lore.kernel.org/r/20250102183416.1841878-1-alex.williamson@redhat.com Cc: stable@vger.kernel.org Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2025-01-03 08:49:05 -07:00
Mark Zhang	45d339fefa	RDMA/mlx5: Enable multiplane mode only when it is supported Driver queries vport_cxt.num_plane and enables multiplane when it is greater then 0, but some old FWs (versions from x.40.1000 till x.42.1000), report vport_cxt.num_plane = 1 unexpectedly. Fix it by querying num_plane only when HCA_CAP2.multiplane bit is set. Fixes: `2a5db20fa5` ("RDMA/mlx5: Add support to multi-plane device and port") Link: https://patch.msgid.link/r/1ef901acdf564716fcf550453cf5e94f343777ec.1734610916.git.leon@kernel.org Cc: stable@vger.kernel.org Reported-by: Francesco Poli <invernomuto@paranoici.org> Closes: https://lore.kernel.org/all/nvs4i2v7o6vn6zhmtq4sgazy2hu5kiulukxcntdelggmznnl7h@so3oul6uwgbl/ Signed-off-by: Mark Zhang <markzhang@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2025-01-03 09:17:19 -04:00
David S. Miller	ce21419b55	Merge branch 'net-iep-clock-module-fixes' Meghana Malladi says: ==================== IEP clock module bug fixes This series has some bug fixes for IEP module needed by PPS and timesync operations. Patch 1/2 fixes firmware load sequence to run all the firmwares when either of the ethernet interfaces is up. Move all the code common for firmware bringup under common functions. Patch 2/2 fixes distorted PPS signal when the ethernet interfaces are brough down and up. This patch also fixes enabling PPS signal after bringing the interface up, without disabling PPS. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2025-01-03 11:54:06 +00:00
Meghana Malladi	9b11536124	net: ti: icssg-prueth: Fix clearing of IEP_CMP_CFG registers during iep_init When ICSSG interfaces are brought down and brought up again, the pru cores are shut down and booted again, flushing out all the memories and start again in a clean state. Hence it is expected that the IEP_CMP_CFG register needs to be flushed during iep_init() to ensure that the existing residual configuration doesn't cause any unusual behavior. If the register is not cleared, existing IEP_CMP_CFG set for CMP1 will result in SYNC0_OUT signal based on the SYNC_OUT register values. After bringing the interface up, calling PPS enable doesn't work as the driver believes PPS is already enabled, (iep->pps_enabled is not cleared during interface bring down) and driver will just return true even though there is no signal. Fix this by disabling pps and perout. Fixes: `c1e0230eea` ("net: ti: icss-iep: Add IEP driver") Signed-off-by: Meghana Malladi <m-malladi@ti.com> Reviewed-by: Roger Quadros <rogerq@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2025-01-03 11:54:06 +00:00
MD Danish Anwar	9facce84f4	net: ti: icssg-prueth: Fix firmware load sequence. Timesync related operations are ran in PRU0 cores for both ICSSG SLICE0 and SLICE1. Currently whenever any ICSSG interface comes up we load the respective firmwares to PRU cores and whenever interface goes down, we stop the resective cores. Due to this, when SLICE0 goes down while SLICE1 is still active, PRU0 firmwares are unloaded and PRU0 core is stopped. This results in clock jump for SLICE1 interface as the timesync related operations are no longer running. As there are interdependencies between SLICE0 and SLICE1 firmwares, fix this by running both PRU0 and PRU1 firmwares as long as at least 1 ICSSG interface is up. Add new flag in prueth struct to check if all firmwares are running and remove the old flag (fw_running). Use emacs_initialized as reference count to load the firmwares for the first and last interface up/down. Moving init_emac_mode and fw_offload_mode API outside of icssg_config to icssg_common_start API as they need to be called only once per firmware boot. Change prueth_emac_restart() to return error code and add error prints inside the caller of this functions in case of any failures. Move prueth_emac_stop() from common to sr1 driver. sr1 and sr2 drivers have different logic handling for stopping the firmwares. While sr1 driver is dependent on emac structure to stop the corresponding pru cores for that slice, for sr2 all the pru cores of both the slices are stopped and is not dependent on emac. So the prueth_emac_stop() function is no longer common and can be moved to sr1 driver. Fixes: `c1e0230eea` ("net: ti: icss-iep: Add IEP driver") Signed-off-by: MD Danish Anwar <danishanwar@ti.com> Signed-off-by: Meghana Malladi <m-malladi@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2025-01-03 11:54:06 +00:00
Jakub Kicinski	3473020d0f	Merge branch 'mptcp-rx-path-fixes' Matthieu Baerts says: ==================== mptcp: rx path fixes Here are 3 different fixes, all related to the MPTCP receive buffer: - Patch 1: fix receive buffer space when recvmsg() blocks after receiving some data. For a fix introduced in v6.12, backported to v6.1. - Patch 2: mptcp_cleanup_rbuf() can be called when no data has been copied. For 5.11. - Patch 3: prevent excessive coalescing on receive, which can affect the throughput badly. It looks better to wait a bit before backporting this one to stable versions, to get more results. For 5.10. ==================== Link: https://patch.msgid.link/20241230-net-mptcp-rbuf-fixes-v1-0-8608af434ceb@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-02 18:44:06 -08:00
Paolo Abeni	56b824eb49	mptcp: prevent excessive coalescing on receive Currently the skb size after coalescing is only limited by the skb layout (the skb must not carry frag_list). A single coalesced skb covering several MSS can potentially fill completely the receive buffer. In such a case, the snd win will zero until the receive buffer will be empty again, affecting tput badly. Fixes: `8268ed4c9d` ("mptcp: introduce and use mptcp_try_coalesce()") Cc: stable@vger.kernel.org # please delay 2 weeks after 6.13-final release Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20241230-net-mptcp-rbuf-fixes-v1-3-8608af434ceb@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-02 18:44:03 -08:00
Paolo Abeni	551844f26d	mptcp: don't always assume copied data in mptcp_cleanup_rbuf() Under some corner cases the MPTCP protocol can end-up invoking mptcp_cleanup_rbuf() when no data has been copied, but such helper assumes the opposite condition. Explicitly drop such assumption and performs the costly call only when strictly needed - before releasing the msk socket lock. Fixes: `fd8976790a` ("mptcp: be careful on MPTCP-level ack.") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20241230-net-mptcp-rbuf-fixes-v1-2-8608af434ceb@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-02 18:44:03 -08:00
Paolo Abeni	449e6912a2	mptcp: fix recvbuffer adjust on sleeping rcvmsg If the recvmsg() blocks after receiving some data - i.e. due to SO_RCVLOWAT - the MPTCP code will attempt multiple times to adjust the receive buffer size, wrongly accounting every time the cumulative of received data - instead of accounting only for the delta. Address the issue moving mptcp_rcv_space_adjust just after the data reception and passing it only the just received bytes. This also removes an unneeded difference between the TCP and MPTCP RX code path implementation. Fixes: `5813022985` ("mptcp: error out earlier on disconnect") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20241230-net-mptcp-rbuf-fixes-v1-1-8608af434ceb@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-02 18:44:03 -08:00
Eric Dumazet	260466b576	ila: serialize calls to nf_register_net_hooks() syzbot found a race in ila_add_mapping() [1] commit `031ae72825` ("ila: call nf_unregister_net_hooks() sooner") attempted to fix a similar issue. Looking at the syzbot repro, we have concurrent ILA_CMD_ADD commands. Add a mutex to make sure at most one thread is calling nf_register_net_hooks(). [1] BUG: KASAN: slab-use-after-free in rht_key_hashfn include/linux/rhashtable.h:159 [inline] BUG: KASAN: slab-use-after-free in __rhashtable_lookup.constprop.0+0x426/0x550 include/linux/rhashtable.h:604 Read of size 4 at addr ffff888028f40008 by task dhcpcd/5501 CPU: 1 UID: 0 PID: 5501 Comm: dhcpcd Not tainted 6.13.0-rc4-syzkaller-00054-gd6ef8b40d075 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0xc3/0x620 mm/kasan/report.c:489 kasan_report+0xd9/0x110 mm/kasan/report.c:602 rht_key_hashfn include/linux/rhashtable.h:159 [inline] __rhashtable_lookup.constprop.0+0x426/0x550 include/linux/rhashtable.h:604 rhashtable_lookup include/linux/rhashtable.h:646 [inline] rhashtable_lookup_fast include/linux/rhashtable.h:672 [inline] ila_lookup_wildcards net/ipv6/ila/ila_xlat.c:127 [inline] ila_xlat_addr net/ipv6/ila/ila_xlat.c:652 [inline] ila_nf_input+0x1ee/0x620 net/ipv6/ila/ila_xlat.c:185 nf_hook_entry_hookfn include/linux/netfilter.h:154 [inline] nf_hook_slow+0xbb/0x200 net/netfilter/core.c:626 nf_hook.constprop.0+0x42e/0x750 include/linux/netfilter.h:269 NF_HOOK include/linux/netfilter.h:312 [inline] ipv6_rcv+0xa4/0x680 net/ipv6/ip6_input.c:309 __netif_receive_skb_one_core+0x12e/0x1e0 net/core/dev.c:5672 __netif_receive_skb+0x1d/0x160 net/core/dev.c:5785 process_backlog+0x443/0x15f0 net/core/dev.c:6117 __napi_poll.constprop.0+0xb7/0x550 net/core/dev.c:6883 napi_poll net/core/dev.c:6952 [inline] net_rx_action+0xa94/0x1010 net/core/dev.c:7074 handle_softirqs+0x213/0x8f0 kernel/softirq.c:561 __do_softirq kernel/softirq.c:595 [inline] invoke_softirq kernel/softirq.c:435 [inline] __irq_exit_rcu+0x109/0x170 kernel/softirq.c:662 irq_exit_rcu+0x9/0x30 kernel/softirq.c:678 instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1049 [inline] sysvec_apic_timer_interrupt+0xa4/0xc0 arch/x86/kernel/apic/apic.c:1049 Fixes: `7f00feaf10` ("ila: Add generic ILA translation facility") Reported-by: syzbot+47e761d22ecf745f72b9@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/6772c9ae.050a0220.2f3838.04c7.GAE@google.com/T/#u Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Florian Westphal <fw@strlen.de> Cc: Tom Herbert <tom@herbertland.com> Link: https://patch.msgid.link/20241230162849.2795486-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-02 18:42:32 -08:00
Eric Dumazet	f91a5b8089	af_packet: fix vlan_get_protocol_dgram() vs MSG_PEEK Blamed commit forgot MSG_PEEK case, allowing a crash [1] as found by syzbot. Rework vlan_get_protocol_dgram() to not touch skb at all, so that it can be used from many cpus on the same skb. Add a const qualifier to skb argument. [1] skbuff: skb_under_panic: text:ffffffff8a8ccd05 len:29 put:14 head:ffff88807fc8e400 data:ffff88807fc8e3f4 tail:0x11 end:0x140 dev:<NULL> ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:206 ! Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI CPU: 1 UID: 0 PID: 5892 Comm: syz-executor883 Not tainted 6.13.0-rc4-syzkaller-00054-gd6ef8b40d075 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 RIP: 0010:skb_panic net/core/skbuff.c:206 [inline] RIP: 0010:skb_under_panic+0x14b/0x150 net/core/skbuff.c:216 Code: 0b 8d 48 c7 c6 86 d5 25 8e 48 8b 54 24 08 8b 0c 24 44 8b 44 24 04 4d 89 e9 50 41 54 41 57 41 56 e8 5a 69 79 f7 48 83 c4 20 90 <0f> 0b 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 RSP: 0018:ffffc900038d7638 EFLAGS: 00010282 RAX: 0000000000000087 RBX: dffffc0000000000 RCX: 609ffd18ea660600 RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000 RBP: ffff88802483c8d0 R08: ffffffff817f0a8c R09: 1ffff9200071ae60 R10: dffffc0000000000 R11: fffff5200071ae61 R12: 0000000000000140 R13: ffff88807fc8e400 R14: ffff88807fc8e3f4 R15: 0000000000000011 FS: 00007fbac5e006c0(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fbac5e00d58 CR3: 000000001238e000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> skb_push+0xe5/0x100 net/core/skbuff.c:2636 vlan_get_protocol_dgram+0x165/0x290 net/packet/af_packet.c:585 packet_recvmsg+0x948/0x1ef0 net/packet/af_packet.c:3552 sock_recvmsg_nosec net/socket.c:1033 [inline] sock_recvmsg+0x22f/0x280 net/socket.c:1055 ____sys_recvmsg+0x1c6/0x480 net/socket.c:2803 ___sys_recvmsg net/socket.c:2845 [inline] do_recvmmsg+0x426/0xab0 net/socket.c:2940 __sys_recvmmsg net/socket.c:3014 [inline] __do_sys_recvmmsg net/socket.c:3037 [inline] __se_sys_recvmmsg net/socket.c:3030 [inline] __x64_sys_recvmmsg+0x199/0x250 net/socket.c:3030 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Fixes: `79eecf631c` ("af_packet: Handle outgoing VLAN packets without hardware offloading") Reported-by: syzbot+74f70bb1cb968bf09e4f@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/6772c485.050a0220.2f3838.04c5.GAE@google.com/T/#u Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Chengen Du <chengen.du@canonical.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20241230161004.2681892-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-02 18:40:59 -08:00
Eric Dumazet	77ee7a6d16	af_packet: fix vlan_get_tci() vs MSG_PEEK Blamed commit forgot MSG_PEEK case, allowing a crash [1] as found by syzbot. Rework vlan_get_tci() to not touch skb at all, so that it can be used from many cpus on the same skb. Add a const qualifier to skb argument. [1] skbuff: skb_under_panic: text:ffffffff8a8da482 len:32 put:14 head:ffff88807a1d5800 data:ffff88807a1d5810 tail:0x14 end:0x140 dev:<NULL> ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:206 ! Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI CPU: 0 UID: 0 PID: 5880 Comm: syz-executor172 Not tainted 6.13.0-rc3-syzkaller-00762-g9268abe611b0 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 RIP: 0010:skb_panic net/core/skbuff.c:206 [inline] RIP: 0010:skb_under_panic+0x14b/0x150 net/core/skbuff.c:216 Code: 0b 8d 48 c7 c6 9e 6c 26 8e 48 8b 54 24 08 8b 0c 24 44 8b 44 24 04 4d 89 e9 50 41 54 41 57 41 56 e8 3a 5a 79 f7 48 83 c4 20 90 <0f> 0b 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 RSP: 0018:ffffc90003baf5b8 EFLAGS: 00010286 RAX: 0000000000000087 RBX: dffffc0000000000 RCX: 8565c1eec37aa000 RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000 RBP: ffff88802616fb50 R08: ffffffff817f0a4c R09: 1ffff92000775e50 R10: dffffc0000000000 R11: fffff52000775e51 R12: 0000000000000140 R13: ffff88807a1d5800 R14: ffff88807a1d5810 R15: 0000000000000014 FS: 00007fa03261f6c0(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffd65753000 CR3: 0000000031720000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> skb_push+0xe5/0x100 net/core/skbuff.c:2636 vlan_get_tci+0x272/0x550 net/packet/af_packet.c:565 packet_recvmsg+0x13c9/0x1ef0 net/packet/af_packet.c:3616 sock_recvmsg_nosec net/socket.c:1044 [inline] sock_recvmsg+0x22f/0x280 net/socket.c:1066 ____sys_recvmsg+0x1c6/0x480 net/socket.c:2814 ___sys_recvmsg net/socket.c:2856 [inline] do_recvmmsg+0x426/0xab0 net/socket.c:2951 __sys_recvmmsg net/socket.c:3025 [inline] __do_sys_recvmmsg net/socket.c:3048 [inline] __se_sys_recvmmsg net/socket.c:3041 [inline] __x64_sys_recvmmsg+0x199/0x250 net/socket.c:3041 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 Fixes: `79eecf631c` ("af_packet: Handle outgoing VLAN packets without hardware offloading") Reported-by: syzbot+8400677f3fd43f37d3bc@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/6772c485.050a0220.2f3838.04c6.GAE@google.com/T/#u Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Chengen Du <chengen.du@canonical.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20241230161004.2681892-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-02 18:40:54 -08:00
Maciej S. Szmigiero	a7af435df0	net: wwan: iosm: Properly check for valid exec stage in ipc_mmio_init() ipc_mmio_init() used the post-decrement operator in its loop continuing condition of "retries" counter being "> 0", which meant that when this condition caused loop exit "retries" counter reached -1. But the later valid exec stage failure check only tests for "retries" counter being exactly zero, so it didn't trigger in this case (but would wrongly trigger if the code reaches a valid exec stage in the very last loop iteration). Fix this by using the pre-decrement operator instead, so the loop counter is exactly zero on valid exec stage failure. Fixes: `dc0514f5d8` ("net: iosm: mmio scratchpad") Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name> Link: https://patch.msgid.link/8b19125a825f9dcdd81c667c1e5c48ba28d505a6.1735490770.git.mail@maciej.szmigiero.name Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-02 18:37:50 -08:00
Eric Dumazet	5b0af621c3	net: restrict SO_REUSEPORT to inet sockets After blamed commit, crypto sockets could accidentally be destroyed from RCU call back, as spotted by zyzbot [1]. Trying to acquire a mutex in RCU callback is not allowed. Restrict SO_REUSEPORT socket option to inet sockets. v1 of this patch supported TCP, UDP and SCTP sockets, but fcnal-test.sh test needed RAW and ICMP support. [1] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:562 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 24, name: ksoftirqd/1 preempt_count: 100, expected: 0 RCU nest depth: 0, expected: 0 1 lock held by ksoftirqd/1/24: #0: ffffffff8e937ba0 (rcu_callback){....}-{0:0}, at: rcu_lock_acquire include/linux/rcupdate.h:337 [inline] #0: ffffffff8e937ba0 (rcu_callback){....}-{0:0}, at: rcu_do_batch kernel/rcu/tree.c:2561 [inline] #0: ffffffff8e937ba0 (rcu_callback){....}-{0:0}, at: rcu_core+0xa37/0x17a0 kernel/rcu/tree.c:2823 Preemption disabled at: [<ffffffff8161c8c8>] softirq_handle_begin kernel/softirq.c:402 [inline] [<ffffffff8161c8c8>] handle_softirqs+0x128/0x9b0 kernel/softirq.c:537 CPU: 1 UID: 0 PID: 24 Comm: ksoftirqd/1 Not tainted 6.13.0-rc3-syzkaller-00174-ga024e377efed #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120 __might_resched+0x5d4/0x780 kernel/sched/core.c:8758 __mutex_lock_common kernel/locking/mutex.c:562 [inline] __mutex_lock+0x131/0xee0 kernel/locking/mutex.c:735 crypto_put_default_null_skcipher+0x18/0x70 crypto/crypto_null.c:179 aead_release+0x3d/0x50 crypto/algif_aead.c:489 alg_do_release crypto/af_alg.c:118 [inline] alg_sock_destruct+0x86/0xc0 crypto/af_alg.c:502 __sk_destruct+0x58/0x5f0 net/core/sock.c:2260 rcu_do_batch kernel/rcu/tree.c:2567 [inline] rcu_core+0xaaa/0x17a0 kernel/rcu/tree.c:2823 handle_softirqs+0x2d4/0x9b0 kernel/softirq.c:561 run_ksoftirqd+0xca/0x130 kernel/softirq.c:950 smpboot_thread_fn+0x544/0xa30 kernel/smpboot.c:164 kthread+0x2f0/0x390 kernel/kthread.c:389 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 </TASK> Fixes: `8c7138b33e` ("net: Unpublish sk from sk_reuseport_cb before call_rcu") Reported-by: syzbot+b3e02953598f447d4d2a@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/6772f2f4.050a0220.2f3838.04cb.GAE@google.com/T/#u Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Martin KaFai Lau <kafai@fb.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20241231160527.3994168-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-02 17:16:58 -08:00
Willem de Bruijn	68e068cabd	net: reenable NETIF_F_IPV6_CSUM offload for BIG TCP packets The blamed commit disabled hardware offoad of IPv6 packets with extension headers on devices that advertise NETIF_F_IPV6_CSUM, based on the definition of that feature in skbuff.h: * * - %NETIF_F_IPV6_CSUM * - Driver (device) is only able to checksum plain * TCP or UDP packets over IPv6. These are specifically * unencapsulated packets of the form IPv6\|TCP or * IPv6\|UDP where the Next Header field in the IPv6 * header is either TCP or UDP. IPv6 extension headers * are not supported with this feature. This feature * cannot be set in features for a device with * NETIF_F_HW_CSUM also set. This feature is being * DEPRECATED (see below). The change causes skb_warn_bad_offload to fire for BIG TCP packets. [ 496.310233] WARNING: CPU: 13 PID: 23472 at net/core/dev.c:3129 skb_warn_bad_offload+0xc4/0xe0 [ 496.310297] ? skb_warn_bad_offload+0xc4/0xe0 [ 496.310300] skb_checksum_help+0x129/0x1f0 [ 496.310303] skb_csum_hwoffload_help+0x150/0x1b0 [ 496.310306] validate_xmit_skb+0x159/0x270 [ 496.310309] validate_xmit_skb_list+0x41/0x70 [ 496.310312] sch_direct_xmit+0x5c/0x250 [ 496.310317] __qdisc_run+0x388/0x620 BIG TCP introduced an IPV6_TLV_JUMBO IPv6 extension header to communicate packet length, as this is an IPv6 jumbogram. But, the feature is only enabled on devices that support BIG TCP TSO. The header is only present for PF_PACKET taps like tcpdump, and not transmitted by physical devices. For this specific case of extension headers that are not transmitted, return to the situation before the blamed commit and support hardware offload. ipv6_has_hopopt_jumbo() tests not only whether this header is present, but also that it is the only extension header before a terminal (L4) header. Fixes: `04c20a9356` ("net: skip offload for NETIF_F_IPV6_CSUM if ipv6 header contains extension") Reported-by: syzbot <syzkaller@googlegroups.com> Reported-by: Eric Dumazet <edumazet@google.com> Closes: https://lore.kernel.org/netdev/CANn89iK1hdC3Nt8KPhOtTF8vCPc1AHDCtse_BTNki1pWxAByTQ@mail.gmail.com/ Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250101164909.1331680-1-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-02 17:13:44 -08:00
Liang Jie	a8620de72e	net: sfc: Correct key_len for efx_tc_ct_zone_ht_params In efx_tc_ct_zone_ht_params, the key_len was previously set to offsetof(struct efx_tc_ct_zone, linkage). This calculation is incorrect because it includes any padding between the zone field and the linkage field due to structure alignment, which can vary between systems. This patch updates key_len to use sizeof_field(struct efx_tc_ct_zone, zone) , ensuring that the hash table correctly uses the zone as the key. This fix prevents potential hash lookup errors and improves connection tracking reliability. Fixes: `c3bb5c6acd` ("sfc: functions to register for conntrack zone offload") Signed-off-by: Liang Jie <liangjie@lixiang.com> Acked-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://patch.msgid.link/20241230093709.3226854-1-buaajxlj@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-02 17:11:41 -08:00
Dave Airlie	273b3eb600	Merge tag 'drm-xe-fixes-2025-01-02' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes Driver Changes: - A couple of OA fixes squashed for stable backporting (Umesh) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/Z3bur0RmH6-70YSh@fedora	2025-01-03 10:57:31 +10:00
Dave Airlie	198c653edf	Merge tag 'drm-misc-fixes-2025-01-02' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes drm-misc-fixes for v6.13-rc6: - Only fixes for adv7511 driver, including a use-after-free. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/f58429b7-5f11-4b78-b577-de32b41299ea@linux.intel.com	2025-01-03 10:43:36 +10:00
Dave Airlie	48fc4378de	Merge tag 'drm-intel-fixes-2024-12-25' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes - Fix C10 pll programming sequence [cx0_phy] (Suraj Kandpal) - Fix power gate sequence. [dg1] (Rodrigo Vivi) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Tvrtko Ursulin <tursulin@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/Z2wKf7tmElKFdnoP@linux	2025-01-03 10:40:43 +10:00
Dave Airlie	3bce3cc644	Merge tag 'drm-xe-fixes-2024-12-23' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes UAPI Changes: - Revert some devcoredump file format changes breaking a mesa debug tool (John) Driver Changes: - Fixes around waits when moving to system (Nirmoy) - Fix a typo when checking for LMEM provisioning (Michal) - Fix a fault on fd close after unbind (Lucas) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/Z2mjt7OTfH76cgua@fedora	2025-01-03 10:28:48 +10:00
Jens Axboe	c6e60a0a68	io_uring/net: always initialize kmsg->msg.msg_inq upfront syzbot reports that ->msg_inq may get used uinitialized from the following path: BUG: KMSAN: uninit-value in io_recv_buf_select io_uring/net.c:1094 [inline] BUG: KMSAN: uninit-value in io_recv+0x930/0x1f90 io_uring/net.c:1158 io_recv_buf_select io_uring/net.c:1094 [inline] io_recv+0x930/0x1f90 io_uring/net.c:1158 io_issue_sqe+0x420/0x2130 io_uring/io_uring.c:1740 io_queue_sqe io_uring/io_uring.c:1950 [inline] io_req_task_submit+0xfa/0x1d0 io_uring/io_uring.c:1374 io_handle_tw_list+0x55f/0x5c0 io_uring/io_uring.c:1057 tctx_task_work_run+0x109/0x3e0 io_uring/io_uring.c:1121 tctx_task_work+0x6d/0xc0 io_uring/io_uring.c:1139 task_work_run+0x268/0x310 kernel/task_work.c:239 io_run_task_work+0x43a/0x4a0 io_uring/io_uring.h:343 io_cqring_wait io_uring/io_uring.c:2527 [inline] __do_sys_io_uring_enter io_uring/io_uring.c:3439 [inline] __se_sys_io_uring_enter+0x204f/0x4ce0 io_uring/io_uring.c:3330 __x64_sys_io_uring_enter+0x11f/0x1a0 io_uring/io_uring.c:3330 x64_sys_call+0xce5/0x3c30 arch/x86/include/generated/asm/syscalls_64.h:427 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcd/0x1e0 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f and it is correct, as it's never initialized upfront. Hence the first submission can end up using it uninitialized, if the recv wasn't successful and the networking stack didn't honor ->msg_get_inq being set and filling in the output value of ->msg_inq as requested. Set it to 0 upfront when it's allocated, just to silence this KMSAN warning. There's no side effect of using it uninitialized, it'll just potentially cause the next receive to use a recv value hint that's not accurate. Fixes: `c6f32c7d9e` ("io_uring/net: get rid of ->prep_async() for receive side") Reported-by: syzbot+068ff190354d2f74892f@syzkaller.appspotmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-01-02 16:40:08 -07:00
Kohei Enju	789a8cff8d	ftrace: Fix function profiler's filtering functionality Commit `c132be2c4f` ("function_graph: Have the instances use their own ftrace_ops for filtering"), function profiler (enabled via function_profile_enabled) has been showing statistics for all functions, ignoring set_ftrace_filter settings. While tracers are instantiated, the function profiler is not. Therefore, it should use the global set_ftrace_filter for consistency. This patch modifies the function profiler to use the global filter, fixing the filtering functionality. Before (filtering not working): ``` root@localhost:~# echo 'vfs' > /sys/kernel/tracing/set_ftrace_filter root@localhost:~# echo 1 > /sys/kernel/tracing/function_profile_enabled root@localhost:~# sleep 1 root@localhost:~# echo 0 > /sys/kernel/tracing/function_profile_enabled root@localhost:~# head /sys/kernel/tracing/trace_stat/ Function Hit Time Avg s^2 -------- --- ---- --- --- schedule 314 22290594 us 70989.15 us 40372231 us x64_sys_call 1527 8762510 us 5738.382 us 3414354 us schedule_hrtimeout_range 176 8665356 us 49234.98 us 405618876 us __x64_sys_ppoll 324 5656635 us 17458.75 us 19203976 us do_sys_poll 324 5653747 us 17449.83 us 19214945 us schedule_timeout 67 5531396 us 82558.15 us 2136740827 us __x64_sys_pselect6 12 3029540 us 252461.7 us 63296940171 us do_pselect.constprop.0 12 3029532 us 252461.0 us 63296952931 us ``` After (filtering working): ``` root@localhost:~# echo 'vfs' > /sys/kernel/tracing/set_ftrace_filter root@localhost:~# echo 1 > /sys/kernel/tracing/function_profile_enabled root@localhost:~# sleep 1 root@localhost:~# echo 0 > /sys/kernel/tracing/function_profile_enabled root@localhost:~# head /sys/kernel/tracing/trace_stat/ Function Hit Time Avg s^2 -------- --- ---- --- --- vfs_write 462 68476.43 us 148.217 us 25874.48 us vfs_read 641 9611.356 us 14.994 us 28868.07 us vfs_fstat 890 878.094 us 0.986 us 1.667 us vfs_fstatat 227 757.176 us 3.335 us 18.928 us vfs_statx 226 610.610 us 2.701 us 17.749 us vfs_getattr_nosec 1187 460.919 us 0.388 us 0.326 us vfs_statx_path 297 343.287 us 1.155 us 11.116 us vfs_rename 6 291.575 us 48.595 us 9889.236 us ``` Cc: stable@vger.kernel.org Link: https://lore.kernel.org/20250101190820.72534-1-enjuk@amazon.com Fixes: `c132be2c4f` ("function_graph: Have the instances use their own ftrace_ops for filtering") Signed-off-by: Kohei Enju <enjuk@amazon.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-01-02 17:21:33 -05:00
Zilin Guan	d654740337	fgraph: Add READ_ONCE() when accessing fgraph_array[] In __ftrace_return_to_handler(), a loop iterates over the fgraph_array[] elements, which are fgraph_ops. The loop checks if an element is a fgraph_stub to prevent using a fgraph_stub afterward. However, if the compiler reloads fgraph_array[] after this check, it might race with an update to fgraph_array[] that introduces a fgraph_stub. This could result in the stub being processed, but the stub contains a null "func_hash" field, leading to a NULL pointer dereference. To ensure that the gops compared against the fgraph_stub matches the gops processed later, add a READ_ONCE(). A similar patch appears in commit `63a8dfb` ("function_graph: Add READ_ONCE() when accessing fgraph_array[]"). Cc: stable@vger.kernel.org Fixes: `37238abe3c` ("ftrace/function_graph: Pass fgraph_ops to function graph callbacks") Link: https://lore.kernel.org/20241231113731.277668-1-zilin@seu.edu.cn Signed-off-by: Zilin Guan <zilin@seu.edu.cn> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-01-02 17:21:18 -05:00
Olof Johansson	0bc21e701a	MAINTAINERS: Remove Olof from SoC maintainers I haven't been an active participant for a couple of years now, and after discussions at Linux Plumbers in 2024, Arnd is getting fresh help from a few more participants. It's time to remove myself, and spare myself from patches and pull requests in my inbox. Signed-off-by: Olof Johansson <olof@lixom.net> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2025-01-02 10:43:12 -08:00
Linus Torvalds	e6b7c8c5a1	Merge tag 'pmdomain-v6.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm Pull pmdomain fixes from Ulf Hansson: - Silence warning by adding a dummy release function - imx: Fix an OF node reference leak in imx_gpcv2_probe() * tag 'pmdomain-v6.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm: pmdomain: core: add dummy release function to genpd device pmdomain: imx: gpcv2: fix an OF node reference leak in imx_gpcv2_probe()	2025-01-02 10:40:40 -08:00
Linus Torvalds	8c2d370154	Merge tag 'mmc-v6.13-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC fix from Ulf Hansson: - sdhci-msm: Fix crypto key eviction * tag 'mmc-v6.13-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: sdhci-msm: fix crypto key eviction	2025-01-02 10:33:07 -08:00
Umesh Nerlige Ramappa	f0ed39830e	xe/oa: Fix query mode of operation for OAR/OAC This is a set of squashed commits to facilitate smooth applying to stable. Each commit message is retained for reference. 1) Allow a GGTT mapped batch to be submitted to user exec queue For a OA use case, one of the HW registers needs to be modified by submitting an MI_LOAD_REGISTER_IMM command to the users exec queue, so that the register is modified in the user's hardware context. In order to do this a batch that is mapped in GGTT, needs to be submitted to the user exec queue. Since all user submissions use q->vm and hence PPGTT, add some plumbing to enable submission of batches mapped in GGTT. v2: ggtt is zero-initialized, so no need to set it false (Matt Brost) 2) xe/oa: Use MI_LOAD_REGISTER_IMMEDIATE to enable OAR/OAC To enable OAR/OAC, a bit in RING_CONTEXT_CONTROL needs to be set. Setting this bit cause the context image size to change and if not done correct, can cause undesired hangs. Current code uses a separate exec_queue to modify this bit and is error-prone. As per HW recommendation, submit MI_LOAD_REGISTER_IMM to the target hardware context to modify the relevant bit. In v2 version, an attempt to submit everything to the user-queue was made, but it failed the unprivileged-single-ctx-counters test. It appears that the OACTXCONTROL must be modified from a remote context. In v3 version, all context specific register configurations were moved to use LOAD_REGISTER_IMMEDIATE and that seems to work well. This is a cleaner way, since we can now submit all configuration to user exec_queue and the fence handling is simplified. v2: (Matt) - set job->ggtt to true if create job is successful - unlock vm on job error (Ashutosh) - don't wait on job submission - use kernel exec queue where possible v3: (Ashutosh) - Fix checkpatch issues - Remove extra spaces/new-lines - Add Fixes: and Cc: tags - Reset context control bit when OA stream is closed - Submit all config via MI_LOAD_REGISTER_IMMEDIATE (Umesh) - Update commit message for v3 experiment - Squash patches for easier port to stable v4: (Ashutosh) - No need to pass q to xe_oa_submit_bb - Do not support exec queues with width > 1 - Fix disabling of CTX_CTRL_OAC_CONTEXT_ENABLE v5: (Ashutosh) - Drop reg_lri related comments - Use XE_OA_SUBMIT_NO_DEPS in xe_oa_load_with_lri Fixes: `8135f1c09d` ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close") Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> # commit 1 Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Cc: stable@vger.kernel.org Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241220171919.571528-2-umesh.nerlige.ramappa@intel.com (cherry picked from commit `55039832f9`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2025-01-02 19:01:21 +01:00
Manivannan Sadhasivam	3b2f56860b	scsi: ufs: qcom: Power down the controller/device during system suspend for SM8550/SM8650 SoCs SM8550 and SM8650 SoCs doesn't support UFS PHY retention. So once these SoCs reaches the low power state (CX power collapse) during system suspend, all the PHY hardware state gets lost. This leads to the UFS resume failure: ufshcd-qcom 1d84000.ufs: ufshcd_uic_hibern8_exit: hibern8 exit failed. ret = 5 ufshcd-qcom 1d84000.ufs: __ufshcd_wl_resume: hibern8 exit failed 5 ufs_device_wlun 0:0:0:49488: ufshcd_wl_resume failed: 5 ufs_device_wlun 0:0:0:49488: PM: dpm_run_callback(): scsi_bus_resume+0x0/0x84 returns 5 ufs_device_wlun 0:0:0:49488: PM: failed to resume async: error 5 With the default system suspend level of UFS_PM_LVL_3, the power domain for UFS PHY needs to be kept always ON to retain the state. But this would prevent these SoCs from reaching the CX power collapse state, leading to poor power saving during system suspend. So to fix this issue without affecting the power saving, set 'ufs_qcom_drvdata::no_phy_retention' to true which sets 'hba->spm_lvl' to UFS_PM_LVL_5 to allow both the controller and device (in turn the PHY) to be powered down during system suspend for these SoCs by default. Cc: stable@vger.kernel.org # 6.3 Fixes: `35cf1aaab1` ("arm64: dts: qcom: sm8550: Add UFS host controller and phy nodes") Fixes: `10e0246712` ("arm64: dts: qcom: sm8650: add interconnect dependent device nodes") Reported-by: Neil Armstrong <neil.armstrong@linaro.org> Tested-by: Amit Pundir <amit.pundir@linaro.org> # on SM8550-HDK Reviewed-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20241219-ufs-qcom-suspend-fix-v3-4-63c4b95a70b9@linaro.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2025-01-02 13:00:16 -05:00
Manivannan Sadhasivam	4f78a56af4	scsi: ufs: qcom: Allow passing platform specific OF data In order to allow platform specific flags and configurations, introduce the platform specific OF data and move the existing quirk UFSHCD_QUIRK_BROKEN_LSDBS_CAP for SM8550 and SM8650 SoCs. Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Tested-by: Amit Pundir <amit.pundir@linaro.org> # on SM8550-HDK Reviewed-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20241219-ufs-qcom-suspend-fix-v3-3-63c4b95a70b9@linaro.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2025-01-02 13:00:16 -05:00
Manivannan Sadhasivam	bb9850704c	scsi: ufs: core: Honor runtime/system PM levels if set by host controller drivers Otherwise, the default levels will override the levels set by the host controller drivers. Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20241219-ufs-qcom-suspend-fix-v3-2-63c4b95a70b9@linaro.org Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2025-01-02 13:00:16 -05:00
Manivannan Sadhasivam	7bac656875	scsi: ufs: qcom: Power off the PHY if it was already powered on in ufs_qcom_power_up_sequence() PHY might already be powered on during ufs_qcom_power_up_sequence() in a couple of cases: 1. During UFSHCD_QUIRK_REINIT_AFTER_MAX_GEAR_SWITCH quirk 2. Resuming from spm_lvl = 5 suspend In those cases, it is necessary to call phy_power_off() and phy_exit() in ufs_qcom_power_up_sequence() function to power off the PHY before calling phy_init() and phy_power_on(). Case (1) is doing it via ufs_qcom_reinit_notify() callback, but case (2) is not handled. So to satisfy both cases, call phy_power_off() and phy_exit() if the phy_count is non-zero. And with this change, the reinit_notify() callback is no longer needed. This fixes the below UFS resume failure with spm_lvl = 5: ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5 ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5 ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5 ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5 ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5 ufs_device_wlun 0:0:0:49488: ufshcd_wl_resume failed: -5 ufs_device_wlun 0:0:0:49488: PM: dpm_run_callback(): scsi_bus_resume returns -5 ufs_device_wlun 0:0:0:49488: PM: failed to resume async: error -5 Cc: stable@vger.kernel.org # 6.3 Fixes: `baf5ddac90` ("scsi: ufs: ufs-qcom: Add support for reinitializing the UFS device") Reported-by: Ram Kumar Dwivedi <quic_rdwivedi@quicinc.com> Tested-by: Amit Pundir <amit.pundir@linaro.org> # on SM8550-HDK Reviewed-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20241219-ufs-qcom-suspend-fix-v3-1-63c4b95a70b9@linaro.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2025-01-02 13:00:16 -05:00
Andrew Cooper	5cc2db3712	x86/static-call: Remove early_boot_irqs_disabled check to fix Xen PVH dom0 __static_call_update_early() has a check for early_boot_irqs_disabled, but is used before early_boot_irqs_disabled is set up in start_kernel(). Xen PV has always special cased early_boot_irqs_disabled, but Xen PVH does not and falls over the BUG when booting as dom0. It is very suspect that early_boot_irqs_disabled starts as 0, becomes 1 for a time, then becomes 0 again, but as this needs backporting to fix a breakage in a security fix, dropping the BUG_ON() is the far safer option. Fixes: `0ef8047b73` ("x86/static-call: provide a way to do very early static-call updates") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219620 Reported-by: Alex Zenla <alex@edera.dev> Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Juergen Gross <jgross@suse.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Alex Zenla <alex@edera.dev> Link: https://lore.kernel.org/r/20241221211046.6475-1-andrew.cooper3@citrix.com	2025-01-02 17:11:29 +01:00
Greg Kroah-Hartman	997bb2d756	Merge tag 'icc-6.13-rc6' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/djakov/icc into char-misc-next Georgi writes: interconnect fixes for v6.13-rc This contains two fixes. One fixing a boot error on db410c board when UBSAN is enabled with clang-19 builds. The other one adds a missing return value check after devm_kasprintf. - interconnect: qcom: icc-rpm: Set the count member before accessing the flex array - interconnect: icc-clk: check return values of devm_kasprintf() Signed-off-by: Georgi Djakov <djakov@kernel.org> * tag 'icc-6.13-rc6' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/djakov/icc: interconnect: icc-clk: check return values of devm_kasprintf() interconnect: qcom: icc-rpm: Set the count member before accessing the flex array	2025-01-02 16:53:11 +01:00
Chun-Kuang Hu	c4bd13be19	drm/mediatek: Remove unneeded semicolon cocci warnings: (new ones prefixed by >>) >> drivers/gpu/drm/mediatek/mtk_drm_drv.c:1092:2-3: Unneeded semicolon Fixes: `4c932840db` ("drm/mediatek: Implement OF graphs support for display paths") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202412022048.kY2ZhxZ4-lkp@intel.com/ Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20241230135314.5419-1-chunkuang.hu@kernel.org/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>	2025-01-02 13:36:18 +00:00
Linus Torvalds	56e6a3499e	Merge tag 'trace-v6.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fix from Steven Rostedt: "Fix trace event string check when dealing with array of strings The xe_bo_move event has a field that indexes into an array of strings. The TP_fast_assign() added the index into the ring buffer and the TP_printk() had a "%s" that referenced the array using the index in the ring buffer. This is a legitimate use of "%s" in trace events. But this triggered a false positive in the test_event_printk() at boot saying that the string was dangerous. Change the check to allow arrays using fields in the ring buffer as an index to be considered a safe string" * tag 'trace-v6.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Have process_string() also allow arrays	2025-01-01 11:22:07 -08:00
Jens Axboe	cc0331e29f	Merge tag 'nvme-6.13-2024-12-31' of git://git.infradead.org/nvme into block-6.13 Pull NVMe fixes from Keith: "nvme fixes for Linux 6.13 - Fix device specific quirk for PRP list alignment (Robert) - Fix target name overflow (Leo) - Fix target write granularity (Luis) - Fix target sleeping in atomic context (Nilay) - Remove unnecessary tcp queue teardown (Chunguang)" * tag 'nvme-6.13-2024-12-31' of git://git.infradead.org/nvme: nvme-tcp: remove nvme_tcp_destroy_io_queues() nvmet-loop: avoid using mutex in IO hotpath nvmet: propagate npwg topology nvmet: Don't overflow subsysnqn nvme-pci: 512 byte aligned dma pool segment quirk	2024-12-31 10:41:58 -07:00
Takashi Iwai	8765429279	ALSA: seq: Check UMP support for midi_version change When the kernel is built without UMP support but a user-space app requires the midi_version > 0, the kernel should return an error. Otherwise user-space assumes as if it were possible to deal, eventually hitting serious errors later. Fixes: `46397622a3` ("ALSA: seq: Add UMP support") Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/20241231145358.21946-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-12-31 15:57:01 +01:00
Kalesh AP	e6178bf78d	RDMA/bnxt_re: Fix error recovery sequence Fixed to return ENXIO from __send_message_basic_sanity() to indicate that device is in error state. In the case of ERR_DEVICE_DETACHED state, the driver should not post the commands to the firmware as it will time out eventually. Removed bnxt_re_modify_qp() call from bnxt_re_dev_stop() as it is a no-op. Fixes: `cc5b9b48d4` ("RDMA/bnxt_re: Recover the device when FW error is detected") Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Link: https://patch.msgid.link/20241231025008.2267162-1-kalesh-anakkur.purayil@broadcom.com Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-12-31 08:28:36 -05:00
Li Zhijian	fb514b3139	RDMA/rtrs: Ensure 'ib_sge list' is accessible Move the declaration of the 'ib_sge list' variable outside the 'always_invalidate' block to ensure it remains accessible for use throughout the function. Previously, 'ib_sge list' was declared within the 'always_invalidate' block, limiting its accessibility, then caused a 'BUG: kernel NULL pointer dereference'[1]. ? __die_body.cold+0x19/0x27 ? page_fault_oops+0x15a/0x2d0 ? search_module_extables+0x19/0x60 ? search_bpf_extables+0x5f/0x80 ? exc_page_fault+0x7e/0x180 ? asm_exc_page_fault+0x26/0x30 ? memcpy_orig+0xd5/0x140 rxe_mr_copy+0x1c3/0x200 [rdma_rxe] ? rxe_pool_get_index+0x4b/0x80 [rdma_rxe] copy_data+0xa5/0x230 [rdma_rxe] rxe_requester+0xd9b/0xf70 [rdma_rxe] ? finish_task_switch.isra.0+0x99/0x2e0 rxe_sender+0x13/0x40 [rdma_rxe] do_task+0x68/0x1e0 [rdma_rxe] process_one_work+0x177/0x330 worker_thread+0x252/0x390 ? __pfx_worker_thread+0x10/0x10 This change ensures the variable is available for subsequent operations that require it. [1] https://lore.kernel.org/linux-rdma/6a1f3e8f-deb0-49f9-bc69-a9b03ecfcda7@fujitsu.com/ Fixes: `9cb8374804` ("RDMA/rtrs: server: main functionality") Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Link: https://patch.msgid.link/20241231013416.1290920-1-lizhijian@fujitsu.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-12-31 08:28:26 -05:00
Daniel Schaefer	7b509910b3	ALSA hda/realtek: Add quirk for Framework F111:000C Similar to commit `eb91c456f3` ("ALSA: hda/realtek: Add Framework Laptop 13 (Intel Core Ultra) to quirks") and previous quirks for Framework systems with Realtek codecs. 000C is a new platform that will also have an ALC285 codec and needs the same quirk. Cc: Jaroslav Kysela <perex@perex.cz> Cc: Takashi Iwai <tiwai@suse.com> Cc: linux@frame.work Cc: Dustin L. Howett <dustin@howett.net> Signed-off-by: Daniel Schaefer <dhs@frame.work> Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/20241231045958.14545-1-dhs@frame.work Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-12-31 12:43:15 +01:00
Yuezhang Mo	a5324b3a48	exfat: fix the infinite loop in __exfat_free_cluster() In __exfat_free_cluster(), the cluster chain is traversed until the EOF cluster. If the cluster chain includes a loop due to file system corruption, the EOF cluster cannot be traversed, resulting in an infinite loop. This commit uses the total number of clusters to prevent this infinite loop. Reported-by: syzbot+1de5a37cb85a2d536330@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=1de5a37cb85a2d536330 Tested-by: syzbot+1de5a37cb85a2d536330@syzkaller.appspotmail.com Fixes: `31023864e6` ("exfat: add fat entry operations") Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>	2024-12-31 17:51:21 +09:00
Yuezhang Mo	98e2fb26d1	exfat: fix the new buffer was not zeroed before writing Before writing, if a buffer_head marked as new, its data must be zeroed, otherwise uninitialized data in the page cache will be written. So this commit uses folio_zero_new_buffers() to zero the new buffers before ->write_end(). Fixes: `6630ea4910` ("exfat: move extend valid_size into ->page_mkwrite()") Reported-by: syzbot+91ae49e1c1a2634d20c0@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=91ae49e1c1a2634d20c0 Tested-by: syzbot+91ae49e1c1a2634d20c0@syzkaller.appspotmail.com Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>	2024-12-31 17:51:16 +09:00
Yuezhang Mo	fee873761b	exfat: fix the infinite loop in exfat_readdir() If the file system is corrupted so that a cluster is linked to itself in the cluster chain, and there is an unused directory entry in the cluster, 'dentry' will not be incremented, causing condition 'dentry < max_dentries' unable to prevent an infinite loop. This infinite loop causes s_lock not to be released, and other tasks will hang, such as exfat_sync_fs(). This commit stops traversing the cluster chain when there is unused directory entry in the cluster to avoid this infinite loop. Reported-by: syzbot+205c2644abdff9d3f9fc@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=205c2644abdff9d3f9fc Tested-by: syzbot+205c2644abdff9d3f9fc@syzkaller.appspotmail.com Fixes: `ca06197382` ("exfat: add directory operations") Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>	2024-12-31 17:49:40 +09:00
Steven Rostedt	afc6717628	tracing: Have process_string() also allow arrays In order to catch a common bug where a TRACE_EVENT() TP_fast_assign() assigns an address of an allocated string to the ring buffer and then references it in TP_printk(), which can be executed hours later when the string is free, the function test_event_printk() runs on all events as they are registered to make sure there's no unwanted dereferencing. It calls process_string() to handle cases in TP_printk() format that has "%s". It returns whether or not the string is safe. But it can have some false positives. For instance, xe_bo_move() has: TP_printk("move_lacks_source:%s, migrate object %p [size %zu] from %s to %s device_id:%s", __entry->move_lacks_source ? "yes" : "no", __entry->bo, __entry->size, xe_mem_type_to_name[__entry->old_placement], xe_mem_type_to_name[__entry->new_placement], __get_str(device_id)) Where the "%s" references into xe_mem_type_to_name[]. This is an array of pointers that should be safe for the event to access. Instead of flagging this as a bad reference, if a reference points to an array, where the record field is the index, consider it safe. Link: https://lore.kernel.org/all/9dee19b6185d325d0e6fa5f7cbba81d007d99166.camel@sapience.com/ Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20241231000646.324fb5f7@gandalf.local.home Fixes: `65a25d9f7a` ("tracing: Add "%s" check in test_event_printk()") Reported-by: Genes Lists <lists@sapience.com> Tested-by: Gene C <arch@sapience.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2024-12-31 00:10:32 -05:00
Jinjian Song	4f619d518d	net: wwan: t7xx: Fix FSM command timeout issue When driver processes the internal state change command, it use an asynchronous thread to process the command operation. If the main thread detects that the task has timed out, the asynchronous thread will panic when executing the completion notification because the main thread completion object has been released. BUG: unable to handle page fault for address: fffffffffffffff8 PGD 1f283a067 P4D 1f283a067 PUD 1f283c067 PMD 0 Oops: 0000 [#1] PREEMPT SMP NOPTI RIP: 0010:complete_all+0x3e/0xa0 [...] Call Trace: <TASK> ? __die_body+0x68/0xb0 ? page_fault_oops+0x379/0x3e0 ? exc_page_fault+0x69/0xa0 ? asm_exc_page_fault+0x22/0x30 ? complete_all+0x3e/0xa0 fsm_main_thread+0xa3/0x9c0 [mtk_t7xx (HASH:1400 5)] ? __pfx_autoremove_wake_function+0x10/0x10 kthread+0xd8/0x110 ? __pfx_fsm_main_thread+0x10/0x10 [mtk_t7xx (HASH:1400 5)] ? __pfx_kthread+0x10/0x10 ret_from_fork+0x38/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1b/0x30 </TASK> [...] CR2: fffffffffffffff8 ---[ end trace 0000000000000000 ]--- Use the reference counter to ensure safe release as Sergey suggests: https://lore.kernel.org/all/da90f64c-260a-4329-87bf-1f9ff20a5951@gmail.com/ Fixes: `13e920d93e` ("net: wwan: t7xx: Add core components") Signed-off-by: Jinjian Song <jinjian.song@fibocom.com> Acked-by: Sergey Ryazanov <ryazanov.s.a@gmail.com> Link: https://patch.msgid.link/20241224041552.8711-1-jinjian.song@fibocom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-30 18:00:32 -08:00
Chester A. Unal	e740492181	MAINTAINERS: change Arınç _NAL's name and email address My legal name now includes Chester. Change the name and the email address sections to reflect that. Link: https://lkml.kernel.org/r/20241225-for-unknown-upstream-v1-1-3e35e4d5e161@arinc9.com Signed-off-by: Chester A. Unal <chester.a.unal@arinc9.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-30 17:59:12 -08:00
Kuan-Wei Chiu	0210d25116	scripts/sorttable: fix orc_sort_cmp() to maintain symmetry and transitivity The orc_sort_cmp() function, used with qsort(), previously violated the symmetry and transitivity rules required by the C standard. Specifically, when both entries are ORC_TYPE_UNDEFINED, it could result in both a < b and b < a, which breaks the required symmetry and transitivity. This can lead to undefined behavior and incorrect sorting results, potentially causing memory corruption in glibc implementations [1]. Symmetry: If x < y, then y > x. Transitivity: If x < y and y < z, then x < z. Fix the comparison logic to return 0 when both entries are ORC_TYPE_UNDEFINED, ensuring compliance with qsort() requirements. Link: https://www.qualys.com/2024/01/30/qsort.txt [1] Link: https://lkml.kernel.org/r/20241226140332.2670689-1-visitorckw@gmail.com Fixes: `57fa189942` ("scripts/sorttable: Implement build-time ORC unwind table sorting") Fixes: `fb799447ae` ("x86,objtool: Split UNWIND_HINT_EMPTY in two") Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw> Cc: <chuang@cs.nycu.edu.tw> Cc: Ingo Molnar <mingo@kernel.org> Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shile Zhang <shile.zhang@linux.alibaba.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-30 17:59:11 -08:00
Tetsuo Handa	dd2a5b5514	mm/util: make memdup_user_nul() similar to memdup_user() Since the string data to copy from userspace is likely less than PAGE_SIZE bytes, replace GFP_KERNEL with GFP_USER like commit `6c2c97a24f` ("memdup_user(): switch to GFP_USER") does and add __GFP_NOWARN like commit `6c8fcc096b` ("mm: don't let userspace spam allocations warnings") does. Also, use dedicated slab buckets like commit `d73778e4b8` ("mm/util: Use dedicated slab buckets for memdup_user()") does. Link: https://lkml.kernel.org/r/014cd694-cc27-4a07-a34a-2ae95d744515@I-love.SAKURA.ne.jp Reported-by: syzbot+7e12e97b36154c54414b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=7e12e97b36154c54414b Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-30 17:59:11 -08:00
Kairui Song	62e72d2cf7	mm, madvise: fix potential workingset node list_lru leaks Since commit `5abc1e37af` ("mm: list_lru: allocate list_lru_one only when needed"), all list_lru users need to allocate the items using the new infrastructure that provides list_lru info for slab allocation, ensuring that the corresponding memcg list_lru is allocated before use. For workingset shadow nodes (which are xa_node), users are converted to use the new infrastructure by commit `9bbdc0f324` ("xarray: use kmem_cache_alloc_lru to allocate xa_node"). The xas->xa_lru will be set correctly for filemap users. However, there is a missing case: xa_node allocations caused by madvise(..., MADV_COLLAPSE). madvise(..., MADV_COLLAPSE) will also read in the absent parts of file map, and there will be xa_nodes allocated for the caller's memcg (assuming it's not rootcg). However, these allocations won't trigger memcg list_lru allocation because the proper xas info was not set. If nothing else has allocated other xa_nodes for that memcg to trigger list_lru creation, and memory pressure starts to evict file pages, workingset_update_node will try to add these xa_nodes to their corresponding memcg list_lru, and it does not exist (NULL). So they will be added to rootcg's list_lru instead. This shouldn't be a significant issue in practice, but it is indeed unexpected behavior, and these xa_nodes will not be reclaimed effectively. And may lead to incorrect counting of the list_lru->nr_items counter. This problem wasn't exposed until recent commit `28e98022b3` ("mm/list_lru: simplify reparenting and initial allocation") added a sanity check: only dying memcg could have a NULL list_lru when list_lru_{add,del} is called. This problem triggered this WARNING. So make madvise(..., MADV_COLLAPSE) also call xas_set_lru() to pass the list_lru which we may want to insert xa_node into later. And move mapping_set_update to mm/internal.h, and turn into a macro to avoid including extra headers in mm/internal.h. Link: https://lkml.kernel.org/r/20241222122936.67501-1-ryncsn@gmail.com Fixes: `9bbdc0f324` ("xarray: use kmem_cache_alloc_lru to allocate xa_node") Reported-by: syzbot+38a0cbd267eff2d286ff@syzkaller.appspotmail.com Closes: https://lore.kernel.org/lkml/675d01e9.050a0220.37aaf.00be.GAE@google.com/ Signed-off-by: Kairui Song <kasong@tencent.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Sasha Levin <sashal@kernel.org> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Yu Zhao <yuzhao@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-30 17:59:11 -08:00
SeongJae Park	7d390b5306	mm/damon/core: fix ignored quota goals and filters of newly committed schemes damon_commit_schemes() ignores quota goals and filters of the newly committed schemes. This makes users confused about the behaviors. Correctly handle those inputs. Link: https://lkml.kernel.org/r/20241222231222.85060-3-sj@kernel.org Fixes: `9cb3d0b9df` ("mm/damon/core: implement DAMON context commit function") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-30 17:59:11 -08:00
SeongJae Park	8debfc5b1a	mm/damon/core: fix new damon_target objects leaks on damon_commit_targets() Patch series "mm/damon/core: fix memory leaks and ignored inputs from damon_commit_ctx()". Due to two bugs in damon_commit_targets() and damon_commit_schemes(), which are called from damon_commit_ctx(), some user inputs can be ignored, and some mmeory objects can be leaked. Fix those. Note that only DAMON sysfs interface users are affected. Other DAMON core API user modules that more focused more on simple and dedicated production usages, including DAMON_RECLAIM and DAMON_LRU_SORT are not using the buggy function in the way, so not affected. This patch (of 2): When new DAMON targets are added via damon_commit_targets(), the newly created targets are not deallocated when updating the internal data (damon_commit_target()) is failed. Worse yet, even if the setup is successfully done, the new target is not linked to the context. Hence, the new targets are always leaked regardless of the internal data setup failure. Fix the leaks. Link: https://lkml.kernel.org/r/20241222231222.85060-2-sj@kernel.org Fixes: `9cb3d0b9df` ("mm/damon/core: implement DAMON context commit function") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-30 17:59:10 -08:00
Kairui Song	98a6abc6ce	mm/list_lru: fix false warning of negative counter commit `2788cf0c40` ("memcg: reparent list_lrus and free kmemcg_id on css offline") removed sanity checks for the nr_items counter's value because it implemented list_lru re-parenting in a way that will redirect children's list_lru to the parent before re-parenting the items in list_lru. This will make item counter uncharging happen in the parent while the item is still being held by the child. As a result, the parent's counter value may become negative. This is acceptable because re-parenting will sum up the children's counter values, and the parent's counter will be fixed. Later commit `fb56fdf8b9` ("mm/list_lru: split the lock to per-cgroup scope") reworked the re-parenting process, and removed the redirect. So it added the sanity check back, assuming that as long as items are still in the children's list_lru, parent's counter will not be uncharged. But that assumption is incorrect. The xas_store in memcg_reparent_list_lrus will set children's list_lru to NULL before re-parenting the items, it redirects list_lru helpers to use parent's list_lru just like before. But still, it's not a problem as re-parenting will fix the counter. Therefore, remove this sanity check, but add a new check to ensure that the counter won't go negative in a different way: the child's list_lru being re-parented should never have a negative counter, since re-parenting should occur in order and fixes counters. Link: https://lkml.kernel.org/r/20241223150907.1591-1-ryncsn@gmail.com Fixes: `fb56fdf8b9` ("mm/list_lru: split the lock to per-cgroup scope") Signed-off-by: Kairui Song <kasong@tencent.com> Closes: https://lore.kernel.org/lkml/Z2Bz9t92Be9l1xqj@lappy/ Cc: Chengming Zhou <zhouchengming@bytedance.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Sasha Levin <sashal@kernel.org> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Yu Zhao <yuzhao@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-30 17:59:10 -08:00
Koichiro Den	adcfb264c3	vmstat: disable vmstat_work on vmstat_cpu_down_prep() Even after mm/vmstat:online teardown, shepherd may still queue work for the dying cpu until the cpu is removed from online mask. While it's quite rare, this means that after unbind_workers() unbinds a per-cpu kworker, it potentially runs vmstat_update for the dying CPU on an irrelevant cpu before entering atomic AP states. When CONFIG_DEBUG_PREEMPT=y, it results in the following error with the backtrace. BUG: using smp_processor_id() in preemptible [00000000] code: \ kworker/7:3/1702 caller is refresh_cpu_vm_stats+0x235/0x5f0 CPU: 0 UID: 0 PID: 1702 Comm: kworker/7:3 Tainted: G Tainted: [N]=TEST Workqueue: mm_percpu_wq vmstat_update Call Trace: <TASK> dump_stack_lvl+0x8d/0xb0 check_preemption_disabled+0xce/0xe0 refresh_cpu_vm_stats+0x235/0x5f0 vmstat_update+0x17/0xa0 process_one_work+0x869/0x1aa0 worker_thread+0x5e5/0x1100 kthread+0x29e/0x380 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> So, for mm/vmstat:online, disable vmstat_work reliably on teardown and symmetrically enable it on startup. Link: https://lkml.kernel.org/r/20241221033321.4154409-1-koichiro.den@canonical.com Signed-off-by: Koichiro Den <koichiro.den@canonical.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-30 17:59:10 -08:00
Baolin Wang	d77b90d2b2	mm: shmem: fix the update of 'shmem_falloc->nr_unswapped' The 'shmem_falloc->nr_unswapped' is used to record how many writepage refused to swap out because fallocate() is allocating, but after shmem supports large folio swap out, the update of 'shmem_falloc->nr_unswapped' does not use the correct number of pages in the large folio, which may lead to fallocate() not exiting as soon as possible. Anyway, this is found through code inspection, and I am not sure whether it would actually cause serious issues. Link: https://lkml.kernel.org/r/f66a0119d0564c2c37c84f045835b870d1b2196f.1734593154.git.baolin.wang@linux.alibaba.com Fixes: `809bc86517` ("mm: shmem: support large folio swap out") Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-30 17:59:10 -08:00
Baolin Wang	d0e6983a6d	mm: shmem: fix incorrect index alignment for within_size policy With enabling the shmem per-size within_size policy, using an incorrect 'order' size to round_up() the index can lead to incorrect i_size checks, resulting in an inappropriate large orders being returned. Changing to use '1 << order' to round_up() the index to fix this issue. Additionally, adding an 'aligned_index' variable to avoid affecting the index checks. Link: https://lkml.kernel.org/r/77d8ef76a7d3d646e9225e9af88a76549a68aab1.1734593154.git.baolin.wang@linux.alibaba.com Fixes: `e7a2ab7b3b` ("mm: shmem: add mTHP support for anonymous shmem") Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-30 17:59:09 -08:00
Gal Pressman	1167324770	percpu: remove intermediate variable in PERCPU_PTR() The intermediate variable in the PERCPU_PTR() macro results in a kernel panic on boot [1] due to a compiler bug seen when compiling the kernel (+ KASAN) with gcc 11.3.1, but not when compiling with latest gcc (v14.2)/clang(v18.1). To solve it, remove the intermediate variable (which is not needed) and keep the casting that resolves the address space checks. [1] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000003: 0000 [#1] SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f] CPU: 0 UID: 0 PID: 547 Comm: iptables Not tainted 6.13.0-rc1_external_tested-master #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:nf_ct_netns_do_get+0x139/0x540 Code: 03 00 00 48 81 c4 88 00 00 00 5b 5d 41 5c 41 5d 41 5e 41 5f c3 4d 8d 75 08 48 b8 00 00 00 00 00 fc ff df 4c 89 f2 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e 27 03 00 00 41 8b 45 08 83 c0 RSP: 0018:ffff888116df75e8 EFLAGS: 00010207 RAX: dffffc0000000000 RBX: 1ffff11022dbeebe RCX: ffffffff839a2382 RDX: 0000000000000003 RSI: 0000000000000008 RDI: ffff88842ec46d10 RBP: 0000000000000002 R08: 0000000000000000 R09: fffffbfff0b0860c R10: ffff888116df75e8 R11: 0000000000000001 R12: ffffffff879d6a80 R13: 0000000000000016 R14: 000000000000001e R15: ffff888116df7908 FS: 00007fba01646740(0000) GS:ffff88842ec00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055bd901800d8 CR3: 00000001205f0003 CR4: 0000000000172eb0 Call Trace: <TASK> ? die_addr+0x3d/0xa0 ? exc_general_protection+0x144/0x220 ? asm_exc_general_protection+0x22/0x30 ? __mutex_lock+0x2c2/0x1d70 ? nf_ct_netns_do_get+0x139/0x540 ? nf_ct_netns_do_get+0xb5/0x540 ? net_generic+0x1f0/0x1f0 ? __create_object+0x5e/0x80 xt_check_target+0x1f0/0x930 ? textify_hooks.constprop.0+0x110/0x110 ? pcpu_alloc_noprof+0x7cd/0xcf0 ? xt_find_target+0x148/0x1e0 find_check_entry.constprop.0+0x6c0/0x920 ? get_info+0x380/0x380 ? __virt_addr_valid+0x1df/0x3b0 ? kasan_quarantine_put+0xe3/0x200 ? kfree+0x13e/0x3d0 ? translate_table+0xaf5/0x1750 translate_table+0xbd8/0x1750 ? ipt_unregister_table_exit+0x30/0x30 ? __might_fault+0xbb/0x170 do_ipt_set_ctl+0x408/0x1340 ? nf_sockopt_find.constprop.0+0x17b/0x1f0 ? lock_downgrade+0x680/0x680 ? lockdep_hardirqs_on_prepare+0x284/0x400 ? ipt_register_table+0x440/0x440 ? bit_wait_timeout+0x160/0x160 nf_setsockopt+0x6f/0xd0 raw_setsockopt+0x7e/0x200 ? raw_bind+0x590/0x590 ? do_user_addr_fault+0x812/0xd20 do_sock_setsockopt+0x1e2/0x3f0 ? move_addr_to_user+0x90/0x90 ? lock_downgrade+0x680/0x680 __sys_setsockopt+0x9e/0x100 __x64_sys_setsockopt+0xb9/0x150 ? do_syscall_64+0x33/0x140 do_syscall_64+0x6d/0x140 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7fba015134ce Code: 0f 1f 40 00 48 8b 15 59 69 0e 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b1 0f 1f 00 f3 0f 1e fa 49 89 ca b8 36 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 0a c3 66 0f 1f 84 00 00 00 00 00 48 8b 15 21 RSP: 002b:00007ffd9de6f388 EFLAGS: 00000246 ORIG_RAX: 0000000000000036 RAX: ffffffffffffffda RBX: 000055bd9017f490 RCX: 00007fba015134ce RDX: 0000000000000040 RSI: 0000000000000000 RDI: 0000000000000004 RBP: 0000000000000500 R08: 0000000000000560 R09: 0000000000000052 R10: 000055bd901800e0 R11: 0000000000000246 R12: 000055bd90180140 R13: 000055bd901800e0 R14: 000055bd9017f498 R15: 000055bd9017ff10 </TASK> Modules linked in: xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_registry overlay zram zsmalloc mlx4_ib mlx4_en mlx4_core rpcrdma rdma_ucm ib_uverbs ib_iser libiscsi scsi_transport_iscsi fuse ib_umad rdma_cm ib_ipoib iw_cm ib_cm ib_core ---[ end trace 0000000000000000 ]--- [akpm@linux-foundation.org: simplification, per Uros] Link: https://lkml.kernel.org/r/20241219121828.2120780-1-gal@nvidia.com Fixes: `dabddd687c` ("percpu: cast percpu pointer in PERCPU_PTR() via unsigned long") Signed-off-by: Gal Pressman <gal@nvidia.com> Closes: https://lore.kernel.org/all/7590f546-4021-4602-9252-0d525de35b52@nvidia.com Cc: Uros Bizjak <ubizjak@gmail.com> Cc: Bill Wendling <morbo@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Dennis Zhou <dennis@kernel.org> Cc: Justin Stitt <justinstitt@google.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-30 17:59:09 -08:00
Yosry Ahmed	eaebeb9392	mm: zswap: fix race between [de]compression and CPU hotunplug In zswap_compress() and zswap_decompress(), the per-CPU acomp_ctx of the current CPU at the beginning of the operation is retrieved and used throughout. However, since neither preemption nor migration are disabled, it is possible that the operation continues on a different CPU. If the original CPU is hotunplugged while the acomp_ctx is still in use, we run into a UAF bug as the resources attached to the acomp_ctx are freed during hotunplug in zswap_cpu_comp_dead(). The problem was introduced in commit `1ec3b5fe6e` ("mm/zswap: move to use crypto_acomp API for hardware acceleration") when the switch to the crypto_acomp API was made. Prior to that, the per-CPU crypto_comp was retrieved using get_cpu_ptr() which disables preemption and makes sure the CPU cannot go away from under us. Preemption cannot be disabled with the crypto_acomp API as a sleepable context is needed. Commit `8ba2f844f0` ("mm/zswap: change per-cpu mutex and buffer to per-acomp_ctx") increased the UAF surface area by making the per-CPU buffers dynamic, adding yet another resource that can be freed from under zswap compression/decompression by CPU hotunplug. There are a few ways to fix this: (a) Add a refcount for acomp_ctx. (b) Disable migration while using the per-CPU acomp_ctx. (c) Disable CPU hotunplug while using the per-CPU acomp_ctx by holding the CPUs read lock. Implement (c) since it's simpler than (a), and (b) involves using migrate_disable() which is apparently undesired (see huge comment in include/linux/preempt.h). Link: https://lkml.kernel.org/r/20241219212437.2714151-1-yosryahmed@google.com Fixes: `1ec3b5fe6e` ("mm/zswap: move to use crypto_acomp API for hardware acceleration") Signed-off-by: Yosry Ahmed <yosryahmed@google.com> Reported-by: Johannes Weiner <hannes@cmpxchg.org> Closes: https://lore.kernel.org/lkml/20241113213007.GB1564047@cmpxchg.org/ Reported-by: Sam Sun <samsun1006219@gmail.com> Closes: https://lore.kernel.org/lkml/CAEkJfYMtSdM5HceNsXUDf5haghD5+o2e7Qv4OcuruL4tPg6OaQ@mail.gmail.com/ Reviewed-by: Chengming Zhou <chengming.zhou@linux.dev> Acked-by: Barry Song <baohua@kernel.org> Reviewed-by: Nhat Pham <nphamcs@gmail.com> Cc: Vitaly Wool <vitalywool@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-30 17:59:09 -08:00
Dennis Lam	5f3fd772d1	ocfs2: fix slab-use-after-free due to dangling pointer dqi_priv When mounting ocfs2 and then remounting it as read-only, a slab-use-after-free occurs after the user uses a syscall to quota_getnextquota. Specifically, sb_dqinfo(sb, type)->dqi_priv is the dangling pointer. During the remounting process, the pointer dqi_priv is freed but is never set as null leaving it to be accessed. Additionally, the read-only option for remounting sets the DQUOT_SUSPENDED flag instead of setting the DQUOT_USAGE_ENABLED flags. Moreover, later in the process of getting the next quota, the function ocfs2_get_next_id is called and only checks the quota usage flags and not the quota suspended flags. To fix this, I set dqi_priv to null when it is freed after remounting with read-only and put a check for DQUOT_SUSPENDED in ocfs2_get_next_id. [akpm@linux-foundation.org: coding-style cleanups] Link: https://lkml.kernel.org/r/20241218023924.22821-2-dennis.lamerice@gmail.com Fixes: `8f9e8f5fcc` ("ocfs2: Fix Q_GETNEXTQUOTA for filesystem without quotas") Signed-off-by: Dennis Lam <dennis.lamerice@gmail.com> Reported-by: syzbot+d173bf8a5a7faeede34c@syzkaller.appspotmail.com Tested-by: syzbot+d173bf8a5a7faeede34c@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/6731d26f.050a0220.1fb99c.014b.GAE@google.com/T/ Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-30 17:59:09 -08:00
David Hildenbrand	3754137d26	fs/proc/task_mmu: fix pagemap flags with PMD THP entries on 32bit Entries (including flags) are u64, even on 32bit. So right now we are cutting of the flags on 32bit. This way, for example the cow selftest complains about: # ./cow ... Bail Out! read and ioctl return unmatched results for populated: 0 1 Link: https://lkml.kernel.org/r/20241217195000.1734039-1-david@redhat.com Fixes: `2c1f057e5b` ("fs/proc/task_mmu: properly detect PM_MMAP_EXCLUSIVE per page of PMD-mapped THPs") Signed-off-by: David Hildenbrand <david@redhat.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-30 17:59:08 -08:00
Arnd Bergmann	cb0ca08b32	kcov: mark in_softirq_really() as __always_inline If gcc decides not to inline in_softirq_really(), objtool warns about a function call with UACCESS enabled: kernel/kcov.o: warning: objtool: __sanitizer_cov_trace_pc+0x1e: call to in_softirq_really() with UACCESS enabled kernel/kcov.o: warning: objtool: check_kcov_mode+0x11: call to in_softirq_really() with UACCESS enabled Mark this as __always_inline to avoid the problem. Link: https://lkml.kernel.org/r/20241217071814.2261620-1-arnd@kernel.org Fixes: `7d4df2dad3` ("kcov: properly check for softirq context") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Marco Elver <elver@google.com> Cc: Aleksandr Nogikh <nogikh@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-30 17:59:08 -08:00
Baolin Wang	472098f233	docs: mm: fix the incorrect 'FileHugeMapped' field The '/proc/PID/smaps' does not have the 'FileHugeMapped' field to count the file transparent huge pages, instead, the 'FilePmdMapped' field should be used. Fix it. Link: https://lkml.kernel.org/r/d520ce3aba2b03b088be30bece732426a939049a.1734425264.git.baolin.wang@linux.alibaba.com Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Acked-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-30 17:59:08 -08:00
Mathieu Othacehe	4d9b90df2e	mailmap: modify the entry for Mathieu Othacehe Set my gnu address as the main one. Link: https://lkml.kernel.org/r/20241217100924.7821-1-othacehe@gnu.org Signed-off-by: Mathieu Othacehe <othacehe@gnu.org> Cc: Alex Elder <elder@kernel.org> Cc: David S. Miller <davem@davemloft.net> Cc: Geliang Tang <geliang@kernel.org> Cc: Kees Cook <kees@kernel.org> Cc: Matthieu Baerts (NGI0) <matttbe@kernel.org> Cc: Neeraj Upadhyay <neeraj.upadhyay@kernel.org> Cc: Quentin Monnet <qmo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-30 17:59:08 -08:00
Alessandro Carminati	cddc76b165	mm/kmemleak: fix sleeping function called from invalid context at print message Address a bug in the kernel that triggers a "sleeping function called from invalid context" warning when /sys/kernel/debug/kmemleak is printed under specific conditions: - CONFIG_PREEMPT_RT=y - Set SELinux as the LSM for the system - Set kptr_restrict to 1 - kmemleak buffer contains at least one item BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48 in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 136, name: cat preempt_count: 1, expected: 0 RCU nest depth: 2, expected: 2 6 locks held by cat/136: #0: ffff32e64bcbf950 (&p->lock){+.+.}-{3:3}, at: seq_read_iter+0xb8/0xe30 #1: ffffafe6aaa9dea0 (scan_mutex){+.+.}-{3:3}, at: kmemleak_seq_start+0x34/0x128 #3: ffff32e6546b1cd0 (&object->lock){....}-{2:2}, at: kmemleak_seq_show+0x3c/0x1e0 #4: ffffafe6aa8d8560 (rcu_read_lock){....}-{1:2}, at: has_ns_capability_noaudit+0x8/0x1b0 #5: ffffafe6aabbc0f8 (notif_lock){+.+.}-{2:2}, at: avc_compute_av+0xc4/0x3d0 irq event stamp: 136660 hardirqs last enabled at (136659): [<ffffafe6a80fd7a0>] _raw_spin_unlock_irqrestore+0xa8/0xd8 hardirqs last disabled at (136660): [<ffffafe6a80fd85c>] _raw_spin_lock_irqsave+0x8c/0xb0 softirqs last enabled at (0): [<ffffafe6a5d50b28>] copy_process+0x11d8/0x3df8 softirqs last disabled at (0): [<0000000000000000>] 0x0 Preemption disabled at: [<ffffafe6a6598a4c>] kmemleak_seq_show+0x3c/0x1e0 CPU: 1 UID: 0 PID: 136 Comm: cat Tainted: G E 6.11.0-rt7+ #34 Tainted: [E]=UNSIGNED_MODULE Hardware name: linux,dummy-virt (DT) Call trace: dump_backtrace+0xa0/0x128 show_stack+0x1c/0x30 dump_stack_lvl+0xe8/0x198 dump_stack+0x18/0x20 rt_spin_lock+0x8c/0x1a8 avc_perm_nonode+0xa0/0x150 cred_has_capability.isra.0+0x118/0x218 selinux_capable+0x50/0x80 security_capable+0x7c/0xd0 has_ns_capability_noaudit+0x94/0x1b0 has_capability_noaudit+0x20/0x30 restricted_pointer+0x21c/0x4b0 pointer+0x298/0x760 vsnprintf+0x330/0xf70 seq_printf+0x178/0x218 print_unreferenced+0x1a4/0x2d0 kmemleak_seq_show+0xd0/0x1e0 seq_read_iter+0x354/0xe30 seq_read+0x250/0x378 full_proxy_read+0xd8/0x148 vfs_read+0x190/0x918 ksys_read+0xf0/0x1e0 __arm64_sys_read+0x70/0xa8 invoke_syscall.constprop.0+0xd4/0x1d8 el0_svc+0x50/0x158 el0t_64_sync+0x17c/0x180 %pS and %pK, in the same back trace line, are redundant, and %pS can void %pK service in certain contexts. %pS alone already provides the necessary information, and if it cannot resolve the symbol, it falls back to printing the raw address voiding the original intent behind the %pK. Additionally, %pK requires a privilege check CAP_SYSLOG enforced through the LSM, which can trigger a "sleeping function called from invalid context" warning under RT_PREEMPT kernels when the check occurs in an atomic context. This issue may also affect other LSMs. This change avoids the unnecessary privilege check and resolves the sleeping function warning without any loss of information. Link: https://lkml.kernel.org/r/20241217142032.55793-1-acarmina@redhat.com Fixes: `3a6f33d86b` ("mm/kmemleak: use %pK to display kernel pointers in backtrace") Signed-off-by: Alessandro Carminati <acarmina@redhat.com> Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Clément Léger <clement.leger@bootlin.com> Cc: Alessandro Carminati <acarmina@redhat.com> Cc: Eric Chanudet <echanude@redhat.com> Cc: Gabriele Paoloni <gpaoloni@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-30 17:59:07 -08:00
Liu Shixin	59d9094df3	mm: hugetlb: independent PMD page table shared count The folio refcount may be increased unexpectly through try_get_folio() by caller such as split_huge_pages. In huge_pmd_unshare(), we use refcount to check whether a pmd page table is shared. The check is incorrect if the refcount is increased by the above caller, and this can cause the page table leaked: BUG: Bad page state in process sh pfn:109324 page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x66 pfn:0x109324 flags: 0x17ffff800000000(node=0\|zone=2\|lastcpupid=0xfffff) page_type: f2(table) raw: 017ffff800000000 0000000000000000 0000000000000000 0000000000000000 raw: 0000000000000066 0000000000000000 00000000f2000000 0000000000000000 page dumped because: nonzero mapcount ... CPU: 31 UID: 0 PID: 7515 Comm: sh Kdump: loaded Tainted: G B 6.13.0-rc2master+ #7 Tainted: [B]=BAD_PAGE Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 Call trace: show_stack+0x20/0x38 (C) dump_stack_lvl+0x80/0xf8 dump_stack+0x18/0x28 bad_page+0x8c/0x130 free_page_is_bad_report+0xa4/0xb0 free_unref_page+0x3cc/0x620 __folio_put+0xf4/0x158 split_huge_pages_all+0x1e0/0x3e8 split_huge_pages_write+0x25c/0x2d8 full_proxy_write+0x64/0xd8 vfs_write+0xcc/0x280 ksys_write+0x70/0x110 __arm64_sys_write+0x24/0x38 invoke_syscall+0x50/0x120 el0_svc_common.constprop.0+0xc8/0xf0 do_el0_svc+0x24/0x38 el0_svc+0x34/0x128 el0t_64_sync_handler+0xc8/0xd0 el0t_64_sync+0x190/0x198 The issue may be triggered by damon, offline_page, page_idle, etc, which will increase the refcount of page table. 1. The page table itself will be discarded after reporting the "nonzero mapcount". 2. The HugeTLB page mapped by the page table miss freeing since we treat the page table as shared and a shared page table will not be unmapped. Fix it by introducing independent PMD page table shared count. As described by comment, pt_index/pt_mm/pt_frag_refcount are used for s390 gmap, x86 pgds and powerpc, pt_share_count is used for x86/arm64/riscv pmds, so we can reuse the field as pt_share_count. Link: https://lkml.kernel.org/r/20241216071147.3984217-1-liushixin2@huawei.com Fixes: `39dde65c99` ("[PATCH] shared page table for hugetlb page") Signed-off-by: Liu Shixin <liushixin2@huawei.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Ken Chen <kenneth.w.chen@intel.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nanyong Sun <sunnanyong@huawei.com> Cc: Jane Chu <jane.chu@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-30 17:59:07 -08:00
Yang Erkun	1fd8bc7cd8	maple_tree: reload mas before the second call for mas_empty_area Change the LONG_MAX in simple_offset_add to 1024, and do latter: [root@fedora ~]# mkdir /tmp/dir [root@fedora ~]# for i in {1..1024}; do touch /tmp/dir/$i; done touch: cannot touch '/tmp/dir/1024': Device or resource busy [root@fedora ~]# rm /tmp/dir/123 [root@fedora ~]# touch /tmp/dir/1024 [root@fedora ~]# rm /tmp/dir/100 [root@fedora ~]# touch /tmp/dir/1025 touch: cannot touch '/tmp/dir/1025': Device or resource busy After we delete file 100, actually this is a empty entry, but the latter create failed unexpected. mas_alloc_cyclic has two chance to find empty entry. First find the entry with range range_lo and range_hi, if no empty entry exist, and range_lo > min, retry find with range min and range_hi. However, the first call mas_empty_area may mark mas as EBUSY, and the second call for mas_empty_area will return false directly. Fix this by reload mas before second call for mas_empty_area. [Liam.Howlett@Oracle.com: fix mas_alloc_cyclic() second search] Link: https://lore.kernel.org/all/20241216060600.287B4C4CED0@smtp.kernel.org/ Link: https://lkml.kernel.org/r/20241216190113.1226145-2-Liam.Howlett@oracle.com Link: https://lkml.kernel.org/r/20241214093005.72284-1-yangerkun@huaweicloud.com Fixes: `9b6713cc75` ("maple_tree: Add mtree_alloc_cyclic()") Signed-off-by: Yang Erkun <yangerkun@huawei.com> Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Chuck Lever <chuck.lever@oracle.com> says: Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-30 17:59:07 -08:00
Yafang Shao	158cdce87c	mm/readahead: fix large folio support in async readahead When testing large folio support with XFS on our servers, we observed that only a few large folios are mapped when reading large files via mmap. After a thorough analysis, I identified it was caused by the `/sys/block//queue/read_ahead_kb` setting. On our test servers, this parameter is set to 128KB. After I tune it to 2MB, the large folio can work as expected. However, I believe the large folio behavior should not be dependent on the value of read_ahead_kb. It would be more robust if the kernel can automatically adopt to it. With /sys/block//queue/read_ahead_kb set to 128KB and performing a sequential read on a 1GB file using MADV_HUGEPAGE, the differences in /proc/meminfo are as follows: - before this patch FileHugePages: 18432 kB FilePmdMapped: 4096 kB - after this patch FileHugePages: 1067008 kB FilePmdMapped: 1048576 kB This shows that after applying the patch, the entire 1GB file is mapped to huge pages. The stable list is CCed, as without this patch, large folios don't function optimally in the readahead path. It's worth noting that if read_ahead_kb is set to a larger value that isn't aligned with huge page sizes (e.g., 4MB + 128KB), it may still fail to map to hugepages. Link: https://lkml.kernel.org/r/20241108141710.9721-1-laoar.shao@gmail.com Link: https://lkml.kernel.org/r/20241206083025.3478-1-laoar.shao@gmail.com Fixes: `4687fdbb80` ("mm/filemap: Support VM_HUGEPAGE for file mappings") Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Tested-by: kernel test robot <oliver.sang@intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: David Hildenbrand <david@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-30 17:59:07 -08:00
Kefeng Wang	34d7cf637c	mm: don't try THP alignment for FS without get_unmapped_area Commit `ed48e87c7d` ("thp: add thp_get_unmapped_area_vmflags()") changes thp_get_unmapped_area() to thp_get_unmapped_area_vmflags() in __get_unmapped_area(), which doesn't initialize local get_area for anonymous mappings. This leads to us always trying THP alignment even for file_operations which have a NULL ->get_unmapped_area() callback. Since commit `efa7df3e3b` ("mm: align larger anonymous mappings on THP boundaries") we only want to enable THP alignment for anonymous mappings, so add a !file check to avoid attempting THP alignment for file mappings. Found issue by code inspection. THP alignment is used for easy or more pmd mappings, from vma side. This may cause unnecessary VMA fragmentation and potentially worse performance on filesystems that do not actually support THPs and thus cannot benefit from the alignment. Link: https://lkml.kernel.org/r/20241206070345.2526501-1-wangkefeng.wang@huawei.com Fixes: `ed48e87c7d` ("thp: add thp_get_unmapped_area_vmflags()") Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Yang Shi <shy828301@gmail.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: David Hildenbrand <david@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-30 17:59:06 -08:00
Seiji Nishikawa	6aaced5abd	mm: vmscan: account for free pages to prevent infinite Loop in throttle_direct_reclaim() The task sometimes continues looping in throttle_direct_reclaim() because allow_direct_reclaim(pgdat) keeps returning false. #0 [ffff80002cb6f8d0] __switch_to at ffff8000080095ac #1 [ffff80002cb6f900] __schedule at ffff800008abbd1c #2 [ffff80002cb6f990] schedule at ffff800008abc50c #3 [ffff80002cb6f9b0] throttle_direct_reclaim at ffff800008273550 #4 [ffff80002cb6fa20] try_to_free_pages at ffff800008277b68 #5 [ffff80002cb6fae0] __alloc_pages_nodemask at ffff8000082c4660 #6 [ffff80002cb6fc50] alloc_pages_vma at ffff8000082e4a98 #7 [ffff80002cb6fca0] do_anonymous_page at ffff80000829f5a8 #8 [ffff80002cb6fce0] __handle_mm_fault at ffff8000082a5974 #9 [ffff80002cb6fd90] handle_mm_fault at ffff8000082a5bd4 At this point, the pgdat contains the following two zones: NODE: 4 ZONE: 0 ADDR: ffff00817fffe540 NAME: "DMA32" SIZE: 20480 MIN/LOW/HIGH: 11/28/45 VM_STAT: NR_FREE_PAGES: 359 NR_ZONE_INACTIVE_ANON: 18813 NR_ZONE_ACTIVE_ANON: 0 NR_ZONE_INACTIVE_FILE: 50 NR_ZONE_ACTIVE_FILE: 0 NR_ZONE_UNEVICTABLE: 0 NR_ZONE_WRITE_PENDING: 0 NR_MLOCK: 0 NR_BOUNCE: 0 NR_ZSPAGES: 0 NR_FREE_CMA_PAGES: 0 NODE: 4 ZONE: 1 ADDR: ffff00817fffec00 NAME: "Normal" SIZE: 8454144 PRESENT: 98304 MIN/LOW/HIGH: 68/166/264 VM_STAT: NR_FREE_PAGES: 146 NR_ZONE_INACTIVE_ANON: 94668 NR_ZONE_ACTIVE_ANON: 3 NR_ZONE_INACTIVE_FILE: 735 NR_ZONE_ACTIVE_FILE: 78 NR_ZONE_UNEVICTABLE: 0 NR_ZONE_WRITE_PENDING: 0 NR_MLOCK: 0 NR_BOUNCE: 0 NR_ZSPAGES: 0 NR_FREE_CMA_PAGES: 0 In allow_direct_reclaim(), while processing ZONE_DMA32, the sum of inactive/active file-backed pages calculated in zone_reclaimable_pages() based on the result of zone_page_state_snapshot() is zero. Additionally, since this system lacks swap, the calculation of inactive/ active anonymous pages is skipped. crash> p nr_swap_pages nr_swap_pages = $1937 = { counter = 0 } As a result, ZONE_DMA32 is deemed unreclaimable and skipped, moving on to the processing of the next zone, ZONE_NORMAL, despite ZONE_DMA32 having free pages significantly exceeding the high watermark. The problem is that the pgdat->kswapd_failures hasn't been incremented. crash> px ((struct pglist_data *) 0xffff00817fffe540)->kswapd_failures $1935 = 0x0 This is because the node deemed balanced. The node balancing logic in balance_pgdat() evaluates all zones collectively. If one or more zones (e.g., ZONE_DMA32) have enough free pages to meet their watermarks, the entire node is deemed balanced. This causes balance_pgdat() to exit early before incrementing the kswapd_failures, as it considers the overall memory state acceptable, even though some zones (like ZONE_NORMAL) remain under significant pressure. The patch ensures that zone_reclaimable_pages() includes free pages (NR_FREE_PAGES) in its calculation when no other reclaimable pages are available (e.g., file-backed or anonymous pages). This change prevents zones like ZONE_DMA32, which have sufficient free pages, from being mistakenly deemed unreclaimable. By doing so, the patch ensures proper node balancing, avoids masking pressure on other zones like ZONE_NORMAL, and prevents infinite loops in throttle_direct_reclaim() caused by allow_direct_reclaim(pgdat) repeatedly returning false. The kernel hangs due to a task stuck in throttle_direct_reclaim(), caused by a node being incorrectly deemed balanced despite pressure in certain zones, such as ZONE_NORMAL. This issue arises from zone_reclaimable_pages() returning 0 for zones without reclaimable file- backed or anonymous pages, causing zones like ZONE_DMA32 with sufficient free pages to be skipped. The lack of swap or reclaimable pages results in ZONE_DMA32 being ignored during reclaim, masking pressure in other zones. Consequently, pgdat->kswapd_failures remains 0 in balance_pgdat(), preventing fallback mechanisms in allow_direct_reclaim() from being triggered, leading to an infinite loop in throttle_direct_reclaim(). This patch modifies zone_reclaimable_pages() to account for free pages (NR_FREE_PAGES) when no other reclaimable pages exist. This ensures zones with sufficient free pages are not skipped, enabling proper balancing and reclaim behavior. [akpm@linux-foundation.org: coding-style cleanups] Link: https://lkml.kernel.org/r/20241130164346.436469-1-snishika@redhat.com Link: https://lkml.kernel.org/r/20241130161236.433747-2-snishika@redhat.com Fixes: `5a1c84b404` ("mm: remove reclaim and compaction retry approximations") Signed-off-by: Seiji Nishikawa <snishika@redhat.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-30 17:59:06 -08:00
Lorenzo Stoakes	ea0916e01d	selftests/memfd: add test for mapping write-sealed memfd read-only Now we have reinstated the ability to map F_SEAL_WRITE mappings read-only, assert that we are able to do this in a test to ensure that we do not regress this again. Link: https://lkml.kernel.org/r/a6377ec470b14c0539b4600cf8fa24bf2e4858ae.1732804776.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Jann Horn <jannh@google.com> Cc: Julian Orth <ju.orth@gmail.com> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-30 17:59:06 -08:00
Lorenzo Stoakes	8ec396d05d	mm: reinstate ability to map write-sealed memfd mappings read-only Patch series "mm: reinstate ability to map write-sealed memfd mappings read-only". In commit `158978945f` ("mm: perform the mapping_map_writable() check after call_mmap()") (and preceding changes in the same series) it became possible to mmap() F_SEAL_WRITE sealed memfd mappings read-only. Commit `5de195060b` ("mm: resolve faulty mmap_region() error path behaviour") unintentionally undid this logic by moving the mapping_map_writable() check before the shmem_mmap() hook is invoked, thereby regressing this change. This series reworks how we both permit write-sealed mappings being mapped read-only and disallow mprotect() from undoing the write-seal, fixing this regression. We also add a regression test to ensure that we do not accidentally regress this in future. Thanks to Julian Orth for reporting this regression. This patch (of 2): In commit `158978945f` ("mm: perform the mapping_map_writable() check after call_mmap()") (and preceding changes in the same series) it became possible to mmap() F_SEAL_WRITE sealed memfd mappings read-only. This was previously unnecessarily disallowed, despite the man page documentation indicating that it would be, thereby limiting the usefulness of F_SEAL_WRITE logic. We fixed this by adapting logic that existed for the F_SEAL_FUTURE_WRITE seal (one which disallows future writes to the memfd) to also be used for F_SEAL_WRITE. For background - the F_SEAL_FUTURE_WRITE seal clears VM_MAYWRITE for a read-only mapping to disallow mprotect() from overriding the seal - an operation performed by seal_check_write(), invoked from shmem_mmap(), the f_op->mmap() hook used by shmem mappings. By extending this to F_SEAL_WRITE and critically - checking mapping_map_writable() to determine if we may map the memfd AFTER we invoke shmem_mmap() - the desired logic becomes possible. This is because mapping_map_writable() explicitly checks for VM_MAYWRITE, which we will have cleared. Commit `5de195060b` ("mm: resolve faulty mmap_region() error path behaviour") unintentionally undid this logic by moving the mapping_map_writable() check before the shmem_mmap() hook is invoked, thereby regressing this change. We reinstate this functionality by moving the check out of shmem_mmap() and instead performing it in do_mmap() at the point at which VMA flags are being determined, which seems in any case to be a more appropriate place in which to make this determination. In order to achieve this we rework memfd seal logic to allow us access to this information using existing logic and eliminate the clearing of VM_MAYWRITE from seal_check_write() which we are performing in do_mmap() instead. Link: https://lkml.kernel.org/r/99fc35d2c62bd2e05571cf60d9f8b843c56069e0.1732804776.git.lorenzo.stoakes@oracle.com Fixes: `5de195060b` ("mm: resolve faulty mmap_region() error path behaviour") Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reported-by: Julian Orth <ju.orth@gmail.com> Closes: https://lore.kernel.org/all/CAHijbEUMhvJTN9Xw1GmbM266FXXv=U7s4L_Jem5x3AaPZxrYpQ@mail.gmail.com/ Cc: Jann Horn <jannh@google.com> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-30 17:59:06 -08:00
Pascal Hambourg	03c8d0af2e	sky2: Add device ID 11ab:4373 for Marvell 88E8075 A Marvell 88E8075 ethernet controller has this device ID instead of 11ab:4370 and works fine with the sky2 driver. Signed-off-by: Pascal Hambourg <pascal@plouf.fr.eu.org> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/10165a62-99fb-4be6-8c64-84afd6234085@plouf.fr.eu.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-30 17:57:47 -08:00
Paolo Abeni	cbb26f7d84	mptcp: fix TCP options overflow. Syzbot reported the following splat: Oops: general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f] CPU: 1 UID: 0 PID: 5836 Comm: sshd Not tainted 6.13.0-rc3-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/25/2024 RIP: 0010:_compound_head include/linux/page-flags.h:242 [inline] RIP: 0010:put_page+0x23/0x260 include/linux/mm.h:1552 Code: 90 90 90 90 90 90 90 55 41 57 41 56 53 49 89 fe 48 bd 00 00 00 00 00 fc ff df e8 f8 5e 12 f8 49 8d 5e 08 48 89 d8 48 c1 e8 03 <80> 3c 28 00 74 08 48 89 df e8 8f c7 78 f8 48 8b 1b 48 89 de 48 83 RSP: 0000:ffffc90003916c90 EFLAGS: 00010202 RAX: 0000000000000001 RBX: 0000000000000008 RCX: ffff888030458000 RDX: 0000000000000100 RSI: 0000000000000000 RDI: 0000000000000000 RBP: dffffc0000000000 R08: ffffffff898ca81d R09: 1ffff110054414ac R10: dffffc0000000000 R11: ffffed10054414ad R12: 0000000000000007 R13: ffff88802a20a542 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f34f496e800(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f9d6ec9ec28 CR3: 000000004d260000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> skb_page_unref include/linux/skbuff_ref.h:43 [inline] __skb_frag_unref include/linux/skbuff_ref.h:56 [inline] skb_release_data+0x483/0x8a0 net/core/skbuff.c:1119 skb_release_all net/core/skbuff.c:1190 [inline] __kfree_skb+0x55/0x70 net/core/skbuff.c:1204 tcp_clean_rtx_queue net/ipv4/tcp_input.c:3436 [inline] tcp_ack+0x2442/0x6bc0 net/ipv4/tcp_input.c:4032 tcp_rcv_state_process+0x8eb/0x44e0 net/ipv4/tcp_input.c:6805 tcp_v4_do_rcv+0x77d/0xc70 net/ipv4/tcp_ipv4.c:1939 tcp_v4_rcv+0x2dc0/0x37f0 net/ipv4/tcp_ipv4.c:2351 ip_protocol_deliver_rcu+0x22e/0x440 net/ipv4/ip_input.c:205 ip_local_deliver_finish+0x341/0x5f0 net/ipv4/ip_input.c:233 NF_HOOK+0x3a4/0x450 include/linux/netfilter.h:314 NF_HOOK+0x3a4/0x450 include/linux/netfilter.h:314 __netif_receive_skb_one_core net/core/dev.c:5672 [inline] __netif_receive_skb+0x2bf/0x650 net/core/dev.c:5785 process_backlog+0x662/0x15b0 net/core/dev.c:6117 __napi_poll+0xcb/0x490 net/core/dev.c:6883 napi_poll net/core/dev.c:6952 [inline] net_rx_action+0x89b/0x1240 net/core/dev.c:7074 handle_softirqs+0x2d4/0x9b0 kernel/softirq.c:561 __do_softirq kernel/softirq.c:595 [inline] invoke_softirq kernel/softirq.c:435 [inline] __irq_exit_rcu+0xf7/0x220 kernel/softirq.c:662 irq_exit_rcu+0x9/0x30 kernel/softirq.c:678 instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1049 [inline] sysvec_apic_timer_interrupt+0x57/0xc0 arch/x86/kernel/apic/apic.c:1049 asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702 RIP: 0033:0x7f34f4519ad5 Code: 85 d2 74 0d 0f 10 02 48 8d 54 24 20 0f 11 44 24 20 64 8b 04 25 18 00 00 00 85 c0 75 27 41 b8 08 00 00 00 b8 0f 01 00 00 0f 05 <48> 3d 00 f0 ff ff 76 75 48 8b 15 24 73 0d 00 f7 d8 64 89 02 48 83 RSP: 002b:00007ffec5b32ce0 EFLAGS: 00000246 RAX: 0000000000000001 RBX: 00000000000668a0 RCX: 00007f34f4519ad5 RDX: 00007ffec5b32d00 RSI: 0000000000000004 RDI: 0000564f4bc6cae0 RBP: 0000564f4bc6b5a0 R08: 0000000000000008 R09: 0000000000000000 R10: 00007ffec5b32de8 R11: 0000000000000246 R12: 0000564f48ea8aa4 R13: 0000000000000001 R14: 0000564f48ea93e8 R15: 00007ffec5b32d68 </TASK> Eric noted a probable shinfo->nr_frags corruption, which indeed occurs. The root cause is a buggy MPTCP option len computation in some circumstances: the ADD_ADDR option should be mutually exclusive with DSS since the blamed commit. Still, mptcp_established_options_add_addr() tries to set the relevant info in mptcp_out_options, if the remaining space is large enough even when DSS is present. Since the ADD_ADDR infos and the DSS share the same union fields, adding first corrupts the latter. In the worst-case scenario, such corruption increases the DSS binary layout, exceeding the computed length and possibly overwriting the skb shared info. Address the issue by enforcing mutual exclusion in mptcp_established_options_add_addr(), too. Cc: stable@vger.kernel.org Reported-by: syzbot+38a095a81f30d82884c1@syzkaller.appspotmail.com Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/538 Fixes: `1bff1e43a3` ("mptcp: optimize out option generation") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/025d9df8cde3c9a557befc47e9bc08fbbe3476e5.1734771049.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-30 17:47:13 -08:00
Joe Hattori	ad5c318086	net: mv643xx_eth: fix an OF node reference leak Current implementation of mv643xx_eth_shared_of_add_port() calls of_parse_phandle(), but does not release the refcount on error. Call of_node_put() in the error path and in mv643xx_eth_shared_of_remove(). This bug was found by an experimental verification tool that I am developing. Fixes: `76723bca28` ("net: mv643xx_eth: add DT parsing support") Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp> Link: https://patch.msgid.link/20241221081448.3313163-1-joe@pf.is.s.u-tokyo.ac.jp Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-30 17:40:34 -08:00
Joshua Washington	fb3a9a1165	gve: trigger RX NAPI instead of TX NAPI in gve_xsk_wakeup Commit `ba0925c34e` ("gve: process XSK TX descriptors as part of RX NAPI") moved XSK TX processing to be part of the RX NAPI. However, that commit did not include triggering the RX NAPI in gve_xsk_wakeup. This is necessary because the TX NAPI only processes TX completions, meaning that a TX wakeup would not actually trigger XSK descriptor processing. Also, the branch on XDP_WAKEUP_TX was supposed to have been removed, as the NAPI should be scheduled whether the wakeup is for RX or TX. Fixes: `ba0925c34e` ("gve: process XSK TX descriptors as part of RX NAPI") Cc: stable@vger.kernel.org Signed-off-by: Joshua Washington <joshwash@google.com> Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com> Link: https://patch.msgid.link/20241221032807.302244-1-pkaligineedi@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-30 17:35:46 -08:00
Vitalii Mordan	b255ef45fc	eth: bcmsysport: fix call balance of priv->clk handling routines Check the return value of clk_prepare_enable to ensure that priv->clk has been successfully enabled. If priv->clk was not enabled during bcm_sysport_probe, bcm_sysport_resume, or bcm_sysport_open, it must not be disabled in any subsequent execution paths. Fixes: `31bc72d976` ("net: systemport: fetch and use clock resources") Signed-off-by: Vitalii Mordan <mordan@ispras.ru> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20241227123007.2333397-1-mordan@ispras.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-30 17:33:46 -08:00
Jens Axboe	a9c83a0ab6	io_uring/timeout: flush timeouts outside of the timeout lock syzbot reports that a recent fix causes nesting issues between the (now) raw timeoutlock and the eventfd locking: ============================= [ BUG: Invalid wait context ] 6.13.0-rc4-00080-g9828a4c0901f #29 Not tainted ----------------------------- kworker/u32:0/68094 is trying to lock: ffff000014d7a520 (&ctx->wqh#2){..-.}-{3:3}, at: eventfd_signal_mask+0x64/0x180 other info that might help us debug this: context-{5:5} 6 locks held by kworker/u32:0/68094: #0: ffff0000c1d98148 ((wq_completion)iou_exit){+.+.}-{0:0}, at: process_one_work+0x4e8/0xfc0 #1: ffff80008d927c78 ((work_completion)(&ctx->exit_work)){+.+.}-{0:0}, at: process_one_work+0x53c/0xfc0 #2: ffff0000c59bc3d8 (&ctx->completion_lock){+.+.}-{3:3}, at: io_kill_timeouts+0x40/0x180 #3: ffff0000c59bc358 (&ctx->timeout_lock){-.-.}-{2:2}, at: io_kill_timeouts+0x48/0x180 #4: ffff800085127aa0 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire+0x8/0x38 #5: ffff800085127aa0 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire+0x8/0x38 stack backtrace: CPU: 7 UID: 0 PID: 68094 Comm: kworker/u32:0 Not tainted 6.13.0-rc4-00080-g9828a4c0901f #29 Hardware name: linux,dummy-virt (DT) Workqueue: iou_exit io_ring_exit_work Call trace: show_stack+0x1c/0x30 (C) __dump_stack+0x24/0x30 dump_stack_lvl+0x60/0x80 dump_stack+0x14/0x20 __lock_acquire+0x19f8/0x60c8 lock_acquire+0x1a4/0x540 _raw_spin_lock_irqsave+0x90/0xd0 eventfd_signal_mask+0x64/0x180 io_eventfd_signal+0x64/0x108 io_req_local_work_add+0x294/0x430 __io_req_task_work_add+0x1c0/0x270 io_kill_timeout+0x1f0/0x288 io_kill_timeouts+0xd4/0x180 io_uring_try_cancel_requests+0x2e8/0x388 io_ring_exit_work+0x150/0x550 process_one_work+0x5e8/0xfc0 worker_thread+0x7ec/0xc80 kthread+0x24c/0x300 ret_from_fork+0x10/0x20 because after the preempt-rt fix for the timeout lock nesting inside the io-wq lock, we now have the eventfd spinlock nesting inside the raw timeout spinlock. Rather than play whack-a-mole with other nesting on the timeout lock, split the deletion and killing of timeouts so queueing the task_work for the timeout cancelations can get done outside of the timeout lock. Reported-by: syzbot+b1fc199a40b65d601b65@syzkaller.appspotmail.com Fixes: `020b40f356` ("io_uring: make ctx->timeout_lock a raw spinlock") Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-30 14:15:17 -07:00
Steven Davis	de30d74f58	cdrom: Fix typo, 'devicen' to 'device' Fix typo in cd_dbg line to add trailing newline character. Signed-off-by: Steven Davis <goldside000@outlook.com> Link: https://lore.kernel.org/lkml/20241229165744.21725-1-goldside000@outlook.com Reviewed-by: Phillip Potter <phil@philpotter.co.uk> Link: https://lore.kernel.org/lkml/Z3GV2W_MUOw5BrtR@equinox Signed-off-by: Phillip Potter <phil@philpotter.co.uk> Link: https://lore.kernel.org/r/20241230193431.441120-2-phil@philpotter.co.uk Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-30 12:56:06 -07:00
Linus Torvalds	ccb98ccef0	Merge tag 'platform-drivers-x86-v6.13-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform drivers fixes from Ilpo Järvinen: "hp-wmi: - mark 8A15 board for timed OMEN thermal profile mlx-platform: - call pci_dev_put() to balance the refcount thinkpad-acpi: - Add support for hotkey 0x1401" * tag 'platform-drivers-x86-v6.13-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86: thinkpad-acpi: Add support for hotkey 0x1401 platform/x86: hp-wmi: mark 8A15 board for timed OMEN thermal profile platform/x86: mlx-platform: call pci_dev_put() to balance the refcount	2024-12-30 11:20:42 -08:00
AngeloGioacchino Del Regno	76aed5e00f	drm/mediatek: mtk_dsi: Add registers to pdata to fix MT8186/MT8188 Registers DSI_VM_CMD and DSI_SHADOW_DEBUG start at different addresses in both MT8186 and MT8188 compared to the older IPs. Add two members in struct mtk_dsi_driver_data to specify the offsets for these two registers on a per-SoC basis, then do specify those in all of the currently present SoC driver data. This fixes writes to the Video Mode Command Packet Control register, fixing enablement of command packet transmission (VM_CMD_EN) and allowance of this transmission during the VFP period (TS_VFP_EN) on both MT8186 and MT8188. Fixes: `03d7adc410` ("drm/mediatek: Add mt8186 dsi compatible to mtk_dsi.c") Fixes: `814d5341f3` ("drm/mediatek: Add mt8188 dsi compatible to mtk_dsi.c") Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: CK Hu <ck.hu@mediatek.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20241219112733.47907-1-angelogioacchino.delregno@collabora.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>	2024-12-30 14:50:38 +00:00
Fei Shao	8fe3ee95da	dt-bindings: display: mediatek: dp: Reference common DAI properties The MediaTek DP hardware supports audio and exposes a DAI, so the '#sound-dai-cells' property is needed for describing the DAI links. Reference the dai-common.yaml schema to allow '#sound-dai-cells' to be used, and filter out non-DP compatibles as MediaTek eDP in the same binding doesn't support audio. This fixes dtbs_check error: '#sound-dai-cells' does not match any of the regexes: 'pinctrl-[0-9]+' Signed-off-by: Fei Shao <fshao@chromium.org> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patchwork.kernel.org/project/dri-devel/patch/20241105090207.3892242-1-fshao@chromium.org/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>	2024-12-30 14:38:05 +00:00
Clément Le Goffic	cc0dc9e871	watchdog: stm32_iwdg: fix error message during driver probe The commit `3ab1663af6` ("watchdog: stm32_iwdg: Add pretimeout support") introduces the support for the pre-timeout interrupt. The support for this interrupt is optional but the driver uses the platform_get_irq() which produces an error message during the driver probe if we don't have any `interrupts` property in the DT. Use the platform_get_irq_optional() API to get rid of the error message as this property is optional. Fixes: `3ab1663af6` ("watchdog: stm32_iwdg: Add pretimeout support") Signed-off-by: Clément Le Goffic <clement.legoffic@foss.st.com> Reviewed-by: Marek Vasut <marex@denx.de> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20241218092227.771133-1-clement.legoffic@foss.st.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>	2024-12-30 15:28:02 +01:00
Liankun Yang	0d68b55887	drm/mediatek: Fix mode valid issue for dp Fix dp mode valid issue to avoid abnormal display of limit state. After DP passes link training, it can express the lane count of the current link status is good. Calculate the maximum bandwidth supported by DP using the current lane count. The color format will select the best one based on the bandwidth requirements of the current timing mode. If the current timing mode uses RGB and meets the DP link bandwidth requirements, RGB will be used. If the timing mode uses RGB but does not meet the DP link bandwidthi requirements, it will continue to check whether YUV422 meets the DP link bandwidth. FEC overhead is approximately 2.4% from DP 1.4a spec 2.2.1.4.2. The down-spread amplitude shall either be disabled (0.0%) or up to 0.5% from 1.4a 3.5.2.6. Add up to approximately 3% total overhead. Because rate is already divided by 10, mode->clock does not need to be multiplied by 10. Fixes: `f70ac097a2` ("drm/mediatek: Add MT8195 Embedded DisplayPort driver") Signed-off-by: Liankun Yang <liankun.yang@mediatek.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20241025083036.8829-3-liankun.yang@mediatek.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>	2024-12-30 14:25:03 +00:00
Liankun Yang	ef24fbd8f1	drm/mediatek: Fix YCbCr422 color format issue for DP Setting up misc0 for Pixel Encoding Format. According to the definition of YCbCr in spec 1.2a Table 2-96, 0x1 << 1 should be written to the register. Use switch case to distinguish RGB, YCbCr422, and unsupported color formats. Fixes: `f70ac097a2` ("drm/mediatek: Add MT8195 Embedded DisplayPort driver") Signed-off-by: Liankun Yang <liankun.yang@mediatek.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20241025083036.8829-2-liankun.yang@mediatek.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>	2024-12-30 14:10:27 +00:00
Chun-Kuang Hu	a10f26062a	Revert "drm/mediatek: Switch to for_each_child_of_node_scoped()" This reverts commit `fd620fc25d`. Boot failures reported by KernelCI: [ 4.395400] mediatek-drm mediatek-drm.5.auto: bound 1c014000.merge (ops 0xffffd35fd12975f8) [ 4.396155] mediatek-drm mediatek-drm.5.auto: bound 1c000000.ovl (ops 0xffffd35fd12977b8) [ 4.411951] mediatek-drm mediatek-drm.5.auto: bound 1c002000.rdma (ops 0xffffd35fd12989c0) [ 4.536837] mediatek-drm mediatek-drm.5.auto: bound 1c004000.ccorr (ops 0xffffd35fd1296cf0) [ 4.545181] mediatek-drm mediatek-drm.5.auto: bound 1c005000.aal (ops 0xffffd35fd1296a80) [ 4.553344] mediatek-drm mediatek-drm.5.auto: bound 1c006000.gamma (ops 0xffffd35fd12972b0) [ 4.561680] mediatek-drm mediatek-drm.5.auto: bound 1c014000.merge (ops 0xffffd35fd12975f8) [ 4.570025] ------------[ cut here ]------------ [ 4.574630] refcount_t: underflow; use-after-free. [ 4.579416] WARNING: CPU: 6 PID: 81 at lib/refcount.c:28 refcount_warn_saturate+0xf4/0x148 [ 4.587670] Modules linked in: [ 4.590714] CPU: 6 UID: 0 PID: 81 Comm: kworker/u32:3 Tainted: G W 6.12.0 #1 cab58e2e59020ebd4be8ada89a65f465a316c742 [ 4.602695] Tainted: [W]=WARN [ 4.605649] Hardware name: Acer Tomato (rev2) board (DT) [ 4.610947] Workqueue: events_unbound deferred_probe_work_func [ 4.616768] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 4.623715] pc : refcount_warn_saturate+0xf4/0x148 [ 4.628493] lr : refcount_warn_saturate+0xf4/0x148 [ 4.633270] sp : ffff8000807639c0 [ 4.636571] x29: ffff8000807639c0 x28: ffff34ff4116c640 x27: ffff34ff4368e080 [ 4.643693] x26: ffffd35fd1299ac8 x25: ffff34ff46c8c410 x24: 0000000000000000 [ 4.650814] x23: ffff34ff4368e080 x22: 00000000fffffdfb x21: 0000000000000002 [ 4.657934] x20: ffff34ff470c6000 x19: ffff34ff410c7c10 x18: 0000000000000006 [ 4.665055] x17: 666678302073706f x16: 2820656772656d2e x15: ffff800080763440 [ 4.672176] x14: 0000000000000000 x13: 2e656572662d7265 x12: ffffd35fd2ed14f0 [ 4.679297] x11: 0000000000000001 x10: 0000000000000001 x9 : ffffd35fd0342150 [ 4.686418] x8 : c0000000ffffdfff x7 : ffffd35fd2e21450 x6 : 00000000000affa8 [ 4.693539] x5 : ffffd35fd2ed1498 x4 : 0000000000000000 x3 : 0000000000000000 [ 4.700660] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff34ff40932580 [ 4.707781] Call trace: [ 4.710216] refcount_warn_saturate+0xf4/0x148 (P) [ 4.714993] refcount_warn_saturate+0xf4/0x148 (L) [ 4.719772] kobject_put+0x110/0x118 [ 4.723335] put_device+0x1c/0x38 [ 4.726638] mtk_drm_bind+0x294/0x5c0 [ 4.730289] try_to_bring_up_aggregate_device+0x16c/0x1e0 [ 4.735673] __component_add+0xbc/0x1c0 [ 4.739495] component_add+0x1c/0x30 [ 4.743058] mtk_disp_rdma_probe+0x140/0x210 [ 4.747314] platform_probe+0x70/0xd0 [ 4.750964] really_probe+0xc4/0x2a8 [ 4.754527] __driver_probe_device+0x80/0x140 [ 4.758870] driver_probe_device+0x44/0x120 [ 4.763040] __device_attach_driver+0xc0/0x108 [ 4.767470] bus_for_each_drv+0x8c/0xf0 [ 4.771294] __device_attach+0xa4/0x198 [ 4.775117] device_initial_probe+0x1c/0x30 [ 4.779286] bus_probe_device+0xb4/0xc0 [ 4.783109] deferred_probe_work_func+0xb0/0x100 [ 4.787714] process_one_work+0x18c/0x420 [ 4.791712] worker_thread+0x30c/0x418 [ 4.795449] kthread+0x128/0x138 [ 4.798665] ret_from_fork+0x10/0x20 [ 4.802229] ---[ end trace 0000000000000000 ]--- Fixes: `fd620fc25d` ("drm/mediatek: Switch to for_each_child_of_node_scoped()") Cc: stable@vger.kernel.org Cc: Javier Carrasco <javier.carrasco.cruz@gmail.com> Reported-by: Sasha Levin <sashal@kernel.org> Closes: https://lore.kernel.org/lkml/Z0lNHdwQ3rODHQ2c@sashalap/T/#mfaa6343cfd4d59aae5912b095c0693c0553e746c Link: https://patchwork.kernel.org/project/dri-devel/patch/20241223151218.7958-1-chunkuang.hu@kernel.org/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>	2024-12-30 13:21:16 +00:00
Takashi Iwai	abbff41b69	Revert "ALSA: ump: Don't enumeration invalid groups for legacy rawmidi" This reverts commit `c2d188e137`. Although it's fine to filter the invalid UMP groups at the first probe time, this will become a problem when UMP groups are updated and (re-)activated. Then there is no way to re-add the substreams properly for the legacy rawmidi, and the new active groups will be still invisible. So let's revert the change. This will move back to showing the full 16 groups, but it's better than forever lost. Link: https://patch.msgid.link/20241230114023.3787-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-12-30 12:40:53 +01:00
Takashi Iwai	0179488ca9	ALSA: seq: oss: Fix races at processing SysEx messages OSS sequencer handles the SysEx messages split in 6 bytes packets, and ALSA sequencer OSS layer tries to combine those. It stores the data in the internal buffer and this access is racy as of now, which may lead to the out-of-bounds access. As a temporary band-aid fix, introduce a mutex for serializing the process of the SysEx message packets. Reported-by: Kun Hu <huk23@m.fudan.edu.cn> Closes: https://lore.kernel.org/2B7E93E4-B13A-4AE4-8E87-306A8EE9BBB7@m.fudan.edu.cn Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/20241230110543.32454-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-12-30 12:06:28 +01:00
Al Viro	7439b39521	ALSA: compress_offload: fix remaining descriptor races in sound/core/compress_offload.c `3d3f43fab4` ("ALSA: compress_offload: improve file descriptors installation for dma-buf") fixed some of descriptor races in snd_compr_task_new(), but there's a couple more left. We need to grab the references to dmabuf before moving them into descriptor table - trying to do that by descriptor afterwards might end up getting a different object, with a dangling reference left in task->{input,output} Fixes: `3d3f43fab4` ("ALSA: compress_offload: improve file descriptors installation for dma-buf") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Link: https://patch.msgid.link/20241229185232.GA1977892@ZenIV Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-12-30 11:49:38 +01:00
Takashi Iwai	ac9fae799e	ALSA: compress_offload: Drop unneeded no_free_ptr() The error path for memdup_user() no longer needs the tricky wrap with no_free_ptr() and we can safely return the error pointer directly. Fixes: `04177158cf` ("ALSA: compress_offload: introduce accel operation mode") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202412290846.cncnpGaw-lkp@intel.com/ Link: https://patch.msgid.link/20241229083917.14912-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-12-30 11:40:41 +01:00
Baojun Xu	6a451e2c5c	ALSA: hda/tas2781: Ignore SUBSYS_ID not found for tas2563 projects Driver will return error if no SUBSYS_ID found in BIOS(acpi). It will cause error in tas2563 projects, which have no SUBSYS_ID. Fixes: `4e7035a75d` ("ALSA: hda/tas2781: Add speaker id check for ASUS projects") Signed-off-by: Baojun Xu <baojun.xu@ti.com> Link: https://lore.kernel.org/20241223225442.1358491-1-stuart.a.hayhurst@gmail.com Link: https://patch.msgid.link/20241230064910.1583-1-baojun.xu@ti.com Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-12-30 11:39:39 +01:00
Linus Torvalds	fc033cf25e	Linux 6.13-rc5	2024-12-29 13:15:45 -08:00
Linus Torvalds	4099a71718	Merge tag 'sched-urgent-2024-12-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fix from Ingo Molnar: "Fix a procfs task state reporting regression when freezing sleeping tasks" * tag 'sched-urgent-2024-12-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: freezer, sched: Report frozen tasks as 'D' instead of 'R'	2024-12-29 10:19:54 -08:00
Linus Torvalds	6cbc4b29eb	Merge tag 'x86-urgent-2024-12-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: - Fix a hang in the "kernel IBT no ENDBR" self-test that may trigger on FRED systems, caused by incomplete FRED state cleanup in the #CP fault handler - Improve TDX (Coco VM) guest unrecoverable error handling to not potentially leak decrypted memory * tag 'x86-urgent-2024-12-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: virt: tdx-guest: Just leak decrypted memory on unrecoverable errors x86/fred: Clear WFE in missing-ENDBRANCH #CPs	2024-12-29 10:16:41 -08:00
Linus Torvalds	f65832a32f	Merge tag 'perf-urgent-2024-12-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 perf fixes from Ingo Molnar: - Fix Intel Lunar Lake build-in event definitions - Fall back to (compatible) legacy features on new Intel PEBS format v6 hardware - Enable uncore support on Intel Clearwater Forest CPUs, which is the same as the existing Sierra Forest uncore driver * tag 'perf-urgent-2024-12-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel: Fix bitmask of OCR and FRONTEND events for LNC perf/x86/intel/ds: Add PEBS format 6 perf/x86/intel/uncore: Add Clearwater Forest support	2024-12-29 10:14:08 -08:00
Linus Torvalds	bcfac5530a	Merge tag 'objtool-urgent-2024-12-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool fix from Ingo Molnar: "Fix false positive objtool build warning related to a noreturn function in the bcachefs code" * tag 'objtool-urgent-2024-12-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool: Add bch2_trans_unlocked_error() to bcachefs noreturns	2024-12-29 10:07:40 -08:00
Linus Torvalds	bf7a281b80	Merge tag 'locking-urgent-2024-12-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fix from Ingo Molnar: "Fix missed rtmutex wakeups causing sporadic boot hangs and other misbehavior" * tag 'locking-urgent-2024-12-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/rtmutex: Make sure we wake anything on the wake_q when we release the lock->wait_lock	2024-12-29 10:04:47 -08:00
Linus Torvalds	feffd35a03	Merge tag 'irq-urgent-2024-12-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fix from Ingo Molnar: "Fix bogus MSI IRQ setup warning on RISC-V" * tag 'irq-urgent-2024-12-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: PCI/MSI: Handle lack of irqdomain gracefully	2024-12-29 10:03:01 -08:00
Linus Torvalds	c059361673	Merge tag 'for-6.13-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "A few more fixes that accumulated over the last two weeks, fixing some user reported problems: - swapfile fixes: - conditional reschedule in the activation loop - fix race with memory mapped file when activating - make activation loop interruptible - rework and fix extent sharing checks - folio fixes: - in send, recheck folio mapping after unlock - in relocation, recheck folio mapping after unlock - fix waiting for encoded read io_uring requests - fix transaction atomicity when enabling simple quotas - move COW block trace point before the block gets freed - print various sizes in sysfs with correct endianity" * tag 'for-6.13-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: sysfs: fix direct super block member reads btrfs: fix transaction atomicity bug when enabling simple quotas btrfs: avoid monopolizing a core when activating a swap file btrfs: allow swap activation to be interruptible btrfs: fix swap file activation failure due to extents that used to be shared btrfs: fix race with memory mapped writes when activating swap file btrfs: check folio mapping after unlock in put_file_data() btrfs: check folio mapping after unlock in relocate_one_folio() btrfs: fix use-after-free when COWing tree bock and tracing is enabled btrfs: fix use-after-free waiting for encoded read endios	2024-12-29 09:34:34 -08:00
Linus Torvalds	e1d9326608	Merge tag 'i2c-for-6.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: - IMX: fix stop condition in single master mode and add compatible string for errata adherence - Microchip: Add support for proper repeated sends and fix unnecessary NAKs on empty messages, which caused false bus detection * tag 'i2c-for-6.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: microchip-core: fix "ghost" detections i2c: microchip-core: actually use repeated sends i2c: imx: add imx7d compatible string for applying erratum ERR007805 i2c: imx: fix missing stop condition in single-master mode	2024-12-29 09:31:55 -08:00
Vishnu Sankar	7e16ae558a	platform/x86: thinkpad-acpi: Add support for hotkey 0x1401 F8 mode key on Lenovo 2025 platforms use a different key code. Adding support for the new keycode 0x1401. Tested on X1 Carbon Gen 13 and X1 2-in-1 Gen 10. Signed-off-by: Vishnu Sankar <vishnuocv@gmail.com> Reviewed-by: Mark Pearson <mpearson-lenovo@squebb.ca> Link: https://lore.kernel.org/r/20241227231840.21334-1-vishnuocv@gmail.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>	2024-12-29 16:55:10 +02:00
Mingcong Bai	032fe9b051	platform/x86: hp-wmi: mark 8A15 board for timed OMEN thermal profile The HP OMEN 8 (2022), corresponding to a board ID of 8A15, supports OMEN thermal profile and requires the timed profile quirk. Upon adding this ID to both the omen_thermal_profile_boards and omen_timed_thermal_profile_boards, significant bump in performance can be observed. For instance, SilverBench (https://silver.urih.com/) results improved from ~56,000 to ~69,000, as a result of higher power draws (and thus core frequencies) whilst under load: Package Power: - Before the patch: ~65W (dropping to about 55W under sustained load). - After the patch: ~115W (dropping to about 105W under sustained load). Core Power: - Before: ~60W (ditto above). - After: ~108W (ditto above). Add 8A15 to omen_thermal_profile_boards and omen_timed_thermal_profile_boards to improve performance. Signed-off-by: Xi Xiao <1577912515@qq.com> Signed-off-by: Mingcong Bai <jeffbai@aosc.io> Link: https://lore.kernel.org/r/20241226062207.3352629-1-jeffbai@aosc.io Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>	2024-12-29 16:48:53 +02:00
Li RongQing	27834971f6	virt: tdx-guest: Just leak decrypted memory on unrecoverable errors In CoCo VMs it is possible for the untrusted host to cause set_memory_decrypted() to fail such that an error is returned and the resulting memory is shared. Callers need to take care to handle these errors to avoid returning decrypted (shared) memory to the page allocator, which could lead to functional or security issues. Leak the decrypted memory when set_memory_decrypted() fails, and don't need to print an error since set_memory_decrypted() will call WARN_ONCE(). Fixes: `f4738f56d1` ("virt: tdx-guest: Add Quote generation support using TSM_REPORTS") Signed-off-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20240619111801.25630-1-lirongqing%40baidu.com	2024-12-29 10:18:44 +01:00
Xin Li (Intel)	dc81e556f2	x86/fred: Clear WFE in missing-ENDBRANCH #CPs An indirect branch instruction sets the CPU indirect branch tracker (IBT) into WAIT_FOR_ENDBRANCH (WFE) state and WFE stays asserted across the instruction boundary. When the decoder finds an inappropriate instruction while WFE is set ENDBR, the CPU raises a #CP fault. For the "kernel IBT no ENDBR" selftest where #CPs are deliberately triggered, the WFE state of the interrupted context needs to be cleared to let execution continue. Otherwise when the CPU resumes from the instruction that just caused the previous #CP, another missing-ENDBRANCH #CP is raised and the CPU enters a dead loop. This is not a problem with IDT because it doesn't preserve WFE and IRET doesn't set WFE. But FRED provides space on the entry stack (in an expanded CS area) to save and restore the WFE state, thus the WFE state is no longer clobbered, so software must clear it. Clear WFE to avoid dead looping in ibt_clear_fred_wfe() and the !ibt_fatal code path when execution is allowed to continue. Clobbering WFE in any other circumstance is a security-relevant bug. [ dhansen: changelog rewording ] Fixes: `a5f6c2ace9` ("x86/shstk: Add user control-protection fault handler") Signed-off-by: Xin Li (Intel) <xin@zytor.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20241113175934.3897541-1-xin%40zytor.com	2024-12-29 10:18:10 +01:00
Chen Ridong	f718faf394	freezer, sched: Report frozen tasks as 'D' instead of 'R' Before commit: `f5d39b0208` ("freezer,sched: Rewrite core freezer logic") the frozen task stat was reported as 'D' in cgroup v1. However, after rewriting the core freezer logic, the frozen task stat is reported as 'R'. This is confusing, especially when a task with stat of 'S' is frozen. This bug can be reproduced with these steps: $ cd /sys/fs/cgroup/freezer/ $ mkdir test $ sleep 1000 & [1] 739 // task whose stat is 'S' $ echo 739 > test/cgroup.procs $ echo FROZEN > test/freezer.state $ ps -aux \| grep 739 root 739 0.1 0.0 8376 1812 pts/0 R 10:56 0:00 sleep 1000 As shown above, a task whose stat is 'S' was changed to 'R' when it was frozen. To solve this regression, simply maintain the same reported state as before the rewrite. [ mingo: Enhanced the changelog and comments ] Fixes: `f5d39b0208` ("freezer,sched: Rewrite core freezer logic") Signed-off-by: Chen Ridong <chenridong@huawei.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Michal Koutný <mkoutny@suse.com> Link: https://lore.kernel.org/r/20241217004818.3200515-1-chenridong@huaweicloud.com	2024-12-29 10:14:20 +01:00
chenchangcheng	31ad36a271	objtool: Add bch2_trans_unlocked_error() to bcachefs noreturns Fix the following objtool warning during build time: fs/bcachefs/btree_trans_commit.o: warning: objtool: bch2_trans_commit_write_locked.isra.0() falls through to next function do_bch2_trans_commit.isra.0() fs/bcachefs/btree_trans_commit.o: warning: objtool: .text: unexpected end of section ...... fs/bcachefs/btree_update.o: warning: objtool: bch2_trans_update_get_key_cache() falls through to next function flush_new_cached_update() fs/bcachefs/btree_update.o: warning: objtool: flush_new_cached_update() falls through to next function bch2_trans_update_by_path() bch2_trans_unlocked_error() is an Obviously Correct (tm) panic() wrapper, add it to the list of known noreturns. [ mingo: Improved the changelog ] Fixes: `fd104e2967` ("bcachefs: bch2_trans_verify_not_unlocked()") Signed-off-by: chenchangcheng <chenchangcheng@kylinos.cn> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/20241220074847.3418134-1-ccc194101@163.com	2024-12-29 09:52:21 +01:00
Tanya Agarwal	b06a6187ef	ALSA: usb-audio: US16x08: Initialize array before use Initialize meter_urb array before use in mixer_us16x08.c. CID 1410197: (#1 of 1): Uninitialized scalar variable (UNINIT) uninit_use_in_call: Using uninitialized value *meter_urb when calling get_meter_levels_from_urb. Coverity Link: https://scan7.scan.coverity.com/#/project-view/52849/11354?selectedIssue=1410197 Fixes: `d2bb390a20` ("ALSA: usb-audio: Tascam US-16x08 DSP mixer quirk") Signed-off-by: Tanya Agarwal <tanyaagarwal25699@gmail.com> Link: https://patch.msgid.link/20241229060240.1642-1-tanyaagarwal25699@gmail.com Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-12-29 09:27:21 +01:00
Pavel Begunkov	38fc96a58c	io_uring/rw: fix downgraded mshot read The io-wq path can downgrade a multishot request to oneshot mode, however io_read_mshot() doesn't handle that and would still post multiple CQEs. That's not allowed, because io_req_post_cqe() requires stricter context requirements. The described can only happen with pollable files that don't support FMODE_NOWAIT, which is an odd combination, so if even allowed it should be fairly rare. Cc: stable@vger.kernel.org Reported-by: chase xd <sl1589472800@gmail.com> Fixes: `bee1d5becd` ("io_uring: disable io-wq execution of multishot NOWAIT requests") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/c5c8c4a50a882fd581257b81bf52eee260ac29fd.1735407848.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-28 13:13:22 -07:00
Linus Torvalds	059dd502b2	Merge tag 'block-6.13-20241228' of git://git.kernel.dk/linux Pull block fix from Jens Axboe: "Just a single fix for ublk setup error handling" * tag 'block-6.13-20241228' of git://git.kernel.dk/linux: ublk: detach gendisk from ublk device if add_disk() fails	2024-12-28 11:02:35 -08:00
Linus Torvalds	d19a3ee573	Merge tag 'io_uring-6.13-20241228' of git://git.kernel.dk/linux Pull io_uring fix from Jens Axboe: "Just a single fix for a theoretical issue with SQPOLL setup" * tag 'io_uring-6.13-20241228' of git://git.kernel.dk/linux: io_uring/sqpoll: fix sqpoll error handling races	2024-12-28 11:00:29 -08:00
Linus Torvalds	e51da4a232	Merge tag '6.13-rc4-SMB3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6 Pull smb client fixes from Steve French: - fix caching of files that will be reused for write - minor cleanup * tag '6.13-rc4-SMB3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: Remove unused is_server_using_iface() smb: enable reuse of deferred file handles for write operations	2024-12-28 10:58:01 -08:00
Masahiro Yamada	8fe1a63d3d	modpost: work around unaligned data access error With the latest binutils, modpost fails with a bus error on some architectures such as ARM and sparc64. Since binutils commit 1f1b5e506bf0 ("bfd/ELF: restrict file alignment for object files"), the byte offset to each section (sh_offset) in relocatable ELF is no longer guaranteed to be aligned. modpost parses MODULE_DEVICE_TABLE() data structures, which are usually located in the .rodata section. If it is not properly aligned, unaligned access errors may occur. To address the issue, this commit imports the get_unaligned() helper from include/linux/unaligned.h. The get_unaligned_native() helper caters to the endianness in addition to handling the unaligned access. I slightly refactored do_pcmcia_entry() and do_input() to avoid writing back to an unaligned address. (We would need the put_unaligned() helper to do that.) The addend__rel() functions need similar adjustments because the .text sections are not aligned either. It seems that the .symtab, .rel. and .rela.* sections are still aligned. Keep normal pointer access for these sections to avoid unnecessary performance costs. Reported-by: Paulo Pisati <paolo.pisati@canonical.com> Reported-by: Matthias Klose <doko@debian.org> Closes: https://sourceware.org/bugzilla/show_bug.cgi?id=32435 Reported-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Closes: https://sourceware.org/bugzilla/show_bug.cgi?id=32493 Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>	2024-12-28 23:31:09 +09:00
Masahiro Yamada	e1352d7ead	modpost: refactor do_vmbus_entry() Optimize the size of guid_name[], as it only requires 1 additional byte for '\0' instead of 2. Simplify the loop by incrementing the iterator by 1 instead of 2. Remove the unnecessary TO_NATIVE() call, as the guid is represented as a byte stream. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>	2024-12-28 23:31:03 +09:00
Masahiro Yamada	bf36b4bf1b	modpost: fix the missed iteration for the max bit in do_input() This loop should iterate over the range from 'min' to 'max' inclusively. The last interation is missed. Fixes: `1d8f430c15` ("[PATCH] Input: add modalias support") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>	2024-12-28 23:30:56 +09:00
Mostafa Saleh	7a6c355b55	scripts/mksysmap: Fix escape chars '$' Commit `b18b047002` ("kbuild: change scripts/mksysmap into sed script") changed the invocation of the script, to call sed directly without shell. That means, the current extra escape that was added in: commit `ec336aa831` ("scripts/mksysmap: Fix badly escaped '$'") for the shell is not correct any more, at the moment the stack traces for nvhe are corrupted: [ 22.840904] kvm [190]: [<ffff80008116dd54>] __kvm_nvhe_$x.220+0x58/0x9c [ 22.842913] kvm [190]: [<ffff8000811709bc>] __kvm_nvhe_$x.9+0x44/0x50 [ 22.844112] kvm [190]: [<ffff80008116f8fc>] __kvm_nvhe___skip_pauth_save+0x4/0x4 With this patch: [ 25.793513] kvm [192]: nVHE call trace: [ 25.794141] kvm [192]: [<ffff80008116dd54>] __kvm_nvhe_hyp_panic+0xb0/0xf4 [ 25.796590] kvm [192]: [<ffff8000811709bc>] __kvm_nvhe_handle_trap+0xe4/0x188 [ 25.797553] kvm [192]: [<ffff80008116f8fc>] __kvm_nvhe___skip_pauth_save+0x4/0x4 Fixes: `b18b047002` ("kbuild: change scripts/mksysmap into sed script") Signed-off-by: Mostafa Saleh <smostafa@google.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>	2024-12-28 23:23:52 +09:00
Linus Torvalds	fd0584d220	Merge tag 'trace-tools-v6.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing tool fix from Steven Rostedt: - Fix rtla divide by zero when the count is zero in histograms * tag 'trace-tools-v6.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: rtla/timerlat: Fix histogram ALL for zero samples	2024-12-27 15:31:52 -08:00
Wolfram Sang	f802f11b23	Merge tag 'i2c-host-fixes-6.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-current i2c-host-fixes for v6.13-rc5 - IMX: fixed stop condition in single master mode and added compatible string for errata adherence. - Microchip: Added support for proper repeated sends and fixed unnecessary NAKs on empty messages, which caused false bus detection.	2024-12-28 00:25:04 +01:00
Chunguang.xu	36e3b1f9ab	nvme-tcp: remove nvme_tcp_destroy_io_queues() Now when destroying the IO queue we call nvme_tcp_stop_io_queues() twice, nvme_tcp_destroy_io_queues() has an unnecessary call. Here we try to remove nvme_tcp_destroy_io_queues() and merge it into nvme_tcp_teardown_io_queues(), simplify the code and align with nvme-rdma, make it easy to maintaince. Signed-off-by: Chunguang.xu <chunguang.xu@shopee.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>	2024-12-27 13:33:48 -08:00
Nilay Shroff	74d16965d7	nvmet-loop: avoid using mutex in IO hotpath Using mutex lock in IO hot path causes the kernel BUG sleeping while atomic. Shinichiro[1], first encountered this issue while running blktest nvme/052 shown below: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:585 in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 996, name: (udev-worker) preempt_count: 0, expected: 0 RCU nest depth: 1, expected: 0 2 locks held by (udev-worker)/996: #0: ffff8881004570c8 (mapping.invalidate_lock){.+.+}-{3:3}, at: page_cache_ra_unbounded+0x155/0x5c0 #1: ffffffff8607eaa0 (rcu_read_lock){....}-{1:2}, at: blk_mq_flush_plug_list+0xa75/0x1950 CPU: 2 UID: 0 PID: 996 Comm: (udev-worker) Not tainted 6.12.0-rc3+ #339 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x6a/0x90 __might_resched.cold+0x1f7/0x23d ? __pfx___might_resched+0x10/0x10 ? vsnprintf+0xdeb/0x18f0 __mutex_lock+0xf4/0x1220 ? nvmet_subsys_nsid_exists+0xb9/0x150 [nvmet] ? __pfx_vsnprintf+0x10/0x10 ? __pfx___mutex_lock+0x10/0x10 ? snprintf+0xa5/0xe0 ? xas_load+0x1ce/0x3f0 ? nvmet_subsys_nsid_exists+0xb9/0x150 [nvmet] nvmet_subsys_nsid_exists+0xb9/0x150 [nvmet] ? __pfx_nvmet_subsys_nsid_exists+0x10/0x10 [nvmet] nvmet_req_find_ns+0x24e/0x300 [nvmet] nvmet_req_init+0x694/0xd40 [nvmet] ? blk_mq_start_request+0x11c/0x750 ? nvme_setup_cmd+0x369/0x990 [nvme_core] nvme_loop_queue_rq+0x2a7/0x7a0 [nvme_loop] ? __pfx___lock_acquire+0x10/0x10 ? __pfx_nvme_loop_queue_rq+0x10/0x10 [nvme_loop] __blk_mq_issue_directly+0xe2/0x1d0 ? __pfx___blk_mq_issue_directly+0x10/0x10 ? blk_mq_request_issue_directly+0xc2/0x140 blk_mq_plug_issue_direct+0x13f/0x630 ? lock_acquire+0x2d/0xc0 ? blk_mq_flush_plug_list+0xa75/0x1950 blk_mq_flush_plug_list+0xa9d/0x1950 ? __pfx_blk_mq_flush_plug_list+0x10/0x10 ? __pfx_mpage_readahead+0x10/0x10 __blk_flush_plug+0x278/0x4d0 ? __pfx___blk_flush_plug+0x10/0x10 ? lock_release+0x460/0x7a0 blk_finish_plug+0x4e/0x90 read_pages+0x51b/0xbc0 ? __pfx_read_pages+0x10/0x10 ? lock_release+0x460/0x7a0 page_cache_ra_unbounded+0x326/0x5c0 force_page_cache_ra+0x1ea/0x2f0 filemap_get_pages+0x59e/0x17b0 ? __pfx_filemap_get_pages+0x10/0x10 ? lock_is_held_type+0xd5/0x130 ? __pfx___might_resched+0x10/0x10 ? find_held_lock+0x2d/0x110 filemap_read+0x317/0xb70 ? up_write+0x1ba/0x510 ? __pfx_filemap_read+0x10/0x10 ? inode_security+0x54/0xf0 ? selinux_file_permission+0x36d/0x420 blkdev_read_iter+0x143/0x3b0 vfs_read+0x6ac/0xa20 ? __pfx_vfs_read+0x10/0x10 ? __pfx_vm_mmap_pgoff+0x10/0x10 ? __pfx___seccomp_filter+0x10/0x10 ksys_read+0xf7/0x1d0 ? __pfx_ksys_read+0x10/0x10 do_syscall_64+0x93/0x180 ? lockdep_hardirqs_on_prepare+0x16d/0x400 ? do_syscall_64+0x9f/0x180 ? lockdep_hardirqs_on+0x78/0x100 ? do_syscall_64+0x9f/0x180 ? lockdep_hardirqs_on_prepare+0x16d/0x400 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f565bd1ce11 Code: 00 48 8b 15 09 90 0d 00 f7 d8 64 89 02 b8 ff ff ff ff eb bd e8 d0 ad 01 00 f3 0f 1e fa 80 3d 35 12 0e 00 00 74 13 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 4f c3 66 0f 1f 44 00 00 55 48 89 e5 48 83 ec RSP: 002b:00007ffd6e7a20c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 RAX: ffffffffffffffda RBX: 0000000000001000 RCX: 00007f565bd1ce11 RDX: 0000000000001000 RSI: 00007f565babb000 RDI: 0000000000000014 RBP: 00007ffd6e7a2130 R08: 00000000ffffffff R09: 0000000000000000 R10: 0000556000bfa610 R11: 0000000000000246 R12: 000000003ffff000 R13: 0000556000bfa5b0 R14: 0000000000000e00 R15: 0000556000c07328 </TASK> Apparently, the above issue is caused due to using mutex lock while we're in IO hot path. It's a regression caused with commit `505363957f` ("nvmet: fix nvme status code when namespace is disabled"). The mutex ->su_mutex is used to find whether a disabled nsid exists in the config group or not. This is to differentiate between a nsid that is disabled vs non-existent. To mitigate the above issue, we've worked upon a fix[2] where we now insert nsid in subsys Xarray as soon as it's created under config group and later when that nsid is enabled, we add an Xarray mark on it and set ns->enabled to true. The Xarray mark is useful while we need to loop through all enabled namepsaces under a subsystem using xa_for_each_marked() API. If later a nsid is disabled then we clear Xarray mark from it and also set ns->enabled to false. It's only when nsid is deleted from the config group we delete it from the Xarray. So with this change, now we could easily differentiate a nsid is disabled (i.e. Xarray entry for ns exists but ns->enabled is set to false) vs non- existent (i.e.Xarray entry for ns doesn't exist). Link: https://lore.kernel.org/linux-nvme/20241022070252.GA11389@lst.de/ [2] Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> Closes: https://lore.kernel.org/linux-nvme/tqcy3sveity7p56v7ywp7ssyviwcb3w4623cnxj3knoobfcanq@yxgt2mjkbkam/ [1] Fixes: `505363957f` ("nvmet: fix nvme status code when namespace is disabled") Fix-suggested-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nilay Shroff <nilay@linux.ibm.com> Signed-off-by: Keith Busch <kbusch@kernel.org>	2024-12-27 13:24:00 -08:00
Luis Chamberlain	b579d6fdc3	nvmet: propagate npwg topology Ensure we propagate npwg to the target as well instead of assuming its the same logical blocks per physical block. This ensures devices with large IUs information properly propagated on the target. Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>	2024-12-27 13:18:01 -08:00
Leo Stone	4db3d750ac	nvmet: Don't overflow subsysnqn nvmet_root_discovery_nqn_store treats the subsysnqn string like a fixed size buffer, even though it is dynamically allocated to the size of the string. Create a new string with kstrndup instead of using the old buffer. Reported-by: syzbot+ff4aab278fa7e27e0f9e@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=ff4aab278fa7e27e0f9e Fixes: `95409e277d` ("nvmet: implement unique discovery NQN") Signed-off-by: Leo Stone <leocstone@gmail.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>	2024-12-27 13:14:31 -08:00
Antonio Pastor	a024e377ef	net: llc: reset skb->transport_header 802.2+LLC+SNAP frames received by napi_complete_done with GRO and DSA have skb->transport_header set two bytes short, or pointing 2 bytes before network_header & skb->data. As snap_rcv expects transport_header to point to SNAP header (OID:PID) after LLC processing advances offset over LLC header (llc_rcv & llc_fixup_skb), code doesn't find a match and packet is dropped. Between napi_complete_done and snap_rcv, transport_header is not used until __netif_receive_skb_core, where originally it was being reset. Commit `fda55eca5a` ("net: introduce skb_transport_header_was_set()") only does so if not set, on the assumption the value was set correctly by GRO (and also on assumption that "network stacks usually reset the transport header anyway"). Afterwards it is moved forward by llc_fixup_skb. Locally generated traffic shows up at __netif_receive_skb_core with no transport_header set and is processed without issue. On a setup with GRO but no DSA, transport_header and network_header are both set to point to skb->data which is also correct. As issue is LLC specific, to avoid impacting non-LLC traffic, and to follow up on original assumption made on previous code change, llc_fixup_skb to reset the offset after skb pull. llc_fixup_skb assumes the LLC header is at skb->data, and by definition SNAP header immediately follows. Fixes: `fda55eca5a` ("net: introduce skb_transport_header_was_set()") Signed-off-by: Antonio Pastor <antonio.pastor@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241225010723.2830290-1-antonio.pastor@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-27 11:23:37 -08:00
Jakub Kicinski	daefdbb5cc	Merge tag 'nf-24-12-25' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following batch contains one Netfilter fix for net: 1) Fix unaligned atomic read on struct nft_set_ext in nft_set_hash backend that causes an alignment failure splat on aarch64. This is related to a recent fix and it has been reported via the regressions mailing list. * tag 'nf-24-12-25' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nft_set_hash: unaligned atomic read on struct nft_set_ext ==================== Link: https://patch.msgid.link/20241224233109.361755-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-27 11:21:32 -08:00
Jakub Kicinski	1fa9d91e9f	Merge branch 'netlink-specs-mptcp-fixes-for-some-descriptions' Matthieu Baerts says: ==================== netlink: specs: mptcp: fixes for some descriptions When looking at the MPTCP PM Netlink specs rendered version [1], a few small issues have been found with the descriptions, and fixed here: - Patch 1: add a missing attribute for two events. For >= v5.19. - Patch 2: clearly mention the attributes. For >= v6.7. - Patch 3: fix missing descriptions and replace a wrong one. For >= v6.7. Link: https://docs.kernel.org/networking/netlink_spec/mptcp_pm.html [1] v1: https://lore.kernel.org/20241219-net-mptcp-netlink-specs-pm-doc-fixes-v1-0-825d3b45f27b@kernel.org ==================== Link: https://patch.msgid.link/20241221-net-mptcp-netlink-specs-pm-doc-fixes-v2-0-e54f2db3f844@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-27 11:16:26 -08:00
Matthieu Baerts (NGI0)	4f363fe9f6	netlink: specs: mptcp: fix missing doc Two operations didn't have a small description. It looks like something that has been missed in the original commit introducing this file. Replace the two "todo" by a small and simple description: Create/Destroy subflow. While at it, also uniform the capital letters, avoid double spaces, and fix the "announce" event description: a new "address" has been announced, not a new "subflow". Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20241221-net-mptcp-netlink-specs-pm-doc-fixes-v2-3-e54f2db3f844@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-27 11:16:21 -08:00
Matthieu Baerts (NGI0)	bea87657b5	netlink: specs: mptcp: clearly mention attributes The rendered version of the MPTCP events [1] looked strange, because the whole content of the 'doc' was displayed in the same block. It was then not clear that the first words, not even ended by a period, were the attributes that are defined when such events are emitted. These attributes have now been moved to the end, prefixed by 'Attributes:' and ended with a period. Note that '>-' has been added after 'doc:' to allow ':' in the text below. The documentation in the UAPI header has been auto-generated by: ./tools/net/ynl/ynl-regen.sh Link: https://docs.kernel.org/networking/netlink_spec/mptcp_pm.html#event-type [1] Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20241221-net-mptcp-netlink-specs-pm-doc-fixes-v2-2-e54f2db3f844@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-27 11:16:21 -08:00
Matthieu Baerts (NGI0)	6b830c6a02	netlink: specs: mptcp: add missing 'server-side' attr This attribute is added with the 'created' and 'established' events, but the documentation didn't mention it. The documentation in the UAPI header has been auto-generated by: ./tools/net/ynl/ynl-regen.sh Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20241221-net-mptcp-netlink-specs-pm-doc-fixes-v2-1-e54f2db3f844@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-27 11:16:21 -08:00
Linus Torvalds	8379578b11	Merge tag 'for-v6.13-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply Pull power supply fixes from Sebastian Reichel: - fix potential array out of bounds access in gpio-charger - cros_charge-control: - fix concurrent sysfs access - allow start_threshold == end_threshold - workaround limited v2 charge threshold API - bq24296: fix vbus regulator handling * tag 'for-v6.13-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply: power: supply: bq24190: Fix BQ24296 Vbus regulator support power: supply: cros_charge-control: hide start threshold on v2 cmd power: supply: cros_charge-control: allow start_threshold == end_threshold power: supply: cros_charge-control: add mutex for driver data power: supply: gpio-charger: Fix set charge current limits	2024-12-27 11:10:56 -08:00
Linus Torvalds	eff4f67583	Merge tag 'powerpc-6.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fix from Madhavan Srinivasan: - Add close() callback in vas_vm_ops struct for proper cleanup Thanks to Haren Myneni. * tag 'powerpc-6.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/pseries/vas: Add close() callback in vas_vm_ops struct	2024-12-27 11:06:29 -08:00
Linus Torvalds	411a678d30	Merge tag 'probes-fixes-v6.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull probes fix from Masami Hiramatsu: "Change the priority of the module callback of kprobe events so that it is called after the jump label list on the module is updated. This ensures the kprobe can check whether it is not on the jump label address correctly" * tag 'probes-fixes-v6.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing/kprobe: Make trace_kprobe's module callback called after jump_label update	2024-12-27 11:03:15 -08:00
Linus Torvalds	f0bc704f46	Merge tag 'hardening-v6.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening fix from Kees Cook: - stddef: make __struct_group() UAPI C++-friendly (Alexander Lobakin) * tag 'hardening-v6.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: stddef: make __struct_group() UAPI C++-friendly	2024-12-27 10:39:05 -08:00
Linus Torvalds	2c2b3d906c	Merge tag 'trace-v6.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: "Two minor tracing fixes: - Add "const" to "char " in event structure field that gets assigned literals. - Check size of input passed into the tracing cpumask file. If a too large of an input gets passed into the cpumask file, it could trigger a warning in the bitmask parsing code" tag 'trace-v6.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Prevent bad count for tracing_cpumask_write tracing: Constify string literal data member in struct trace_event_call	2024-12-27 10:33:21 -08:00
Tomas Glozar	6cc45f8c1f	rtla/timerlat: Fix histogram ALL for zero samples rtla timerlat hist currently computers the minimum, maximum and average latency even in cases when there are zero samples. This leads to nonsensical values being calculated for maximum and minimum, and to divide by zero for average. A similar bug is fixed by `01b05fc0e5` ("rtla/timerlat: Fix histogram report when a cpu count is 0") but the bug still remains for printing the sum over all CPUs in timerlat_print_stats_all. The issue can be reproduced with this command: $ rtla timerlat hist -U -d 1s Index over: count: min: avg: max: Floating point exception (core dumped) (There are always no samples with -U unless the user workload is created.) Fix the bug by omitting max/min/avg when sample count is zero, displaying a dash instead, just like we already do for the individual CPUs. The logic is moved into a new function called format_summary_value, which is used for both the individual CPUs and for the overall summary. Cc: stable@vger.kernel.org Link: https://lore.kernel.org/20241127134130.51171-1-tglozar@redhat.com Fixes: `1462501c7a` ("rtla/timerlat: Add a summary for hist mode") Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2024-12-27 11:21:46 -05:00
Linus Torvalds	d6ef8b40d0	Merge tag 'sound-6.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A collection of small fixes. Nothing really stands out, fortunately. - Follow-up fixes for the new compress offload API extension - A few ASoC SOF, AMD and Mediatek quirks and fixes - A regression fix in legacy SH driver cleanup - Fix DMA mapping error handling in the helper code - Fix kselftest dependency" * tag 'sound-6.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: sh: Fix wrong argument order for copy_from_iter() selftests/alsa: Fix circular dependency involving global-timer ALSA: memalloc: prefer dma_mapping_error() over explicit address checking ALSA: compress_offload: improve file descriptors installation for dma-buf ALSA: compress_offload: use safe list iteration in snd_compr_task_seq() ALSA: compress_offload: avoid 64-bit get_user() ALSA: compress_offload: import DMA_BUF namespace ASoC: mediatek: disable buffer pre-allocation ASoC: rt722: add delay time to wait for the calibration procedure ASoC: SOF: Intel: hda-dai: Do not release the link DMA on STOP ASoC: dt-bindings: realtek,rt5645: Fix CPVDD voltage comment ASoC: Intel: sof_sdw: Fix DMI match for Lenovo 21QA and 21QB ASoC: Intel: sof_sdw: Fix DMI match for Lenovo 21Q6 and 21Q7 ASoC: amd: ps: Fix for enabling DMIC on acp63 platform via _DSD entry	2024-12-26 10:49:02 -08:00
Linus Torvalds	23db0ed34f	Merge tag 'dmaengine-fix-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine Pull dmaengine fixes from Vinod Koul: "Bunch of minor driver fixes for drivers in this cycle: - Kernel doc warning documentation fixes - apple driver fix for register access - amd driver dropping private dma_ops - freescale cleanup path fix - refcount fix for mv_xor driver - null pointer deref fix for at_xdmac driver - GENMASK to GENMASK_ULL fix for loongson2 apb driver - Tegra driver fix for correcting dma status" * tag 'dmaengine-fix-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: dmaengine: tegra: Return correct DMA status when paused dmaengine: mv_xor: fix child node refcount handling in early exit dmaengine: fsl-edma: implement the cleanup path of fsl_edma3_attach_pd() dmaengine: amd: qdma: Remove using the private get and set dma_ops APIs dmaengine: apple-admac: Avoid accessing registers in probe linux/dmaengine.h: fix a few kernel-doc warnings dmaengine: loongson2-apb: Change GENMASK to GENMASK_ULL dmaengine: dw: Select only supported masters for ACPI devices dmaengine: at_xdmac: avoid null_prt_deref in at_xdmac_prep_dma_memset	2024-12-26 10:43:25 -08:00
Linus Torvalds	6fcb22ef50	Merge tag 'phy-fixes-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy Pull phy fixes from Vinod Koul: "A few core API fixes for devm calls and bunch of driver fixes as usual: - devm_phy_xxx fixes for few APIs in the phy core - qmp driver register name config - init sequence fix for usb driver - rockchip driver setting drvdata correctly in samsung hdptx and reset fix for naneng combophy - regulator dependency fix for mediatek hdmi driver - overflow assertion fix for stm32 driver" * tag 'phy-fixes-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: phy: mediatek: phy-mtk-hdmi: add regulator dependency phy: freescale: fsl-samsung-hdmi: Fix 64-by-32 division cocci warnings phy: core: Fix an OF node refcount leakage in of_phy_provider_lookup() phy: core: Fix an OF node refcount leakage in _of_phy_get() phy: core: Fix that API devm_phy_destroy() fails to destroy the phy phy: core: Fix that API devm_of_phy_provider_unregister() fails to unregister the phy provider phy: core: Fix that API devm_phy_put() fails to release the phy phy: rockchip: samsung-hdptx: Set drvdata before enabling runtime PM phy: stm32: work around constant-value overflow assertion phy: qcom-qmp: Fix register name in RX Lane config of SC8280XP phy: rockchip: naneng-combphy: fix phy reset phy: usb: Toggle the PHY power during init	2024-12-26 10:39:57 -08:00
Linus Torvalds	ab8beb2047	Merge tag 'chrome-platform-for-6.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux Pull chrome platform fix from Tzung-Bi Shih: - Fix wrong product names for early Framework Laptops * tag 'chrome-platform-for-6.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux: platform/chrome: cros_ec_lpc: fix product identity for early Framework Laptops	2024-12-26 10:35:13 -08:00
Pavel Begunkov	e33ac68e5e	io_uring/sqpoll: fix sqpoll error handling races BUG: KASAN: slab-use-after-free in __lock_acquire+0x370b/0x4a10 kernel/locking/lockdep.c:5089 Call Trace: <TASK> ... _raw_spin_lock_irqsave+0x3d/0x60 kernel/locking/spinlock.c:162 class_raw_spinlock_irqsave_constructor include/linux/spinlock.h:551 [inline] try_to_wake_up+0xb5/0x23c0 kernel/sched/core.c:4205 io_sq_thread_park+0xac/0xe0 io_uring/sqpoll.c:55 io_sq_thread_finish+0x6b/0x310 io_uring/sqpoll.c:96 io_sq_offload_create+0x162/0x11d0 io_uring/sqpoll.c:497 io_uring_create io_uring/io_uring.c:3724 [inline] io_uring_setup+0x1728/0x3230 io_uring/io_uring.c:3806 ... Kun Hu reports that the SQPOLL creating error path has UAF, which happens if io_uring_alloc_task_context() fails and then io_sq_thread() manages to run and complete before the rest of error handling code, which means io_sq_thread_finish() is looking at already killed task. Note that this is mostly theoretical, requiring fault injection on the allocation side to trigger in practice. Cc: stable@vger.kernel.org Reported-by: Kun Hu <huk23@m.fudan.edu.cn> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/0f2f1aa5729332612bd01fe0f2f385fd1f06ce7c.1735231717.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-26 10:02:40 -07:00
Ming Lei	75cd4005da	ublk: detach gendisk from ublk device if add_disk() fails Inside ublk_abort_requests(), gendisk is grabbed for aborting all inflight requests. And ublk_abort_requests() is called when exiting the uring context or handling timeout. If add_disk() fails, the gendisk may have been freed when calling ublk_abort_requests(), so use-after-free can be caused when getting disk's reference in ublk_abort_requests(). Fixes the bug by detaching gendisk from ublk device if add_disk() fails. Fixes: `bd23f6c2c2` ("ublk: quiesce request queue when aborting queue") Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20241225110640.351531-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-26 06:42:55 -07:00
Jie Gan	8a6442ec34	arm64: dts: qcom: sa8775p: fix the secure device bootup issue The secure device(fused) cannot bootup with TPDM_DCC device. So disable it in DT. Fixes: `6596118ccd` ("arm64: dts: qcom: Add coresight nodes for SA8775p") Signed-off-by: Jie Gan <quic_jiegan@quicinc.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Link: https://lore.kernel.org/r/20241219025216.3463527-1-quic_jiegan@quicinc.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>	2024-12-25 22:05:40 -06:00
Conor Dooley	49e1f0fd0d	i2c: microchip-core: fix "ghost" detections Running i2c-detect currently produces an output akin to: 0 1 2 3 4 5 6 7 8 9 a b c d e f 00: 08 -- 0a -- 0c -- 0e -- 10: 10 -- 12 -- 14 -- 16 -- UU 19 -- 1b -- 1d -- 1f 20: -- 21 -- 23 -- 25 -- 27 -- 29 -- 2b -- 2d -- 2f 30: -- -- -- -- -- -- -- -- 38 -- 3a -- 3c -- 3e -- 40: 40 -- 42 -- 44 -- 46 -- 48 -- 4a -- 4c -- 4e -- 50: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 60: 60 -- 62 -- 64 -- 66 -- 68 -- 6a -- 6c -- 6e -- 70: 70 -- 72 -- 74 -- 76 -- This happens because for an i2c_msg with a len of 0 the driver will mark the transmission of the message as a success once the START has been sent, without waiting for the devices on the bus to respond with an ACK/NAK. Since i2cdetect seems to run in a tight loop over all addresses the NAK is treated as part of the next test for the next address. Delete the fast path that marks a message as complete when idev->msg_len is zero after sending a START/RESTART since this isn't a valid scenario. CC: stable@vger.kernel.org Fixes: `64a6f1c498` ("i2c: add support for microchip fpga i2c controllers") Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Andi Shyti <andi.shyti@kernel.org> Link: https://lore.kernel.org/r/20241218-outbid-encounter-b2e78b1cc707@spud Signed-off-by: Andi Shyti <andi.shyti@kernel.org>	2024-12-26 01:54:47 +01:00
Conor Dooley	9a8f9320d6	i2c: microchip-core: actually use repeated sends At present, where repeated sends are intended to be used, the i2c-microchip-core driver sends a stop followed by a start. Lots of i2c devices must not malfunction in the face of this behaviour, because the driver has operated like this for years! Try to keep track of whether or not a repeated send is required, and suppress sending a stop in these cases. CC: stable@vger.kernel.org Fixes: `64a6f1c498` ("i2c: add support for microchip fpga i2c controllers") Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Andi Shyti <andi.shyti@kernel.org> Link: https://lore.kernel.org/r/20241218-football-composure-e56df2461461@spud Signed-off-by: Andi Shyti <andi.shyti@kernel.org>	2024-12-26 01:54:47 +01:00
Carlos Song	e0cec36319	i2c: imx: add imx7d compatible string for applying erratum ERR007805 Compatible string "fsl,imx7d-i2c" is not exited at i2c-imx driver compatible string table, at the result, "fsl,imx21-i2c" will be matched, but it will cause erratum ERR007805 not be applied in fact. So Add "fsl,imx7d-i2c" compatible string in i2c-imx driver to apply the erratum ERR007805(https://www.nxp.com/docs/en/errata/IMX7DS_3N09P.pdf). " ERR007805 I2C: When the I2C clock speed is configured for 400 kHz, the SCL low period violates the I2C spec of 1.3 uS min Description: When the I2C module is programmed to operate at the maximum clock speed of 400 kHz (as defined by the I2C spec), the SCL clock low period violates the I2C spec of 1.3 uS min. The user must reduce the clock speed to obtain the SCL low time to meet the 1.3us I2C minimum required. This behavior means the SoC is not compliant to the I2C spec at 400kHz. Workaround: To meet the clock low period requirement in fast speed mode, SCL must be configured to 384KHz or less. " "fsl,imx7d-i2c" already is documented in binding doc. This erratum fix has been included in imx6_i2c_hwdata and it is the same in all I.MX6/7/8, so just reuse it. Fixes: `39c025721d` ("i2c: imx: Implement errata ERR007805 or e7805 bus frequency limit") Cc: stable@vger.kernel.org # v5.18+ Signed-off-by: Carlos Song <carlos.song@nxp.com> Signed-off-by: Haibo Chen <haibo.chen@nxp.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Fixes: `39c025721d` ("i2c: imx: Implement errata ERR007805 or e7805 bus frequency limit") Acked-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://lore.kernel.org/r/20241218044238.143414-1-carlos.song@nxp.com Signed-off-by: Andi Shyti <andi.shyti@kernel.org>	2024-12-25 23:45:05 +01:00
Stefan Eichenberger	768776dd4e	i2c: imx: fix missing stop condition in single-master mode A regression was introduced with the implementation of single-master mode, preventing proper stop conditions from being generated. Devices that require a valid stop condition, such as EEPROMs, fail to function correctly as a result. The issue only affects devices with the single-master property enabled. This commit resolves the issue by re-enabling I2C bus busy bit (IBB) polling for single-master mode when generating a stop condition. The fix further ensures that the i2c_imx->stopped flag is cleared at the start of each transfer, allowing the stop condition to be correctly generated in i2c_imx_stop(). According to the reference manual (IMX8MMRM, Rev. 2, 09/2019, page 5270), polling the IBB bit to determine if the bus is free is only necessary in multi-master mode. Consequently, the IBB bit is not polled for the start condition in single-master mode. Fixes: `6692694aca` ("i2c: imx: do not poll for bus busy in single master mode") Signed-off-by: Stefan Eichenberger <stefan.eichenberger@toradex.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Reviewed-by: Francesco Dolcini <francesco.dolcini@toradex.com> Link: https://lore.kernel.org/r/20241216151829.74056-1-eichest@gmail.com Signed-off-by: Andi Shyti <andi.shyti@kernel.org>	2024-12-25 23:45:04 +01:00
Arnd Bergmann	924d66011f	drm/mediatek: stop selecting foreign drivers The PHY portion of the mediatek hdmi driver was originally part of the driver it self and later split out into drivers/phy, which a 'select' to keep the prior behavior. However, this leads to build failures when the PHY driver cannot be built: WARNING: unmet direct dependencies detected for PHY_MTK_HDMI Depends on [n]: (ARCH_MEDIATEK \|\| COMPILE_TEST [=y]) && COMMON_CLK [=y] && OF [=y] && REGULATOR [=n] Selected by [m]: - DRM_MEDIATEK_HDMI [=m] && HAS_IOMEM [=y] && DRM [=m] && DRM_MEDIATEK [=m] ERROR: modpost: "devm_regulator_register" [drivers/phy/mediatek/phy-mtk-hdmi-drv.ko] undefined! ERROR: modpost: "rdev_get_drvdata" [drivers/phy/mediatek/phy-mtk-hdmi-drv.ko] undefined! The best option here is to just not select the phy driver and leave that up to the defconfig. Do the same for the other PHY and memory drivers selected here as well for consistency. Fixes: `a481bf2f0c` ("drm/mediatek: Separate mtk_hdmi_phy to an independent module") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: CK Hu <ck.hu@mediatek.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20241218085837.2670434-1-arnd@kernel.org/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>	2024-12-25 15:29:57 +00:00
Jason-JH.Lin	5c9d7e79ba	drm/mediatek: Add support for 180-degree rotation in the display driver mediatek-drm driver reported the capability of 180-degree rotation by adding `DRM_MODE_ROTATE_180` to the plane property, as flip-x combined with flip-y equals a 180-degree rotation. However, we did not handle the rotation property in the driver and lead to rotation issues. Fixes: `74608d8fee` ("drm/mediatek: Add DRM_MODE_ROTATE_0 to rotation property") Signed-off-by: Jason-JH.Lin <jason-jh.lin@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: CK Hu <ck.hu@mediatek.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20241118025126.30808-1-jason-jh.lin@mediatek.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>	2024-12-25 15:27:32 +00:00
Daniel Golle	f8d9b91739	drm/mediatek: Only touch DISP_REG_OVL_PITCH_MSB if AFBC is supported Touching DISP_REG_OVL_PITCH_MSB leads to video overlay on MT2701, MT7623N and probably other older SoCs being broken. Move setting up AFBC layer configuration into a separate function only being called on hardware which actually supports AFBC which restores the behavior as it was before commit `c410fa9b07` ("drm/mediatek: Add AFBC support to Mediatek DRM driver") on non-AFBC hardware. Fixes: `c410fa9b07` ("drm/mediatek: Add AFBC support to Mediatek DRM driver") Cc: stable@vger.kernel.org Signed-off-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: CK Hu <ck.hu@mediatek.com> Link: `c7fbd3c3e6`.1734397800.git.daniel@makrotopia.org/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>	2024-12-25 15:19:27 +00:00
Jason-JH.Lin	da03801ad0	drm/mediatek: Move mtk_crtc_finish_page_flip() to ddp_cmdq_cb() mtk_crtc_finish_page_flip() is used to notify userspace that a page flip has been completed, allowing userspace to free the frame buffer of the last frame and commit the next frame. In MediaTek's hardware design for configuring display hardware by using GCE, `DRM_EVENT_FLIP_COMPLETE` should be notified to userspace after GCE has finished configuring all display hardware settings for each atomic_commit(). Currently, mtk_crtc_finish_page_flip() cannot guarantee that GCE has configured all the display hardware settings of the last frame. Therefore, to increase the accuracy of the timing for notifying `DRM_EVENT_FLIP_COMPLETE` to userspace, mtk_crtc_finish_page_flip() should be moved to ddp_cmdq_cb(). Fixes: `7f82d9c438` ("drm/mediatek: Clear pending flag when cmdq packet is done") Signed-off-by: Jason-JH.Lin <jason-jh.lin@mediatek.com> Reviewed-by: CK Hu <ck.hu@mediatek.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20241211034716.29241-1-jason-jh.lin@mediatek.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>	2024-12-25 15:18:55 +00:00
Guoqing Jiang	36684e9d88	drm/mediatek: Set private->all_drm_private[i]->drm to NULL if mtk_drm_bind returns err The pointer need to be set to NULL, otherwise KASAN complains about use-after-free. Because in mtk_drm_bind, all private's drm are set as follows. private->all_drm_private[i]->drm = drm; And drm will be released by drm_dev_put in case mtk_drm_kms_init returns failure. However, the shutdown path still accesses the previous allocated memory in drm_atomic_helper_shutdown. [ 84.874820] watchdog: watchdog0: watchdog did not stop! [ 86.512054] ================================================================== [ 86.513162] BUG: KASAN: use-after-free in drm_atomic_helper_shutdown+0x33c/0x378 [ 86.514258] Read of size 8 at addr ffff0000d46fc068 by task shutdown/1 [ 86.515213] [ 86.515455] CPU: 1 UID: 0 PID: 1 Comm: shutdown Not tainted 6.13.0-rc1-mtk+gfa1a78e5d24b-dirty #55 [ 86.516752] Hardware name: Unknown Product/Unknown Product, BIOS 2022.10 10/01/2022 [ 86.517960] Call trace: [ 86.518333] show_stack+0x20/0x38 (C) [ 86.518891] dump_stack_lvl+0x90/0xd0 [ 86.519443] print_report+0xf8/0x5b0 [ 86.519985] kasan_report+0xb4/0x100 [ 86.520526] __asan_report_load8_noabort+0x20/0x30 [ 86.521240] drm_atomic_helper_shutdown+0x33c/0x378 [ 86.521966] mtk_drm_shutdown+0x54/0x80 [ 86.522546] platform_shutdown+0x64/0x90 [ 86.523137] device_shutdown+0x260/0x5b8 [ 86.523728] kernel_restart+0x78/0xf0 [ 86.524282] __do_sys_reboot+0x258/0x2f0 [ 86.524871] __arm64_sys_reboot+0x90/0xd8 [ 86.525473] invoke_syscall+0x74/0x268 [ 86.526041] el0_svc_common.constprop.0+0xb0/0x240 [ 86.526751] do_el0_svc+0x4c/0x70 [ 86.527251] el0_svc+0x4c/0xc0 [ 86.527719] el0t_64_sync_handler+0x144/0x168 [ 86.528367] el0t_64_sync+0x198/0x1a0 [ 86.528920] [ 86.529157] The buggy address belongs to the physical page: [ 86.529972] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0xffff0000d46fd4d0 pfn:0x1146fc [ 86.531319] flags: 0xbfffc0000000000(node=0\|zone=2\|lastcpupid=0xffff) [ 86.532267] raw: 0bfffc0000000000 0000000000000000 dead000000000122 0000000000000000 [ 86.533390] raw: ffff0000d46fd4d0 0000000000000000 00000000ffffffff 0000000000000000 [ 86.534511] page dumped because: kasan: bad access detected [ 86.535323] [ 86.535559] Memory state around the buggy address: [ 86.536265] ffff0000d46fbf00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 86.537314] ffff0000d46fbf80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 86.538363] >ffff0000d46fc000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 86.544733] ^ [ 86.551057] ffff0000d46fc080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 86.557510] ffff0000d46fc100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 86.563928] ================================================================== [ 86.571093] Disabling lock debugging due to kernel taint [ 86.577642] Unable to handle kernel paging request at virtual address e0e9c0920000000b [ 86.581834] KASAN: maybe wild-memory-access in range [0x0752049000000058-0x075204900000005f] ... Fixes: `1ef7ed4835` ("drm/mediatek: Modify mediatek-drm for mt8195 multi mmsys support") Signed-off-by: Guoqing Jiang <guoqing.jiang@canonical.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20241223023227.1258112-1-guoqing.jiang@canonical.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>	2024-12-25 15:18:55 +00:00
Dustin L. Howett	dcd59d0d7d	platform/chrome: cros_ec_lpc: fix product identity for early Framework Laptops The product names for the Framework Laptop (12th and 13th Generation Intel Core) are incorrect as of `62be134abf`. Fixes: `62be134abf` ("platform/chrome: cros_ec_lpc: switch primary DMI data for Framework Laptop") Cc: stable@vger.kernel.org # 6.12.x Signed-off-by: Dustin L. Howett <dustin@howett.net> Reviewed-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20241224-platform-chrome-cros_ec_lpc-fix-product-identity-for-early-framework-laptops-v1-1-0d31d6e1d22c@howett.net Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org>	2024-12-25 01:47:35 +00:00
Tejun Heo	ce2b93fc1d	sched_ext: Fix dsq_local_on selftest The dsp_local_on selftest expects the scheduler to fail by trying to schedule an e.g. CPU-affine task to the wrong CPU. However, this isn't guaranteed to happen in the 1 second window that the test is running. Besides, it's odd to have this particular exception path tested when there are no other tests that verify that the interface is working at all - e.g. the test would pass if dsp_local_on interface is completely broken and fails on any attempt. Flip the test so that it verifies that the feature works. While at it, fix a typo in the info message. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Ihor Solodrai <ihor.solodrai@pm.me> Link: http://lkml.kernel.org/r/Z1n9v7Z6iNJ-wKmq@slm.duckdns.org Signed-off-by: Tejun Heo <tj@kernel.org>	2024-12-24 14:09:50 -10:00
Pablo Neira Ayuso	542ed8145e	netfilter: nft_set_hash: unaligned atomic read on struct nft_set_ext Access to genmask field in struct nft_set_ext results in unaligned atomic read: [ 72.130109] Unable to handle kernel paging request at virtual address ffff0000c2bb708c [ 72.131036] Mem abort info: [ 72.131213] ESR = 0x0000000096000021 [ 72.131446] EC = 0x25: DABT (current EL), IL = 32 bits [ 72.132209] SET = 0, FnV = 0 [ 72.133216] EA = 0, S1PTW = 0 [ 72.134080] FSC = 0x21: alignment fault [ 72.135593] Data abort info: [ 72.137194] ISV = 0, ISS = 0x00000021, ISS2 = 0x00000000 [ 72.142351] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 72.145989] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 72.150115] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000237d27000 [ 72.154893] [ffff0000c2bb708c] pgd=0000000000000000, p4d=180000023ffff403, pud=180000023f84b403, pmd=180000023f835403, +pte=0068000102bb7707 [ 72.163021] Internal error: Oops: 0000000096000021 [#1] SMP [...] [ 72.170041] CPU: 7 UID: 0 PID: 54 Comm: kworker/7:0 Tainted: G E 6.13.0-rc3+ #2 [ 72.170509] Tainted: [E]=UNSIGNED_MODULE [ 72.170720] Hardware name: QEMU QEMU Virtual Machine, BIOS edk2-stable202302-for-qemu 03/01/2023 [ 72.171192] Workqueue: events_power_efficient nft_rhash_gc [nf_tables] [ 72.171552] pstate: 21400005 (nzCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--) [ 72.171915] pc : nft_rhash_gc+0x200/0x2d8 [nf_tables] [ 72.172166] lr : nft_rhash_gc+0x128/0x2d8 [nf_tables] [ 72.172546] sp : ffff800081f2bce0 [ 72.172724] x29: ffff800081f2bd40 x28: ffff0000c2bb708c x27: 0000000000000038 [ 72.173078] x26: ffff0000c6780ef0 x25: ffff0000c643df00 x24: ffff0000c6778f78 [ 72.173431] x23: 000000000000001a x22: ffff0000c4b1f000 x21: ffff0000c6780f78 [ 72.173782] x20: ffff0000c2bb70dc x19: ffff0000c2bb7080 x18: 0000000000000000 [ 72.174135] x17: ffff0000c0a4e1c0 x16: 0000000000003000 x15: 0000ac26d173b978 [ 72.174485] x14: ffffffffffffffff x13: 0000000000000030 x12: ffff0000c6780ef0 [ 72.174841] x11: 0000000000000000 x10: ffff800081f2bcf8 x9 : ffff0000c3000000 [ 72.175193] x8 : 00000000000004be x7 : 0000000000000000 x6 : 0000000000000000 [ 72.175544] x5 : 0000000000000040 x4 : ffff0000c3000010 x3 : 0000000000000000 [ 72.175871] x2 : 0000000000003a98 x1 : ffff0000c2bb708c x0 : 0000000000000004 [ 72.176207] Call trace: [ 72.176316] nft_rhash_gc+0x200/0x2d8 [nf_tables] (P) [ 72.176653] process_one_work+0x178/0x3d0 [ 72.176831] worker_thread+0x200/0x3f0 [ 72.176995] kthread+0xe8/0xf8 [ 72.177130] ret_from_fork+0x10/0x20 [ 72.177289] Code: 54fff984 d503201f d2800080 91003261 (f820303f) [ 72.177557] ---[ end trace 0000000000000000 ]--- Align struct nft_set_ext to word size to address this and documentation it. pahole reports that this increases the size of elements for rhash and pipapo in 8 bytes on x86_64. Fixes: `7ffc748115` ("netfilter: nft_set_hash: skip duplicated elements pending gc run") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2024-12-25 00:27:49 +01:00
Henry Huang	35bf430e08	sched_ext: initialize kit->cursor.flags struct bpf_iter_scx_dsq *it maybe not initialized. If we didn't call scx_bpf_dsq_move_set_vtime and scx_bpf_dsq_move_set_slice before scx_bpf_dsq_move, it would cause unexpected behaviors: 1. Assign a huge slice into p->scx.slice 2. Assign a invalid vtime into p->scx.dsq_vtime Signed-off-by: Henry Huang <henry.hj@antgroup.com> Fixes: `6462dd53a2` ("sched_ext: Compact struct bpf_iter_scx_dsq_kern") Cc: stable@vger.kernel.org # v6.12 Signed-off-by: Tejun Heo <tj@kernel.org>	2024-12-24 10:56:08 -10:00
Su Hui	d57212f281	workqueue: add printf attribute to __alloc_workqueue() Fix a compiler warning with W=1: kernel/workqueue.c: error: function ‘__alloc_workqueue’ might be a candidate for ‘gnu_printf’ format attribute[-Werror=suggest-attribute=format] 5657 \| name_len = vsnprintf(wq->name, sizeof(wq->name), fmt, args); \| ^~~~~~~~ Fixes: `9b59a85a84` ("workqueue: Don't call va_start / va_end twice") Signed-off-by: Su Hui <suhui@nfschina.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2024-12-24 09:50:38 -10:00
Linus Torvalds	9b2ffa6148	Merge tag 'mtd/fixes-for-6.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux Pull mtd fixes from Miquel Raynal: "Four minor fixes for NAND controller drivers (cleanup path, double actions, and W=1 warning) as well as a cast to avoid overflows in an mtd device driver" * tag 'mtd/fixes-for-6.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: mtd: rawnand: omap2: Fix build warnings with W=1 mtd: rawnand: arasan: Fix missing de-registration of NAND mtd: rawnand: arasan: Fix double assertion of chip-select mtd: diskonchip: Cast an operand to prevent potential overflow mtd: rawnand: fix double free in atmel_pmecc_create_user()	2024-12-24 09:08:45 -08:00
Arnd Bergmann	17194c2998	phy: mediatek: phy-mtk-hdmi: add regulator dependency The driver no longer builds when regulator support is unavailable: arm-linux-gnueabi-ld: drivers/phy/mediatek/phy-mtk-hdmi.o: in function `mtk_hdmi_phy_register_regulators': phy-mtk-hdmi.c:(.text.unlikely+0x3e): undefined reference to `devm_regulator_register' arm-linux-gnueabi-ld: drivers/phy/mediatek/phy-mtk-hdmi-mt8195.o: in function `mtk_hdmi_phy_pwr5v_is_enabled': phy-mtk-hdmi-mt8195.c:(.text+0x326): undefined reference to `rdev_get_drvdata' arm-linux-gnueabi-ld: drivers/phy/mediatek/phy-mtk-hdmi-mt8195.o: in function `mtk_hdmi_phy_pwr5v_disable': phy-mtk-hdmi-mt8195.c:(.text+0x346): undefined reference to `rdev_get_drvdata' arm-linux-gnueabi-ld: drivers/phy/mediatek/phy-mtk-hdmi-mt8195.o: in function `mtk_hdmi_phy_pwr5v_enable': Fixes: `49393b2da1` ("phy: mediatek: phy-mtk-hdmi: Register PHY provided regulator") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20241213083056.2596499-1-arnd@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-12-24 20:38:53 +05:30
Adam Ford	739214dd1c	phy: freescale: fsl-samsung-hdmi: Fix 64-by-32 division cocci warnings The Kernel test robot returns the following warning: do_div() does a 64-by-32 division, please consider using div64_ul instead. To prevent the 64-by-32 divsion, consolidate both the multiplication and the do_div into one line which explicitly uses u64 sizes. Fixes: `1951dbb41d` ("phy: freescale: fsl-samsung-hdmi: Support dynamic integer") Signed-off-by: Adam Ford <aford173@gmail.com> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202412091243.fSObwwPi-lkp@intel.com/ Link: https://lore.kernel.org/r/20241215220555.99113-1-aford173@gmail.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-12-24 20:37:56 +05:30
Zijun Hu	a2d633cb14	phy: core: Fix an OF node refcount leakage in of_phy_provider_lookup() For macro for_each_child_of_node(parent, child), refcount of @child has been increased before entering its loop body, so normally needs to call of_node_put(@child) before returning from the loop body to avoid refcount leakage. of_phy_provider_lookup() has such usage but does not call of_node_put() before returning, so cause leakage of the OF node refcount. Fix by simply calling of_node_put() before returning from the loop body. The APIs affected by this issue are shown below since they indirectly invoke problematic of_phy_provider_lookup(). phy_get() of_phy_get() devm_phy_get() devm_of_phy_get() devm_of_phy_get_by_index() Fixes: `2a4c37016c` ("phy: core: Fix of_phy_provider_lookup to return PHY provider for sub node") Cc: stable@vger.kernel.org Reviewed-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Link: https://lore.kernel.org/r/20241213-phy_core_fix-v6-5-40ae28f5015a@quicinc.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-12-24 19:55:37 +05:30
Zijun Hu	5ebdc6be16	phy: core: Fix an OF node refcount leakage in _of_phy_get() _of_phy_get() will directly return when suffers of_device_is_compatible() error, but it forgets to decrease refcount of OF node @args.np before error return, the refcount was increased by previous of_parse_phandle_with_args() so causes the OF node's refcount leakage. Fix by decreasing the refcount via of_node_put() before the error return. Fixes: `b7563e2796` ("phy: work around 'phys' references to usb-nop-xceiv devices") Cc: stable@vger.kernel.org Reviewed-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Link: https://lore.kernel.org/r/20241213-phy_core_fix-v6-4-40ae28f5015a@quicinc.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-12-24 19:55:37 +05:30
Zijun Hu	4dc48c88fc	phy: core: Fix that API devm_phy_destroy() fails to destroy the phy For devm_phy_destroy(), its comment says it needs to invoke phy_destroy() to destroy the phy, but it will not actually invoke the function since devres_destroy() does not call devm_phy_consume(), and the missing phy_destroy() call will cause that the phy fails to be destroyed. Fortunately, the faulty API has not been used by current kernel tree. Fix by using devres_release() instead of devres_destroy() within the API. Fixes: `ff76496347` ("drivers: phy: add generic PHY framework") Reviewed-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Link: https://lore.kernel.org/r/20241213-phy_core_fix-v6-3-40ae28f5015a@quicinc.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-12-24 19:55:37 +05:30
Zijun Hu	c0b82ab95b	phy: core: Fix that API devm_of_phy_provider_unregister() fails to unregister the phy provider For devm_of_phy_provider_unregister(), its comment says it needs to invoke of_phy_provider_unregister() to unregister the phy provider, but it will not actually invoke the function since devres_destroy() does not call devm_phy_provider_release(), and the missing of_phy_provider_unregister() call will cause: - The phy provider fails to be unregistered. - Leak both memory and the OF node refcount. Fortunately, the faulty API has not been used by current kernel tree. Fix by using devres_release() instead of devres_destroy() within the API. Fixes: `ff76496347` ("drivers: phy: add generic PHY framework") Reviewed-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Link: https://lore.kernel.org/stable/20241213-phy_core_fix-v6-2-40ae28f5015a%40quicinc.com Link: https://lore.kernel.org/r/20241213-phy_core_fix-v6-2-40ae28f5015a@quicinc.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-12-24 19:55:37 +05:30
Zijun Hu	fe4bfa9b6d	phy: core: Fix that API devm_phy_put() fails to release the phy For devm_phy_put(), its comment says it needs to invoke phy_put() to release the phy, but it will not actually invoke the function since devres_destroy() does not call devm_phy_release(), and the missing phy_put() call will cause: - The phy fails to be released. - devm_phy_put() can not fully undo what API devm_phy_get() does. - Leak refcount of both the module and device for below typical usage: devm_phy_get(); // or its variant ... err = do_something(); if (err) goto err_out; ... err_out: devm_phy_put(); // leak refcount here The file(s) affected by this issue are shown below since they have such typical usage. drivers/pci/controller/cadence/pcie-cadence.c drivers/net/ethernet/ti/am65-cpsw-nuss.c Fix by using devres_release() instead of devres_destroy() within the API. Fixes: `ff76496347` ("drivers: phy: add generic PHY framework") Cc: stable@vger.kernel.org Cc: Lorenzo Pieralisi <lpieralisi@kernel.org> Cc: Krzysztof Wilczyński <kw@linux.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Link: https://lore.kernel.org/r/20241213-phy_core_fix-v6-1-40ae28f5015a@quicinc.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-12-24 19:55:37 +05:30
Akhil R	ebc008699f	dmaengine: tegra: Return correct DMA status when paused Currently, the driver does not return the correct DMA status when a DMA pause is issued by the client drivers. This causes GPCDMA users to assume that DMA is still running, while in reality, the DMA is paused. Return DMA_PAUSED for tx_status() if the channel is paused in the middle of a transfer. Fixes: `ee17028009` ("dmaengine: tegra: Add tegra gpcdma driver") Cc: stable@vger.kernel.org Signed-off-by: Akhil R <akhilrajeev@nvidia.com> Signed-off-by: Kartik Rajput <kkartik@nvidia.com> Link: https://lore.kernel.org/r/20241212124412.5650-1-kkartik@nvidia.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-12-24 15:49:30 +05:30
Javier Carrasco	362f1bf98a	dmaengine: mv_xor: fix child node refcount handling in early exit The for_each_child_of_node() loop requires explicit calls to of_node_put() to decrement the child's refcount upon early exits (break, goto, return). Add the missing calls in the two early exits before the goto instructions. Cc: stable@vger.kernel.org Fixes: `f7d12ef53d` ("dma: mv_xor: add Device Tree binding") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Link: https://lore.kernel.org/r/20241011-dma_mv_xor_of_node_put-v1-1-3c2de819f463@gmail.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-12-24 15:45:01 +05:30
Rodrigo Vivi	20e7c5313f	drm/i915/dg1: Fix power gate sequence. sub-pipe PG is not present on DG1. Setting these bits can disable other power gates and cause GPU hangs on video playbacks. VLK: 16314, 4304 Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13381 Fixes: `85a12d7eb8` ("drm/i915/tgl: Fix Media power gate sequence.") Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241219210019.70532-1-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit `de7061947b`) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>	2024-12-24 09:41:03 +00:00
Suraj Kandpal	385a95cc72	drm/i915/cx0_phy: Fix C10 pll programming sequence According to spec VDR_CUSTOM_WIDTH register gets programmed after pll specific VDR registers and TX Lane programming registers are done. Moreover we only program into C10_VDR_CONTROL1 to update config and setup master lane once all VDR registers are written into. Bspec: 67636 Fixes: `51390cc0e0` ("drm/i915/mtl: Add Support for C10 PHY message bus and pll programming") Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com> Reviewed-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241216181554.2861381-1-suraj.kandpal@intel.com (cherry picked from commit `f9d418552b`) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>	2024-12-24 09:41:00 +00:00
Zhu Yanjun	2ac5415022	RDMA/rxe: Remove the direct link to net_device The similar patch in siw is in the link: https://git.kernel.org/rdma/rdma/c/16b87037b48889 This problem also occurred in RXE. The following analyze this problem. In the following Call Traces: " BUG: KASAN: slab-use-after-free in dev_get_flags+0x188/0x1d0 net/core/dev.c:8782 Read of size 4 at addr ffff8880554640b0 by task kworker/1:4/5295 CPU: 1 UID: 0 PID: 5295 Comm: kworker/1:4 Not tainted 6.12.0-rc3-syzkaller-00399-g9197b73fd7bb #0 Hardware name: Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 Workqueue: infiniband ib_cache_event_task Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:377 [inline] print_report+0x169/0x550 mm/kasan/report.c:488 kasan_report+0x143/0x180 mm/kasan/report.c:601 dev_get_flags+0x188/0x1d0 net/core/dev.c:8782 rxe_query_port+0x12d/0x260 drivers/infiniband/sw/rxe/rxe_verbs.c:60 __ib_query_port drivers/infiniband/core/device.c:2111 [inline] ib_query_port+0x168/0x7d0 drivers/infiniband/core/device.c:2143 ib_cache_update+0x1a9/0xb80 drivers/infiniband/core/cache.c:1494 ib_cache_event_task+0xf3/0x1e0 drivers/infiniband/core/cache.c:1568 process_one_work kernel/workqueue.c:3229 [inline] process_scheduled_works+0xa65/0x1850 kernel/workqueue.c:3310 worker_thread+0x870/0xd30 kernel/workqueue.c:3391 kthread+0x2f2/0x390 kernel/kthread.c:389 ret_from_fork+0x4d/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 </TASK> " 1). In the link [1], " infiniband syz2: set down " This means that on 839.350575, the event ib_cache_event_task was sent andi queued in ib_wq. 2). In the link [1], " team0 (unregistering): Port device team_slave_0 removed " It indicates that before 843.251853, the net device should be freed. 3). In the link [1], " BUG: KASAN: slab-use-after-free in dev_get_flags+0x188/0x1d0 " This means that on 850.559070, this slab-use-after-free problem occurred. In all, on 839.350575, the event ib_cache_event_task was sent and queued in ib_wq, before 843.251853, the net device veth was freed. on 850.559070, this event was executed, and the mentioned freed net device was called. Thus, the above call trace occurred. [1] https://syzkaller.appspot.com/x/log.txt?x=12e7025f980000 Reported-by: syzbot+4b87489410b4efd181bf@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=4b87489410b4efd181bf Fixes: `8700e3e7c4` ("Soft RoCE driver") Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Link: https://patch.msgid.link/20241220222325.2487767-1-yanjun.zhu@linux.dev Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-12-24 04:36:40 -05:00
Joe Hattori	ccfa3131d4	dmaengine: fsl-edma: implement the cleanup path of fsl_edma3_attach_pd() Current implementation of fsl_edma3_attach_pd() does not provide a cleanup path, resulting in a memory leak. For example, dev_pm_domain_detach() is not called after dev_pm_domain_attach_by_id(), and the device link created with the DL_FLAG_STATELESS is not released explicitly. Therefore, provide a cleanup function fsl_edma3_detach_pd() and call it upon failure. Also add a devm_add_action_or_reset() call with this function after a successful fsl_edma3_attach_pd(). Fixes: `72f5801a4e` ("dmaengine: fsl-edma: integrate v3 support") Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp> Link: https://lore.kernel.org/r/20241221075712.3297200-1-joe@pf.is.s.u-tokyo.ac.jp Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-12-24 14:55:57 +05:30
Lizhi Xu	98feccbf32	tracing: Prevent bad count for tracing_cpumask_write If a large count is provided, it will trigger a warning in bitmap_parse_user. Also check zero for it. Cc: stable@vger.kernel.org Fixes: `9e01c1b74c` ("cpumask: convert kernel trace functions") Link: https://lore.kernel.org/20241216073238.2573704-1-lizhi.xu@windriver.com Reported-by: syzbot+0aecfd34fb878546f3fd@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=0aecfd34fb878546f3fd Tested-by: syzbot+0aecfd34fb878546f3fd@syzkaller.appspotmail.com Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2024-12-23 21:59:15 -05:00
Christian Göttsche	452f4b31e3	tracing: Constify string literal data member in struct trace_event_call The name member of the struct trace_event_call is assigned with generated string literals; declare them pointer to read-only. Reported by clang: security/landlock/syscalls.c:179:1: warning: initializing 'char ' with an expression of type 'const char[34]' discards qualifiers [-Wincompatible-pointer-types-discards-qualifiers] 179 \| SYSCALL_DEFINE3(landlock_create_ruleset, \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 180 \| const struct landlock_ruleset_attr __user const, attr, \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 181 \| const size_t, size, const __u32, flags) \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ./include/linux/syscalls.h:226:36: note: expanded from macro 'SYSCALL_DEFINE3' 226 \| #define SYSCALL_DEFINE3(name, ...) SYSCALL_DEFINEx(3, _##name, __VA_ARGS__) \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ./include/linux/syscalls.h:234:2: note: expanded from macro 'SYSCALL_DEFINEx' 234 \| SYSCALL_METADATA(sname, x, __VA_ARGS__) \ \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ./include/linux/syscalls.h:184:2: note: expanded from macro 'SYSCALL_METADATA' 184 \| SYSCALL_TRACE_ENTER_EVENT(sname); \ \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ./include/linux/syscalls.h:151:30: note: expanded from macro 'SYSCALL_TRACE_ENTER_EVENT' 151 \| .name = "sys_enter"#sname, \ \| ^~~~~~~~~~~~~~~~~ Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Mickaël Salaün <mic@digikod.net> Cc: Günther Noack <gnoack@google.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Bill Wendling <morbo@google.com> Cc: Justin Stitt <justinstitt@google.com> Link: https://lore.kernel.org/20241125105028.42807-1-cgoettsche@seltendoof.de Fixes: `b77e38aa24` ("tracing: add event trace infrastructure") Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2024-12-23 21:53:43 -05:00
Linus Torvalds	ef49c460ab	Merge tag 'modules-6.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux Pull modules fix from Petr Pavlu: "A single fix is present to correct the module vermagic for PREEMPT_RT" * tag 'modules-6.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux: preempt: Move PREEMPT_RT before PREEMPT in vermagic.	2024-12-23 13:06:48 -08:00
Qu Wenruo	fca432e73d	btrfs: sysfs: fix direct super block member reads The following sysfs entries are reading super block member directly, which can have a different endian and cause wrong values: - sys/fs/btrfs/<uuid>/nodesize - sys/fs/btrfs/<uuid>/sectorsize - sys/fs/btrfs/<uuid>/clone_alignment Thankfully those values (nodesize and sectorsize) are always aligned inside the btrfs_super_block, so it won't trigger unaligned read errors, just endian problems. Fix them by using the native cached members instead. Fixes: `df93589a17` ("btrfs: export more from FS_INFO to sysfs") CC: stable@vger.kernel.org Reviewed-by: Naohiro Aota <naohiro.aota@wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2024-12-23 22:06:44 +01:00
Julian Sun	f2363e6fcc	btrfs: fix transaction atomicity bug when enabling simple quotas Set squota incompat bit before committing the transaction that enables the feature. With the config CONFIG_BTRFS_ASSERT enabled, an assertion failure occurs regarding the simple quota feature. [5.596534] assertion failed: btrfs_fs_incompat(fs_info, SIMPLE_QUOTA), in fs/btrfs/qgroup.c:365 [5.597098] ------------[ cut here ]------------ [5.597371] kernel BUG at fs/btrfs/qgroup.c:365! [5.597946] CPU: 1 UID: 0 PID: 268 Comm: mount Not tainted 6.13.0-rc2-00031-gf92f4749861b #146 [5.598450] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [5.599008] RIP: 0010:btrfs_read_qgroup_config+0x74d/0x7a0 [5.604303] <TASK> [5.605230] ? btrfs_read_qgroup_config+0x74d/0x7a0 [5.605538] ? exc_invalid_op+0x56/0x70 [5.605775] ? btrfs_read_qgroup_config+0x74d/0x7a0 [5.606066] ? asm_exc_invalid_op+0x1f/0x30 [5.606441] ? btrfs_read_qgroup_config+0x74d/0x7a0 [5.606741] ? btrfs_read_qgroup_config+0x74d/0x7a0 [5.607038] ? try_to_wake_up+0x317/0x760 [5.607286] open_ctree+0xd9c/0x1710 [5.607509] btrfs_get_tree+0x58a/0x7e0 [5.608002] vfs_get_tree+0x2e/0x100 [5.608224] fc_mount+0x16/0x60 [5.608420] btrfs_get_tree+0x2f8/0x7e0 [5.608897] vfs_get_tree+0x2e/0x100 [5.609121] path_mount+0x4c8/0xbc0 [5.609538] __x64_sys_mount+0x10d/0x150 The issue can be easily reproduced using the following reproducer: root@q:linux# cat repro.sh set -e mkfs.btrfs -q -f /dev/sdb mount /dev/sdb /mnt/btrfs btrfs quota enable -s /mnt/btrfs umount /mnt/btrfs mount /dev/sdb /mnt/btrfs The issue is that when enabling quotas, at btrfs_quota_enable(), we set BTRFS_QGROUP_STATUS_FLAG_SIMPLE_MODE at fs_info->qgroup_flags and persist it in the quota root in the item with the key BTRFS_QGROUP_STATUS_KEY, but we only set the incompat bit BTRFS_FEATURE_INCOMPAT_SIMPLE_QUOTA after we commit the transaction used to enable simple quotas. This means that if after that transaction commit we unmount the filesystem without starting and committing any other transaction, or we have a power failure, the next time we mount the filesystem we will find the flag BTRFS_QGROUP_STATUS_FLAG_SIMPLE_MODE set in the item with the key BTRFS_QGROUP_STATUS_KEY but we will not find the incompat bit BTRFS_FEATURE_INCOMPAT_SIMPLE_QUOTA set in the superblock, triggering an assertion failure at: btrfs_read_qgroup_config() -> qgroup_read_enable_gen() To fix this issue, set the BTRFS_FEATURE_INCOMPAT_SIMPLE_QUOTA flag immediately after setting the BTRFS_QGROUP_STATUS_FLAG_SIMPLE_MODE. This ensures that both flags are flushed to disk within the same transaction. Fixes: `182940f4f4` ("btrfs: qgroup: add new quota mode for simple quotas") CC: stable@vger.kernel.org # 6.6+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Julian Sun <sunjunchao2870@gmail.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2024-12-23 22:05:05 +01:00
Filipe Manana	2c8507c63f	btrfs: avoid monopolizing a core when activating a swap file During swap activation we iterate over the extents of a file and we can have many thousands of them, so we can end up in a busy loop monopolizing a core. Avoid this by doing a voluntary reschedule after processing each extent. CC: stable@vger.kernel.org # 5.4+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2024-12-23 22:04:48 +01:00
Filipe Manana	9a45022a0e	btrfs: allow swap activation to be interruptible During swap activation we iterate over the extents of a file, then do several checks for each extent, some of which may take some significant time such as checking if an extent is shared. Since a file can have many thousands of extents, this can be a very slow operation and it's currently not interruptible. I had a bug during development of a previous patch that resulted in an infinite loop when iterating the extents, so a core was busy looping and I couldn't cancel the operation, which is very annoying and requires a reboot. So make the loop interruptible by checking for fatal signals at the end of each iteration and stopping immediately if there is one. CC: stable@vger.kernel.org # 5.4+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2024-12-23 22:04:38 +01:00
Filipe Manana	03018e5d85	btrfs: fix swap file activation failure due to extents that used to be shared When activating a swap file, to determine if an extent is shared we use can_nocow_extent(), which ends up at btrfs_cross_ref_exist(). That helper is meant to be quick because it's used in the NOCOW write path, when flushing delalloc and when doing a direct IO write, however it does return some false positives, meaning it may indicate that an extent is shared even if it's no longer the case. For the write path this is fine, we just do a unnecessary COW operation instead of doing a more rigorous check which would be too heavy (calling btrfs_is_data_extent_shared()). However when activating a swap file, the false positives simply result in a failure, which is confusing for users/applications. One particular case where this happens is when a data extent only has 1 reference but that reference is not inlined in the extent item located in the extent tree - this happens when we create more than 33 references for an extent and then delete those 33 references plus every other non-inline reference except one. The function check_committed_ref() assumes that if the size of an extent item doesn't match the size of struct btrfs_extent_item plus the size of an inline reference (plus an owner reference in case simple quotas are enabled), then the extent is shared - that is not the case however, we can have a single reference but it's not inlined - the reason we do this is to be fast and avoid inspecting non-inline references which may be located in another leaf of the extent tree, slowing down write paths. The following test script reproduces the bug: $ cat test.sh #!/bin/bash DEV=/dev/sdi MNT=/mnt/sdi NUM_CLONES=50 umount $DEV &> /dev/null run_test() { local sync_after_add_reflinks=$1 local sync_after_remove_reflinks=$2 mkfs.btrfs -f $DEV > /dev/null #mkfs.xfs -f $DEV > /dev/null mount $DEV $MNT touch $MNT/foo chmod 0600 $MNT/foo # On btrfs the file must be NOCOW. chattr +C $MNT/foo &> /dev/null xfs_io -s -c "pwrite -b 1M 0 1M" $MNT/foo mkswap $MNT/foo for ((i = 1; i <= $NUM_CLONES; i++)); do touch $MNT/foo_clone_$i chmod 0600 $MNT/foo_clone_$i # On btrfs the file must be NOCOW. chattr +C $MNT/foo_clone_$i &> /dev/null cp --reflink=always $MNT/foo $MNT/foo_clone_$i done if [ $sync_after_add_reflinks -ne 0 ]; then # Flush delayed refs and commit current transaction. sync -f $MNT fi # Remove the original file and all clones except the last. rm -f $MNT/foo for ((i = 1; i < $NUM_CLONES; i++)); do rm -f $MNT/foo_clone_$i done if [ $sync_after_remove_reflinks -ne 0 ]; then # Flush delayed refs and commit current transaction. sync -f $MNT fi # Now use the last clone as a swap file. It should work since # its extent are not shared anymore. swapon $MNT/foo_clone_${NUM_CLONES} swapoff $MNT/foo_clone_${NUM_CLONES} umount $MNT } echo -e "\nTest without sync after creating and removing clones" run_test 0 0 echo -e "\nTest with sync after creating clones" run_test 1 0 echo -e "\nTest with sync after removing clones" run_test 0 1 echo -e "\nTest with sync after creating and removing clones" run_test 1 1 Running the test: $ ./test.sh Test without sync after creating and removing clones wrote 1048576/1048576 bytes at offset 0 1 MiB, 1 ops; 0.0017 sec (556.793 MiB/sec and 556.7929 ops/sec) Setting up swapspace version 1, size = 1020 KiB (1044480 bytes) no label, UUID=a6b9c29e-5ef4-4689-a8ac-bc199c750f02 swapon: /mnt/sdi/foo_clone_50: swapon failed: Invalid argument swapoff: /mnt/sdi/foo_clone_50: swapoff failed: Invalid argument Test with sync after creating clones wrote 1048576/1048576 bytes at offset 0 1 MiB, 1 ops; 0.0036 sec (271.739 MiB/sec and 271.7391 ops/sec) Setting up swapspace version 1, size = 1020 KiB (1044480 bytes) no label, UUID=5e9008d6-1f7a-4948-a1b4-3f30aba20a33 swapon: /mnt/sdi/foo_clone_50: swapon failed: Invalid argument swapoff: /mnt/sdi/foo_clone_50: swapoff failed: Invalid argument Test with sync after removing clones wrote 1048576/1048576 bytes at offset 0 1 MiB, 1 ops; 0.0103 sec (96.665 MiB/sec and 96.6651 ops/sec) Setting up swapspace version 1, size = 1020 KiB (1044480 bytes) no label, UUID=916c2740-fa9f-4385-9f06-29c3f89e4764 Test with sync after creating and removing clones wrote 1048576/1048576 bytes at offset 0 1 MiB, 1 ops; 0.0031 sec (314.268 MiB/sec and 314.2678 ops/sec) Setting up swapspace version 1, size = 1020 KiB (1044480 bytes) no label, UUID=06aab1dd-4d90-49c0-bd9f-3a8db4e2f912 swapon: /mnt/sdi/foo_clone_50: swapon failed: Invalid argument swapoff: /mnt/sdi/foo_clone_50: swapoff failed: Invalid argument Fix this by reworking btrfs_swap_activate() to instead of using extent maps and checking for shared extents with can_nocow_extent(), iterate over the inode's file extent items and use the accurate btrfs_is_data_extent_shared(). CC: stable@vger.kernel.org # 5.4+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2024-12-23 22:04:17 +01:00
Filipe Manana	0525064bb8	btrfs: fix race with memory mapped writes when activating swap file When activating the swap file we flush all delalloc and wait for ordered extent completion, so that we don't miss any delalloc and extents before we check that the file's extent layout is usable for a swap file and activate the swap file. We are called with the inode's VFS lock acquired, so we won't race with buffered and direct IO writes, however we can still race with memory mapped writes since they don't acquire the inode's VFS lock. The race window is between flushing all delalloc and locking the whole file's extent range, since memory mapped writes lock an extent range with the length of a page. Fix this by acquiring the inode's mmap lock before we flush delalloc. CC: stable@vger.kernel.org # 5.4+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2024-12-23 22:03:43 +01:00
Boris Burkov	0fba7be1ca	btrfs: check folio mapping after unlock in put_file_data() When we call btrfs_read_folio() we get an unlocked folio, so it is possible for a different thread to concurrently modify folio->mapping. We must check that this hasn't happened once we do have the lock. CC: stable@vger.kernel.org # 6.12+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Boris Burkov <boris@bur.io> Signed-off-by: David Sterba <dsterba@suse.com>	2024-12-23 22:00:07 +01:00
Boris Burkov	3e74859ee3	btrfs: check folio mapping after unlock in relocate_one_folio() When we call btrfs_read_folio() to bring a folio uptodate, we unlock the folio. The result of that is that a different thread can modify the mapping (like remove it with invalidate) before we call folio_lock(). This results in an invalid page and we need to try again. In particular, if we are relocating concurrently with aborting a transaction, this can result in a crash like the following: BUG: kernel NULL pointer dereference, address: 0000000000000000 PGD 0 P4D 0 Oops: 0000 [#1] SMP CPU: 76 PID: 1411631 Comm: kworker/u322:5 Workqueue: events_unbound btrfs_reclaim_bgs_work RIP: 0010:set_page_extent_mapped+0x20/0xb0 RSP: 0018:ffffc900516a7be8 EFLAGS: 00010246 RAX: ffffea009e851d08 RBX: ffffea009e0b1880 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffc900516a7b90 RDI: ffffea009e0b1880 RBP: 0000000003573000 R08: 0000000000000001 R09: ffff88c07fd2f3f0 R10: 0000000000000000 R11: 0000194754b575be R12: 0000000003572000 R13: 0000000003572fff R14: 0000000000100cca R15: 0000000005582fff FS: 0000000000000000(0000) GS:ffff88c07fd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000407d00f002 CR4: 00000000007706f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> ? __die+0x78/0xc0 ? page_fault_oops+0x2a8/0x3a0 ? __switch_to+0x133/0x530 ? wq_worker_running+0xa/0x40 ? exc_page_fault+0x63/0x130 ? asm_exc_page_fault+0x22/0x30 ? set_page_extent_mapped+0x20/0xb0 relocate_file_extent_cluster+0x1a7/0x940 relocate_data_extent+0xaf/0x120 relocate_block_group+0x20f/0x480 btrfs_relocate_block_group+0x152/0x320 btrfs_relocate_chunk+0x3d/0x120 btrfs_reclaim_bgs_work+0x2ae/0x4e0 process_scheduled_works+0x184/0x370 worker_thread+0xc6/0x3e0 ? blk_add_timer+0xb0/0xb0 kthread+0xae/0xe0 ? flush_tlb_kernel_range+0x90/0x90 ret_from_fork+0x2f/0x40 ? flush_tlb_kernel_range+0x90/0x90 ret_from_fork_asm+0x11/0x20 </TASK> This occurs because cleanup_one_transaction() calls destroy_delalloc_inodes() which calls invalidate_inode_pages2() which takes the folio_lock before setting mapping to NULL. We fail to check this, and subsequently call set_extent_mapping(), which assumes that mapping != NULL (in fact it asserts that in debug mode) Note that the "fixes" patch here is not the one that introduced the race (the very first iteration of this code from 2009) but a more recent change that made this particular crash happen in practice. Fixes: `e7f1326cc2` ("btrfs: set page extent mapped after read_folio in relocate_one_page") CC: stable@vger.kernel.org # 6.1+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Boris Burkov <boris@bur.io> Signed-off-by: David Sterba <dsterba@suse.com>	2024-12-23 22:00:07 +01:00
Filipe Manana	44f52bbe96	btrfs: fix use-after-free when COWing tree bock and tracing is enabled When a COWing a tree block, at btrfs_cow_block(), and we have the tracepoint trace_btrfs_cow_block() enabled and preemption is also enabled (CONFIG_PREEMPT=y), we can trigger a use-after-free in the COWed extent buffer while inside the tracepoint code. This is because in some paths that call btrfs_cow_block(), such as btrfs_search_slot(), we are holding the last reference on the extent buffer @buf so btrfs_force_cow_block() drops the last reference on the @buf extent buffer when it calls free_extent_buffer_stale(buf), which schedules the release of the extent buffer with RCU. This means that if we are on a kernel with preemption, the current task may be preempted before calling trace_btrfs_cow_block() and the extent buffer already released by the time trace_btrfs_cow_block() is called, resulting in a use-after-free. Fix this by moving the trace_btrfs_cow_block() from btrfs_cow_block() to btrfs_force_cow_block() before the COWed extent buffer is freed. This also has a side effect of invoking the tracepoint in the tree defrag code, at defrag.c:btrfs_realloc_node(), since btrfs_force_cow_block() is called there, but this is fine and it was actually missing there. Reported-by: syzbot+8517da8635307182c8a5@syzkaller.appspotmail.com Link: https://lore.kernel.org/linux-btrfs/6759a9b9.050a0220.1ac542.000d.GAE@google.com/ CC: stable@vger.kernel.org # 5.4+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2024-12-23 21:59:32 +01:00
Johannes Thumshirn	d29662695e	btrfs: fix use-after-free waiting for encoded read endios Fix a use-after-free in the I/O completion path for encoded reads by using a completion instead of a wait_queue for synchronizing the destruction of 'struct btrfs_encoded_read_private'. Fixes: `1881fba89b` ("btrfs: add BTRFS_IOC_ENCODED_READ ioctl") CC: stable@vger.kernel.org # 6.1+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2024-12-23 21:55:06 +01:00
Linus Torvalds	f07044dd0d	Merge tag 'nfsd-6.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever:: - Revert one v6.13 fix at the author's request (to be done differently) - Fix a minor problem with recent NFSv4.2 COPY enhancements - Fix an NFSv4.0 callback bug introduced in the v6.13 merge window * tag 'nfsd-6.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: nfsd: restore callback functionality for NFSv4.0 NFSD: fix management of pending async copies nfsd: Revert "nfsd: release svc_expkey/svc_export with rcu_work"	2024-12-23 12:16:15 -08:00
Jakub Kicinski	b3a69c5598	Merge branch 'mlx5-misc-fixes-2024-12-20' Tariq Toukan says: ==================== mlx5 misc fixes 2024-12-20 This small patchset provides misc bug fixes from the team to the mlx5 core and Eth drivers. ==================== Link: https://patch.msgid.link/20241220081505.1286093-1-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-23 10:54:07 -08:00
Jianbo Liu	2a4f56fbcc	net/mlx5e: Keep netdev when leave switchdev for devlink set legacy only In the cited commit, when changing from switchdev to legacy mode, uplink representor's netdev is kept, and its profile is replaced with nic profile, so netdev is detached from old profile, then attach to new profile. During profile change, the hardware resources allocated by the old profile will be cleaned up. However, the cleanup is relying on the related kernel modules. And they may need to flush themselves first, which is triggered by netdev events, for example, NETDEV_UNREGISTER. However, netdev is kept, or netdev_register is called after the cleanup, which may cause troubles because the resources are still referred by kernel modules. The same process applies to all the caes when uplink is leaving switchdev mode, including devlink eswitch mode set legacy, driver unload and devlink reload. For the first one, it can be blocked and returns failure to users, whenever possible. But it's hard for the others. Besides, the attachment to nic profile is unnecessary as the netdev will be unregistered anyway for such cases. So in this patch, the original behavior is kept only for devlink eswitch set mode legacy. For the others, moves netdev unregistration before the profile change. Fixes: `7a9fb35e8c` ("net/mlx5e: Do not reload ethernet ports when changing eswitch mode") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241220081505.1286093-5-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-23 10:54:02 -08:00
Jianbo Liu	5a03b36856	net/mlx5e: Skip restore TC rules for vport rep without loaded flag During driver unload, unregister_netdev is called after unloading vport rep. So, the mlx5e_rep_priv is already freed while trying to get rpriv->netdev, or walk rpriv->tc_ht, which results in use-after-free. So add the checking to make sure access the data of vport rep which is still loaded. Fixes: `d1569537a8` ("net/mlx5e: Modify and restore TC rules for IPSec TX rules") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241220081505.1286093-4-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-23 10:54:02 -08:00
Dragos Tatulea	8c6254479b	net/mlx5e: macsec: Maintain TX SA from encoding_sa In MACsec, it is possible to create multiple active TX SAs on a SC, but only one such SA can be used at a time for transmission. This SA is selected through the encoding_sa link parameter. When there are 2 or more active TX SAs configured (encoding_sa=0): ip macsec add macsec0 tx sa 0 pn 1 on key 00 <KEY1> ip macsec add macsec0 tx sa 1 pn 1 on key 00 <KEY2> ... the traffic should be still sent via TX SA 0 as the encoding_sa was not changed. However, the driver ignores the encoding_sa and overrides it to SA 1 by installing the flow steering id of the newly created TX SA into the SCI -> flow steering id hash map. The future packet tx descriptors will point to the incorrect flow steering rule (SA 1). This patch fixes the issue by avoiding the creation of the flow steering rule for an active TX SA that is not the encoding_sa. The driver side tx_sa object and the FW side macsec object are still created. When the encoding_sa link parameter is changed to another active TX SA, only the new flow steering rule will be created in the mlx5e_macsec_upd_txsa() handler. Fixes: `8ff0ac5be1` ("net/mlx5: Add MACsec offload Tx command support") Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Lior Nahmanson <liorna@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241220081505.1286093-3-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-23 10:54:01 -08:00
Shahar Shitrit	050a4c011b	net/mlx5: DR, select MSIX vector 0 for completion queue creation When creating a software steering completion queue (CQ), an arbitrary MSIX vector n is selected. This results in the CQ sharing the same Ethernet traffic channel n associated with the chosen vector. However, the value of n is often unpredictable, which can introduce complications for interrupt monitoring and verification tools. Moreover, SW steering uses polling rather than event-driven interrupts. Therefore, there is no need to select any MSIX vector other than the existing vector 0 for CQ creation. In light of these factors, and to enhance predictability, we modify the code to consistently select MSIX vector 0 for CQ creation. Fixes: `297cccebdc` ("net/mlx5: DR, Expose an internal API to issue RDMA operations") Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241220081505.1286093-2-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-23 10:54:01 -08:00
Kory Maincent	75221e9610	net: pse-pd: tps23881: Fix power on/off issue An issue was present in the initial driver implementation. The driver read the power status of all channels before toggling the bit of the desired one. Using the power status register as a base value introduced a problem, because only the bit corresponding to the concerned channel ID should be set in the write-only power enable register. This led to cases where disabling power for one channel also powered off other channels. This patch removes the power status read and ensures the value is limited to the bit matching the channel index of the PI. Fixes: `20e6d190ff` ("net: pse-pd: Add TI TPS23881 PSE controller driver") Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Acked-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://patch.msgid.link/20241220170400.291705-1-kory.maincent@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-23 10:40:34 -08:00
Siddharth Vadapalli	4a4d38ace1	net: ethernet: ti: am65-cpsw: default to round-robin for host port receive The Host Port (i.e. CPU facing port) of CPSW receives traffic from Linux via TX DMA Channels which are Hardware Queues consisting of traffic categorized according to their priority. The Host Port is configured to dequeue traffic from these Hardware Queues on the basis of priority i.e. as long as traffic exists on a Hardware Queue of a higher priority, the traffic on Hardware Queues of lower priority isn't dequeued. An alternate operation is also supported wherein traffic can be dequeued by the Host Port in a Round-Robin manner. Until commit under Fixes, the am65-cpsw driver enabled a single TX DMA Channel, due to which, unless modified by user via "ethtool", all traffic from Linux is transmitted on DMA Channel 0. Therefore, configuring the Host Port for priority based dequeuing or Round-Robin operation is identical since there is a single DMA Channel. Since commit under Fixes, all 8 TX DMA Channels are enabled by default. Additionally, the default "tc mapping" doesn't take into account the possibility of different traffic profiles which various users might have. This results in traffic starvation at the Host Port due to the priority based dequeuing which has been enabled by default since the inception of the driver. The traffic starvation triggers NETDEV WATCHDOG timeout for all TX DMA Channels that haven't been serviced due to the presence of traffic on the higher priority TX DMA Channels. Fix this by defaulting to Round-Robin dequeuing at the Host Port, which shall ensure that traffic is dequeued from all TX DMA Channels irrespective of the traffic profile. This will address the NETDEV WATCHDOG timeouts. At the same time, users can still switch from Round-Robin to Priority based dequeuing at the Host Port with the help of the "p0-rx-ptype-rrobin" private flag of "ethtool". Users are expected to setup an appropriate "tc mapping" that suits their traffic profile when switching to priority based dequeuing at the Host Port. Fixes: `be397ea347` ("net: ethernet: am65-cpsw: Set default TX channels to maximum") Cc: <stable@vger.kernel.org> Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com> Link: https://patch.msgid.link/20241220075618.228202-1-s-vadapalli@ti.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-23 10:37:29 -08:00
Nikolay Kuratov	4e86729d1f	net/sctp: Prevent autoclose integer overflow in sctp_association_init() While by default max_autoclose equals to INT_MAX / HZ, one may set net.sctp.max_autoclose to UINT_MAX. There is code in sctp_association_init() that can consequently trigger overflow. Cc: stable@vger.kernel.org Fixes: `9f70f46bd4` ("sctp: properly latch and use autoclose value from sock to association") Signed-off-by: Nikolay Kuratov <kniv@yandex-team.ru> Acked-by: Xin Long <lucien.xin@gmail.com> Link: https://patch.msgid.link/20241219162114.2863827-1-kniv@yandex-team.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-23 10:10:13 -08:00
Ilya Shchipletsov	a4fd163aed	netrom: check buffer length before accessing it Syzkaller reports an uninit value read from ax25cmp when sending raw message through ieee802154 implementation. ===================================================== BUG: KMSAN: uninit-value in ax25cmp+0x3a5/0x460 net/ax25/ax25_addr.c:119 ax25cmp+0x3a5/0x460 net/ax25/ax25_addr.c:119 nr_dev_get+0x20e/0x450 net/netrom/nr_route.c:601 nr_route_frame+0x1a2/0xfc0 net/netrom/nr_route.c:774 nr_xmit+0x5a/0x1c0 net/netrom/nr_dev.c:144 __netdev_start_xmit include/linux/netdevice.h:4940 [inline] netdev_start_xmit include/linux/netdevice.h:4954 [inline] xmit_one net/core/dev.c:3548 [inline] dev_hard_start_xmit+0x247/0xa10 net/core/dev.c:3564 __dev_queue_xmit+0x33b8/0x5130 net/core/dev.c:4349 dev_queue_xmit include/linux/netdevice.h:3134 [inline] raw_sendmsg+0x654/0xc10 net/ieee802154/socket.c:299 ieee802154_sock_sendmsg+0x91/0xc0 net/ieee802154/socket.c:96 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg net/socket.c:745 [inline] ____sys_sendmsg+0x9c2/0xd60 net/socket.c:2584 ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2638 __sys_sendmsg net/socket.c:2667 [inline] __do_sys_sendmsg net/socket.c:2676 [inline] __se_sys_sendmsg net/socket.c:2674 [inline] __x64_sys_sendmsg+0x307/0x490 net/socket.c:2674 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0x44/0x110 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x63/0x6b Uninit was created at: slab_post_alloc_hook+0x129/0xa70 mm/slab.h:768 slab_alloc_node mm/slub.c:3478 [inline] kmem_cache_alloc_node+0x5e9/0xb10 mm/slub.c:3523 kmalloc_reserve+0x13d/0x4a0 net/core/skbuff.c:560 __alloc_skb+0x318/0x740 net/core/skbuff.c:651 alloc_skb include/linux/skbuff.h:1286 [inline] alloc_skb_with_frags+0xc8/0xbd0 net/core/skbuff.c:6334 sock_alloc_send_pskb+0xa80/0xbf0 net/core/sock.c:2780 sock_alloc_send_skb include/net/sock.h:1884 [inline] raw_sendmsg+0x36d/0xc10 net/ieee802154/socket.c:282 ieee802154_sock_sendmsg+0x91/0xc0 net/ieee802154/socket.c:96 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg net/socket.c:745 [inline] ____sys_sendmsg+0x9c2/0xd60 net/socket.c:2584 ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2638 __sys_sendmsg net/socket.c:2667 [inline] __do_sys_sendmsg net/socket.c:2676 [inline] __se_sys_sendmsg net/socket.c:2674 [inline] __x64_sys_sendmsg+0x307/0x490 net/socket.c:2674 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0x44/0x110 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x63/0x6b CPU: 0 PID: 5037 Comm: syz-executor166 Not tainted 6.7.0-rc7-syzkaller-00003-gfbafc3e621c3 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023 ===================================================== This issue occurs because the skb buffer is too small, and it's actual allocation is aligned. This hides an actual issue, which is that nr_route_frame does not validate the buffer size before using it. Fix this issue by checking skb->len before accessing any fields in skb->data. Found by Linux Verification Center (linuxtesting.org) with Syzkaller. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Co-developed-by: Nikita Marushkin <hfggklm@gmail.com> Signed-off-by: Nikita Marushkin <hfggklm@gmail.com> Signed-off-by: Ilya Shchipletsov <rabbelkin@mail.ru> Link: https://patch.msgid.link/20241219082308.3942-1-rabbelkin@mail.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-23 10:04:55 -08:00
Xiao Liang	b5a7b661a0	net: Fix netns for ip_tunnel_init_flow() The device denoted by tunnel->parms.link resides in the underlay net namespace. Therefore pass tunnel->net to ip_tunnel_init_flow(). Fixes: `db53cd3d88` ("net: Handle l3mdev in ip_tunnel_init_flow") Signed-off-by: Xiao Liang <shaw.leon@gmail.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20241219130336.103839-1-shaw.leon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-23 10:04:41 -08:00
Wang Liang	4f4aa4aa28	net: fix memory leak in tcp_conn_request() If inet_csk_reqsk_queue_hash_add() return false, tcp_conn_request() will return without free the dst memory, which allocated in af_ops->route_req. Here is the kmemleak stack: unreferenced object 0xffff8881198631c0 (size 240): comm "softirq", pid 0, jiffies 4299266571 (age 1802.392s) hex dump (first 32 bytes): 00 10 9b 03 81 88 ff ff 80 98 da bc ff ff ff ff ................ 81 55 18 bb ff ff ff ff 00 00 00 00 00 00 00 00 .U.............. backtrace: [<ffffffffb93e8d4c>] kmem_cache_alloc+0x60c/0xa80 [<ffffffffba11b4c5>] dst_alloc+0x55/0x250 [<ffffffffba227bf6>] rt_dst_alloc+0x46/0x1d0 [<ffffffffba23050a>] __mkroute_output+0x29a/0xa50 [<ffffffffba23456b>] ip_route_output_key_hash+0x10b/0x240 [<ffffffffba2346bd>] ip_route_output_flow+0x1d/0x90 [<ffffffffba254855>] inet_csk_route_req+0x2c5/0x500 [<ffffffffba26b331>] tcp_conn_request+0x691/0x12c0 [<ffffffffba27bd08>] tcp_rcv_state_process+0x3c8/0x11b0 [<ffffffffba2965c6>] tcp_v4_do_rcv+0x156/0x3b0 [<ffffffffba299c98>] tcp_v4_rcv+0x1cf8/0x1d80 [<ffffffffba239656>] ip_protocol_deliver_rcu+0xf6/0x360 [<ffffffffba2399a6>] ip_local_deliver_finish+0xe6/0x1e0 [<ffffffffba239b8e>] ip_local_deliver+0xee/0x360 [<ffffffffba239ead>] ip_rcv+0xad/0x2f0 [<ffffffffba110943>] __netif_receive_skb_one_core+0x123/0x140 Call dst_release() to free the dst memory when inet_csk_reqsk_queue_hash_add() return false in tcp_conn_request(). Fixes: `ff46e3b442` ("Fix race for duplicate reqsk on identical SYN") Signed-off-by: Wang Liang <wangliang74@huawei.com> Link: https://patch.msgid.link/20241219072859.3783576-1-wangliang74@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-23 10:03:12 -08:00
Ben Wolsieffer	0cfc36ea51	serial: stm32: use port lock wrappers for break control Commit `30e945861f` ("serial: stm32: add support for break control") added another usage of the port lock, but was merged on the same day as `c5d0666255` ("serial: stm32: Use port lock wrappers"), therefore the latter did not update this usage to use the port lock wrappers. Fixes: `c5d0666255` ("serial: stm32: Use port lock wrappers") Cc: stable <stable@kernel.org> Signed-off-by: Ben Wolsieffer <ben.wolsieffer@hefring.com> Reviewed-by: John Ogness <john.ogness@linutronix.de> Link: https://lore.kernel.org/r/20241216145323.111612-1-ben.wolsieffer@hefring.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-23 19:01:54 +01:00
Xiaolei Wang	fbd22c4fa7	serial: imx: Use uart_port_lock_irq() instead of uart_port_lock() When executing 'echo mem > /sys/power/state', the following deadlock occurs. Since there is output during the serial port entering the suspend process, the suspend will be interrupted, resulting in the nesting of locks. Therefore, use uart_port_lock_irq() instead of uart_port_unlock(). WARNING: inconsistent lock state 6.12.0-rc2-00002-g3c199ed5bd64-dirty #23 Not tainted -------------------------------- inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage. sh/494 [HC0[0]:SC0[0]:HE1:SE1] takes: c4db5850 (&port_lock_key){?.-.}-{3:3}, at: imx_uart_enable_wakeup+0x14/0x254 {IN-HARDIRQ-W} state was registered at: lock_acquire+0x104/0x348 _raw_spin_lock+0x48/0x84 imx_uart_int+0x14/0x4dc __handle_irq_event_percpu+0xac/0x2fc handle_irq_event_percpu+0xc/0x40 handle_irq_event+0x38/0x8c handle_fasteoi_irq+0xb4/0x1b8 handle_irq_desc+0x1c/0x2c gic_handle_irq+0x6c/0xa0 generic_handle_arch_irq+0x2c/0x64 call_with_stack+0x18/0x20 __irq_svc+0x9c/0xbc _raw_spin_unlock_irqrestore+0x2c/0x48 uart_write+0xd8/0x3a0 do_output_char+0x1a8/0x1e4 n_tty_write+0x224/0x440 file_tty_write.constprop.0+0x124/0x250 do_iter_readv_writev+0x100/0x1e0 vfs_writev+0xc4/0x448 do_writev+0x68/0xf8 ret_fast_syscall+0x0/0x1c irq event stamp: 31593 hardirqs last enabled at (31593): [<c1150e48>] _raw_spin_unlock_irqrestore+0x44/0x48 hardirqs last disabled at (31592): [<c07f32f0>] clk_enable_lock+0x60/0x120 softirqs last enabled at (30334): [<c012d1d4>] handle_softirqs+0x2cc/0x478 softirqs last disabled at (30325): [<c012d510>] __irq_exit_rcu+0x120/0x15c other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&port_lock_key); <Interrupt> lock(&port_lock_key); Fixes: `3c199ed5bd` ("serial: imx: Grab port lock in imx_uart_enable_wakeup()") Signed-off-by: Xiaolei Wang <xiaolei.wang@windriver.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Fabio Estevam <festevam@gmail.com> Link: https://lore.kernel.org/r/20241210233613.2881264-1-xiaolei.wang@windriver.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-23 19:01:40 +01:00
Ilpo Järvinen	ed2761958a	tty: serial: 8250: Fix another runtime PM usage counter underflow The commit `f9b11229b7` ("serial: 8250: Fix PM usage_count for console handover") fixed one runtime PM usage counter balance problem that occurs because .dev is not set during univ8250 setup preventing call to pm_runtime_get_sync(). Later, univ8250_console_exit() will trigger the runtime PM usage counter underflow as .dev is already set at that time. Call pm_runtime_get_sync() to balance the RPM usage counter also in serial8250_register_8250_port() before trying to add the port. Reported-by: Borislav Petkov (AMD) <bp@alien8.de> Fixes: `bedb404e91` ("serial: 8250_port: Don't use power management for kernel console") Cc: stable <stable@kernel.org> Tested-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20241210170120.2231-1-ilpo.jarvinen@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-23 19:01:18 +01:00
Akash M	dfc51e48bc	usb: gadget: f_fs: Remove WARN_ON in functionfs_bind This commit addresses an issue related to below kernel panic where panic_on_warn is enabled. It is caused by the unnecessary use of WARN_ON in functionsfs_bind, which easily leads to the following scenarios. 1.adb_write in adbd 2. UDC write via configfs ================= ===================== ->usb_ffs_open_thread() ->UDC write ->open_functionfs() ->configfs_write_iter() ->adb_open() ->gadget_dev_desc_UDC_store() ->adb_write() ->usb_gadget_register_driver_owner ->driver_register() ->StartMonitor() ->bus_add_driver() ->adb_read() ->gadget_bind_driver() <times-out without BIND event> ->configfs_composite_bind() ->usb_add_function() ->open_functionfs() ->ffs_func_bind() ->adb_open() ->functionfs_bind() <ffs->state !=FFS_ACTIVE> The adb_open, adb_read, and adb_write operations are invoked from the daemon, but trying to bind the function is a process that is invoked by UDC write through configfs, which opens up the possibility of a race condition between the two paths. In this race scenario, the kernel panic occurs due to the WARN_ON from functionfs_bind when panic_on_warn is enabled. This commit fixes the kernel panic by removing the unnecessary WARN_ON. Kernel panic - not syncing: kernel: panic_on_warn set ... [ 14.542395] Call trace: [ 14.542464] ffs_func_bind+0x1c8/0x14a8 [ 14.542468] usb_add_function+0xcc/0x1f0 [ 14.542473] configfs_composite_bind+0x468/0x588 [ 14.542478] gadget_bind_driver+0x108/0x27c [ 14.542483] really_probe+0x190/0x374 [ 14.542488] __driver_probe_device+0xa0/0x12c [ 14.542492] driver_probe_device+0x3c/0x220 [ 14.542498] __driver_attach+0x11c/0x1fc [ 14.542502] bus_for_each_dev+0x104/0x160 [ 14.542506] driver_attach+0x24/0x34 [ 14.542510] bus_add_driver+0x154/0x270 [ 14.542514] driver_register+0x68/0x104 [ 14.542518] usb_gadget_register_driver_owner+0x48/0xf4 [ 14.542523] gadget_dev_desc_UDC_store+0xf8/0x144 [ 14.542526] configfs_write_iter+0xf0/0x138 Fixes: `ddf8abd259` ("USB: f_fs: the FunctionFS driver") Cc: stable <stable@kernel.org> Signed-off-by: Akash M <akash.m5@samsung.com> Link: https://lore.kernel.org/r/20241219125221.1679-1-akash.m5@samsung.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-23 18:54:34 +01:00
Kai-Heng Feng	59bfeaf545	USB: core: Disable LPM only for non-suspended ports There's USB error when tegra board is shutting down: [ 180.919315] usb 2-3: Failed to set U1 timeout to 0x0,error code -113 [ 180.919995] usb 2-3: Failed to set U1 timeout to 0xa,error code -113 [ 180.920512] usb 2-3: Failed to set U2 timeout to 0x4,error code -113 [ 186.157172] tegra-xusb 3610000.usb: xHCI host controller not responding, assume dead [ 186.157858] tegra-xusb 3610000.usb: HC died; cleaning up [ 186.317280] tegra-xusb 3610000.usb: Timeout while waiting for evaluate context command The issue is caused by disabling LPM on already suspended ports. For USB2 LPM, the LPM is already disabled during port suspend. For USB3 LPM, port won't transit to U1/U2 when it's already suspended in U3, hence disabling LPM is only needed for ports that are not suspended. Cc: Wayne Chang <waynec@nvidia.com> Cc: stable <stable@kernel.org> Fixes: `d920a2ed86` ("usb: Disable USB3 LPM at shutdown") Signed-off-by: Kai-Heng Feng <kaihengf@nvidia.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Tested-by: Jon Hunter <jonathanh@nvidia.com> Link: https://lore.kernel.org/r/20241206074817.89189-1-kaihengf@nvidia.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-23 18:54:20 +01:00
Ma Ke	0df11fa8ce	usb: fix reference leak in usb_new_device() When device_add(&udev->dev) succeeds and a later call fails, usb_new_device() does not properly call device_del(). As comment of device_add() says, 'if device_add() succeeds, you should call device_del() when you want to get rid of it. If device_add() has not succeeded, use only put_device() to drop the reference count'. Found by code review. Cc: stable <stable@kernel.org> Fixes: `9f8b17e643` ("USB: make usbdevices export their device nodes instead of using a separate class") Signed-off-by: Ma Ke <make_ruc2021@163.com> Reviewed-by: Alan Stern <stern@rowland.harvard.edu> Link: https://lore.kernel.org/r/20241218071346.2973980-1-make_ruc2021@163.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-23 18:54:04 +01:00
Xu Yang	862a9c0f68	usb: typec: tcpci: fix NULL pointer issue on shared irq case The tcpci_irq() may meet below NULL pointer dereference issue: [ 2.641851] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000010 [ 2.641951] status 0x1, 0x37f [ 2.650659] Mem abort info: [ 2.656490] ESR = 0x0000000096000004 [ 2.660230] EC = 0x25: DABT (current EL), IL = 32 bits [ 2.665532] SET = 0, FnV = 0 [ 2.668579] EA = 0, S1PTW = 0 [ 2.671715] FSC = 0x04: level 0 translation fault [ 2.676584] Data abort info: [ 2.679459] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 [ 2.684936] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 2.689980] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 2.695284] [0000000000000010] user address but active_mm is swapper [ 2.701632] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP [ 2.707883] Modules linked in: [ 2.710936] CPU: 1 UID: 0 PID: 87 Comm: irq/111-2-0051 Not tainted 6.12.0-rc6-06316-g7f63786ad3d1-dirty #4 [ 2.720570] Hardware name: NXP i.MX93 11X11 EVK board (DT) [ 2.726040] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 2.732989] pc : tcpci_irq+0x38/0x318 [ 2.736647] lr : _tcpci_irq+0x14/0x20 [ 2.740295] sp : ffff80008324bd30 [ 2.743597] x29: ffff80008324bd70 x28: ffff800080107894 x27: ffff800082198f70 [ 2.750721] x26: ffff0000050e6680 x25: ffff000004d172ac x24: ffff0000050f0000 [ 2.757845] x23: ffff000004d17200 x22: 0000000000000001 x21: ffff0000050f0000 [ 2.764969] x20: ffff000004d17200 x19: 0000000000000000 x18: 0000000000000001 [ 2.772093] x17: 0000000000000000 x16: ffff80008183d8a0 x15: ffff00007fbab040 [ 2.779217] x14: ffff00007fb918c0 x13: 0000000000000000 x12: 000000000000017a [ 2.786341] x11: 0000000000000001 x10: 0000000000000a90 x9 : ffff80008324bd00 [ 2.793465] x8 : ffff0000050f0af0 x7 : ffff00007fbaa840 x6 : 0000000000000031 [ 2.800589] x5 : 000000000000017a x4 : 0000000000000002 x3 : 0000000000000002 [ 2.807713] x2 : ffff80008324bd3a x1 : 0000000000000010 x0 : 0000000000000000 [ 2.814838] Call trace: [ 2.817273] tcpci_irq+0x38/0x318 [ 2.820583] _tcpci_irq+0x14/0x20 [ 2.823885] irq_thread_fn+0x2c/0xa8 [ 2.827456] irq_thread+0x16c/0x2f4 [ 2.830940] kthread+0x110/0x114 [ 2.834164] ret_from_fork+0x10/0x20 [ 2.837738] Code: f9426420 f9001fe0 d2800000 52800201 (f9400a60) This may happen on shared irq case. Such as two Type-C ports share one irq. After the first port finished tcpci_register_port(), it may trigger interrupt. However, if the interrupt comes by chance the 2nd port finishes devm_request_threaded_irq(), the 2nd port interrupt handler will run at first. Then the above issue happens due to tcpci is still a NULL pointer in tcpci_irq() when dereference to regmap. devm_request_threaded_irq() <-- port1 irq comes disable_irq(client->irq); tcpci_register_port() This will restore the logic to the state before commit (`77e85107a7` "usb: typec: tcpci: support edge irq"). However, moving tcpci_register_port() earlier creates a problem when use edge irq because tcpci_init() will be called before devm_request_threaded_irq(). The tcpci_init() writes the ALERT_MASK to the hardware to tell it to start generating interrupts but we're not ready to deal with them yet, then the ALERT events may be missed and ALERT line will not recover to high level forever. To avoid the issue, this will also set ALERT_MASK register after devm_request_threaded_irq() return. Fixes: `77e85107a7` ("usb: typec: tcpci: support edge irq") Cc: stable <stable@kernel.org> Tested-by: Emanuele Ghidoli <emanuele.ghidoli@toradex.com> Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Reviewed-by: Francesco Dolcini <francesco.dolcini@toradex.com> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://lore.kernel.org/r/20241218095328.2604607-1-xu.yang_2@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-23 18:53:46 +01:00
Lianqin Hu	13014969cb	usb: gadget: u_serial: Disable ep before setting port to null to fix the crash caused by port being null Considering that in some extreme cases, when performing the unbinding operation, gserial_disconnect has cleared gser->ioport, which triggers gadget reconfiguration, and then calls gs_read_complete, resulting in access to a null pointer. Therefore, ep is disabled before gserial_disconnect sets port to null to prevent this from happening. Call trace: gs_read_complete+0x58/0x240 usb_gadget_giveback_request+0x40/0x160 dwc3_remove_requests+0x170/0x484 dwc3_ep0_out_start+0xb0/0x1d4 __dwc3_gadget_start+0x25c/0x720 kretprobe_trampoline.cfi_jt+0x0/0x8 kretprobe_trampoline.cfi_jt+0x0/0x8 udc_bind_to_driver+0x1d8/0x300 usb_gadget_probe_driver+0xa8/0x1dc gadget_dev_desc_UDC_store+0x13c/0x188 configfs_write_iter+0x160/0x1f4 vfs_write+0x2d0/0x40c ksys_write+0x7c/0xf0 __arm64_sys_write+0x20/0x30 invoke_syscall+0x60/0x150 el0_svc_common+0x8c/0xf8 do_el0_svc+0x28/0xa0 el0_svc+0x24/0x84 Fixes: `c1dca562be` ("usb gadget: split out serial core") Cc: stable <stable@kernel.org> Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Lianqin Hu <hulianqin@vivo.com> Link: https://lore.kernel.org/r/TYUPR06MB621733B5AC690DBDF80A0DCCD2042@TYUPR06MB6217.apcprd06.prod.outlook.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-23 18:53:31 +01:00
Joe Hattori	2b6ffcd787	net: stmmac: restructure the error path of stmmac_probe_config_dt() Current implementation of stmmac_probe_config_dt() does not release the OF node reference obtained by of_parse_phandle() in some error paths. The problem is that some error paths call stmmac_remove_config_dt() to clean up but others use and unwind ladder. These two types of error handling have not kept in sync and have been a recurring source of bugs. Re-write the error handling in stmmac_probe_config_dt() to use an unwind ladder. Consequently, stmmac_remove_config_dt() is not needed anymore, thus remove it. This bug was found by an experimental verification tool that I am developing. Fixes: `4838a54050` ("net: stmmac: Fix wrapper drivers not detecting PHY") Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp> Link: https://patch.msgid.link/20241219024119.2017012-1-joe@pf.is.s.u-tokyo.ac.jp Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-23 09:52:47 -08:00
Joe Hattori	74adad5003	usb: chipidea: ci_hdrc_imx: decrement device's refcount in .remove() and in the error path of .probe() Current implementation of ci_hdrc_imx_driver does not decrement the refcount of the device obtained in usbmisc_get_init_data(). Add a put_device() call in .remove() and in .probe() before returning an error. This bug was found by an experimental static analysis tool that I am developing. Cc: stable <stable@kernel.org> Fixes: `f40017e0f3` ("chipidea: usbmisc_imx: Add USB support for VF610 SoCs") Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp> Acked-by: Peter Chen <peter.chen@kernel.org> Link: https://lore.kernel.org/r/20241216015539.352579-1-joe@pf.is.s.u-tokyo.ac.jp Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-23 18:51:27 +01:00
Abel Vesa	f47eba045e	usb: typec: ucsi: Set orientation as none when connector is unplugged The current implementation of the ucsi glink client connector_status() callback is only relying on the state of the gpio. This means that even when the cable is unplugged, the orientation propagated to the switches along the graph is "orientation normal", instead of "orientation none", which would be the correct one in this case. One of the Qualcomm DP-USB PHY combo drivers, which needs to be aware of the orientation change, is relying on the "orientation none" to skip the reinitialization of the entire PHY. Since the ucsi glink client advertises "orientation normal" even when the cable is unplugged, the mentioned PHY is taken down and reinitialized when in fact it should be left as-is. This triggers a crash within the displayport controller driver in turn, which brings the whole system down on some Qualcomm platforms. Propagating "orientation none" from the ucsi glink client on the connector_status() callback hides the problem of the mentioned PHY driver away for now. But the "orientation none" is nonetheless the correct one to be used in this case. So propagate the "orientation none" instead when the connector status flags says cable is disconnected. Fixes: `76716fd5bf` ("usb: typec: ucsi: glink: move GPIO reading into connector_status callback") Cc: stable <stable@kernel.org> # 6.10 Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Abel Vesa <abel.vesa@linaro.org> Reviewed-by: Johan Hovold <johan+linaro@kernel.org> Tested-by: Johan Hovold <johan+linaro@kernel.org> Link: https://lore.kernel.org/r/20241212-usb-typec-ucsi-glink-add-orientation-none-v2-1-db5a50498a77@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-23 18:51:03 +01:00
Ingo Rohloff	9466545720	usb: gadget: configfs: Ignore trailing LF for user strings to cdev Since commit `c033563220` ("usb: gadget: configfs: Attach arbitrary strings to cdev") a user can provide extra string descriptors to a USB gadget via configfs. For "manufacturer", "product", "serialnumber", setting the string via configfs ignores a trailing LF. For the arbitrary strings the LF was not ignored. This patch ignores a trailing LF to make this consistent with the existing behavior for "manufacturer", ... string descriptors. Fixes: `c033563220` ("usb: gadget: configfs: Attach arbitrary strings to cdev") Cc: stable <stable@kernel.org> Signed-off-by: Ingo Rohloff <ingo.rohloff@lauterbach.com> Link: https://lore.kernel.org/r/20241212154114.29295-1-ingo.rohloff@lauterbach.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-23 18:50:53 +01:00
Jun Yan	7a3d76a0b6	USB: usblp: return error when setting unsupported protocol Fix the regression introduced by commit `d8c6edfa3f` ("USB: usblp: don't call usb_set_interface if there's a single alt"), which causes that unsupported protocols can also be set via ioctl when the num_altsetting of the device is 1. Move the check for protocol support to the earlier stage. Fixes: `d8c6edfa3f` ("USB: usblp: don't call usb_set_interface if there's a single alt") Cc: stable <stable@kernel.org> Signed-off-by: Jun Yan <jerrysteve1101@gmail.com> Link: https://lore.kernel.org/r/20241212143852.671889-1-jerrysteve1101@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-23 18:50:41 +01:00
Prashanth K	057bd54dfc	usb: gadget: f_uac2: Fix incorrect setting of bNumEndpoints Currently afunc_bind sets std_ac_if_desc.bNumEndpoints to 1 if controls (mute/volume) are enabled. During next afunc_bind call, bNumEndpoints would be unchanged and incorrectly set to 1 even if the controls aren't enabled. Fix this by resetting the value of bNumEndpoints to 0 on every afunc_bind call. Fixes: `eaf6cbe099` ("usb: gadget: f_uac2: add volume and mute support") Cc: stable <stable@kernel.org> Signed-off-by: Prashanth K <quic_prashk@quicinc.com> Link: https://lore.kernel.org/r/20241211115915.159864-1-quic_prashk@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-23 18:50:28 +01:00
Mohsin Bashir	a072ffd896	eth: fbnic: fix csr boundary for RPM RAM section The CSR dump support leverages the FBNIC_BOUNDS macro, which pads the end condition for each section by adding an offset of 1. However, the RPC RAM section, which is dumped differently from other sections, does not rely on this macro and instead directly uses end boundary address. Hence, subtracting 1 from the end address results in skipping a register. Fixes `3d12862b21` (“eth: fbnic: Add support to dump registers”) Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://patch.msgid.link/20241218232614.439329-1-mohsin.bashr@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-23 09:50:14 -08:00
Dan Carpenter	b9711ff7cd	usb: typec: tcpm/tcpci_maxim: fix error code in max_contaminant_read_resistance_kohm() If max_contaminant_read_adc_mv() fails, then return the error code. Don't return zero. Fixes: `02b332a063` ("usb: typec: maxim_contaminant: Implement check_contaminant callback") Cc: stable <stable@kernel.org> Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: André Draszik <andre.draszik@linaro.org> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/f1bf3768-419e-40dd-989c-f7f455d6c824@stanley.mountain Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-23 18:50:09 +01:00
Xu Yang	e19852d0bf	usb: host: xhci-plat: set skip_phy_initialization if software node has XHCI_SKIP_PHY_INIT property The source of quirk XHCI_SKIP_PHY_INIT comes from xhci_plat_priv.quirks or software node property. This will set skip_phy_initialization if software node also has XHCI_SKIP_PHY_INIT property. Fixes: `a6cd2b3fa8` ("usb: host: xhci-plat: Parse xhci-missing_cas_quirk and apply quirk") Cc: stable <stable@kernel.org> Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Link: https://lore.kernel.org/r/20241209111423.4085548-1-xu.yang_2@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-23 18:50:05 +01:00
Prashanth K	625e70ccb7	usb: dwc3-am62: Disable autosuspend during remove Runtime PM documentation (Section 5) mentions, during remove() callbacks, drivers should undo the runtime PM changes done in probe(). Usually this means calling pm_runtime_disable(), pm_runtime_dont_use_autosuspend() etc. Hence add missing function to disable autosuspend on dwc3-am62 driver unbind. Fixes: `e8784c0aec` ("drivers: usb: dwc3: Add AM62 USB wrapper driver") Cc: stable <stable@kernel.org> Signed-off-by: Prashanth K <quic_prashk@quicinc.com> Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Link: https://lore.kernel.org/r/20241209105728.3216872-1-quic_prashk@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-23 18:49:51 +01:00
André Draszik	01ea6bf5cb	usb: dwc3: gadget: fix writing NYET threshold Before writing a new value to the register, the old value needs to be masked out for the new value to be programmed as intended, because at least in some cases the reset value of that field is 0xf (max value). At the moment, the dwc3 core initialises the threshold to the maximum value (0xf), with the option to override it via a DT. No upstream DTs seem to override it, therefore this commit doesn't change behaviour for any upstream platform. Nevertheless, the code should be fixed to have the desired outcome. Do so. Fixes: `80caf7d21a` ("usb: dwc3: add lpm erratum support") Cc: stable@vger.kernel.org # 5.10+ (needs adjustment for 5.4) Signed-off-by: André Draszik <andre.draszik@linaro.org> Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Link: https://lore.kernel.org/r/20241209-dwc3-nyet-fix-v2-1-02755683345b@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-23 18:47:13 +01:00
Lucas De Marchi	fe39b222a4	drm/xe: Fix fault on fd close after unbind If userspace holds an fd open, unbinds the device and then closes it, the driver shouldn't try to access the hardware. Protect it by using drm_dev_enter()/drm_dev_exit(). This fixes the following page fault: <6> [IGT] xe_wedged: exiting, ret=98 <1> BUG: unable to handle page fault for address: ffffc901bc5e508c <1> #PF: supervisor read access in kernel mode <1> #PF: error_code(0x0000) - not-present page ... <4> xe_lrc_update_timestamp+0x1c/0xd0 [xe] <4> xe_exec_queue_update_run_ticks+0x50/0xb0 [xe] <4> xe_exec_queue_fini+0x16/0xb0 [xe] <4> __guc_exec_queue_fini_async+0xc4/0x190 [xe] <4> guc_exec_queue_fini_async+0xa0/0xe0 [xe] <4> guc_exec_queue_fini+0x23/0x40 [xe] <4> xe_exec_queue_destroy+0xb3/0xf0 [xe] <4> xe_file_close+0xd4/0x1a0 [xe] <4> drm_file_free+0x210/0x280 [drm] <4> drm_close_helper.isra.0+0x6d/0x80 [drm] <4> drm_release_noglobal+0x20/0x90 [drm] Fixes: `514447a121` ("drm/xe: Stop accumulating LRC timestamp on job_free") Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3421 Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241218053122.2730195-1-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit `4ca1fd4183`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2024-12-23 16:19:52 +01:00
Michal Wajdeczko	af12ba67d0	drm/xe/pf: Use correct function to check LMEM provisioning There is a typo in function call and instead of VF LMEM we were looking at VF GGTT provisioning. Fix that. Fixes: `234670cea9` ("drm/xe/pf: Skip fair VFs provisioning if already provisioned") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Piotr Piórkowski <piotr.piorkowski@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241216223253.819-1-michal.wajdeczko@intel.com (cherry picked from commit `a8d0aa0e7f`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2024-12-23 16:19:52 +01:00
Nirmoy Das	5e0a67fdb8	drm/xe: Wait for migration job before unmapping pages Fix a potential GPU page fault during tt -> system moves by waiting for migration jobs to complete before unmapping SG. This ensures that IOMMU mappings are not prematurely torn down while a migration job is still in progress. v2: Use intr=false(Matt A) v3: Update commit message(Matt A) v4: s/DMA_RESV_USAGE_BOOKKEEP/DMA_RESV_USAGE_KERNEL(Thomas) Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3466 Fixes: `75521e8b56` ("drm/xe: Perform dma_map when moving system buffer objects to TT") Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: stable@vger.kernel.org # v6.11+ Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241213122415.3880017-2-nirmoy.das@intel.com Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> (cherry picked from commit `cda06412c0`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2024-12-23 16:19:52 +01:00
Nirmoy Das	528cef1b41	drm/xe: Use non-interruptible wait when moving BO to system Ensure a non-interruptible wait is used when moving a bo to XE_PL_SYSTEM. This prevents dma_mappings from being removed prematurely while a GPU job is still in progress, even if the CPU receives a signal during the operation. Fixes: `75521e8b56` ("drm/xe: Perform dma_map when moving system buffer objects to TT") Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: stable@vger.kernel.org # v6.11+ Suggested-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241213122415.3880017-1-nirmoy.das@intel.com Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> (cherry picked from commit `dc5e20ae1f`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2024-12-23 16:19:52 +01:00
John Harrison	a53da2fb25	drm/xe: Revert some changes that break a mesa debug tool There is a mesa debug tool for decoding devcoredump files. Recent changes to improve the devcoredump output broke that tool. So revert the changes until the tool can be extended to support the new fields. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Fixes: `c28fd6c358` ("drm/xe/devcoredump: Improve section headings and add tile info") Fixes: `ec1455ce7e` ("drm/xe/devcoredump: Add ASCII85 dump helper function") Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Julia Filipchuk <julia.filipchuk@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: intel-xe@lists.freedesktop.org Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241213172833.1733376-1-John.C.Harrison@Intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit `70fb86a85d`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2024-12-23 16:19:52 +01:00
Masami Hiramatsu (Google)	d685d55dfc	tracing/kprobe: Make trace_kprobe's module callback called after jump_label update Make sure the trace_kprobe's module notifer callback function is called after jump_label's callback is called. Since the trace_kprobe's callback eventually checks jump_label address during registering new kprobe on the loading module, jump_label must be updated before this registration happens. Link: https://lore.kernel.org/all/173387585556.995044.3157941002975446119.stgit@devnote2/ Fixes: `6142431810` ("tracing/kprobes: Support module init function probing") Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>	2024-12-24 00:08:13 +09:00
Chengchang Tang	e3debdd484	RDMA/hns: Fix missing flush CQE for DWQE Flush CQE handler has not been called if QP state gets into errored mode in DWQE path. So, the new added outstanding WQEs will never be flushed. It leads to a hung task timeout when using NFS over RDMA: __switch_to+0x7c/0xd0 __schedule+0x350/0x750 schedule+0x50/0xf0 schedule_timeout+0x2c8/0x340 wait_for_common+0xf4/0x2b0 wait_for_completion+0x20/0x40 __ib_drain_sq+0x140/0x1d0 [ib_core] ib_drain_sq+0x98/0xb0 [ib_core] rpcrdma_xprt_disconnect+0x68/0x270 [rpcrdma] xprt_rdma_close+0x20/0x60 [rpcrdma] xprt_autoclose+0x64/0x1cc [sunrpc] process_one_work+0x1d8/0x4e0 worker_thread+0x154/0x420 kthread+0x108/0x150 ret_from_fork+0x10/0x18 Fixes: `01584a5edc` ("RDMA/hns: Add support of direct wqe") Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://patch.msgid.link/20241220055249.146943-5-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-12-23 09:58:30 -05:00
Chengchang Tang	fa5c4ba8cd	RDMA/hns: Fix warning storm caused by invalid input in IO path WARN_ON() is called in the IO path. And it could lead to a warning storm. Use WARN_ON_ONCE() instead of WARN_ON(). Fixes: `12542f1de1` ("RDMA/hns: Refactor process about opcode in post_send()") Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://patch.msgid.link/20241220055249.146943-4-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-12-23 09:58:30 -05:00
Chengchang Tang	0572eccf23	RDMA/hns: Fix accessing invalid dip_ctx during destroying QP If it fails to modify QP to RTR, dip_ctx will not be attached. And during detroying QP, the invalid dip_ctx pointer will be accessed. Fixes: `faa62440a5` ("RDMA/hns: Fix different dgids mapping to the same dip_idx") Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://patch.msgid.link/20241220055249.146943-3-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-12-23 09:58:30 -05:00
wenglianfa	8673a6c2d9	RDMA/hns: Fix mapping error of zero-hop WQE buffer Due to HW limitation, the three region of WQE buffer must be mapped and set to HW in a fixed order: SQ buffer, SGE buffer, and RQ buffer. Currently when one region is zero-hop while the other two are not, the zero-hop region will not be mapped. This violate the limitation above and leads to address error. Fixes: `38389eaa4d` ("RDMA/hns: Add mtr support for mixed multihop addressing") Signed-off-by: wenglianfa <wenglianfa@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://patch.msgid.link/20241220055249.146943-2-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-12-23 09:58:30 -05:00
Chun-Kuang Hu	d08555758f	Revert "drm/mediatek: dsi: Correct calculation formula of PHY Timing" This reverts commit `417d8c4727`. With that patch the panel in the Tentacruel ASUS Chromebook CM14 (CM1402F) flickers. There are 1 or 2 times per second a black panel. Stable Kernel 6.11.5 and mainline 6.12-rc4 works only when reverse that patch. Fixes: `417d8c4727` ("drm/mediatek: dsi: Correct calculation formula of PHY Timing") Cc: stable@vger.kernel.org Cc: Shuijing Li <shuijing.li@mediatek.com> Reported-by: Jens Ziller <zillerbaer@gmx.de> Closes: https://patchwork.kernel.org/project/dri-devel/patch/20240412031208.30688-1-shuijing.li@mediatek.com/ Link: https://patchwork.kernel.org/project/dri-devel/patch/20241212001908.6056-1-chunkuang.hu@kernel.org/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>	2024-12-23 14:34:12 +00:00
Dr. David Alan Gilbert	f17224c2a7	cifs: Remove unused is_server_using_iface() The last use of is_server_using_iface() was removed in 2022 by commit `aa45dadd34` ("cifs: change iface_list from array to sorted linked list") Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2024-12-23 08:06:05 -06:00
Bharath SM	b8ea3b1ff5	smb: enable reuse of deferred file handles for write operations Previously, deferred file handles were reused only for read operations, this commit extends to reusing deferred handles for write operations. By reusing these handles we can reduce the need for open/close operations over the wire. Signed-off-by: Bharath SM <bharathsm@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2024-12-23 08:05:39 -06:00
Sebastian Andrzej Siewior	0b7a66a2c8	preempt: Move PREEMPT_RT before PREEMPT in vermagic. Since the dynamic preemption has been enabled for PREEMPT_RT we have now CONFIG_PREEMPT and CONFIG_PREEMPT_RT set simultaneously. This affects the vermagic strings which comes now PREEMPT with PREEMPT_RT enabled. The PREEMPT_RT module usually can not be loaded on a PREEMPT kernel because some symbols are missing. However if the symbols are fine then it continues and it crashes later. The problem is that the struct module has a different layout and the num_exentries or init members are at a different position leading to a crash later on. This is not necessary caught by the size check in elf_validity_cache_index_mod() because the mem member has an alignment requirement of __module_memory_align which is big enough keep the total size unchanged. Therefore we should keep the string accurate instead of removing it. Move the PREEMPT_RT check before the PREEMPT so that it takes precedence if both symbols are enabled. Fixes: `35772d627b` ("sched: Enable PREEMPT_DYNAMIC for PREEMPT_RT") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Petr Pavlu <petr.pavlu@suse.com> Link: https://lore.kernel.org/r/20241205160602.3lIAsJRT@linutronix.de Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>	2024-12-23 10:46:38 +01:00
Linus Torvalds	4bbf9020be	Linux 6.13-rc4	2024-12-22 13:22:21 -08:00
Linus Torvalds	b1fdbe77be	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM x86 fixes from Paolo Bonzini: - Disable AVIC on SNP-enabled systems that don't allow writes to the virtual APIC page, as such hosts will hit unexpected RMP #PFs in the host when running VMs of any flavor. - Fix a WARN in the hypercall completion path due to KVM trying to determine if a guest with protected register state is in 64-bit mode (KVM's ABI is to assume such guests only make hypercalls in 64-bit mode). - Allow the guest to write to supported bits in MSR_AMD64_DE_CFG to fix a regression with Windows guests, and because KVM's read-only behavior appears to be entirely made up. - Treat TDP MMU faults as spurious if the faulting access is allowed given the existing SPTE. This fixes a benign WARN (other than the WARN itself) due to unexpectedly replacing a writable SPTE with a read-only SPTE. - Emit a warning when KVM is configured with ignore_msrs=1 and also to hide the MSRs that the guest is looking for from the kernel logs. ignore_msrs can trick guests into assuming that certain processor features are present, and this in turn leads to bogus bug reports. * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: let it be known that ignore_msrs is a bad idea KVM: VMX: don't include '<linux/find.h>' directly KVM: x86/mmu: Treat TDP MMU faults as spurious if access is already allowed KVM: SVM: Allow guest writes to set MSR_AMD64_DE_CFG bits KVM: x86: Play nice with protected guests in complete_hypercall_exit() KVM: SVM: Disable AVIC on SNP-enabled system without HvInUseWrAllowed feature	2024-12-22 12:16:41 -08:00
Paolo Bonzini	8afa5b10af	Merge tag 'kvm-x86-fixes-6.13-rcN' of https://github.com/kvm-x86/linux into HEAD KVM x86 fixes for 6.13: - Disable AVIC on SNP-enabled systems that don't allow writes to the virtual APIC page, as such hosts will hit unexpected RMP #PFs in the host when running VMs of any flavor. - Fix a WARN in the hypercall completion path due to KVM trying to determine if a guest with protected register state is in 64-bit mode (KVM's ABI is to assume such guests only make hypercalls in 64-bit mode). - Allow the guest to write to supported bits in MSR_AMD64_DE_CFG to fix a regression with Windows guests, and because KVM's read-only behavior appears to be entirely made up. - Treat TDP MMU faults as spurious if the faulting access is allowed given the existing SPTE. This fixes a benign WARN (other than the WARN itself) due to unexpectedly replacing a writable SPTE with a read-only SPTE.	2024-12-22 12:07:16 -05:00
Paolo Bonzini	398b7b6cb9	KVM: x86: let it be known that ignore_msrs is a bad idea When running KVM with ignore_msrs=1 and report_ignored_msrs=0, the user has no clue that that the guest is being lied to. This may cause bug reports such as https://gitlab.com/qemu-project/qemu/-/issues/2571, where enabling a CPUID bit in QEMU caused Linux guests to try reading MSR_CU_DEF_ERR; and being lied about the existence of MSR_CU_DEF_ERR caused the guest to assume other things about the local APIC which were not true: Sep 14 12:02:53 kernel: mce: [Firmware Bug]: Your BIOS is not setting up LVT offset 0x2 for deferred error IRQs correctly. Sep 14 12:02:53 kernel: unchecked MSR access error: RDMSR from 0x852 at rIP: 0xffffffffb548ffa7 (native_read_msr+0x7/0x40) Sep 14 12:02:53 kernel: Call Trace: ... Sep 14 12:02:53 kernel: native_apic_msr_read+0x20/0x30 Sep 14 12:02:53 kernel: setup_APIC_eilvt+0x47/0x110 Sep 14 12:02:53 kernel: mce_amd_feature_init+0x485/0x4e0 ... Sep 14 12:02:53 kernel: [Firmware Bug]: cpu 0, try to use APIC520 (LVT offset 2) for vector 0xf4, but the register is already in use for vector 0x0 on this cpu Without reported_ignored_msrs=0 at least the host kernel log will contain enough information to avoid going on a wild goose chase. But if reports about individual MSR accesses are being silenced too, at least complain loudly the first time a VM is started. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-12-22 12:06:01 -05:00
Wolfram Sang	37d1d99b88	KVM: VMX: don't include '<linux/find.h>' directly The header clearly states that it does not want to be included directly, only via '<linux/bitmap.h>'. Replace the include accordingly. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Message-ID: <20241217070539.2433-2-wsa+renesas@sang-engineering.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-12-22 12:04:57 -05:00
Linus Torvalds	bcde95ce32	Merge tag 'devicetree-fixes-for-6.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull devicetree fixes from Rob Herring: - Disable #address-cells/#size-cells warning on coreboot (Chromebooks) platforms - Add missing root #address-cells/#size-cells in default empty DT - Fix uninitialized variable in of_irq_parse_one() - Fix interrupt-map cell length check in of_irq_parse_imap_parent() - Fix refcount handling in __of_get_dma_parent() - Fix error path in of_parse_phandle_with_args_map() - Fix dma-ranges handling with flags cells - Drop explicit fw_devlink handling of 'interrupt-parent' - Fix "compression" typo in fixed-partitions binding - Unify "fsl,liodn" property type definitions * tag 'devicetree-fixes-for-6.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: of: Add coreboot firmware to excluded default cells list of/irq: Fix using uninitialized variable @addr_len in API of_irq_parse_one() of/irq: Fix interrupt-map cell length check in of_irq_parse_imap_parent() of: Fix refcount leakage for OF node returned by __of_get_dma_parent() of: Fix error path in of_parse_phandle_with_args_map() dt-bindings: mtd: fixed-partitions: Fix "compression" typo of: Add #address-cells/#size-cells in the device-tree root empty node dt-bindings: Unify "fsl,liodn" type definitions of: address: Preserve the flags portion on 1:1 dma-ranges mapping of/unittest: Add empty dma-ranges address translation tests of: property: fw_devlink: Do not use interrupt-parent directly	2024-12-22 08:40:23 -08:00
Linus Torvalds	48f506ad0b	Merge tag 'soc-fixes-6.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull SoC fixes from Arnd Bergmann: "Two more small fixes, correcting the cacheline size on Raspberry Pi 5 and fixing a logic mistake in the microchip mpfs firmware driver" * tag 'soc-fixes-6.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: arm64: dts: broadcom: Fix L2 linesize for Raspberry Pi 5 firmware: microchip: fix UL_IAP lock check in mpfs_auto_update_state()	2024-12-21 15:45:06 -08:00
Linus Torvalds	4aa748dd1a	Merge tag 'mm-hotfixes-stable-2024-12-21-12-09' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "25 hotfixes. 16 are cc:stable. 19 are MM and 6 are non-MM. The usual bunch of singletons and doubletons - please see the relevant changelogs for details" * tag 'mm-hotfixes-stable-2024-12-21-12-09' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (25 commits) mm: huge_memory: handle strsep not finding delimiter alloc_tag: fix set_codetag_empty() when !CONFIG_MEM_ALLOC_PROFILING_DEBUG alloc_tag: fix module allocation tags populated area calculation mm/codetag: clear tags before swap mm/vmstat: fix a W=1 clang compiler warning mm: convert partially_mapped set/clear operations to be atomic nilfs2: fix buffer head leaks in calls to truncate_inode_pages() vmalloc: fix accounting with i915 mm/page_alloc: don't call pfn_to_page() on possibly non-existent PFN in split_large_buddy() fork: avoid inappropriate uprobe access to invalid mm nilfs2: prevent use of deleted inode zram: fix uninitialized ZRAM not releasing backing device zram: refuse to use zero sized block device as backing device mm: use clear_user_(high)page() for arch with special user folio handling mm: introduce cpu_icache_is_aliasing() across all architectures mm: add RCU annotation to pte_offset_map(_lock) mm: correctly reference merged VMA mm: use aligned address in copy_user_gigantic_page() mm: use aligned address in clear_gigantic_page() mm: shmem: fix ShmemHugePages at swapout ...	2024-12-21 15:31:56 -08:00
Steven Rostedt	e84a3bf7f4	staging: gpib: Fix allyesconfig build failures My tests run an allyesconfig build and it failed with the following errors: LD [M] samples/kfifo/dma-example.ko ld.lld: error: undefined symbol: nec7210_board_reset ld.lld: error: undefined symbol: nec7210_read ld.lld: error: undefined symbol: nec7210_write It appears that some modules call the function nec7210_board_reset() that is defined in nec7210.c. In an allyesconfig build, these other modules are built in. But the file that holds nec7210_board_reset() has: obj-m += nec7210.o Where that "-m" means it only gets built as a module. With the other modules built in, they have no access to nec7210_board_reset() and the build fails. This isn't the only function. After fixing that one, I hit another: ld.lld: error: undefined symbol: push_gpib_event ld.lld: error: undefined symbol: gpib_match_device_path Where push_gpib_event() was also used outside of the file it was defined in, and that file too only was built as a module. Since the directory that nec7210.c is only traversed when CONFIG_GPIB_NEC7210 is set, and the directory with gpib_common.c is only traversed when CONFIG_GPIB_COMMON is set, use those configs as the option to build those modules. When it is an allyesconfig, then they will both be built in and their functions will be available to the other modules that are also built in. Fixes: `3ba84ac69b` ("staging: gpib: Add nec7210 GPIB chip driver") Fixes: `9dde4559e9` ("staging: gpib: Add GPIB common core driver") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2024-12-21 11:30:13 -08:00
Linus Torvalds	a016546ba6	Merge tag 'kbuild-fixes-v6.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - Remove stale code in usr/include/headers_check.pl - Fix issues in the user-mode-linux Debian package - Fix false-positive "export twice" errors in modpost * tag 'kbuild-fixes-v6.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: modpost: distinguish same module paths from different dump files kbuild: deb-pkg: Do not install maint scripts for arch 'um' kbuild: deb-pkg: add debarch for ARCH=um kbuild: Drop support for include/asm-<arch> in headers_check.pl	2024-12-21 11:24:32 -08:00
Linus Torvalds	9c707ba99f	Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Pull BPF fixes from Daniel Borkmann: - Fix inlining of bpf_get_smp_processor_id helper for !CONFIG_SMP systems (Andrea Righi) - Fix BPF USDT selftests helper code to use asm constraint "m" for LoongArch (Tiezhu Yang) - Fix BPF selftest compilation error in get_uprobe_offset when PROCMAP_QUERY is not defined (Jerome Marchand) - Fix BPF bpf_skb_change_tail helper when used in context of BPF sockmap to handle negative skb header offsets (Cong Wang) - Several fixes to BPF sockmap code, among others, in the area of socket buffer accounting (Levi Zim, Zijian Zhang, Cong Wang) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: selftests/bpf: Test bpf_skb_change_tail() in TC ingress selftests/bpf: Introduce socket_helpers.h for TC tests selftests/bpf: Add a BPF selftest for bpf_skb_change_tail() bpf: Check negative offsets in __bpf_skb_min_len() tcp_bpf: Fix copied value in tcp_bpf_sendmsg skmsg: Return copied bytes in sk_msg_memcopy_from_iter tcp_bpf: Add sk_rmem_alloc related logic for tcp_bpf ingress redirection tcp_bpf: Charge receive socket buffer in bpf_tcp_ingress() selftests/bpf: Fix compilation error in get_uprobe_offset() selftests/bpf: Use asm constraint "m" for LoongArch bpf: Fix bpf_get_smp_processor_id() on !CONFIG_SMP	2024-12-21 11:07:19 -08:00
Linus Torvalds	876685ce5e	Merge tag 'media/v6.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: - fix a clang build issue with mediatec vcodec - add missing variable initialization to dib3000mb write function * tag 'media/v6.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: media: mediatek: vcodec: mark vdec_vp9_slice_map_counts_eob_coef noinline media: dvb-frontends: dib3000mb: fix uninit-value in dib3000_write_reg	2024-12-21 10:56:34 -08:00
Linus Torvalds	a99b4a369a	Merge tag 'pci-v6.13-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull PCI fixes from Krzysztof Wilczyński: "Two small patches that are important for fixing boot time hang on Intel JHL7540 'Titan Ridge' platforms equipped with a Thunderbolt controller. The boot time issue manifests itself when a PCI Express bandwidth control is unnecessarily enabled on the Thunderbolt controller downstream ports, which only supports a link speed of 2.5 GT/s in accordance with USB4 v2 specification (p. 671, sec. 11.2.1, "PCIe Physical Layer Logical Sub-block"). As such, there is no need to enable bandwidth control on such downstream port links, which also works around the issue. Both patches were tested by the original reporter on the hardware on which the failure origin golly manifested itself. Both fixes were proven to resolve the reported boot hang issue, and both patches have been in linux-next this week with no reported problems" * tag 'pci-v6.13-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: PCI/bwctrl: Enable only if more than one speed is supported PCI: Honor Max Link Speed when determining supported speeds	2024-12-21 10:51:04 -08:00
Linus Torvalds	78b1346123	Merge tag 'pm-6.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix some amd-pstate driver issues: - Detect preferred core support in amd-pstate before driver registration to avoid initialization ordering issues (K Prateek Nayak) - Fix issues with with boost numerator handling in amd-pstate leading to inconsistently programmed CPPC max performance values (Mario Limonciello)" * tag 'pm-6.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq/amd-pstate: Use boost numerator for upper bound of frequencies cpufreq/amd-pstate: Store the boost numerator as highest perf again cpufreq/amd-pstate: Detect preferred core support before driver registration	2024-12-21 10:47:47 -08:00
Linus Torvalds	be6bb3619e	Merge tag 'thermal-6.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull thermal control fixes from Rafael Wysocki: "Fix two issues with the user thermal thresholds feature introduced in this development cycle (Daniel Lezcano)" * tag 'thermal-6.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: thermal/thresholds: Fix boundaries and detection routine thermal/thresholds: Fix uapi header macros leading to a compilation error	2024-12-21 10:44:44 -08:00
Linus Torvalds	5100b6f9e7	Merge tag 'acpi-6.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fix from Rafael Wysocki: "Unbreak ACPI EC support on LoongArch that has been broken earlier in this development cycle (Huacai Chen)" * tag 'acpi-6.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: EC: Enable EC support on LoongArch by default	2024-12-21 10:42:35 -08:00
Linus Torvalds	baa172c77a	Merge tag '6.13-rc3-SMB3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6 Pull smb client fixes from Steve French: - fix regression in display of write stats - fix rmmod failure with network namespaces - two minor cleanups * tag '6.13-rc3-SMB3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: smb: fix bytes written value in /proc/fs/cifs/Stats smb: client: fix TCP timers deadlock after rmmod smb: client: Deduplicate "select NETFS_SUPPORT" in Kconfig smb: use macros instead of constants for leasekey size and default cifsattrs value	2024-12-21 09:35:18 -08:00
Linus Torvalds	4a5da3f5d3	Merge tag 'nfs-for-6.13-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client fixes from Trond Myklebust: - NFS/pnfs: Fix a live lock between recalled layouts and layoutget - Fix a build warning about an undeclared symbol 'nfs_idmap_cache_timeout' * tag 'nfs-for-6.13-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: fs/nfs: fix missing declaration of nfs_idmap_cache_timeout NFS/pnfs: Fix a live lock between recalled layouts and layoutget	2024-12-21 09:32:24 -08:00
Linus Torvalds	7684392f17	Merge tag 'ceph-for-6.13-rc4' of https://github.com/ceph/ceph-client Pull ceph fixes from Ilya Dryomov: "A handful of important CephFS fixes from Max, Alex and myself: memory corruption due to a buffer overrun, potential infinite loop and several memory leaks on the error paths. All but one marked for stable" * tag 'ceph-for-6.13-rc4' of https://github.com/ceph/ceph-client: ceph: allocate sparse_ext map only for sparse reads ceph: fix memory leak in ceph_direct_read_write() ceph: improve error handling and short/overflow-read logic in __ceph_sync_read() ceph: validate snapdirname option length when mounting ceph: give up on paths longer than PATH_MAX ceph: fix memory leaks in __ceph_sync_read()	2024-12-21 09:29:46 -08:00
Masahiro Yamada	9435dc77a3	modpost: distinguish same module paths from different dump files Since commit `13b25489b6` ("kbuild: change working directory to external module directory with M="), module paths are always relative to the top of the external module tree. The module paths recorded in Module.symvers are no longer globally unique when they are passed via KBUILD_EXTRA_SYMBOLS for building other external modules, which may result in false-positive "exported twice" errors. Such errors should not occur because external modules should be able to override in-tree modules. To address this, record the dump file path in struct module and check it when searching for a module. Fixes: `13b25489b6` ("kbuild: change working directory to external module directory with M=") Reported-by: Jon Hunter <jonathanh@nvidia.com> Closes: https://lore.kernel.org/all/eb21a546-a19c-40df-b821-bbba80f19a3d@nvidia.com/ Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Tested-by: Jon Hunter <jonathanh@nvidia.com>	2024-12-21 12:42:10 +09:00
Nicolas Schier	54956567a0	kbuild: deb-pkg: Do not install maint scripts for arch 'um' Stop installing Debian maintainer scripts when building a user-mode-linux Debian package. Debian maintainer scripts are used for e.g. requesting rebuilds of initrd, rebuilding DKMS modules and updating of grub configuration. As all of this is not relevant for UML but also may lead to failures while processing the kernel hooks, do no more install maintainer scripts for the UML package. Suggested-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Nicolas Schier <nicolas@fjasle.eu> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>	2024-12-21 12:42:10 +09:00
Masahiro Yamada	a34e92d2e8	kbuild: deb-pkg: add debarch for ARCH=um 'make ARCH=um bindeb-pkg' shows the following warning. $ make ARCH=um bindeb-pkg [snip] GEN debian WARNING Your architecture doesn't have its equivalent Debian userspace architecture defined! Falling back to the current host architecture (amd64). Please add support for um to ./scripts/package/mkdebian ... This commit hard-codes i386/amd64 because UML is only supported for x86. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>	2024-12-21 12:42:04 +09:00
Geert Uytterhoeven	d67393f4d2	kbuild: Drop support for include/asm-<arch> in headers_check.pl "include/asm-<arch>" was replaced by "arch/<arch>/include/asm" a long time ago. All assembler header files are now included using "#include <asm/*>", so there is no longer a need to rewrite paths. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>	2024-12-21 11:43:17 +09:00
Nikolaus Voss	c384481006	clk: clk-imx8mp-audiomix: fix function signature clk_imx8mp_audiomix_reset_controller_register() in the "if !CONFIG_RESET_CONTROLLER" branch had the first argument missing. It is an empty function for this branch so it wasn't immediately apparent. Fixes: `6f0e817175` ("clk: imx: clk-audiomix: Add reset controller") Cc: <stable@vger.kernel.org> # 6.12.x Signed-off-by: Nikolaus Voss <nv@vosn.de> Link: https://lore.kernel.org/r/20241219105447.889CB11FE@mail.steuer-voss.de Reviewed-by: Daniel Baluta <daniel.baluta@nxp.com> Acked-by: Shengjiu Wang <shengjiu.wang@gmail.com> Reviewed-by: Peng Fan <peng.fan@nxp.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org>	2024-12-20 15:43:41 -08:00
Cong Wang	4a58963d10	selftests/bpf: Test bpf_skb_change_tail() in TC ingress Similarly to the previous test, we also need a test case to cover positive offsets as well, TC is an excellent hook for this. Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Zijian Zhang <zijianzhang@bytedance.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20241213034057.246437-5-xiyou.wangcong@gmail.com	2024-12-20 23:13:31 +01:00
Cong Wang	472759c9f5	selftests/bpf: Introduce socket_helpers.h for TC tests Pull socket helpers out of sockmap_helpers.h so that they can be reused for TC tests as well. This prepares for the next patch. Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20241213034057.246437-4-xiyou.wangcong@gmail.com	2024-12-20 23:13:31 +01:00
Cong Wang	9ee0c7b865	selftests/bpf: Add a BPF selftest for bpf_skb_change_tail() As requested by Daniel, we need to add a selftest to cover bpf_skb_change_tail() cases in skb_verdict. Here we test trimming, growing and error cases, and validate its expected return values and the expected sizes of the payload. Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20241213034057.246437-3-xiyou.wangcong@gmail.com	2024-12-20 23:13:31 +01:00
Cong Wang	9ecc4d858b	bpf: Check negative offsets in __bpf_skb_min_len() skb_network_offset() and skb_transport_offset() can be negative when they are called after we pull the transport header, for example, when we use eBPF sockmap at the point of ->sk_data_ready(). __bpf_skb_min_len() uses an unsigned int to get these offsets, this leads to a very large number which then causes bpf_skb_change_tail() failed unexpectedly. Fix this by using a signed int to get these offsets and ensure the minimum is at least zero. Fixes: `5293efe62d` ("bpf: add bpf_skb_change_tail helper") Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20241213034057.246437-2-xiyou.wangcong@gmail.com	2024-12-20 23:13:31 +01:00
Linus Torvalds	499551201b	Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fix from Catalin Marinas: "Fix a sparse warning in the arm64 signal code dealing with the user shadow stack register, GCSPR_EL0" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64/signal: Silence sparse warning storing GCSPR_EL0	2024-12-20 14:10:01 -08:00
Levi Zim	5153a75ef3	tcp_bpf: Fix copied value in tcp_bpf_sendmsg bpf kselftest sockhash::test_txmsg_cork_hangs in test_sockmap.c triggers a kernel NULL pointer dereference: BUG: kernel NULL pointer dereference, address: 0000000000000008 ? __die_body+0x6e/0xb0 ? __die+0x8b/0xa0 ? page_fault_oops+0x358/0x3c0 ? local_clock+0x19/0x30 ? lock_release+0x11b/0x440 ? kernelmode_fixup_or_oops+0x54/0x60 ? __bad_area_nosemaphore+0x4f/0x210 ? mmap_read_unlock+0x13/0x30 ? bad_area_nosemaphore+0x16/0x20 ? do_user_addr_fault+0x6fd/0x740 ? prb_read_valid+0x1d/0x30 ? exc_page_fault+0x55/0xd0 ? asm_exc_page_fault+0x2b/0x30 ? splice_to_socket+0x52e/0x630 ? shmem_file_splice_read+0x2b1/0x310 direct_splice_actor+0x47/0x70 splice_direct_to_actor+0x133/0x300 ? do_splice_direct+0x90/0x90 do_splice_direct+0x64/0x90 ? __ia32_sys_tee+0x30/0x30 do_sendfile+0x214/0x300 __se_sys_sendfile64+0x8e/0xb0 __x64_sys_sendfile64+0x25/0x30 x64_sys_call+0xb82/0x2840 do_syscall_64+0x75/0x110 entry_SYSCALL_64_after_hwframe+0x4b/0x53 This is caused by tcp_bpf_sendmsg() returning a larger value(12289) than size (8192), which causes the while loop in splice_to_socket() to release an uninitialized pipe buf. The underlying cause is that this code assumes sk_msg_memcopy_from_iter() will copy all bytes upon success but it actually might only copy part of it. This commit changes it to use the real copied bytes. Signed-off-by: Levi Zim <rsworktech@outlook.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Björn Töpel <bjorn@kernel.org> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20241130-tcp-bpf-sendmsg-v1-2-bae583d014f3@outlook.com	2024-12-20 22:53:36 +01:00
Levi Zim	fdf478d236	skmsg: Return copied bytes in sk_msg_memcopy_from_iter Previously sk_msg_memcopy_from_iter returns the copied bytes from the last copy_from_iter{,_nocache} call upon success. This commit changes it to return the total number of copied bytes on success. Signed-off-by: Levi Zim <rsworktech@outlook.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Björn Töpel <bjorn@kernel.org> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20241130-tcp-bpf-sendmsg-v1-1-bae583d014f3@outlook.com	2024-12-20 22:53:36 +01:00
Linus Torvalds	d74276290c	Merge tag 'hwmon-for-v6.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon fixes from Guenter Roeck: - Fix reporting of negative temperature, current, and voltage values in the tmp513 driver * tag 'hwmon-for-v6.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (tmp513) Fix interpretation of values of Temperature Result and Limit Registers hwmon: (tmp513) Fix Current Register value interpretation hwmon: (tmp513) Fix interpretation of values of Shunt Voltage and Limit Registers	2024-12-20 13:48:41 -08:00
Rob Herring (Arm)	8600058ba2	of: Add coreboot firmware to excluded default cells list Google Juniper and other Chromebook platforms have a very old bootloader which populates /firmware node without proper address/size-cells leading to warnings: Missing '#address-cells' in /firmware WARNING: CPU: 0 PID: 1 at drivers/of/base.c:106 of_bus_n_addr_cells+0x90/0xf0 Modules linked in: CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.12.0 #1 933ab9971ff4d5dc58cb378a96f64c7f72e3454d Hardware name: Google juniper sku16 board (DT) ... Missing '#size-cells' in /firmware WARNING: CPU: 0 PID: 1 at drivers/of/base.c:133 of_bus_n_size_cells+0x90/0xf0 Modules linked in: CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Tainted: G W 6.12.0 #1 933ab9971ff4d5dc58cb378a96f64c7f72e3454d Tainted: [W]=WARN Hardware name: Google juniper sku16 board (DT) These platform won't receive updated bootloader/firmware, so add an exclusion for platforms with a "coreboot" compatible node. While this is wider than necessary, that's the easiest fix and it doesn't doesn't matter if we miss checking other platforms using coreboot. We may revisit this later and address with a fixup to the DT itself. Reported-by: Sasha Levin <sashal@kernel.org> Closes: https://lore.kernel.org/all/Z0NUdoG17EwuCigT@sashalap/ Cc: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: Chen-Yu Tsai <wenst@chromium.org> Cc: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Rob Herring (Arm) <robh@kernel.org>	2024-12-20 15:39:22 -06:00
Linus Torvalds	11167b29e5	Merge tag 'block-6.13-20241220' of git://git.kernel.dk/linux Pull block fixes from Jens Axboe: - Minor cleanups for bdev/nvme using the helpers introduced - Revert of a deadlock fix that still needs more work - Fix a UAF of hctx in the cpu hotplug code * tag 'block-6.13-20241220' of git://git.kernel.dk/linux: block: avoid to reuse `hctx` not removed from cpuhp callback list block: Revert "block: Fix potential deadlock while freezing queue and acquiring sysfs_lock" nvme: use blk_validate_block_size() for max LBA check block/bdev: use helper for max block size check	2024-12-20 13:37:58 -08:00
Linus Torvalds	7c05bd9230	Merge tag 'io_uring-6.13-20241220' of git://git.kernel.dk/linux Pull io_uring fixes from Jens Axboe: - Fix for a file ref leak for registered ring fds - Turn the ->timeout_lock into a raw spinlock, as it nests under the io-wq lock which is a raw spinlock as it's called from the scheduler side - Limit ring resizing to DEFER_TASKRUN for now. We will broaden this in the future, but for now, ensure that it's only feasible on rings with a single user - Add sanity check for io-wq enqueuing * tag 'io_uring-6.13-20241220' of git://git.kernel.dk/linux: io_uring: check if iowq is killed before queuing io_uring/register: limit ring resizing to DEFER_TASKRUN io_uring: Fix registered ring file refcount leak io_uring: make ctx->timeout_lock a raw spinlock	2024-12-20 13:32:43 -08:00
Christian Brauner	5fe85a5c51	Merge patch series "netfs, ceph, nfs, cachefiles: Miscellaneous fixes/changes" David Howells <dhowells@redhat.com> says: Here are some miscellaneous fixes and changes for netfslib and the ceph and nfs filesystems: (1) Ignore silly-rename files from afs and nfs when building the header archive in a kernel build. (2) netfs: Fix the way read result collection applies results to folios when each folio is being read by multiple subrequests and the results come out of order. (3) netfs: Fix ENOMEM handling in buffered reads. (4) nfs: Fix an oops in nfs_netfs_init_request() when copying to the cache. (5) cachefiles: Parse the "secctx" command immediately to get the correct error rather than leaving it to the "bind" command. (6) netfs: Remove a redundant smp_rmb(). This isn't a bug per se and could be deferred. (7) netfs: Fix missing barriers by using clear_and_wake_up_bit(). (8) netfs: Work around recursion in read retry by failing and abandoning the retried subrequest if no I/O is performed. [!] NOTE: This only works around the recursion problem if the recursion keeps returning no data. If the server manages, say, to repeatedly return a single byte of data faster than the retry algorithm can complete, it will still recurse and the stack overrun may still occur. Actually fixing this requires quite an intrusive change which will hopefully make the next merge window. (9) netfs: Fix the clearance of a folio_queue when unlocking the page if we're going to want to subsequently send the queue for copying to the cache (if, for example, we're using ceph). (10) netfs: Fix the lack of cancellation of copy-to-cache when the cache for a file is temporarily disabled (for example when a DIO write is done to the file). This patch and (9) fix hangs with ceph. With these patches, I can run xfstest -g quick to completion on ceph with a local cache. The patches can also be found here with a bonus cifs patch: https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=netfs-fixes * patches from https://lore.kernel.org/r/20241213135013.2964079-1-dhowells@redhat.com: netfs: Fix is-caching check in read-retry netfs: Fix the (non-)cancellation of copy when cache is temporarily disabled netfs: Fix ceph copy to cache on write-begin netfs: Work around recursion by abandoning retry if nothing read netfs: Fix missing barriers by using clear_and_wake_up_bit() netfs: Remove redundant use of smp_rmb() cachefiles: Parse the "secctx" immediately nfs: Fix oops in nfs_netfs_init_request() when copying to cache netfs: Fix enomem handling in buffered reads netfs: Fix non-contiguous donation between completed reads kheaders: Ignore silly-rename files Link: https://lore.kernel.org/r/20241213135013.2964079-1-dhowells@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-12-20 22:08:16 +01:00
David Howells	d4e338de17	netfs: Fix is-caching check in read-retry netfs: Fix is-caching check in read-retry The read-retry code checks the NETFS_RREQ_COPY_TO_CACHE flag to determine if there might be failed reads from the cache that need turning into reads from the server, with the intention of skipping the complicated part if it can. The code that set the flag, however, got lost during the read-side rewrite. Fix the check to see if the cache_resources are valid instead. The flag can then be removed. Fixes: `ee4cdf7ba8` ("netfs: Speed up buffered reading") Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/3752048.1734381285@warthog.procyon.org.uk cc: Jeff Layton <jlayton@kernel.org> cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-12-20 22:07:58 +01:00
David Howells	d0327c8243	netfs: Fix the (non-)cancellation of copy when cache is temporarily disabled When the caching for a cookie is temporarily disabled (e.g. due to a DIO write on that file), future copying to the cache for that file is disabled until all fds open on that file are closed. However, if netfslib is using the deprecated PG_private_2 method (such as is currently used by ceph), and decides it wants to copy to the cache, netfs_advance_write() will just bail at the first check seeing that the cache stream is unavailable, and indicate that it dealt with all the content. This means that we have no subrequests to provide notifications to drive the state machine or even to pin the request and the request just gets discarded, leaving the folios with PG_private_2 set. Fix this by jumping directly to cancel the request if the cache is not available. That way, we don't remove mark3 from the folio_queue list and netfs_pgpriv2_cancel() will clean up the folios. This was found by running the generic/013 xfstest against ceph with an active cache and the "-o fsc" option passed to ceph. That would usually hang Fixes: `ee4cdf7ba8` ("netfs: Speed up buffered reading") Reported-by: Max Kellermann <max.kellermann@ionos.com> Closes: https://lore.kernel.org/r/CAKPOu+_4m80thNy5_fvROoxBm689YtA0dZ-=gcmkzwYSY4syqw@mail.gmail.com/ Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20241213135013.2964079-11-dhowells@redhat.com cc: Jeff Layton <jlayton@kernel.org> cc: Ilya Dryomov <idryomov@gmail.com> cc: Xiubo Li <xiubli@redhat.com> cc: netfs@lists.linux.dev cc: ceph-devel@vger.kernel.org cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-12-20 22:07:58 +01:00
David Howells	38cf8e9457	netfs: Fix ceph copy to cache on write-begin At the end of netfs_unlock_read_folio() in which folios are marked appropriately for copying to the cache (either with by being marked dirty and having their private data set or by having PG_private_2 set) and then unlocked, the folio_queue struct has the entry pointing to the folio cleared. This presents a problem for netfs_pgpriv2_write_to_the_cache(), which is used to write folios marked with PG_private_2 to the cache as it expects to be able to trawl the folio_queue list thereafter to find the relevant folios, leading to a hang. Fix this by not clearing the folio_queue entry if we're going to do the deprecated copy-to-cache. The clearance will be done instead as the folios are written to the cache. This can be reproduced by starting cachefiles, mounting a ceph filesystem with "-o fsc" and writing to it. Fixes: `796a404964` ("netfs: In readahead, put the folio refs as soon extracted") Reported-by: Max Kellermann <max.kellermann@ionos.com> Closes: https://lore.kernel.org/r/CAKPOu+_4m80thNy5_fvROoxBm689YtA0dZ-=gcmkzwYSY4syqw@mail.gmail.com/ Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20241213135013.2964079-10-dhowells@redhat.com Fixes: `ee4cdf7ba8` ("netfs: Speed up buffered reading") cc: Jeff Layton <jlayton@kernel.org> cc: Ilya Dryomov <idryomov@gmail.com> cc: Xiubo Li <xiubli@redhat.com> cc: netfs@lists.linux.dev cc: ceph-devel@vger.kernel.org cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-12-20 22:07:57 +01:00
David Howells	4acb665cf4	netfs: Work around recursion by abandoning retry if nothing read syzkaller reported recursion with a loop of three calls (netfs_rreq_assess, netfs_retry_reads and netfs_rreq_terminated) hitting the limit of the stack during an unbuffered or direct I/O read. There are a number of issues: (1) There is no limit on the number of retries. (2) A subrequest is supposed to be abandoned if it does not transfer anything (NETFS_SREQ_NO_PROGRESS), but that isn't checked under all circumstances. (3) The actual root cause, which is this: if (atomic_dec_and_test(&rreq->nr_outstanding)) netfs_rreq_terminated(rreq, ...); When we do a retry, we bump the rreq->nr_outstanding counter to prevent the final cleanup phase running before we've finished dispatching the retries. The problem is if we hit 0, we have to do the cleanup phase - but we're in the cleanup phase and end up repeating the retry cycle, hence the recursion. Work around the problem by limiting the number of retries. This is based on Lizhi Xu's patch[1], and makes the following changes: (1) Replace NETFS_SREQ_NO_PROGRESS with NETFS_SREQ_MADE_PROGRESS and make the filesystem set it if it managed to read or write at least one byte of data. Clear this bit before issuing a subrequest. (2) Add a ->retry_count member to the subrequest and increment it any time we do a retry. (3) Remove the NETFS_SREQ_RETRYING flag as it is superfluous with ->retry_count. If the latter is non-zero, we're doing a retry. (4) Abandon a subrequest if retry_count is non-zero and we made no progress. (5) Use ->retry_count in both the write-side and the read-size. [?] Question: Should I set a hard limit on retry_count in both read and write? Say it hits 50, we always abandon it. The problem is that these changes only mitigate the issue. As long as it made at least one byte of progress, the recursion is still an issue. This patch mitigates the problem, but does not fix the underlying cause. I have patches that will do that, but it's an intrusive fix that's currently pending for the next merge window. The oops generated by KASAN looks something like: BUG: TASK stack guard page was hit at ffffc9000482ff48 (stack is ffffc90004830000..ffffc90004838000) Oops: stack guard page: 0000 [#1] PREEMPT SMP KASAN NOPTI ... RIP: 0010:mark_lock+0x25/0xc60 kernel/locking/lockdep.c:4686 ... mark_usage kernel/locking/lockdep.c:4646 [inline] __lock_acquire+0x906/0x3ce0 kernel/locking/lockdep.c:5156 lock_acquire.part.0+0x11b/0x380 kernel/locking/lockdep.c:5825 local_lock_acquire include/linux/local_lock_internal.h:29 [inline] ___slab_alloc+0x123/0x1880 mm/slub.c:3695 __slab_alloc.constprop.0+0x56/0xb0 mm/slub.c:3908 __slab_alloc_node mm/slub.c:3961 [inline] slab_alloc_node mm/slub.c:4122 [inline] kmem_cache_alloc_noprof+0x2a7/0x2f0 mm/slub.c:4141 radix_tree_node_alloc.constprop.0+0x1e8/0x350 lib/radix-tree.c:253 idr_get_free+0x528/0xa40 lib/radix-tree.c:1506 idr_alloc_u32+0x191/0x2f0 lib/idr.c:46 idr_alloc+0xc1/0x130 lib/idr.c:87 p9_tag_alloc+0x394/0x870 net/9p/client.c:321 p9_client_prepare_req+0x19f/0x4d0 net/9p/client.c:644 p9_client_zc_rpc.constprop.0+0x105/0x880 net/9p/client.c:793 p9_client_read_once+0x443/0x820 net/9p/client.c:1570 p9_client_read+0x13f/0x1b0 net/9p/client.c:1534 v9fs_issue_read+0x115/0x310 fs/9p/vfs_addr.c:74 netfs_retry_read_subrequests fs/netfs/read_retry.c:60 [inline] netfs_retry_reads+0x153a/0x1d00 fs/netfs/read_retry.c:232 netfs_rreq_assess+0x5d3/0x870 fs/netfs/read_collect.c:371 netfs_rreq_terminated+0xe5/0x110 fs/netfs/read_collect.c:407 netfs_retry_reads+0x155e/0x1d00 fs/netfs/read_retry.c:235 netfs_rreq_assess+0x5d3/0x870 fs/netfs/read_collect.c:371 netfs_rreq_terminated+0xe5/0x110 fs/netfs/read_collect.c:407 netfs_retry_reads+0x155e/0x1d00 fs/netfs/read_retry.c:235 netfs_rreq_assess+0x5d3/0x870 fs/netfs/read_collect.c:371 ... netfs_rreq_terminated+0xe5/0x110 fs/netfs/read_collect.c:407 netfs_retry_reads+0x155e/0x1d00 fs/netfs/read_retry.c:235 netfs_rreq_assess+0x5d3/0x870 fs/netfs/read_collect.c:371 netfs_rreq_terminated+0xe5/0x110 fs/netfs/read_collect.c:407 netfs_retry_reads+0x155e/0x1d00 fs/netfs/read_retry.c:235 netfs_rreq_assess+0x5d3/0x870 fs/netfs/read_collect.c:371 netfs_rreq_terminated+0xe5/0x110 fs/netfs/read_collect.c:407 netfs_dispatch_unbuffered_reads fs/netfs/direct_read.c:103 [inline] netfs_unbuffered_read fs/netfs/direct_read.c:127 [inline] netfs_unbuffered_read_iter_locked+0x12f6/0x19b0 fs/netfs/direct_read.c:221 netfs_unbuffered_read_iter+0xc5/0x100 fs/netfs/direct_read.c:256 v9fs_file_read_iter+0xbf/0x100 fs/9p/vfs_file.c:361 do_iter_readv_writev+0x614/0x7f0 fs/read_write.c:832 vfs_readv+0x4cf/0x890 fs/read_write.c:1025 do_preadv fs/read_write.c:1142 [inline] __do_sys_preadv fs/read_write.c:1192 [inline] __se_sys_preadv fs/read_write.c:1187 [inline] __x64_sys_preadv+0x22d/0x310 fs/read_write.c:1187 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83 Fixes: `ee4cdf7ba8` ("netfs: Speed up buffered reading") Closes: https://syzkaller.appspot.com/bug?extid=1fc6f64c40a9d143cfb6 Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20241108034020.3695718-1-lizhi.xu@windriver.com/ [1] Link: https://lore.kernel.org/r/20241213135013.2964079-9-dhowells@redhat.com Tested-by: syzbot+885c03ad650731743489@syzkaller.appspotmail.com Suggested-by: Lizhi Xu <lizhi.xu@windriver.com> cc: Dominique Martinet <asmadeus@codewreck.org> cc: Jeff Layton <jlayton@kernel.org> cc: v9fs@lists.linux.dev cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Reported-by: syzbot+885c03ad650731743489@syzkaller.appspotmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-12-20 22:07:57 +01:00
David Howells	aa39564189	netfs: Fix missing barriers by using clear_and_wake_up_bit() Use clear_and_wake_up_bit() rather than something like: clear_bit_unlock(NETFS_RREQ_IN_PROGRESS, &rreq->flags); wake_up_bit(&rreq->flags, NETFS_RREQ_IN_PROGRESS); as there needs to be a barrier inserted between which is present in clear_and_wake_up_bit(). Fixes: `288ace2f57` ("netfs: New writeback implementation") Fixes: `ee4cdf7ba8` ("netfs: Speed up buffered reading") Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20241213135013.2964079-8-dhowells@redhat.com Reviewed-by: Akira Yokosawa <akiyks@gmail.com> cc: Zilin Guan <zilin@seu.edu.cn> cc: Akira Yokosawa <akiyks@gmail.com> cc: Jeff Layton <jlayton@kernel.org> cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-12-20 22:07:57 +01:00
Zilin Guan	f4d3cde410	netfs: Remove redundant use of smp_rmb() The function netfs_unbuffered_write_iter_locked() in fs/netfs/direct_write.c contains an unnecessary smp_rmb() call after wait_on_bit(). Since wait_on_bit() already incorporates a memory barrier that ensures the flag update is visible before the function returns, the smp_rmb() provides no additional benefit and incurs unnecessary overhead. This patch removes the redundant barrier to simplify and optimize the code. Signed-off-by: Zilin Guan <zilin@seu.edu.cn> Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20241207021952.2978530-1-zilin@seu.edu.cn/ Link: https://lore.kernel.org/r/20241213135013.2964079-7-dhowells@redhat.com Reviewed-by: Akira Yokosawa <akiyks@gmail.com> cc: Akira Yokosawa <akiyks@gmail.com> cc: Jeff Layton <jlayton@kernel.org> cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-12-20 22:07:56 +01:00
Max Kellermann	e5a8b6446c	cachefiles: Parse the "secctx" immediately Instead of storing an opaque string, call security_secctx_to_secid() right in the "secctx" command handler and store only the numeric "secid". This eliminates an unnecessary string allocation and allows the daemon to receive errors when writing the "secctx" command instead of postponing the error to the "bind" command handler. For example, if the kernel was built without `CONFIG_SECURITY`, "bind" will return `EOPNOTSUPP`, but the daemon doesn't know why. With this patch, the "secctx" will instead return `EOPNOTSUPP` which is the right context for this error. This patch adds a boolean flag `have_secid` because I'm not sure if we can safely assume that zero is the special secid value for "not set". This appears to be true for SELinux, Smack and AppArmor, but since this attribute is not documented, I'm unable to derive a stable guarantee for that. Signed-off-by: Max Kellermann <max.kellermann@ionos.com> Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20241209141554.638708-1-max.kellermann@ionos.com/ Link: https://lore.kernel.org/r/20241213135013.2964079-6-dhowells@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-12-20 22:07:56 +01:00
David Howells	86ad1a58f6	nfs: Fix oops in nfs_netfs_init_request() when copying to cache When netfslib wants to copy some data that has just been read on behalf of nfs, it creates a new write request and calls nfs_netfs_init_request() to initialise it, but with a NULL file pointer. This causes nfs_file_open_context() to oops - however, we don't actually need the nfs context as we're only going to write to the cache. Fix this by just returning if we aren't given a file pointer and emit a warning if the request was for something other than copy-to-cache. Further, fix nfs_netfs_free_request() so that it doesn't try to free the context if the pointer is NULL. Fixes: `ee4cdf7ba8` ("netfs: Speed up buffered reading") Reported-by: Max Kellermann <max.kellermann@ionos.com> Closes: https://lore.kernel.org/r/CAKPOu+9DyMbKLhyJb7aMLDTb=Fh0T8Teb9sjuf_pze+XWT1VaQ@mail.gmail.com/ Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20241213135013.2964079-5-dhowells@redhat.com cc: Trond Myklebust <trondmy@kernel.org> cc: Anna Schumaker <anna@kernel.org> cc: Dave Wysochanski <dwysocha@redhat.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-nfs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-12-20 22:07:56 +01:00
David Howells	105549d09a	netfs: Fix enomem handling in buffered reads If netfs_read_to_pagecache() gets an error from either ->prepare_read() or from netfs_prepare_read_iterator(), it needs to decrement ->nr_outstanding, cancel the subrequest and break out of the issuing loop. Currently, it only does this for two of the cases, but there are two more that aren't handled. Fix this by moving the handling to a common place and jumping to it from all four places. This is in preference to inserting a wrapper around netfs_prepare_read_iterator() as proposed by Dmitry Antipov[1]. Link: https://lore.kernel.org/r/20241202093943.227786-1-dmantipov@yandex.ru/ [1] Fixes: `ee4cdf7ba8` ("netfs: Speed up buffered reading") Reported-by: syzbot+404b4b745080b6210c6c@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=404b4b745080b6210c6c Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20241213135013.2964079-4-dhowells@redhat.com Tested-by: syzbot+404b4b745080b6210c6c@syzkaller.appspotmail.com cc: Dmitry Antipov <dmantipov@yandex.ru> cc: Jeff Layton <jlayton@kernel.org> cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-12-20 22:07:56 +01:00
David Howells	c8b90d40d5	netfs: Fix non-contiguous donation between completed reads When a read subrequest finishes, if it doesn't have sufficient coverage to complete the folio(s) covering either side of it, it will donate the excess coverage to the adjacent subrequests on either side, offloading responsibility for unlocking the folio(s) covered to them. Now, preference is given to donating down to a lower file offset over donating up because that check is done first - but there's no check that the lower subreq is actually contiguous, and so we can end up donating incorrectly. The scenario seen[1] is that an 8MiB readahead request spanning four 2MiB folios is split into eight 1MiB subreqs (numbered 1 through 8). These terminate in the order 1,6,2,5,3,7,4,8. What happens is: - 1 donates to 2 - 6 donates to 5 - 2 completes, unlocking the first folio (with 1). - 5 completes, unlocking the third folio (with 6). - 3 donates to 4 - 7 donates to 4 incorrectly - 4 completes, unlocking the second folio (with 3), but can't use the excess from 7. - 8 donates to 4, also incorrectly. Fix this by preventing downward donation if the subreqs are not contiguous (in the example above, 7 donates to 4 across the gap left by 5 and 6). Reported-by: Shyam Prasad N <nspmangalore@gmail.com> Closes: https://lore.kernel.org/r/CANT5p=qBwjBm-D8soFVVtswGEfmMtQXVW83=TNfUtvyHeFQZBA@mail.gmail.com/ Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/526707.1733224486@warthog.procyon.org.uk/ [1] Link: https://lore.kernel.org/r/20241213135013.2964079-3-dhowells@redhat.com cc: Steve French <sfrench@samba.org> cc: Paulo Alcantara <pc@manguebit.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-12-20 22:07:55 +01:00
David Howells	973b710b88	kheaders: Ignore silly-rename files Tell tar to ignore silly-rename files (".__afs" and ".nfs") when building the header archive. These occur when a file that is open is unlinked locally, but hasn't yet been closed. Such files are visible to the user via the getdents() syscall and so programs may want to do things with them. During the kernel build, such files may be made during the processing of header files and the cleanup may get deferred by fput() which may result in tar seeing these files when it reads the directory, but they may have disappeared by the time it tries to open them, causing tar to fail with an error. Further, we don't want to include them in the tarball if they still exist. With CONFIG_HEADERS_INSTALL=y, something like the following may be seen: find: './kernel/.tmp_cpio_dir/include/dt-bindings/reset/.__afs2080': No such file or directory tar: ./include/linux/greybus/.__afs3C95: File removed before we read it The find warning doesn't seem to cause a problem. Fix this by telling tar when called from in gen_kheaders.sh to exclude such files. This only affects afs and nfs; cifs uses the Windows Hidden attribute to prevent the file from being seen. Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20241213135013.2964079-2-dhowells@redhat.com cc: Masahiro Yamada <masahiroy@kernel.org> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org cc: linux-nfs@vger.kernel.org cc: linux-kernel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-12-20 22:07:55 +01:00
Jakub Kicinski	30b981796b	selftests: drv-net: test empty queue and NAPI responses in netlink Make sure kernel doesn't respond to GETs for queues and NAPIs when link is down. Not with valid data, or with empty message, we want a ENOENT. Link: https://patch.msgid.link/20241219032833.1165433-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-20 11:57:29 -08:00
Jakub Kicinski	4a25201aa4	netdev-genl: avoid empty messages in napi get Empty netlink responses from do() are not correct (as opposed to dump() where not dumping anything is perfectly fine). We should return an error if the target object does not exist, in this case if the netdev is down we "hide" the NAPI instances. Fixes: `27f91aaf49` ("netdev-genl: Add netlink framework functions for napi") Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241219032833.1165433-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-20 11:57:29 -08:00
Vladimir Oltean	246068b86b	selftests: net: local_termination: require mausezahn Since the blamed commit, we require mausezahn because send_raw() uses it. Remove the "REQUIRE_MZ=no" line, which overwrites the default of requiring it. Fixes: `2379795042` ("selftests: net: local_termination: add PTP frames to the mix") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20241219155410.1856868-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-20 11:48:21 -08:00
Linus Torvalds	e9b8ffafd2	Merge tag 'usb-6.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB / Thunderbolt fixes from Greg KH: "Here are some important, and small, fixes for USB and Thunderbolt issues that have come up in the -rc releases. And some new device ids for good measure. Included in here are: - Much reported xhci bugfix for usb-storage devices (and other devices as well, tripped me up on a video camera) - thunderbolt fixes for some small reported issues - new usb-serial device ids All of these have been in linux-next this week with no reported issues" * tag 'usb-6.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: usb: xhci: fix ring expansion regression in 6.13-rc1 xhci: Turn NEC specific quirk for handling Stop Endpoint errors generic thunderbolt: Improve redrive mode handling USB: serial: option: add Telit FE910C04 rmnet compositions USB: serial: option: add MediaTek T7XX compositions USB: serial: option: add Netprisma LCUK54 modules for WWAN Ready USB: serial: option: add MeiG Smart SLM770A USB: serial: option: add TCL IK512 MBIM & ECM thunderbolt: Don't display nvm_version unless upgrade supported thunderbolt: Add support for Intel Panther Lake-M/P	2024-12-20 11:09:40 -08:00
Linus Torvalds	5127e1495b	Merge tag 'spi-fix-v6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fix from Mark Brown: "A fix for the remove path of the Rockchip driver, the code was just clearly and obviously wrong" * tag 'spi-fix-v6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: rockchip-sfc: Fix error in remove progress	2024-12-20 11:06:25 -08:00
Linus Torvalds	b648264cd4	Merge tag 'regulator-fix-v6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fix from Mark Brown: "The recently added regulator-uv-survival-time-ms property was renamed during the review of the series that added it, but unfortunately only in the DT binding and not in the code that parses the binding. This brings the code in line with the binding, if someone started using the original name we can add compat support for it but there's nothing upstream yet and it's a very niche feature so hopefully not" * tag 'regulator-fix-v6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: rename regulator-uv-survival-time-ms according to DT binding	2024-12-20 11:04:02 -08:00
Linus Torvalds	af215c980c	Merge tag 'drm-fixes-2024-12-20' of https://gitlab.freedesktop.org/drm/kernel Pull drm fixes from Dave Airlie: "Probably the last pull before Christmas holidays, I'll still be around for most of the time anyways, nothing too major in here, bunch of amdgpu and i915 along with a smattering of fixes across the board. core: - fix FB dependency - avoid div by 0 more in vrefresh - maintainers update display: - fix DP tunnel error path dma-buf: - fix !DEBUG_FS sched: - docs warning fix panel: - collection of misc panel fixes i915: - Reset engine utilization buffer before registration - Ensure busyness counter increases motonically - Accumulate active runtime on gt reset amdgpu: - Disable BOCO when CONFIG_HOTPLUG_PCI_PCIE is not enabled - scheduler job fixes - IP version check fixes - devcoredump fix - GPUVM update fix - NBIO 2.5 fix udmabuf: - fix memory leak on last export - sealing fixes ivpu: - fix NULL pointer - fix memory leak - fix WARN" * tag 'drm-fixes-2024-12-20' of https://gitlab.freedesktop.org/drm/kernel: (33 commits) drm/sched: Fix drm_sched_fini() docu generation accel/ivpu: Fix WARN in ivpu_ipc_send_receive_internal() accel/ivpu: Fix memory leak in ivpu_mmu_reserved_context_init() accel/ivpu: Fix general protection fault in ivpu_bo_list() drm/amdgpu/nbio7.0: fix IP version check drm/amd: Update strapping for NBIO 2.5.0 drm/amdgpu: Handle NULL bo->tbo.resource (again) in amdgpu_vm_bo_update drm/amdgpu: fix amdgpu_coredump drm/amdgpu/smu14.0.2: fix IP version check drm/amdgpu/gfx12: fix IP version check drm/amdgpu/mmhub4.1: fix IP version check drm/amdgpu/nbio7.11: fix IP version check drm/amdgpu/nbio7.7: fix IP version check drm/amdgpu: don't access invalid sched drm/amd: Require CONFIG_HOTPLUG_PCI_PCIE for BOCO drm: rework FB_CORE dependency drm/fbdev: Select FB_CORE dependency for fbdev on DMA and TTM fbdev: Fix recursive dependencies wrt BACKLIGHT_CLASS_DEVICE i915/guc: Accumulate active runtime on gt reset i915/guc: Ensure busyness counter increases motonically ...	2024-12-20 10:17:53 -08:00
Linus Torvalds	5b83bcdea5	Merge tag 'trace-ringbuffer-v6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull ring-buffer fixes from Steven Rostedt: - Fix possible overflow of mmapped ring buffer with bad offset If the mmap() to the ring buffer passes in a start address that is passed the end of the mmapped file, it is not caught and a slab-out-of-bounds is triggered. Add a check to make sure the start address is within the bounds - Do not use TP_printk() to boot mapped ring buffers As a boot mapped ring buffer's data may have pointers that map to the previous boot's memory map, it is unsafe to allow the TP_printk() to be used to read the boot mapped buffer's events. If a TP_printk() points to a static string from within the kernel it will not match the current kernel mapping if KASLR is active, and it can fault. Have it simply print out the raw fields. * tag 'trace-ringbuffer-v6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: trace/ring-buffer: Do not use TP_printk() formatting for boot mapped buffers ring-buffer: Fix overflow in __rb_map_vma	2024-12-20 10:13:26 -08:00
Alexander Lobakin	724c6ce38b	stddef: make __struct_group() UAPI C++-friendly For the most part of the C++ history, it couldn't have type declarations inside anonymous unions for different reasons. At the same time, __struct_group() relies on the latters, so when the @TAG argument is not empty, C++ code doesn't want to build (even under `extern "C"`): ../linux/include/uapi/linux/pkt_cls.h:25:24: error: 'struct tc_u32_sel::<unnamed union>::tc_u32_sel_hdr,' invalid; an anonymous union may only have public non-static data members [-fpermissive] The safest way to fix this without trying to switch standards (which is impossible in UAPI anyway) etc., is to disable tag declaration for that language. This won't break anything since for now it's not buildable at all. Use a separate definition for __struct_group() when __cplusplus is defined to mitigate the error, including the version from tools/. Fixes: `50d7bd38c3` ("stddef: Introduce struct_group() helper macro") Reported-by: Christopher Ferris <cferris@google.com> Closes: https://lore.kernel.org/linux-hardening/Z1HZpe3WE5As8UAz@google.com Suggested-by: Kees Cook <kees@kernel.org> # __struct_group_tag() Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/20241219135734.2130002-1-aleksander.lobakin@intel.com Signed-off-by: Kees Cook <kees@kernel.org>	2024-12-20 09:05:53 -08:00
Arnd Bergmann	a31ffd6ed5	Merge tag 'arm-soc/for-6.13/devicetree-arm64-fixes' of https://github.com/Broadcom/stblinux into arm/fixes This pull request contains Broadcom ARM64-based SoCs Device Tree fixes for 6.13, please pull the following: - Willow corrects the L2 cache line size on the Raspberry Pi 5 (2712) to the correct value of 64 bytes * tag 'arm-soc/for-6.13/devicetree-arm64-fixes' of https://github.com/Broadcom/stblinux: arm64: dts: broadcom: Fix L2 linesize for Raspberry Pi 5 Link: https://lore.kernel.org/r/20241217190547.868744-1-florian.fainelli@broadcom.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2024-12-20 18:02:27 +01:00
Arnd Bergmann	a61dae1101	Merge tag 'riscv-soc-fixes-for-v6.13-rc4' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux into arm/fixes RISC-V soc driver fixes for v6.13-rc4 A single fix for the Auto Update driver, where a mistake in array indexing (accessing as a u32 rather than a u8) caused the driver to read the wrong feature disable bits. Signed-off-by: Conor Dooley <conor.dooley@microchip.com> * tag 'riscv-soc-fixes-for-v6.13-rc4' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux: firmware: microchip: fix UL_IAP lock check in mpfs_auto_update_state() Link: https://lore.kernel.org/r/20241218-suffrage-unfazed-fa0113072a42@spud Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2024-12-20 18:00:30 +01:00
Zijian Zhang	d888b7af7c	tcp_bpf: Add sk_rmem_alloc related logic for tcp_bpf ingress redirection When we do sk_psock_verdict_apply->sk_psock_skb_ingress, an sk_msg will be created out of the skb, and the rmem accounting of the sk_msg will be handled by the skb. For skmsgs in __SK_REDIRECT case of tcp_bpf_send_verdict, when redirecting to the ingress of a socket, although we sk_rmem_schedule and add sk_msg to the ingress_msg of sk_redir, we do not update sk_rmem_alloc. As a result, except for the global memory limit, the rmem of sk_redir is nearly unlimited. Thus, add sk_rmem_alloc related logic to limit the recv buffer. Since the function sk_msg_recvmsg and __sk_psock_purge_ingress_msg are used in these two paths. We use "msg->skb" to test whether the sk_msg is skb backed up. If it's not, we shall do the memory accounting explicitly. Fixes: `604326b41a` ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Zijian Zhang <zijianzhang@bytedance.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20241210012039.1669389-3-zijianzhang@bytedance.com	2024-12-20 17:59:47 +01:00
Cong Wang	54f89b3178	tcp_bpf: Charge receive socket buffer in bpf_tcp_ingress() When bpf_tcp_ingress() is called, the skmsg is being redirected to the ingress of the destination socket. Therefore, we should charge its receive socket buffer, instead of sending socket buffer. Because sk_rmem_schedule() tests pfmemalloc of skb, we need to introduce a wrapper and call it for skmsg. Fixes: `604326b41a` ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20241210012039.1669389-2-zijianzhang@bytedance.com	2024-12-20 17:59:47 +01:00
Bingwu Zhang	669bf56cb2	mailmap: update Bingwu Zhang's email address I used to contribute to the kernel as 'Bingwu Zhang <xtexchooser@duck.com>' and 'Zhang Bingwu <xtexchooser@duck.com>'. Signed-off-by: Bingwu Zhang <xtex@aosc.io> Link: https://lore.kernel.org/r/20241208041352.168131-2-xtex@envs.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-20 16:47:23 +01:00
Arnd Bergmann	baf8855c91	staging: gpib: fix address space mixup Throughout the gpib drivers, a 'void *' struct member is used in place of either port numbers or __iomem pointers, which leads to lots of extra type casts, sparse warnings and less portable code. Split the struct member in two separate ones with the correct types, so each driver can pick which one to use. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/all/f10e976e-7a04-4454-b38d-39cd18f142da@roeck-us.net/ Link: https://lore.kernel.org/r/20241213064959.1045243-3-arnd@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-20 16:44:15 +01:00
Arnd Bergmann	fec866a003	staging: gpib: use ioport_map The tnt4882 backend has a rather elabolate way of abstracting the PIO and MMIO based hardware variants, duplicating the functionality of ioport_map() in a less portable way. Change it to use ioport_map() with ioread8()/iowrite8() to do this more easily. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20241213064959.1045243-2-arnd@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-20 16:44:15 +01:00
Arnd Bergmann	edbb7200ca	staging: gpib: fix pcmcia dependencies With CONFIG_PCMCIA=m, the gpib drivers that optionally support PCMCIA cannot be built-in. Add a Kconfig dependency to force these to be loadable modules as well, and change the GPIB_PCMCIA symbol to have the correct state for that. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20241213064959.1045243-1-arnd@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-20 16:44:15 +01:00
Arnd Bergmann	003d2abde1	staging: gpib: add module author and description fields The FMH driver is still missing both, so take them from the comment at the start of the file. Fixes: `8e4841a088` ("staging: gpib: Add Frank Mori Hess FPGA PCI GPIB driver") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Dominik Karol Piątkowski <dominik.karol.piatkowski@protonmail.com> Link: https://lore.kernel.org/r/20241213083119.2607901-1-arnd@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-20 16:44:06 +01:00
Arnd Bergmann	79d2e1919a	staging: gpib: fix Makefiles Having gpib drivers built-in rather than as loadable modules causes link failure because the drivers are never actually built: arm-linux-gnueabi-ld: drivers/staging/gpib/fmh_gpib/fmh_gpib.o: in function `fmh_gpib_t1_delay': fmh_gpib.c:(.text+0x3b0): undefined reference to `nec7210_t1_delay' arm-linux-gnueabi-ld: drivers/staging/gpib/fmh_gpib/fmh_gpib.o: in function `fmh_gpib_serial_poll_status': fmh_gpib.c:(.text+0x418): undefined reference to `nec7210_serial_poll_status' arm-linux-gnueabi-ld: drivers/staging/gpib/fmh_gpib/fmh_gpib.o: in function `fmh_gpib_secondary_address': fmh_gpib.c:(.text+0x57c): undefined reference to `nec7210_secondary_address' arm-linux-gnueabi-ld: drivers/staging/gpib/fmh_gpib/fmh_gpib.o: in function `fmh_gpib_primary_address': fmh_gpib.c:(.text+0x5ac): undefined reference to `nec7210_primary_address' Change this to use the correct Makefile syntax, setting either obj-m or obj-y. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20241212154245.1411411-2-arnd@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-20 16:43:35 +01:00
Arnd Bergmann	d99d65aedd	staging: gpib: make global 'usec_diff' functions static Trying to build both gpib_bitbang and lpvo_usb_gpib into the kernel reveals a function that should have been static and is also duplicated: x86_64-linux-ld: drivers/staging/gpib/lpvo_usb_gpib/lpvo_usb_gpib.o: in function `usec_diff': lpvo_usb_gpib.c:(.text+0x23c0): multiple definition of `usec_diff'; drivers/staging/gpib/gpio/gpib_bitbang.o:gpib_bitbang.c:(.text+0x2470): first defined here Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20241212154245.1411411-1-arnd@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-20 16:43:35 +01:00
Jiapeng Chong	8c41fae530	staging: gpib: Modify mismatched function name No functional modification involved. drivers/staging/gpib/lpvo_usb_gpib/lpvo_usb_gpib.c:676: warning: expecting prototype for interface_clear(). Prototype was for usb_gpib_interface_clear() instead. drivers/staging/gpib/lpvo_usb_gpib/lpvo_usb_gpib.c:654: warning: expecting prototype for go_to_standby(). Prototype was for usb_gpib_go_to_standby() instead. drivers/staging/gpib/lpvo_usb_gpib/lpvo_usb_gpib.c:636: warning: expecting prototype for enable_eos(). Prototype was for usb_gpib_enable_eos() instead. drivers/staging/gpib/lpvo_usb_gpib/lpvo_usb_gpib.c:618: warning: expecting prototype for disable_eos(). Prototype was for usb_gpib_disable_eos() instead. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=12253 Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Link: https://lore.kernel.org/r/20241206022504.69670-1-jiapeng.chong@linux.alibaba.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-20 16:43:12 +01:00
Dave Penkler	fd1885db8e	staging: gpib: Add lower bound check for secondary address Commit `9dde4559e9` ("staging: gpib: Add GPIB common core driver") from Sep 18, 2024 (linux-next), leads to the following Smatch static checker warning: drivers/staging/gpib/common/gpib_os.c:541 dvrsp() warn: no lower bound on 'sad' rl='s32min-30' The value -1 was introduced in user land to signify No secondary address to the driver so that a lower bound check could be added. This patch adds that check. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/linux-staging/4efd91f3-4259-4e95-a4e0-925853b98858@stanley.mountain/ Signed-off-by: Dave Penkler <dpenkler@gmail.com> Link: https://lore.kernel.org/r/20241207123410.28759-1-dpenkler@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-20 16:43:02 +01:00
Dave Penkler	4da38536e2	staging: gpib: Fix erroneous removal of blank before newline The USB_GPIB_SET_LINES command string used to be: "\nIBDC \n" but when we were merging this code into the upstream kernel we deleted the space character before the newline to make checkpatch happy. That turned out to be a mistake. The "\nIBDC" part of the string is a command that we pass to the firmware and the next character is a variable u8 value. It gets set in set_control_line(). msg[leng - 2] = value ? (retval & ~line) : retval \| line; where leng is the length of the command string. Imagine the parameter was supposed to be "8". With the pre-merge code the command string would be "\nIBDC8\n" With the post-merge code the command string became "\nIBD8\n" The firmware doesn't recognize "IBD8" as a valid command and rejects it. Putting a "." where the parameter is supposed to go fixes the driver and makes checkpatch happy. Same thing with the other define and the in-line assignment. Reported-by: Marcello Carla' <marcello.carla@gmx.com> Fixes: `fce79512a9` ("staging: gpib: Add LPVO DIY USB GPIB driver") Co-developed-by: Marcello Carla' <marcello.carla@gmx.com> Signed-off-by: Marcello Carla' <marcello.carla@gmx.com> Signed-off-by: Dave Penkler <dpenkler@gmail.com> Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://lore.kernel.org/r/20241205093442.5796-1-dpenkler@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-20 16:39:13 +01:00
Kan Liang	aa5d2ca7c1	perf/x86/intel: Fix bitmask of OCR and FRONTEND events for LNC The released OCR and FRONTEND events utilized more bits on Lunar Lake p-core. The corresponding mask in the extra_regs has to be extended to unblock the extra bits. Add a dedicated intel_lnc_extra_regs. Fixes: `a932aa0e86` ("perf/x86: Add Lunar Lake and Arrow Lake support") Reported-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20241216160252.430858-1-kan.liang@linux.intel.com	2024-12-20 15:31:14 +01:00
NeilBrown	7917f01a28	nfsd: restore callback functionality for NFSv4.0 A recent patch inadvertently broke callbacks for NFSv4.0. In the 4.0 case we do not expect a session to be found but still need to call setup_callback_client() which will not try to dereference it. This patch moves the check for failure to find a session into the 4.1+ branch of setup_callback_client() Fixes: `1e02c641c3` ("NFSD: Prevent NULL dereference in nfsd4_process_cb_update()") Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-12-20 09:17:12 -05:00
Mark Brown	926e862058	arm64/signal: Silence sparse warning storing GCSPR_EL0 We are seeing a sparse warning in gcs_restore_signal(): arch/arm64/kernel/signal.c:1054:9: sparse: sparse: cast removes address space '__user' of expression when storing the final GCSPR_EL0 value back into the register, caused by the fact that write_sysreg_s() casts the value it writes to a u64 which sparse sees as discarding the __userness of the pointer. Avoid this by treating the address as an integer, casting to a pointer only when using it to write to userspace. While we're at it also inline gcs_signal_cap_valid() into it's one user and make equivalent updates to gcs_signal_entry(). Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202412082005.OBJ0BbWs-lkp@intel.com/ Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20241214-arm64-gcs-signal-sparse-v3-1-5e8d18fffc0c@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2024-12-20 14:12:04 +00:00
Takashi Iwai	8cbd01ba9c	Merge tag 'asoc-fix-v6.13-rc3' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: Fixes for v6.13 A mix of quirks and small fixes, nothing too major anywhere.	2024-12-20 14:09:45 +01:00
David S. Miller	cc54ec56d8	Merge branch 'gve-xdp-fixes' Joshua Washington says: ==================== gve: various XDP fixes This patch series contains the following XDP fixes: - clean up XDP tx queue when stopping rings - use RCU synchronization to guard existence of XDP queues - perform XSK TX as part of RX NAPI to fix busy polling - fix XDP allocation issues when non-XDP configurations occur ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2024-12-20 12:49:42 +00:00
Joshua Washington	de63ac44a5	gve: fix XDP allocation path in edge cases This patch fixes a number of consistency issues in the queue allocation path related to XDP. As it stands, the number of allocated XDP queues changes in three different scenarios. 1) Adding an XDP program while the interface is up via gve_add_xdp_queues 2) Removing an XDP program while the interface is up via gve_remove_xdp_queues 3) After queues have been allocated and the old queue memory has been removed in gve_queues_start. However, the requirement for the interface to be up for gve_(add\|remove)_xdp_queues to be called, in conjunction with the fact that the number of queues stored in priv isn't updated until _after_ XDP queues have been allocated in the normal queue allocation path means that if an XDP program is added while the interface is down, XDP queues won't be added until the _second_ if_up, not the first. Given the expectation that the number of XDP queues is equal to the number of RX queues, scenario (3) has another problematic implication. When changing the number of queues while an XDP program is loaded, the number of XDP queues must be updated as well, as there is logic in the driver (gve_xdp_tx_queue_id()) which relies on every RX queue having a corresponding XDP TX queue. However, the number of XDP queues stored in priv would not be updated until _after_ a close/open leading to a mismatch in the number of XDP queues reported vs the number of XDP queues which actually exist after the queue count update completes. This patch remedies these issues by doing the following: 1) The allocation config getter function is set up to retrieve the _expected_ number of XDP queues to allocate instead of relying on the value stored in `priv` which is only updated once the queues have been allocated. 2) When adjusting queues, XDP queues are adjusted to match the number of RX queues when XDP is enabled. This only works in the case when queues are live, so part (1) of the fix must still be available in the case that queues are adjusted when there is an XDP program and the interface is down. Fixes: `5f08cd3d64` ("gve: Alloc before freeing when adjusting queues") Cc: stable@vger.kernel.org Signed-off-by: Joshua Washington <joshwash@google.com> Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Shailend Chand <shailend@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-12-20 12:49:42 +00:00
Joshua Washington	ba0925c34e	gve: process XSK TX descriptors as part of RX NAPI When busy polling is enabled, xsk_sendmsg for AF_XDP zero copy marks the NAPI ID corresponding to the memory pool allocated for the socket. In GVE, this NAPI ID will never correspond to a NAPI ID of one of the dedicated XDP TX queues registered with the umem because XDP TX is not set up to share a NAPI with a corresponding RX queue. This patch moves XSK TX descriptor processing from the TX NAPI to the RX NAPI, and the gve_xsk_wakeup callback is updated to use the RX NAPI instead of the TX NAPI, accordingly. The branch on if the wakeup is for TX is removed, as the NAPI poll should be invoked whether the wakeup is for TX or for RX. Fixes: `fd8e40321a` ("gve: Add AF_XDP zero-copy support for GQI-QPL format") Cc: stable@vger.kernel.org Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com> Signed-off-by: Joshua Washington <joshwash@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-12-20 12:49:41 +00:00
Joshua Washington	40338d7987	gve: guard XSK operations on the existence of queues This patch predicates the enabling and disabling of XSK pools on the existence of queues. As it stands, if the interface is down, disabling or enabling XSK pools would result in a crash, as the RX queue pointer would be NULL. XSK pool registration will occur as part of the next interface up. Similarly, xsk_wakeup needs be guarded against queues disappearing while the function is executing, so a check against the GVE_PRIV_FLAGS_NAPI_ENABLED flag is added to synchronize with the disabling of the bit and the synchronize_net() in gve_turndown. Fixes: `fd8e40321a` ("gve: Add AF_XDP zero-copy support for GQI-QPL format") Cc: stable@vger.kernel.org Signed-off-by: Joshua Washington <joshwash@google.com> Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Shailend Chand <shailend@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-12-20 12:49:41 +00:00
Joshua Washington	ff7c2dea9d	gve: guard XDP xmit NDO on existence of xdp queues In GVE, dedicated XDP queues only exist when an XDP program is installed and the interface is up. As such, the NDO XDP XMIT callback should return early if either of these conditions are false. In the case of no loaded XDP program, priv->num_xdp_queues=0 which can cause a divide-by-zero error, and in the case of interface down, num_xdp_queues remains untouched to persist XDP queue count for the next interface up, but the TX pointer itself would be NULL. The XDP xmit callback also needs to synchronize with a device transitioning from open to close. This synchronization will happen via the GVE_PRIV_FLAGS_NAPI_ENABLED bit along with a synchronize_net() call, which waits for any RCU critical sections at call-time to complete. Fixes: `39a7f4aa3e` ("gve: Add XDP REDIRECT support for GQI-QPL format") Cc: stable@vger.kernel.org Signed-off-by: Joshua Washington <joshwash@google.com> Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Shailend Chand <shailend@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-12-20 12:49:41 +00:00
Joshua Washington	6321f5fb70	gve: clean XDP queues in gve_tx_stop_ring_gqi When stopping XDP TX rings, the XDP clean function needs to be called to clean out the entire queue, similar to what happens in the normal TX queue case. Otherwise, the FIFO won't be cleared correctly, and xsk_tx_completed won't be reported. Fixes: `75eaae158b` ("gve: Add XDP DROP and TX support for GQI-QPL format") Cc: stable@vger.kernel.org Signed-off-by: Joshua Washington <joshwash@google.com> Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-12-20 12:49:41 +00:00
Takashi Iwai	66a0a2b047	ALSA: sh: Fix wrong argument order for copy_from_iter() Fix a brown paper bag bug I introduced at converting to the standard iter helper; the arguments were wrongly passed and have to be swapped. Fixes: `9b5f8ee43e` ("ALSA: sh: Use standard helper for buffer accesses") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202412140019.jat5Dofr-lkp@intel.com/ Link: https://patch.msgid.link/20241220114417.5898-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-12-20 12:45:38 +01:00
Li Zhijian	55853cb829	selftests/alsa: Fix circular dependency involving global-timer The pattern rule `$(OUTPUT)/%: %.c` inadvertently included a circular dependency on the global-timer target due to its inclusion in $(TEST_GEN_PROGS_EXTENDED). This resulted in a circular dependency warning during the build process. To resolve this, the dependency on $(TEST_GEN_PROGS_EXTENDED) has been replaced with an explicit dependency on $(OUTPUT)/libatest.so. This change ensures that libatest.so is built before any other targets that require it, without creating a circular dependency. This fix addresses the following warning: make[4]: Entering directory 'tools/testing/selftests/alsa' make[4]: Circular default_modconfig/kselftest/alsa/global-timer <- default_modconfig/kselftest/alsa/global-timer dependency dropped. make[4]: Nothing to be done for 'all'. make[4]: Leaving directory 'tools/testing/selftests/alsa' Cc: Mark Brown <broonie@kernel.org> Cc: Jaroslav Kysela <perex@perex.cz> Cc: Takashi Iwai <tiwai@suse.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Link: https://patch.msgid.link/20241218025931.914164-1-lizhijian@fujitsu.com Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-12-20 10:00:41 +01:00
Fedor Pchelkin	fa0308134d	ALSA: memalloc: prefer dma_mapping_error() over explicit address checking With CONFIG_DMA_API_DEBUG enabled, the following warning is observed: DMA-API: snd_hda_intel 0000:03:00.1: device driver failed to check map error[device address=0x00000000ffff0000] [size=20480 bytes] [mapped as single] WARNING: CPU: 28 PID: 2255 at kernel/dma/debug.c:1036 check_unmap+0x1408/0x2430 CPU: 28 UID: 42 PID: 2255 Comm: wireplumber Tainted: G W L 6.12.0-10-133577cad6bf48e5a7848c4338124081393bfe8a+ #759 debug_dma_unmap_page+0xe9/0xf0 snd_dma_wc_free+0x85/0x130 [snd_pcm] snd_pcm_lib_free_pages+0x1e3/0x440 [snd_pcm] snd_pcm_common_ioctl+0x1c9a/0x2960 [snd_pcm] snd_pcm_ioctl+0x6a/0xc0 [snd_pcm] ... Check for returned DMA addresses using specialized dma_mapping_error() helper which is generally recommended for this purpose by Documentation/core-api/dma-api.rst. Fixes: `c880a51466` ("ALSA: memalloc: Use proper DMA mapping API for x86 WC buffer allocations") Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Closes: https://lore.kernel.org/r/CABXGCsNB3RsMGvCucOy3byTEOxoc-Ys+zB_HQ=Opb_GhX1ioDA@mail.gmail.com/ Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Link: https://patch.msgid.link/20241219203345.195898-1-pchelkin@ispras.ru Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-12-20 09:54:12 +01:00
Jaroslav Kysela	3d3f43fab4	ALSA: compress_offload: improve file descriptors installation for dma-buf Avoid to use single dma_buf_fd() call for both directions. This code ensures that both file descriptors are allocated before fd_install(). Link: https://lore.kernel.org/linux-sound/6a923647-4495-4cff-a253-b73f48cfd0ea@stanley.mountain/ Fixes: `04177158cf` ("ALSA: compress_offload: introduce accel operation mode") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Cc: Vinod Koul <vkoul@kernel.org> Signed-off-by: Jaroslav Kysela <perex@perex.cz> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/20241217100726.732863-1-perex@perex.cz	2024-12-20 09:52:15 +01:00
Jaroslav Kysela	f25a51b47c	ALSA: compress_offload: use safe list iteration in snd_compr_task_seq() The sequence function can call snd_compr_task_free_one(). Use list_for_each_entry_safe_reverse() to make sure that the used pointers are safe. Link: https://lore.kernel.org/linux-sound/f2769cff-6c7a-4092-a2d1-c33a5411a182@stanley.mountain/ Fixes: `04177158cf` ("ALSA: compress_offload: introduce accel operation mode") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Cc: Vinod Koul <vkoul@kernel.org> Signed-off-by: Jaroslav Kysela <perex@perex.cz> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/20241217100707.732766-1-perex@perex.cz	2024-12-20 09:51:36 +01:00
Arnd Bergmann	6018f2fe10	ALSA: compress_offload: avoid 64-bit get_user() On some architectures, get_user() cannot read a 64-bit user variable: arm-linux-gnueabi-ld: sound/core/compress_offload.o: in function `snd_compr_ioctl': compress_offload.c:(.text.snd_compr_ioctl+0x538): undefined reference to `__get_user_bad' Use an equivalent copy_from_user() instead. Fixes: `04177158cf` ("ALSA: compress_offload: introduce accel operation mode") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Shengjiu Wang <shengjiu.wang@gmail.com> Acked-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/20241216093410.377112-2-arnd@kernel.org	2024-12-20 09:49:15 +01:00
Arnd Bergmann	1ae40d5231	ALSA: compress_offload: import DMA_BUF namespace The compression offload code cannot be in a loadable module unless it imports that namespace: ERROR: modpost: module snd-compress uses symbol dma_buf_get from namespace DMA_BUF, but does not import it. ERROR: modpost: module snd-compress uses symbol dma_buf_put from namespace DMA_BUF, but does not import it. ERROR: modpost: module snd-compress uses symbol dma_buf_fd from namespace DMA_BUF, but does not import it. Fixes: `04177158cf` ("ALSA: compress_offload: introduce accel operation mode") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Shengjiu Wang <shengjiu.wang@gmail.com> Acked-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/20241216093410.377112-1-arnd@kernel.org	2024-12-20 09:49:15 +01:00
Dave Airlie	e639fb046b	Merge tag 'amd-drm-fixes-6.13-2024-12-18' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.13-2024-12-18: amdgpu: - Disable BOCO when CONFIG_HOTPLUG_PCI_PCIE is not enabled - scheduler job fixes - IP version check fixes - devcoredump fix - GPUVM update fix - NBIO 2.5 fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241218204637.2966198-1-alexander.deucher@amd.com	2024-12-20 16:21:44 +10:00
Jakub Kicinski	b6075c8053	Merge tag 'wireless-2024-12-19' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Kalle Valo says: ==================== wireless fixes for v6.13-rc5 Few minor fixes this time, nothing special. * tag 'wireless-2024-12-19' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: wifi: cw1200: Fix potential NULL dereference wifi: iwlwifi: mvm: Fix __counted_by usage in cfg80211_wowlan_nd_* MAINTAINERS: wifi: ath: add Jeff Johnson as maintainer wifi: iwlwifi: fix CRF name for Bz ==================== Link: https://patch.msgid.link/20241219185042.662B6C4CECE@smtp.kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-19 18:53:15 -08:00
Jakub Kicinski	c0cc126882	Merge branch 'net-dsa-microchip-fix-set_ageing_time-function-for-ksz9477-and-lan937x-switches' Tristram Ha says: ==================== net: dsa: microchip: Fix set_ageing_time function for KSZ9477 and LAN937X switches The aging count is not a simple number that is broken into two parts and programmed into 2 registers. These patches correct the programming for KSZ9477 and LAN937X switches. ==================== Link: https://patch.msgid.link/20241218020224.70590-1-Tristram.Ha@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-19 18:04:06 -08:00
Tristram Ha	bb98690434	net: dsa: microchip: Fix LAN937X set_ageing_time function The aging count is not a simple 20-bit value but comprises a 3-bit multiplier and a 20-bit second time. The code tries to use the original multiplier which is 4 as the second count is still 300 seconds by default. As the 20-bit number is now too large for practical use there is an option to interpret it as microseconds instead of seconds. Fixes: `2c119d9982` ("net: dsa: microchip: add the support for set_ageing_time") Signed-off-by: Tristram Ha <tristram.ha@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241218020224.70590-3-Tristram.Ha@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-19 18:04:04 -08:00
Tristram Ha	262bfba8ab	net: dsa: microchip: Fix KSZ9477 set_ageing_time function The aging count is not a simple 11-bit value but comprises a 3-bit multiplier and an 8-bit second count. The code tries to use the original multiplier which is 4 as the second count is still 300 seconds by default. Fixes: `2c119d9982` ("net: dsa: microchip: add the support for set_ageing_time") Signed-off-by: Tristram Ha <tristram.ha@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241218020224.70590-2-Tristram.Ha@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-19 18:04:04 -08:00
Biju Das	79d67c499c	drm: adv7511: Drop dsi single lane support As per [1] and [2], ADV7535/7533 supports only 2-, 3-, or 4-lane. Drop unsupported 1-lane. [1] https://www.analog.com/media/en/technical-documentation/data-sheets/ADV7535.pdf [2] https://www.analog.com/media/en/technical-documentation/data-sheets/ADV7533.pdf Fixes: `1e4d58cd7f` ("drm/bridge: adv7533: Create a MIPI DSI device") Reported-by: Hien Huynh <hien.huynh.px@renesas.com> Cc: stable@vger.kernel.org Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Reviewed-by: Adam Ford <aford173@gmail.com> Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241119192040.152657-4-biju.das.jz@bp.renesas.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>	2024-12-20 03:53:25 +02:00
Biju Das	ee8f9ed57a	dt-bindings: display: adi,adv7533: Drop single lane support As per [1] and [2], ADV7535/7533 supports only 2-, 3-, or 4-lane. Drop unsupported 1-lane from bindings. [1] https://www.analog.com/media/en/technical-documentation/data-sheets/ADV7535.pdf [2] https://www.analog.com/media/en/technical-documentation/data-sheets/ADV7533.pdf Fixes: `1e4d58cd7f` ("drm/bridge: adv7533: Create a MIPI DSI device") Cc: stable@vger.kernel.org Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241119192040.152657-3-biju.das.jz@bp.renesas.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>	2024-12-20 03:53:25 +02:00
Biju Das	81adbd3ff2	drm: adv7511: Fix use-after-free in adv7533_attach_dsi() The host_node pointer was assigned and freed in adv7533_parse_dt(), and later, adv7533_attach_dsi() uses the same. Fix this use-after-free issue by dropping of_node_put() in adv7533_parse_dt() and calling of_node_put() in error path of probe() and also in the remove(). Fixes: `1e4d58cd7f` ("drm/bridge: adv7533: Create a MIPI DSI device") Cc: stable@vger.kernel.org Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241119192040.152657-2-biju.das.jz@bp.renesas.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>	2024-12-20 03:53:25 +02:00
Stefan Ekenberg	902806baf3	drm/bridge: adv7511_audio: Update Audio InfoFrame properly AUDIO_UPDATE bit (Bit 5 of MAIN register 0x4A) needs to be set to 1 while updating Audio InfoFrame information and then set to 0 when done. Otherwise partially updated Audio InfoFrames could be sent out. Two cases where this rule were not followed are fixed: - In adv7511_hdmi_hw_params() make sure AUDIO_UPDATE bit is updated before/after setting ADV7511_REG_AUDIO_INFOFRAME. - In audio_startup() use the correct register for clearing AUDIO_UPDATE bit. The problem with corrupted audio infoframes were discovered by letting a HDMI logic analyser check the output of ADV7535. Note that this patchs replaces writing REG_GC(1) with REG_INFOFRAME_UPDATE. Bit 5 of REG_GC(1) is positioned within field GC_PP[3:0] and that field doesn't control audio infoframe and is read- only. My conclusion therefore was that the author if this code meant to clear bit 5 of REG_INFOFRAME_UPDATE from the very beginning. Tested-by: Biju Das <biju.das.jz@bp.renesas.com> Fixes: `53c515befe` ("drm/bridge: adv7511: Add Audio support") Signed-off-by: Stefan Ekenberg <stefan.ekenberg@axis.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20241119-adv7511-audio-info-frame-v4-1-4ae68e76c89c@axis.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>	2024-12-20 03:53:02 +02:00
Sean Christopherson	386d69f9f2	KVM: x86/mmu: Treat TDP MMU faults as spurious if access is already allowed Treat slow-path TDP MMU faults as spurious if the access is allowed given the existing SPTE to fix a benign warning (other than the WARN itself) due to replacing a writable SPTE with a read-only SPTE, and to avoid the unnecessary LOCK CMPXCHG and subsequent TLB flush. If a read fault races with a write fault, fast GUP fails for any reason when trying to "promote" the read fault to a writable mapping, and KVM resolves the write fault first, then KVM will end up trying to install a read-only SPTE (for a !map_writable fault) overtop a writable SPTE. Note, it's not entirely clear why fast GUP fails, or if that's even how KVM ends up with a !map_writable fault with a writable SPTE. If something else is going awry, e.g. due to a bug in mmu_notifiers, then treating read faults as spurious in this scenario could effectively mask the underlying problem. However, retrying the faulting access instead of overwriting an existing SPTE is functionally correct and desirable irrespective of the WARN, and fast GUP _can_ legitimately fail with a writable VMA, e.g. if the Accessed bit in primary MMU's PTE is toggled and causes a PTE value mismatch. The WARN was also recently added, specifically to track down scenarios where KVM is unnecessarily overwrites SPTEs, i.e. treating the fault as spurious doesn't regress KVM's bug-finding capabilities in any way. In short, letting the WARN linger because there's a tiny chance it's due to a bug elsewhere would be excessively paranoid. Fixes: `1a175082b1` ("KVM: x86/mmu: WARN and flush if resolving a TDP MMU fault clears MMU-writable") Reported-by: Lei Yang <leiyang@redhat.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219588 Tested-by: Lei Yang <leiyang@redhat.com> Link: https://lore.kernel.org/r/20241218213611.3181643-1-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-12-19 17:47:52 -08:00
Sean Christopherson	4d5163cba4	KVM: SVM: Allow guest writes to set MSR_AMD64_DE_CFG bits Drop KVM's arbitrary behavior of making DE_CFG.LFENCE_SERIALIZE read-only for the guest, as rejecting writes can lead to guest crashes, e.g. Windows in particular doesn't gracefully handle unexpected #GPs on the WRMSR, and nothing in the AMD manuals suggests that LFENCE_SERIALIZE is read-only _if it exists_. KVM only allows LFENCE_SERIALIZE to be set, by the guest or host, if the underlying CPU has X86_FEATURE_LFENCE_RDTSC, i.e. if LFENCE is guaranteed to be serializing. So if the guest sets LFENCE_SERIALIZE, KVM will provide the desired/correct behavior without any additional action (the guest's value is never stuffed into hardware). And having LFENCE be serializing even when it's not _required_ to be is a-ok from a functional perspective. Fixes: `74a0e79df6` ("KVM: SVM: Disallow guest from changing userspace's MSR_AMD64_DE_CFG value") Fixes: `d1d93fa90f` ("KVM: SVM: Add MSR-based feature support for serializing LFENCE") Reported-by: Simon Pilkington <simonp.git@mailbox.org> Closes: https://lore.kernel.org/all/52914da7-a97b-45ad-86a0-affdf8266c61@mailbox.org Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Link: https://lore.kernel.org/r/20241211172952.1477605-1-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-12-19 17:47:52 -08:00
Sean Christopherson	9b42d1e8e4	KVM: x86: Play nice with protected guests in complete_hypercall_exit() Use is_64_bit_hypercall() instead of is_64_bit_mode() to detect a 64-bit hypercall when completing said hypercall. For guests with protected state, e.g. SEV-ES and SEV-SNP, KVM must assume the hypercall was made in 64-bit mode as the vCPU state needed to detect 64-bit mode is unavailable. Hacking the sev_smoke_test selftest to generate a KVM_HC_MAP_GPA_RANGE hypercall via VMGEXIT trips the WARN: ------------[ cut here ]------------ WARNING: CPU: 273 PID: 326626 at arch/x86/kvm/x86.h:180 complete_hypercall_exit+0x44/0xe0 [kvm] Modules linked in: kvm_amd kvm ... [last unloaded: kvm] CPU: 273 UID: 0 PID: 326626 Comm: sev_smoke_test Not tainted 6.12.0-smp--392e932fa0f3-feat #470 Hardware name: Google Astoria/astoria, BIOS 0.20240617.0-0 06/17/2024 RIP: 0010:complete_hypercall_exit+0x44/0xe0 [kvm] Call Trace: <TASK> kvm_arch_vcpu_ioctl_run+0x2400/0x2720 [kvm] kvm_vcpu_ioctl+0x54f/0x630 [kvm] __se_sys_ioctl+0x6b/0xc0 do_syscall_64+0x83/0x160 entry_SYSCALL_64_after_hwframe+0x76/0x7e </TASK> ---[ end trace 0000000000000000 ]--- Fixes: `b5aead0064` ("KVM: x86: Assume a 64-bit hypercall for guests with protected state") Cc: stable@vger.kernel.org Cc: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: Nikunj A Dadhania <nikunj@amd.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Link: https://lore.kernel.org/r/20241128004344.4072099-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-12-19 17:47:51 -08:00
Suravee Suthikulpanit	d81cadbe16	KVM: SVM: Disable AVIC on SNP-enabled system without HvInUseWrAllowed feature On SNP-enabled system, VMRUN marks AVIC Backing Page as in-use while the guest is running for both secure and non-secure guest. Any hypervisor write to the in-use vCPU's AVIC backing page (e.g. to inject an interrupt) will generate unexpected #PF in the host. Currently, attempt to run AVIC guest would result in the following error: BUG: unable to handle page fault for address: ff3a442e549cc270 #PF: supervisor write access in kernel mode #PF: error_code(0x80000003) - RMP violation PGD b6ee01067 P4D b6ee02067 PUD 10096d063 PMD 11c540063 PTE 80000001149cc163 SEV-SNP: PFN 0x1149cc unassigned, dumping non-zero entries in 2M PFN region: [0x114800 - 0x114a00] ... Newer AMD system is enhanced to allow hypervisor to modify the backing page for non-secure guest on SNP-enabled system. This enhancement is available when the CPUID Fn8000_001F_EAX bit 30 is set (HvInUseWrAllowed). This table describes AVIC support matrix w.r.t. SNP enablement: \| Non-SNP system \| SNP system ----------------------------------------------------- Non-SNP guest \| AVIC Activate \| AVIC Activate iff \| \| HvInuseWrAllowed=1 ----------------------------------------------------- SNP guest \| N/A \| Secure AVIC Therefore, check and disable AVIC in kvm_amd driver when the feature is not available on SNP-enabled system. See the AMD64 Architecture Programmer’s Manual (APM) Volume 2 for detail. (https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/ programmer-references/40332.pdf) Fixes: `216d106c7f` ("x86/sev: Add SEV-SNP host initialization support") Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/20241104075845.7583-1-suravee.suthikulpanit@amd.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2024-12-19 17:47:51 -08:00
Wei Fang	25c6a5ab15	net: phy: micrel: Dynamically control external clock of KSZ PHY On the i.MX6ULL-14x14-EVK board, enet1_ref and enet2_ref are used as the clock sources for two external KSZ PHYs. However, after closing the two FEC ports, the clk_enable_count of the enet1_ref and enet2_ref clocks is not 0. The root cause is that since the commit `9853294627` ("net: phy: micrel: use devm_clk_get_optional_enabled for the rmii-ref clock"), the external clock of KSZ PHY has been enabled when the PHY driver probes, and it can only be disabled when the PHY driver is removed. This causes the clock to continue working when the system is suspended or the network port is down. Although Heiko explained in the commit message that the patch was because some clock suppliers need to enable the clock to get the valid clock rate , it seems that the simple fix is to disable the clock after getting the clock rate to solve the current problem. This is indeed true, but we need to admit that Heiko's patch has been applied for more than a year, and we cannot guarantee whether there are platforms that only enable rmii-ref in the KSZ PHY driver during this period. If this is the case, disabling rmii-ref will cause RMII on these platforms to not work. Secondly, commit `99ac4cbcc2` ("net: phy: micrel: allow usage of generic ethernet-phy clock") just simply enables the generic clock permanently, which seems like the generic clock may only be enabled in the PHY driver. If we simply disable the generic clock, RMII may not work. If we keep it as it is, the platform using the generic clock will have the same problem as the i.MX6ULL platform. To solve this problem, the clock is enabled when phy_driver::resume() is called, and the clock is disabled when phy_driver::suspend() is called. Since phy_driver::resume() and phy_driver::suspend() are not called in pairs, an additional clk_enable flag is added. When phy_driver::suspend() is called, the clock is disabled only if clk_enable is true. Conversely, when phy_driver::resume() is called, the clock is enabled if clk_enable is false. The changes that introduced the problem were only a few lines, while the current fix is about a hundred lines, which seems out of proportion, but it is necessary because kszphy_probe() is used by multiple KSZ PHYs and we need to fix all of them. Fixes: `9853294627` ("net: phy: micrel: use devm_clk_get_optional_enabled for the rmii-ref clock") Fixes: `99ac4cbcc2` ("net: phy: micrel: allow usage of generic ethernet-phy clock") Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20241217063500.1424011-1-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-19 17:29:20 -08:00
Dave Airlie	87fd883325	Merge tag 'drm-misc-fixes-2024-12-19' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes drm-misc-fixes for v6.13-rc4: - udma-buf fixes related to sealing. - dma-buf build warning fix when debugfs is not enabled. - Assorted drm/panel fixes. - Correct error return in drm_dp_tunnel_mgr_create. - Fix even more divide by zero in drm_mode_vrefresh. - Fix FBDEV dependencies in Kconfig. - Documentation fix for drm_sched_fini. - IVPU NULL pointer, memory leak and WARN fix. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/d0763051-87b7-483e-89e0-a9f993383450@linux.intel.com	2024-12-20 07:13:45 +10:00
Pavel Begunkov	dbd2ca9367	io_uring: check if iowq is killed before queuing task work can be executed after the task has gone through io_uring termination, whether it's the final task_work run or the fallback path. In this case, task work will find ->io_wq being already killed and null'ed, which is a problem if it then tries to forward the request to io_queue_iowq(). Make io_queue_iowq() fail requests in this case. Note that it also checks PF_KTHREAD, because the user can first close a DEFER_TASKRUN ring and shortly after kill the task, in which case ->iowq check would race. Cc: stable@vger.kernel.org Fixes: `50c52250e2` ("block: implement async io_uring discard cmd") Fixes: `773af69121` ("io_uring: always reissue from task_work context") Reported-by: Will <willsroot@protonmail.com> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/63312b4a2c2bb67ad67b857d17a300e1d3b078e8.1734637909.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-19 13:31:53 -07:00
Johan Hovold	7db0ba3e6e	Revert "arm64: dts: qcom: x1e80100: enable OTG on USB-C controllers" This reverts commit `f042bc234c`. A recent change enabling role switching for the x1e80100 USB-C controllers breaks UCSI and DisplayPort Alternate Mode when the controllers are in host mode: ucsi_glink.pmic_glink_ucsi pmic_glink.ucsi.0: PPM init failed, stop trying As enabling OTG mode currently breaks SuperSpeed hotplug and suspend, and with retimer (and orientation detection) support not even merged yet, let's revert at least until we have stable host mode in mainline. Fixes: `f042bc234c` ("arm64: dts: qcom: x1e80100: enable OTG on USB-C controllers") Reported-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/all/hw2pdof4ajadjsjrb44f2q4cz4yh5qcqz5d3l7gjt2koycqs3k@xx5xvd26uyef Link: https://lore.kernel.org/lkml/Z1gbyXk-SktGjL6-@hovoldconsulting.com/ Cc: Jonathan Marek <jonathan@marek.ca> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Reviewed-by: Abel Vesa <abel.vesa@linaro.org> Link: https://lore.kernel.org/r/20241210111444.26240-4-johan+linaro@kernel.org Signed-off-by: Bjorn Andersson <andersson@kernel.org>	2024-12-19 12:44:30 -06:00
Johan Hovold	2e5e1a7ea6	Revert "arm64: dts: qcom: x1e80100-crd: enable otg on usb ports" This reverts commit `2dd3250191`. A recent change enabling OTG mode on the x1e81000 CRD breaks suspend. Specifically, the device hard resets during resume if suspended with all controllers in device mode (i.e. no USB device connected). The corresponding change on the T14s also led to SuperSpeed hotplugs not being detected. With retimer (and orientation detection) support not even merged yet, let's revert at least until we have stable host mode in mainline. Fixes: `2dd3250191` ("arm64: dts: qcom: x1e80100-crd: enable otg on usb ports") Reported-by: Abel Vesa <abel.vesa@linaro.org> Cc: Jonathan Marek <jonathan@marek.ca> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Reviewed-by: Abel Vesa <abel.vesa@linaro.org> Link: https://lore.kernel.org/r/20241210111444.26240-3-johan+linaro@kernel.org Signed-off-by: Bjorn Andersson <andersson@kernel.org>	2024-12-19 12:44:30 -06:00
Bharath SM	92941c7f2c	smb: fix bytes written value in /proc/fs/cifs/Stats With recent netfs apis changes, the bytes written value was not getting updated in /proc/fs/cifs/Stats. Fix this by updating tcon->bytes in write operations. Fixes: `3ee1a1fc39` ("cifs: Cut over to using netfslib") Signed-off-by: Bharath SM <bharathsm@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2024-12-19 12:14:11 -06:00
Linus Torvalds	8faabc041a	Merge tag 'net-6.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from can and netfilter. Current release - regressions: - rtnetlink: try the outer netns attribute in rtnl_get_peer_net() - rust: net::phy fix module autoloading Current release - new code bugs: - phy: avoid undefined behavior in _led_polarity_set() - eth: octeontx2-pf: fix netdev memory leak in rvu_rep_create() Previous releases - regressions: - smc: check sndbuf_space again after NOSPACE flag is set in smc_poll - ipvs: fix clamp() of ip_vs_conn_tab on small memory systems - dsa: restore dsa_software_vlan_untag() ability to operate on VLAN-untagged traffic - eth: - tun: fix tun_napi_alloc_frags() - ionic: no double destroy workqueue - idpf: trigger SW interrupt when exiting wb_on_itr mode - rswitch: rework ts tags management - team: fix feature exposure when no ports are present Previous releases - always broken: - core: fix repeated netlink messages in queue dump - mdiobus: fix an OF node reference leak - smc: check iparea_offset and ipv6_prefixes_cnt when receiving proposal msg - can: fix missed interrupts with m_can_pci - eth: oa_tc6: fix infinite loop error when tx credits becomes 0" tag 'net-6.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (45 commits) net: mctp: handle skb cleanup on sock_queue failures net: mdiobus: fix an OF node reference leak octeontx2-pf: fix error handling of devlink port in rvu_rep_create() octeontx2-pf: fix netdev memory leak in rvu_rep_create() psample: adjust size if rate_as_probability is set netdev-genl: avoid empty messages in queue dump net: dsa: restore dsa_software_vlan_untag() ability to operate on VLAN-untagged traffic selftests: openvswitch: fix tcpdump execution net: usb: qmi_wwan: add Quectel RG255C net: phy: avoid undefined behavior in *_led_polarity_set() netfilter: ipset: Fix for recursive locking warning ipvs: Fix clamp() of ip_vs_conn_tab on small memory systems can: m_can: fix missed interrupts with m_can_pci can: m_can: set init flag earlier in probe rtnetlink: Try the outer netns attribute in rtnl_get_peer_net(). net: netdevsim: fix nsim_pp_hold_write() idpf: trigger SW interrupt when exiting wb_on_itr mode idpf: add support for SW triggered interrupts qed: fix possible uninit pointer read in qed_mcp_nvm_info_populate() net: ethernet: bgmac-platform: fix an OF node reference leak ...	2024-12-19 09:19:11 -08:00
Linus Torvalds	baaa2567a7	Merge tag 'mmc-v6.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC fixes from Ulf Hansson: - mtk-sd: Cleanup the wakeup configuration in error/remove-path - sdhci-tegra: Correct quirk for ADMA2 length * tag 'mmc-v6.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: mtk-sd: disable wakeup in .remove() and in the error path of .probe() mmc: sdhci-tegra: Remove SDHCI_QUIRK_BROKEN_ADMA_ZEROLEN_DESC quirk	2024-12-19 08:53:51 -08:00
Linus Torvalds	a0db71c7fe	Merge tag 'pwm/for-6.13-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux Pull pwm fix from Uwe Kleine-König: "Fix regression in pwm-stm32 driver when converting to new waveform support Fabrice Gasnier found and fixed a regression I introduced with v6.13-rc1 when converting the stm32 pwm driver to support the new waveform stuff. On some hardware variants this completely broke the driver" * tag 'pwm/for-6.13-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux: pwm: stm32: Fix complementary output in round_waveform_tohw()	2024-12-19 08:50:05 -08:00
Linus Torvalds	466b2d40f6	Merge tag 'v6.13-rc3-ksmbd-server-fixes' of git://git.samba.org/ksmbd Pull smb server fixes from Steve French: - Two fixes for better handling maximum outstanding requests - Fix simultaneous negotiate protocol race * tag 'v6.13-rc3-ksmbd-server-fixes' of git://git.samba.org/ksmbd: ksmbd: conn lock to serialize smb2 negotiate ksmbd: fix broken transfers when exceeding max simultaneous operations ksmbd: count all requests in req_running counter	2024-12-19 08:45:37 -08:00
Lukas Wunner	774c71c52a	PCI/bwctrl: Enable only if more than one speed is supported If a PCIe port only supports a single speed, enabling bandwidth control is pointless: There's no need to monitor autonomous speed changes, nor can the speed be changed. Not enabling it saves a small amount of memory and compute resources, but also fixes a boot hang reported by Niklas: It occurs when enabling bandwidth control on Downstream Ports of Intel JHL7540 "Titan Ridge 2018" Thunderbolt controllers. The ports only support 2.5 GT/s in accordance with USB4 v2 sec 11.2.1, so the present commit works around the issue. PCIe r6.2 sec 8.2.1 prescribes that: "A device must support 2.5 GT/s and is not permitted to skip support for any data rates between 2.5 GT/s and the highest supported rate." Consequently, bandwidth control is currently only disabled if a port doesn't support higher speeds than 2.5 GT/s. However the Implementation Note in PCIe r6.2 sec 7.5.3.18 cautions: "It is strongly encouraged that software primarily utilize the Supported Link Speeds Vector instead of the Max Link Speed field, so that software can determine the exact set of supported speeds on current and future hardware. This can avoid software being confused if a future specification defines Links that do not require support for all slower speeds." In other words, future revisions of the PCIe Base Spec may allow gaps in the Supported Link Speeds Vector. To be future-proof, don't just check whether speeds above 2.5 GT/s are supported, but rather check whether more than one speed is supported. Fixes: `665745f274` ("PCI/bwctrl: Re-add BW notification portdrv as PCIe BW controller") Closes: https://lore.kernel.org/r/db8e457fcd155436449b035e8791a8241b0df400.camel@kernel.org Link: https://lore.kernel.org/r/3564908a9c99fc0d2a292473af7a94ebfc8f5820.1734428762.git.lukas@wunner.de Reported-by: Niklas Schnelle <niks@kernel.org> Tested-by: Niklas Schnelle <niks@kernel.org> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Jonathan Cameron <Jonthan.Cameron@huawei.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>	2024-12-19 16:36:36 +00:00
Lukas Wunner	3202ca2215	PCI: Honor Max Link Speed when determining supported speeds The Supported Link Speeds Vector in the Link Capabilities 2 Register indicates the supported link speeds. The Max Link Speed field in the Link Capabilities Register indicates the maximum of those speeds. pcie_get_supported_speeds() neglects to honor the Max Link Speed field and will thus incorrectly deem higher speeds as supported. Fix it. One user-visible issue addressed here is an incorrect value in the sysfs attribute "max_link_speed". But the main motivation is a boot hang reported by Niklas: Intel JHL7540 "Titan Ridge 2018" Thunderbolt controllers supports 2.5-8 GT/s speeds, but indicate 2.5 GT/s as maximum. Ilpo recalls seeing this on more devices. It can be explained by the controller's Downstream Ports supporting 8 GT/s if an Endpoint is attached, but limiting to 2.5 GT/s if the port interfaces to a PCIe Adapter, in accordance with USB4 v2 sec 11.2.1: "This section defines the functionality of an Internal PCIe Port that interfaces to a PCIe Adapter. [...] The Logical sub-block shall update the PCIe configuration registers with the following characteristics: [...] Max Link Speed field in the Link Capabilities Register set to 0001b (data rate of 2.5 GT/s only). Note: These settings do not represent actual throughput. Throughput is implementation specific and based on the USB4 Fabric performance." The present commit is not sufficient on its own to fix Niklas' boot hang, but it is a prerequisite: A subsequent commit will fix the boot hang by enabling bandwidth control only if more than one speed is supported. The GENMASK() macro used herein specifies 0 as lowest bit, even though the Supported Link Speeds Vector ends at bit 1. This is done on purpose to avoid a GENMASK(0, 1) macro if Max Link Speed is zero. That macro would be invalid as the lowest bit is greater than the highest bit. Ilpo has witnessed a zero Max Link Speed on Root Complex Integrated Endpoints in particular, so it does occur in practice. (The Link Capabilities Register is optional on RCiEPs per PCIe r6.2 sec 7.5.3.) Fixes: `d2bd39c045` ("PCI: Store all PCIe Supported Link Speeds") Closes: https://lore.kernel.org/r/70829798889c6d779ca0f6cd3260a765780d1369.camel@kernel.org Link: https://lore.kernel.org/r/fe03941e3e1cc42fb9bf4395e302bff53ee2198b.1734428762.git.lukas@wunner.de Reported-by: Niklas Schnelle <niks@kernel.org> Tested-by: Niklas Schnelle <niks@kernel.org> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>	2024-12-19 16:35:59 +00:00
Jens Axboe	c261e4f1dd	io_uring/register: limit ring resizing to DEFER_TASKRUN With DEFER_TASKRUN, we know the ring can't be both waited upon and resized at the same time. This is important for CQ resizing. Allowing SQ ring resizing is more trivial, but isn't the interesting use case. Hence limit ring resizing in general to DEFER_TASKRUN only for now. This isn't a huge problem as CQ ring resizing is generally the most useful on networking type of workloads where it can be hard to size the ring appropriately upfront, and those should be using DEFER_TASKRUN for better performance. Fixes: `79cfe9e59c` ("io_uring/register: add IORING_REGISTER_RESIZE_RINGS") Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-19 09:32:26 -07:00
Tvrtko Ursulin	de35994ecd	workqueue: Do not warn when cancelling WQ_MEM_RECLAIM work from !WQ_MEM_RECLAIM worker After commit `746ae46c11` ("drm/sched: Mark scheduler work queues with WQ_MEM_RECLAIM") amdgpu started seeing the following warning: [ ] workqueue: WQ_MEM_RECLAIM sdma0:drm_sched_run_job_work [gpu_sched] is flushing !WQ_MEM_RECLAIM events:amdgpu_device_delay_enable_gfx_off [amdgpu] ... [ ] Workqueue: sdma0 drm_sched_run_job_work [gpu_sched] ... [ ] Call Trace: [ ] <TASK> ... [ ] ? check_flush_dependency+0xf5/0x110 ... [ ] cancel_delayed_work_sync+0x6e/0x80 [ ] amdgpu_gfx_off_ctrl+0xab/0x140 [amdgpu] [ ] amdgpu_ring_alloc+0x40/0x50 [amdgpu] [ ] amdgpu_ib_schedule+0xf4/0x810 [amdgpu] [ ] ? drm_sched_run_job_work+0x22c/0x430 [gpu_sched] [ ] amdgpu_job_run+0xaa/0x1f0 [amdgpu] [ ] drm_sched_run_job_work+0x257/0x430 [gpu_sched] [ ] process_one_work+0x217/0x720 ... [ ] </TASK> The intent of the verifcation done in check_flush_depedency is to ensure forward progress during memory reclaim, by flagging cases when either a memory reclaim process, or a memory reclaim work item is flushed from a context not marked as memory reclaim safe. This is correct when flushing, but when called from the cancel(_delayed)_work_sync() paths it is a false positive because work is either already running, or will not be running at all. Therefore cancelling it is safe and we can relax the warning criteria by letting the helper know of the calling context. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Fixes: `fca839c00a` ("workqueue: warn if memory reclaim tries to flush !WQ_MEM_RECLAIM workqueue") References: `746ae46c11` ("drm/sched: Mark scheduler work queues with WQ_MEM_RECLAIM") Cc: Tejun Heo <tj@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v4.5+ Signed-off-by: Tejun Heo <tj@kernel.org>	2024-12-19 06:15:35 -10:00
Enzo Matsumiya	e9f2517a3e	smb: client: fix TCP timers deadlock after rmmod Commit `ef7134c7fc` ("smb: client: Fix use-after-free of network namespace.") fixed a netns UAF by manually enabled socket refcounting (sk->sk_net_refcnt=1 and sock_inuse_add(net, 1)). The reason the patch worked for that bug was because we now hold references to the netns (get_net_track() gets a ref internally) and they're properly released (internally, on __sk_destruct()), but only because sk->sk_net_refcnt was set. Problem: (this happens regardless of CONFIG_NET_NS_REFCNT_TRACKER and regardless if init_net or other) Setting sk->sk_net_refcnt=1 manually and after socket creation is not only out of cifs scope, but also technically wrong -- it's set conditionally based on user (=1) vs kernel (=0) sockets. And net/ implementations seem to base their user vs kernel space operations on it. e.g. upon TCP socket close, the TCP timers are not cleared because sk->sk_net_refcnt=1: (cf. commit `151c9c724d` ("tcp: properly terminate timers for kernel sockets")) net/ipv4/tcp.c: void tcp_close(struct sock *sk, long timeout) { lock_sock(sk); __tcp_close(sk, timeout); release_sock(sk); if (!sk->sk_net_refcnt) inet_csk_clear_xmit_timers_sync(sk); sock_put(sk); } Which will throw a lockdep warning and then, as expected, deadlock on tcp_write_timer(). A way to reproduce this is by running the reproducer from `ef7134c7fc` and then 'rmmod cifs'. A few seconds later, the deadlock/lockdep warning shows up. Fix: We shouldn't mess with socket internals ourselves, so do not set sk_net_refcnt manually. Also change __sock_create() to sock_create_kern() for explicitness. As for non-init_net network namespaces, we deal with it the best way we can -- hold an extra netns reference for server->ssocket and drop it when it's released. This ensures that the netns still exists whenever we need to create/destroy server->ssocket, but is not directly tied to it. Fixes: `ef7134c7fc` ("smb: client: Fix use-after-free of network namespace.") Cc: stable@vger.kernel.org Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>	2024-12-19 09:25:20 -06:00
Dragan Simic	ee1c8e6b29	smb: client: Deduplicate "select NETFS_SUPPORT" in Kconfig Repeating automatically selected options in Kconfig files is redundant, so let's delete repeated "select NETFS_SUPPORT" that was added accidentally. Fixes: `69c3c023af` ("cifs: Implement netfslib hooks") Signed-off-by: Dragan Simic <dsimic@manjaro.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2024-12-19 09:24:35 -06:00
Bharath SM	a769bee5f9	smb: use macros instead of constants for leasekey size and default cifsattrs value Replace default hardcoded value for cifsAttrs with ATTR_ARCHIVE macro Use SMB2_LEASE_KEY_SIZE macro for leasekey size in smb2_lease_break Signed-off-by: Bharath SM <bharathsm@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2024-12-19 09:24:32 -06:00
Bagas Sanjaya	1b684ca15f	drm/sched: Fix drm_sched_fini() docu generation Commit `baf4afc583` ("drm/sched: Improve teardown documentation") added a list of drm_sched_fini()'s problems. The list triggers htmldocs warning (but renders correctly in htmldocs output): Documentation/gpu/drm-mm:571: ./drivers/gpu/drm/scheduler/sched_main.c:1359: ERROR: Unexpected indentation. Separate the list from the preceding paragraph by a blank line to fix the warning. While at it, also end the aforementioned paragraph by a colon. Fixes: `baf4afc583` ("drm/sched: Improve teardown documentation") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Closes: https://lore.kernel.org/r/20241108175655.6d3fcfb7@canb.auug.org.au/ Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> [phasta: Adjust commit message] Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20241217034915.62594-1-bagasdotme@gmail.com	2024-12-19 16:03:56 +01:00
Lucas Stach	f64f610ec6	pmdomain: core: add dummy release function to genpd device The genpd device, which is really only used as a handle to lookup OPP, but not even registered to the device core otherwise and thus lifetime linked to the genpd struct it is contained in, is missing a release function. After `b8f7bbd1f4` ("pmdomain: core: Add missing put_device()") the device will be cleaned up going through the driver core device_release() function, which will warn when no release callback is present for the device. Add a dummy release function to shut up the warning. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Tested-by: Luca Ceresoli <luca.ceresoli@bootlin.com> Fixes: `b8f7bbd1f4` ("pmdomain: core: Add missing put_device()") Cc: stable@vger.kernel.org Message-ID: <20241218184433.1930532-1-l.stach@pengutronix.de> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>	2024-12-19 15:47:02 +01:00
Joe Hattori	469c0682e0	pmdomain: imx: gpcv2: fix an OF node reference leak in imx_gpcv2_probe() imx_gpcv2_probe() leaks an OF node reference obtained by of_get_child_by_name(). Fix it by declaring the device node with the __free(device_node) cleanup construct. This bug was found by an experimental static analysis tool that I am developing. Fixes: `03aa12629f` ("soc: imx: Add GPCv2 power gating driver") Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp> Cc: stable@vger.kernel.org Message-ID: <20241215030159.1526624-1-joe@pf.is.s.u-tokyo.ac.jp> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>	2024-12-19 15:26:37 +01:00
Amir Goldstein	974e3fe0ac	fs: relax assertions on failure to encode file handles Encoding file handles is usually performed by a filesystem >encode_fh() method that may fail for various reasons. The legacy users of exportfs_encode_fh(), namely, nfsd and name_to_handle_at(2) syscall are ready to cope with the possibility of failure to encode a file handle. There are a few other users of exportfs_encode_{fh,fid}() that currently have a WARN_ON() assertion when ->encode_fh() fails. Relax those assertions because they are wrong. The second linked bug report states commit `16aac5ad1f` ("ovl: support encoding non-decodable file handles") in v6.6 as the regressing commit, but this is not accurate. The aforementioned commit only increases the chances of the assertion and allows triggering the assertion with the reproducer using overlayfs, inotify and drop_caches. Triggering this assertion was always possible with other filesystems and other reasons of ->encode_fh() failures and more particularly, it was also possible with the exact same reproducer using overlayfs that is mounted with options index=on,nfs_export=on also on kernels < v6.6. Therefore, I am not listing the aforementioned commit as a Fixes commit. Backport hint: this patch will have a trivial conflict applying to v6.6.y, and other trivial conflicts applying to stable kernels < v6.6. Reported-by: syzbot+ec07f6f5ce62b858579f@syzkaller.appspotmail.com Tested-by: syzbot+ec07f6f5ce62b858579f@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-unionfs/671fd40c.050a0220.4735a.024f.GAE@google.com/ Reported-by: Dmitry Safonov <dima@arista.com> Closes: https://lore.kernel.org/linux-fsdevel/CAGrbwDTLt6drB9eaUagnQVgdPBmhLfqqxAf3F+Juqy_o6oP8uw@mail.gmail.com/ Cc: stable@vger.kernel.org Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/r/20241219115301.465396-1-amir73il@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-12-19 15:18:27 +01:00
Eric Biggers	8d90a86ed0	mmc: sdhci-msm: fix crypto key eviction Commit `c7eed31e23` ("mmc: sdhci-msm: Switch to the new ICE API") introduced an incorrect check of the algorithm ID into the key eviction path, and thus qcom_ice_evict_key() is no longer ever called. Fix it. Fixes: `c7eed31e23` ("mmc: sdhci-msm: Switch to the new ICE API") Cc: stable@vger.kernel.org Cc: Abel Vesa <abel.vesa@linaro.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Message-ID: <20241213041958.202565-6-ebiggers@kernel.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>	2024-12-19 14:42:10 +01:00
Jerome Marchand	716f2bca1c	selftests/bpf: Fix compilation error in get_uprobe_offset() In get_uprobe_offset(), the call to procmap_query() use the constant PROCMAP_QUERY_VMA_EXECUTABLE, even if PROCMAP_QUERY is not defined. Define PROCMAP_QUERY_VMA_EXECUTABLE when PROCMAP_QUERY isn't. Fixes: `4e9e07603e` ("selftests/bpf: make use of PROCMAP_QUERY ioctl if available") Signed-off-by: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/bpf/20241218175724.578884-1-jmarchan@redhat.com	2024-12-19 13:24:39 +01:00
Jacek Lawrynowicz	0f6482caa6	accel/ivpu: Fix WARN in ivpu_ipc_send_receive_internal() Move pm_runtime_set_active() to ivpu_pm_init() so when ivpu_ipc_send_receive_internal() is executed before ivpu_pm_enable() it already has correct runtime state, even if last resume was not successful. Fixes: `8ed520ff46` ("accel/ivpu: Move set autosuspend delay to HW specific code") Cc: stable@vger.kernel.org # v6.7+ Reviewed-by: Karol Wachowski <karol.wachowski@intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241210130939.1575610-4-jacek.lawrynowicz@linux.intel.com	2024-12-19 13:16:59 +01:00
Jacek Lawrynowicz	6c9ba75f14	accel/ivpu: Fix memory leak in ivpu_mmu_reserved_context_init() Add appropriate error handling to ensure all allocated resources are released upon encountering an error. Fixes: `a74f4d9913` ("accel/ivpu: Defer MMU root page table allocation") Cc: Karol Wachowski <karol.wachowski@intel.com> Reviewed-by: Karol Wachowski <karol.wachowski@intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241210130939.1575610-3-jacek.lawrynowicz@linux.intel.com	2024-12-19 13:16:21 +01:00
Jacek Lawrynowicz	4b2efb9db0	accel/ivpu: Fix general protection fault in ivpu_bo_list() Check if ctx is not NULL before accessing its fields. Fixes: `37dee2a2f4` ("accel/ivpu: Improve buffer object debug logs") Cc: stable@vger.kernel.org # v6.8 Reviewed-by: Karol Wachowski <karol.wachowski@intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241210130939.1575610-2-jacek.lawrynowicz@linux.intel.com	2024-12-19 13:16:20 +01:00
Tiezhu Yang	29d44cce32	selftests/bpf: Use asm constraint "m" for LoongArch Currently, LoongArch LLVM does not support the constraint "o" and no plan to support it, it only supports the similar constraint "m", so change the constraints from "nor" in the "else" case to arch-specific "nmr" to avoid the build error such as "unexpected asm memory constraint" for LoongArch. Fixes: `630301b0d5` ("selftests/bpf: Add basic USDT selftests") Suggested-by: Weining Lu <luweining@loongson.cn> Suggested-by: Li Chen <chenli@loongson.cn> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Huacai Chen <chenhuacai@loongson.cn> Cc: stable@vger.kernel.org Link: https://llvm.org/docs/LangRef.html#supported-constraint-code-list Link: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/LoongArch/LoongArchISelDAGToDAG.cpp#L172 Link: https://lore.kernel.org/bpf/20241219111506.20643-1-yangtiezhu@loongson.cn	2024-12-19 13:15:52 +01:00
Selvin Xavier	9272cba0de	RDMA/bnxt_re: Fix the locking while accessing the QP table QP table handling is synchronized with destroy QP and Async event from the HW. The same needs to be synchronized during create_qp also. Use the same lock in create_qp also. Fixes: `76d3ddff71` ("RDMA/bnxt_re: synchronize the qp-handle table array") Fixes: `f218d67ef0` ("RDMA/bnxt_re: Allow posting when QPs are in error") Fixes: `84cf229f40` ("RDMA/bnxt_re: Fix the qp table indexing") Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/20241217102649.1377704-6-kalesh-anakkur.purayil@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-12-19 06:57:19 -05:00
Damodharam Ammepalli	bb839f3ace	RDMA/bnxt_re: Fix MSN table size for variable wqe mode For variable size wqe mode, the MSN table size should be half the size of the SQ depth. Fixing this to avoid wrap around problems in the retransmission path. Fixes: `de1d364c38` ("RDMA/bnxt_re: Add support for Variable WQE in Genp7 adapters") Reviewed-by: Kashyap Desai <kashyap.desai@broadcom.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/20241217102649.1377704-5-kalesh-anakkur.purayil@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-12-19 06:57:19 -05:00
Damodharam Ammepalli	d13be54dc1	RDMA/bnxt_re: Add send queue size check for variable wqe For the fixed WQE case, HW supports 0xFFFF WQEs. For variable Size WQEs, HW treats this number as the 16 bytes slots. The maximum supported WQEs needs to be adjusted based on the number of slots. Set a maximum WQE limit for variable WQE scenario. Fixes: `de1d364c38` ("RDMA/bnxt_re: Add support for Variable WQE in Genp7 adapters") Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/20241217102649.1377704-4-kalesh-anakkur.purayil@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-12-19 06:57:19 -05:00
Kalesh AP	d5a38bf2f3	RDMA/bnxt_re: Disable use of reserved wqes Disabling the reserved wqes logic for Gen P5/P7 devices because this workaround is required only for legacy devices. Fixes: `ecb53febfc` ("RDMA/bnxt_en: Enable RDMA driver support for 57500 chip") Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/20241217102649.1377704-3-kalesh-anakkur.purayil@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-12-19 06:57:19 -05:00
Selvin Xavier	40be32303e	RDMA/bnxt_re: Fix max_qp_wrs reported While creating qps, driver adds one extra entry to the sq size passed by the ULPs in order to avoid queue full condition. When ULPs creates QPs with max_qp_wr reported, driver creates QP with 1 more than the max_wqes supported by HW. Create QP fails in this case. To avoid this error, reduce 1 entry in max_qp_wqes and report it to the stack. Fixes: `1ac5a40479` ("RDMA/bnxt_re: Add bnxt_re RoCE driver") Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/20241217102649.1377704-2-kalesh-anakkur.purayil@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-12-19 06:57:19 -05:00
Greg Kroah-Hartman	1b62f3cb74	Merge tag 'thunderbolt-for-v6.13-rc4' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt into usb-linus Mika writes: thunderbolt: Fixes for v6.13-rc4 This includes following USB4/Thunderbolt fixes for v6.13-rc4: - Add Intel Panther Lake PCI IDs - Do not show nvm_version for retimers that are not supported - Fix redrive mode handling. All these have been in linux-next with no reported issues. * tag 'thunderbolt-for-v6.13-rc4' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt: thunderbolt: Improve redrive mode handling thunderbolt: Don't display nvm_version unless upgrade supported thunderbolt: Add support for Intel Panther Lake-M/P	2024-12-19 12:35:02 +01:00
Ahmad Fatoum	1322149606	regulator: rename regulator-uv-survival-time-ms according to DT binding The regulator bindings don't document regulator-uv-survival-time-ms, but the more descriptive regulator-uv-less-critical-window-ms instead. Looking back at v3[1] and v4[2] of the series adding the support, the property was indeed renamed between these patch series, but unfortunately the rename only made it into the DT bindings with the driver code still using the old name. Let's therefore rename the property in the driver code to follow suit. This will break backwards compatibility, but there are no upstream device trees using the property and we never documented the old name of the property anyway. ¯\_(ツ)_/¯" [1]: https://lore.kernel.org/all/20231025084614.3092295-7-o.rempel@pengutronix.de/ [2]: https://lore.kernel.org/all/20231026144824.4065145-5-o.rempel@pengutronix.de/ Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de> Link: https://patch.msgid.link/20241218-regulator-uv-survival-time-ms-rename-v1-1-6cac9c3c75da@pengutronix.de Signed-off-by: Mark Brown <broonie@kernel.org>	2024-12-19 11:15:24 +00:00
Chen-Yu Tsai	32c9c06adb	ASoC: mediatek: disable buffer pre-allocation On Chromebooks based on Mediatek MT8195 or MT8188, the audio frontend (AFE) is limited to accessing a very small window (1 MiB) of memory, which is described as a reserved memory region in the device tree. On these two platforms, the maximum buffer size is given as 512 KiB. The MediaTek common code uses the same value for preallocations. This means that only the first two PCM substreams get preallocations, and then the whole space is exhausted, barring any other substreams from working. Since the substreams used are not always the first two, this means audio won't work correctly. This is observed on the MT8188 Geralt Chromebooks, on which the "mediatek,dai-link" property was dropped when it was upstreamed. That property causes the driver to only register the PCM substreams listed in the property, and in the order given. Instead of trying to compute an optimal value and figuring out which streams are used, simply disable preallocation. The PCM buffers are managed by the core and are allocated and released on the fly. There should be no impact to any of the other MediaTek platforms. Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://patch.msgid.link/20241219105303.548437-1-wenst@chromium.org Signed-off-by: Mark Brown <broonie@kernel.org>	2024-12-19 11:15:09 +00:00
Jeremy Kerr	ce1219c3f7	net: mctp: handle skb cleanup on sock_queue failures Currently, we don't use the return value from sock_queue_rcv_skb, which means we may leak skbs if a message is not successfully queued to a socket. Instead, ensure that we're freeing the skb where the sock hasn't otherwise taken ownership of the skb by adding checks on the sock_queue_rcv_skb() to invoke a kfree on failure. In doing so, rather than using the 'rc' value to trigger the kfree_skb(), use the skb pointer itself, which is more explicit. Also, add a kunit test for the sock delivery failure cases. Fixes: `4a992bbd36` ("mctp: Implement message fragmentation & reassembly") Cc: stable@vger.kernel.org Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Link: https://patch.msgid.link/20241218-mctp-next-v2-1-1c1729645eaa@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-12-19 11:52:49 +01:00
Joe Hattori	572af9f284	net: mdiobus: fix an OF node reference leak fwnode_find_mii_timestamper() calls of_parse_phandle_with_fixed_args() but does not decrement the refcount of the obtained OF node. Add an of_node_put() call before returning from the function. This bug was detected by an experimental static analysis tool that I am developing. Fixes: `bc1bee3b87` ("net: mdiobus: Introduce fwnode_mdiobus_register_phy()") Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241218035106.1436405-1-joe@pf.is.s.u-tokyo.ac.jp Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-12-19 11:45:42 +01:00
Dave Airlie	e9088ac19e	Merge tag 'drm-intel-fixes-2024-12-18' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes - Reset engine utilization buffer before registration (Umesh Nerlige Ramappa) - Ensure busyness counter increases motonically (Umesh Nerlige Ramappa) - Accumulate active runtime on gt reset (Umesh Nerlige Ramappa) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Tvrtko Ursulin <tursulin@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/Z2LppUZudGKXwWjW@linux	2024-12-19 20:31:36 +10:00
Bernard Metzler	16b87037b4	RDMA/siw: Remove direct link to net_device Do not manage a per device direct link to net_device. Rely on associated ib_devices net_device management, not doubling the effort locally. A badly managed local link to net_device was causing a 'KASAN: slab-use-after-free' exception during siw_query_port() call. Fixes: `bdcf26bf9b` ("rdma/siw: network and RDMA core interface") Reported-by: syzbot+4b87489410b4efd181bf@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=4b87489410b4efd181bf Signed-off-by: Bernard Metzler <bmt@zurich.ibm.com> Link: https://patch.msgid.link/20241212151848.564872-1-bmt@zurich.ibm.com Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-12-19 05:18:37 -05:00
Chiara Meiohas	13a6691910	RDMA/nldev: Set error code in rdma_nl_notify_event In case of error set the error code before the goto. Fixes: `6ff57a2ea7` ("RDMA/nldev: Fix NULL pointer dereferences issue in rdma_nl_notify_event") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/linux-rdma/a84a2fc3-33b6-46da-a1bd-3343fa07eaf9@stanley.mountain/ Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com> Reviewed-by: Maher Sanalla <msanalla@nvidia.com> Link: https://patch.msgid.link/13eb25961923f5de9eb9ecbbc94e26113d6049ef.1733815944.git.leonro@nvidia.com Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-12-19 05:15:20 -05:00
Paolo Abeni	b4adc04954	Merge tag 'nf-24-12-19' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net The following series contains two fixes for Netfilter/IPVS: 1) Possible build failure in IPVS on systems with less than 512MB memory due to incorrect use of clamp(), from David Laight. 2) Fix bogus lockdep nesting splat with ipset list:set type, from Phil Sutter. netfilter pull request 24-12-19 * tag 'nf-24-12-19' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: ipset: Fix for recursive locking warning ipvs: Fix clamp() of ip_vs_conn_tab on small memory systems ==================== Link: https://patch.msgid.link/20241218234137.1687288-1-pablo@netfilter.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-12-19 09:55:21 +01:00
Harshit Mogalapalli	b95c8c33ae	octeontx2-pf: fix error handling of devlink port in rvu_rep_create() Unregister the devlink port when register_netdev() fails. Fixes: `9ed0343f56` ("octeontx2-pf: Add devlink port support") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Link: https://patch.msgid.link/20241217052326.1086191-2-harshit.m.mogalapalli@oracle.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-18 19:23:51 -08:00
Harshit Mogalapalli	51df947678	octeontx2-pf: fix netdev memory leak in rvu_rep_create() When rvu_rep_devlink_port_register() fails, free_netdev(ndev) for this incomplete iteration before going to "exit:" label. Fixes: `9ed0343f56` ("octeontx2-pf: Add devlink port support") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Link: https://patch.msgid.link/20241217052326.1086191-1-harshit.m.mogalapalli@oracle.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-18 19:23:50 -08:00
Adrian Moreno	5eecd85c77	psample: adjust size if rate_as_probability is set If PSAMPLE_ATTR_SAMPLE_PROBABILITY flag is to be sent, the available size for the packet data has to be adjusted accordingly. Also, check the error code returned by nla_put_flag. Fixes: `7b1b2b60c6` ("net: psample: allow using rate as probability") Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Reviewed-by: Aaron Conole <aconole@redhat.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20241217113739.3929300-1-amorenoz@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-18 19:23:04 -08:00
Jakub Kicinski	5eb70dbebf	netdev-genl: avoid empty messages in queue dump Empty netlink responses from do() are not correct (as opposed to dump() where not dumping anything is perfectly fine). We should return an error if the target object does not exist, in this case if the netdev is down it has no queues. Fixes: `6b6171db7f` ("netdev-genl: Add netlink framework functions for queue") Reported-by: syzbot+0a884bc2d304ce4af70f@syzkaller.appspotmail.com Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20241218022508.815344-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-18 19:22:51 -08:00
Vladimir Oltean	16f027cd40	net: dsa: restore dsa_software_vlan_untag() ability to operate on VLAN-untagged traffic Robert Hodaszi reports that locally terminated traffic towards VLAN-unaware bridge ports is broken with ocelot-8021q. He is describing the same symptoms as for commit `1f9fc48fd3` ("net: dsa: sja1105: fix reception from VLAN-unaware bridges"). For context, the set merged as "VLAN fixes for Ocelot driver": https://lore.kernel.org/netdev/20240815000707.2006121-1-vladimir.oltean@nxp.com/ was developed in a slightly different form earlier this year, in January. Initially, the switch was unconditionally configured to set OCELOT_ES0_TAG when using ocelot-8021q, regardless of port operating mode. This led to the situation where VLAN-unaware bridge ports would always push their PVID - see ocelot_vlan_unaware_pvid() - a negligible value anyway - into RX packets. To strip this in software, we would have needed DSA to know what private VID the switch chose for VLAN-unaware bridge ports, and pushed into the packets. This was implemented downstream, and a remnant of it remains in the form of a comment mentioning ds->ops->get_private_vid(), as something which would maybe need to be considered in the future. However, for upstream, it was deemed inappropriate, because it would mean introducing yet another behavior for stripping VLAN tags from VLAN-unaware bridge ports, when one already existed (ds->untag_bridge_pvid). The latter has been marked as obsolete along with an explanation why it is logically broken, but still, it would have been confusing. So, for upstream, felix_update_tag_8021q_rx_rule() was developed, which essentially changed the state of affairs from "Felix with ocelot-8021q delivers all packets as VLAN-tagged towards the CPU" into "Felix with ocelot-8021q delivers all packets from VLAN-aware bridge ports towards the CPU". This was done on the premise that in VLAN-unaware mode, there's nothing useful in the VLAN tags, and we can avoid introducing ds->ops->get_private_vid() in the DSA receive path if we configure the switch to not push those VLAN tags into packets in the first place. Unfortunately, and this is when the trainwreck started, the selftests developed initially and posted with the series were not re-ran. dsa_software_vlan_untag() was initially written given the assumption that users of this feature would send _all_ traffic as VLAN-tagged. It was only partially adapted to the new scheme, by removing ds->ops->get_private_vid(), which also used to be necessary in standalone ports mode. Where the trainwreck became even worse is that I had a second opportunity to think about this, when the dsa_software_vlan_untag() logic change initially broke sja1105, in commit `1f9fc48fd3` ("net: dsa: sja1105: fix reception from VLAN-unaware bridges"). I did not connect the dots that it also breaks ocelot-8021q, for pretty much the same reason that not all received packets will be VLAN-tagged. To be compatible with the optimized Felix control path which runs felix_update_tag_8021q_rx_rule() to only push VLAN tags when useful (in VLAN-aware mode), we need to restore the old dsa_software_vlan_untag() logic. The blamed commit introduced the assumption that dsa_software_vlan_untag() will see only VLAN-tagged packets, assumption which is false. What corrupts RX traffic is the fact that we call skb_vlan_untag() on packets which are not VLAN-tagged in the first place. Fixes: `93e4649efa` ("net: dsa: provide a software untagging function on RX for VLAN-aware bridges") Reported-by: Robert Hodaszi <robert.hodaszi@digi.com> Closes: https://lore.kernel.org/netdev/20241215163334.615427-1-robert.hodaszi@digi.com/ Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20241216135059.1258266-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-18 19:22:36 -08:00
Jakub Kicinski	a713c017ef	Merge branch '200GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== idpf: trigger SW interrupt when exiting wb_on_itr mode Joshua Hay says: This patch series introduces SW triggered interrupt support for idpf, then uses said interrupt to fix a race condition between completion writebacks and re-enabling interrupts. * '200GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: idpf: trigger SW interrupt when exiting wb_on_itr mode idpf: add support for SW triggered interrupts ==================== Link: https://patch.msgid.link/20241217225715.4005644-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-18 19:20:20 -08:00
Adrian Moreno	a17975992c	selftests: openvswitch: fix tcpdump execution Fix the way tcpdump is executed by: - Using the right variable for the namespace. Currently the use of the empty "ns" makes the command fail. - Waiting until it starts to capture to ensure the interesting traffic is caught on slow systems. - Using line-buffered output to ensure logs are available when the test is paused with "-p". Otherwise the last chunk of data might only be written when tcpdump is killed. Fixes: `74cc26f416` ("selftests: openvswitch: add interface support") Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Link: https://patch.msgid.link/20241217211652.483016-1-amorenoz@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-18 19:18:41 -08:00
Leo Stone	d3ac65d274	mm: huge_memory: handle strsep not finding delimiter split_huge_pages_write() does not handle the case where strsep finds no delimiter in the given string and sets the input buffer to NULL, which allows this reproducer to trigger a protection fault. Link: https://lkml.kernel.org/r/20241216042752.257090-2-leocstone@gmail.com Signed-off-by: Leo Stone <leocstone@gmail.com> Reported-by: syzbot+8a3da2f1bbf59227c289@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=8a3da2f1bbf59227c289 Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-18 19:04:47 -08:00
Suren Baghdasaryan	60da7445a1	alloc_tag: fix set_codetag_empty() when !CONFIG_MEM_ALLOC_PROFILING_DEBUG It was recently noticed that set_codetag_empty() might be used not only to mark NULL alloctag references as empty to avoid warnings but also to reset valid tags (in clear_page_tag_ref()). Since set_codetag_empty() is defined as NOOP for CONFIG_MEM_ALLOC_PROFILING_DEBUG=n, such use of set_codetag_empty() leads to subtle bugs. Fix set_codetag_empty() for CONFIG_MEM_ALLOC_PROFILING_DEBUG=n to reset the tag reference. Link: https://lkml.kernel.org/r/20241130001423.1114965-2-surenb@google.com Fixes: `a8fc28dad6` ("alloc_tag: introduce clear_page_tag_ref() helper function") Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reported-by: David Wang <00107082@163.com> Closes: https://lore.kernel.org/lkml/20241124074318.399027-1-00107082@163.com/ Cc: David Wang <00107082@163.com> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Sourav Panda <souravpanda@google.com> Cc: Yu Zhao <yuzhao@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-18 19:04:46 -08:00
Suren Baghdasaryan	e269b5d291	alloc_tag: fix module allocation tags populated area calculation vm_module_tags_populate() calculation of the populated area assumes that area starts at a page boundary and therefore when new pages are allocation, the end of the area is page-aligned as well. If the start of the area is not page-aligned then allocating a page and incrementing the end of the area by PAGE_SIZE leads to an area at the end but within the area boundary which is not populated. Accessing this are will lead to a kernel panic. Fix the calculation by down-aligning the start of the area and using that as the location allocated pages are mapped to. [gehao@kylinos.cn: fix vm_module_tags_populate's KASAN poisoning logic] Link: https://lkml.kernel.org/r/20241205170528.81000-1-hao.ge@linux.dev [gehao@kylinos.cn: fix panic when CONFIG_KASAN enabled and CONFIG_KASAN_VMALLOC not enabled] Link: https://lkml.kernel.org/r/20241212072126.134572-1-hao.ge@linux.dev Link: https://lkml.kernel.org/r/20241130001423.1114965-1-surenb@google.com Fixes: `0f9b685626` ("alloc_tag: populate memory for module tags as needed") Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202411132111.6a221562-lkp@intel.com Acked-by: Yu Zhao <yuzhao@google.com> Tested-by: Adrian Huang <ahuang12@lenovo.com> Cc: David Wang <00107082@163.com> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Sourav Panda <souravpanda@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-18 19:04:46 -08:00
David Wang	640a603943	mm/codetag: clear tags before swap When CONFIG_MEM_ALLOC_PROFILING_DEBUG is set, kernel WARN would be triggered when calling __alloc_tag_ref_set() during swap: alloc_tag was not cleared (got tag for mm/filemap.c:1951) WARNING: CPU: 0 PID: 816 at ./include/linux/alloc_tag.h... Clear code tags before swap can fix the warning. And this patch also fix a potential invalid address dereference in alloc_tag_add_check() when CONFIG_MEM_ALLOC_PROFILING_DEBUG is set and ref->ct is CODETAG_EMPTY, which is defined as ((void *)1). Link: https://lkml.kernel.org/r/20241213013332.89910-1-00107082@163.com Fixes: `51f43d5d82` ("mm/codetag: swap tags when migrate pages") Signed-off-by: David Wang <00107082@163.com> Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202412112227.df61ebb-lkp@intel.com Acked-by: Suren Baghdasaryan <surenb@google.com> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Yu Zhao <yuzhao@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-18 19:04:46 -08:00
Bart Van Assche	30c2de0a26	mm/vmstat: fix a W=1 clang compiler warning Fix the following clang compiler warning that is reported if the kernel is built with W=1: ./include/linux/vmstat.h:518:36: error: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Werror,-Wenum-enum-conversion] 518 \| return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_" \| ~~~~~~~~~~~ ^ ~~~ Link: https://lkml.kernel.org/r/20241212213126.1269116-1-bvanassche@acm.org Fixes: `9d7ea9a297` ("mm/vmstat: add helpers to get vmstat item names for each enum type") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-18 19:04:46 -08:00
Usama Arif	42b2eb6983	mm: convert partially_mapped set/clear operations to be atomic Other page flags in the 2nd page, like PG_hwpoison and PG_anon_exclusive can get modified concurrently. Changes to other page flags might be lost if they are happening at the same time as non-atomic partially_mapped operations. Hence, make partially_mapped operations atomic. Link: https://lkml.kernel.org/r/20241212183351.1345389-1-usamaarif642@gmail.com Fixes: `8422acdc97` ("mm: introduce a pageflag for partially mapped folios") Reported-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/all/e53b04ad-1827-43a2-a1ab-864c7efecf6e@redhat.com/ Signed-off-by: Usama Arif <usamaarif642@gmail.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Cc: Barry Song <baohua@kernel.org> Cc: Domenico Cerasuolo <cerasuolodomenico@gmail.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Nico Pache <npache@redhat.com> Cc: Rik van Riel <riel@surriel.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Yu Zhao <yuzhao@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-18 19:04:45 -08:00
Ryusuke Konishi	6309b8ce98	nilfs2: fix buffer head leaks in calls to truncate_inode_pages() When block_invalidatepage was converted to block_invalidate_folio, the fallback to block_invalidatepage in folio_invalidate() if the address_space_operations method invalidatepage (currently invalidate_folio) was not set, was removed. Unfortunately, some pseudo-inodes in nilfs2 use empty_aops set by inode_init_always_gfp() as is, or explicitly set it to address_space_operations. Therefore, with this change, block_invalidatepage() is no longer called from folio_invalidate(), and as a result, the buffer_head structures attached to these pages/folios are no longer freed via try_to_free_buffers(). Thus, these buffer heads are now leaked by truncate_inode_pages(), which cleans up the page cache from inode evict(), etc. Three types of caches use empty_aops: gc inode caches and the DAT shadow inode used by GC, and b-tree node caches. Of these, b-tree node caches explicitly call invalidate_mapping_pages() during cleanup, which involves calling try_to_free_buffers(), so the leak was not visible during normal operation but worsened when GC was performed. Fix this issue by using address_space_operations with invalidate_folio set to block_invalidate_folio instead of empty_aops, which will ensure the same behavior as before. Link: https://lkml.kernel.org/r/20241212164556.21338-1-konishi.ryusuke@gmail.com Fixes: `7ba13abbd3` ("fs: Turn block_invalidatepage into block_invalidate_folio") Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: <stable@vger.kernel.org> [5.18+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-18 19:04:45 -08:00
Matthew Wilcox (Oracle)	a2e740e216	vmalloc: fix accounting with i915 If the caller of vmap() specifies VM_MAP_PUT_PAGES (currently only the i915 driver), we will decrement nr_vmalloc_pages and MEMCG_VMALLOC in vfree(). These counters are incremented by vmalloc() but not by vmap() so this will cause an underflow. Check the VM_MAP_PUT_PAGES flag before decrementing either counter. Link: https://lkml.kernel.org/r/20241211202538.168311-1-willy@infradead.org Fixes: `b944afc9d6` ("mm: add a VM_MAP_PUT_PAGES flag for vmap") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev> Reviewed-by: Balbir Singh <balbirs@nvidia.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Muchun Song <muchun.song@linux.dev> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: "Uladzislau Rezki (Sony)" <urezki@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-18 19:04:45 -08:00
David Hildenbrand	faeec8e23c	mm/page_alloc: don't call pfn_to_page() on possibly non-existent PFN in split_large_buddy() In split_large_buddy(), we might call pfn_to_page() on a PFN that might not exist. In corner cases, such as when freeing the highest pageblock in the last memory section, this could result with CONFIG_SPARSEMEM && !CONFIG_SPARSEMEM_EXTREME in __pfn_to_section() returning NULL and and __section_mem_map_addr() dereferencing that NULL pointer. Let's fix it, and avoid doing a pfn_to_page() call for the first iteration, where we already have the page. So far this was found by code inspection, but let's just CC stable as the fix is easy. Link: https://lkml.kernel.org/r/20241210093437.174413-1-david@redhat.com Fixes: `fd919a85cd` ("mm: page_isolation: prepare for hygienic freelists") Signed-off-by: David Hildenbrand <david@redhat.com> Reported-by: Vlastimil Babka <vbabka@suse.cz> Closes: https://lkml.kernel.org/r/e1a898ba-a717-4d20-9144-29df1a6c8813@suse.cz Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Zi Yan <ziy@nvidia.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Yu Zhao <yuzhao@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-18 19:04:45 -08:00
Lorenzo Stoakes	8ac662f5da	fork: avoid inappropriate uprobe access to invalid mm If dup_mmap() encounters an issue, currently uprobe is able to access the relevant mm via the reverse mapping (in build_map_info()), and if we are very unlucky with a race window, observe invalid XA_ZERO_ENTRY state which we establish as part of the fork error path. This occurs because uprobe_write_opcode() invokes anon_vma_prepare() which in turn invokes find_mergeable_anon_vma() that uses a VMA iterator, invoking vma_iter_load() which uses the advanced maple tree API and thus is able to observe XA_ZERO_ENTRY entries added to dup_mmap() in commit `d240629148` ("fork: use __mt_dup() to duplicate maple tree in dup_mmap()"). This change was made on the assumption that only process tear-down code would actually observe (and make use of) these values. However this very unlikely but still possible edge case with uprobes exists and unfortunately does make these observable. The uprobe operation prevents races against the dup_mmap() operation via the dup_mmap_sem semaphore, which is acquired via uprobe_start_dup_mmap() and dropped via uprobe_end_dup_mmap(), and held across register_for_each_vma() prior to invoking build_map_info() which does the reverse mapping lookup. Currently these are acquired and dropped within dup_mmap(), which exposes the race window prior to error handling in the invoking dup_mm() which tears down the mm. We can avoid all this by just moving the invocation of uprobe_start_dup_mmap() and uprobe_end_dup_mmap() up a level to dup_mm() and only release this lock once the dup_mmap() operation succeeds or clean up is done. This means that the uprobe code can never observe an incompletely constructed mm and resolves the issue in this case. Link: https://lkml.kernel.org/r/20241210172412.52995-1-lorenzo.stoakes@oracle.com Fixes: `d240629148` ("fork: use __mt_dup() to duplicate maple tree in dup_mmap()") Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reported-by: syzbot+2d788f4f7cb660dac4b7@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/6756d273.050a0220.2477f.003d.GAE@google.com/ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-18 19:04:44 -08:00
Edward Adam Davis	901ce9705f	nilfs2: prevent use of deleted inode syzbot reported a WARNING in nilfs_rmdir. [1] Because the inode bitmap is corrupted, an inode with an inode number that should exist as a ".nilfs" file was reassigned by nilfs_mkdir for "file0", causing an inode duplication during execution. And this causes an underflow of i_nlink in rmdir operations. The inode is used twice by the same task to unmount and remove directories ".nilfs" and "file0", it trigger warning in nilfs_rmdir. Avoid to this issue, check i_nlink in nilfs_iget(), if it is 0, it means that this inode has been deleted, and iput is executed to reclaim it. [1] WARNING: CPU: 1 PID: 5824 at fs/inode.c:407 drop_nlink+0xc4/0x110 fs/inode.c:407 ... Call Trace: <TASK> nilfs_rmdir+0x1b0/0x250 fs/nilfs2/namei.c:342 vfs_rmdir+0x3a3/0x510 fs/namei.c:4394 do_rmdir+0x3b5/0x580 fs/namei.c:4453 __do_sys_rmdir fs/namei.c:4472 [inline] __se_sys_rmdir fs/namei.c:4470 [inline] __x64_sys_rmdir+0x47/0x50 fs/namei.c:4470 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Link: https://lkml.kernel.org/r/20241209065759.6781-1-konishi.ryusuke@gmail.com Fixes: `d25006523d` ("nilfs2: pathname operations") Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Reported-by: syzbot+9260555647a5132edd48@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=9260555647a5132edd48 Tested-by: syzbot+9260555647a5132edd48@syzkaller.appspotmail.com Signed-off-by: Edward Adam Davis <eadavis@qq.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-18 19:04:44 -08:00
Kairui Song	74363ec674	zram: fix uninitialized ZRAM not releasing backing device Setting backing device is done before ZRAM initialization. If we set the backing device, then remove the ZRAM module without initializing the device, the backing device reference will be leaked and the device will be hold forever. Fix this by always reset the ZRAM fully on rmmod or reset store. Link: https://lkml.kernel.org/r/20241209165717.94215-3-ryncsn@gmail.com Fixes: `013bf95a83` ("zram: add interface to specif backing device") Signed-off-by: Kairui Song <kasong@tencent.com> Reported-by: Desheng Wu <deshengwu@tencent.com> Suggested-by: Sergey Senozhatsky <senozhatsky@chromium.org> Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-18 19:04:44 -08:00
Kairui Song	be48c412f6	zram: refuse to use zero sized block device as backing device Patch series "zram: fix backing device setup issue", v2. This series fixes two bugs of backing device setting: - ZRAM should reject using a zero sized (or the uninitialized ZRAM device itself) as the backing device. - Fix backing device leaking when removing a uninitialized ZRAM device. This patch (of 2): Setting a zero sized block device as backing device is pointless, and one can easily create a recursive loop by setting the uninitialized ZRAM device itself as its own backing device by (zram0 is uninitialized): echo /dev/zram0 > /sys/block/zram0/backing_dev It's definitely a wrong config, and the module will pin itself, kernel should refuse doing so in the first place. By refusing to use zero sized device we avoided misuse cases including this one above. Link: https://lkml.kernel.org/r/20241209165717.94215-1-ryncsn@gmail.com Link: https://lkml.kernel.org/r/20241209165717.94215-2-ryncsn@gmail.com Fixes: `013bf95a83` ("zram: add interface to specif backing device") Signed-off-by: Kairui Song <kasong@tencent.com> Reported-by: Desheng Wu <deshengwu@tencent.com> Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-18 19:04:44 -08:00
Zi Yan	c51a4f11e6	mm: use clear_user_(high)page() for arch with special user folio handling Some architectures have special handling after clearing user folios: architectures, which set cpu_dcache_is_aliasing() to true, require flushing dcache; arc, which sets cpu_icache_is_aliasing() to true, changes folio->flags to make icache coherent to dcache. So __GFP_ZERO using only clear_page() is not enough to zero user folios and clear_user_(high)page() must be used. Otherwise, user data will be corrupted. Fix it by always clearing user folios with clear_user_(high)page() when cpu_dcache_is_aliasing() is true or cpu_icache_is_aliasing() is true. Rename alloc_zeroed() to user_alloc_needs_zeroing() and invert the logic to clarify its intend. Link: https://lkml.kernel.org/r/20241209182326.2955963-2-ziy@nvidia.com Fixes: `5708d96da2` ("mm: avoid zeroing user movable page twice with init_on_alloc=1") Signed-off-by: Zi Yan <ziy@nvidia.com> Reported-by: Geert Uytterhoeven <geert+renesas@glider.be> Closes: https://lore.kernel.org/linux-mm/CAMuHMdV1hRp_NtR5YnJo=HsfgKQeH91J537Gh4gKk3PFZhSkbA@mail.gmail.com/ Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Alexander Potapenko <glider@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Kees Cook <keescook@chromium.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Vineet Gupta <vgupta@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-18 19:04:43 -08:00
Zi Yan	5c0541e11c	mm: introduce cpu_icache_is_aliasing() across all architectures In commit `eacd0e950d` ("ARC: [mm] Lazy D-cache flush (non aliasing VIPT)"), arc adds the need to flush dcache to make icache see the code page change. This also requires special handling for clear_user_(high)page(). Introduce cpu_icache_is_aliasing() to make MM code query special clear_user_(high)page() easier. This will be used by the following commit. Link: https://lkml.kernel.org/r/20241209182326.2955963-1-ziy@nvidia.com Fixes: `5708d96da2` ("mm: avoid zeroing user movable page twice with init_on_alloc=1") Signed-off-by: Zi Yan <ziy@nvidia.com> Suggested-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Alexander Potapenko <glider@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Geert Uytterhoeven <geert+renesas@glider.be> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Kees Cook <keescook@chromium.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Vineet Gupta <vgupta@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-18 19:04:43 -08:00
Petr Malat	31c5629920	mm: add RCU annotation to pte_offset_map(_lock) RCU lock is taken by ___pte_offset_map() unless it returns NULL. Add this information to its inline callers to avoid sparse warning about context imbalance in pte_unmap(). Link: https://lkml.kernel.org/r/20241210000604.700710-1-oss@malat.biz Signed-off-by: Petr Malat <oss@malat.biz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-18 19:04:43 -08:00
Lorenzo Stoakes	42c4e4b20d	mm: correctly reference merged VMA On second merge attempt on mmap() we incorrectly discard the possibly merged VMA, resulting in a possible use-after-free (and most certainly a reference to the wrong VMA) in this instance in the subsequent __mmap_complete() invocation. Correct this mistake by reassigning vma correctly if a merge succeeds in this case. Link: https://lkml.kernel.org/r/20241206215229.244413-1-lorenzo.stoakes@oracle.com Fixes: `5ac87a885a` ("mm: defer second attempt at merge on mmap()") Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Suggested-by: Jann Horn <jannh@google.com> Reported-by: syzbot+91cf8da9401355f946c3@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/67536a25.050a0220.a30f1.0149.GAE@google.com/ Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-18 19:04:42 -08:00
Kefeng Wang	f5d09de9f1	mm: use aligned address in copy_user_gigantic_page() In current kernel, hugetlb_wp() calls copy_user_large_folio() with the fault address. Where the fault address may be not aligned with the huge page size. Then, copy_user_large_folio() may call copy_user_gigantic_page() with the address, while copy_user_gigantic_page() requires the address to be huge page size aligned. So, this may cause memory corruption or information leak, addtional, use more obvious naming 'addr_hint' instead of 'addr' for copy_user_gigantic_page(). Link: https://lkml.kernel.org/r/20241028145656.932941-2-wangkefeng.wang@huawei.com Fixes: `530dd9926d` ("mm: memory: improve copy_user_large_folio()") Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-18 19:04:42 -08:00
Kefeng Wang	8aca2bc96c	mm: use aligned address in clear_gigantic_page() In current kernel, hugetlb_no_page() calls folio_zero_user() with the fault address. Where the fault address may be not aligned with the huge page size. Then, folio_zero_user() may call clear_gigantic_page() with the address, while clear_gigantic_page() requires the address to be huge page size aligned. So, this may cause memory corruption or information leak, addtional, use more obvious naming 'addr_hint' instead of 'addr' for clear_gigantic_page(). Link: https://lkml.kernel.org/r/20241028145656.932941-1-wangkefeng.wang@huawei.com Fixes: `78fefd04c1` ("mm: memory: convert clear_huge_page() to folio_zero_user()") Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by: "Huang, Ying" <ying.huang@intel.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-18 19:04:42 -08:00
Hugh Dickins	dad2dc9c92	mm: shmem: fix ShmemHugePages at swapout /proc/meminfo ShmemHugePages has been showing overlarge amounts (more than Shmem) after swapping out THPs: we forgot to update NR_SHMEM_THPS. Add shmem_update_stats(), to avoid repetition, and risk of making that mistake again: the call from shmem_delete_from_page_cache() is the bugfix; the call from shmem_replace_folio() is reassuring, but not really a bugfix (replace corrects misplaced swapin readahead, but huge swapin readahead would be a mistake). Link: https://lkml.kernel.org/r/5ba477c8-a569-70b5-923e-09ab221af45b@google.com Fixes: `809bc86517` ("mm: shmem: support large folio swap out") Signed-off-by: Hugh Dickins <hughd@google.com> Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev> Reviewed-by: Yosry Ahmed <yosryahmed@google.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Tested-by: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-18 19:04:42 -08:00
Heming Zhao	7782e3b3b0	ocfs2: fix the space leak in LA when releasing LA Commit `30dd3478c3` ("ocfs2: correctly use ocfs2_find_next_zero_bit()") introduced an issue, the ocfs2_sync_local_to_main() ignores the last contiguous free bits, which causes an OCFS2 volume to lose the last free clusters of LA window during the release routine. Please note, because commit `dfe6c5692f` ("ocfs2: fix the la space leak when unmounting an ocfs2 volume") was reverted, this commit is a replacement fix for commit `dfe6c5692f`. Link: https://lkml.kernel.org/r/20241205104835.18223-3-heming.zhao@suse.com Fixes: `30dd3478c3` ("ocfs2: correctly use ocfs2_find_next_zero_bit()") Signed-off-by: Heming Zhao <heming.zhao@suse.com> Suggested-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-18 19:04:41 -08:00
Heming Zhao	1a72d2ebee	ocfs2: revert "ocfs2: fix the la space leak when unmounting an ocfs2 volume" Patch series "Revert ocfs2 commit `dfe6c5692f` and provide a new fix". SUSE QA team detected a mistake in my commit `dfe6c5692f` ("ocfs2: fix the la space leak when unmounting an ocfs2 volume"). I am very sorry for my error. (If my eyes are correct) From the mailling list mails, this patch shouldn't be applied to 4.19 5.4 5.10 5.15 6.1 6.6, and these branches should perform a revert operation. Reason for revert: In commit `dfe6c5692f`, I mistakenly wrote: "This bug has existed since the initial OCFS2 code.". The statement is wrong. The correct introduction commit is `30dd3478c3`. IOW, if the branch doesn't include `30dd3478c3`, `dfe6c5692f` should also not be included. This reverts commit `dfe6c5692f` ("ocfs2: fix the la space leak when unmounting an ocfs2 volume"). In commit `dfe6c5692f`, the commit log "This bug has existed since the initial OCFS2 code." is wrong. The correct introduction commit is `30dd3478c3` ("ocfs2: correctly use ocfs2_find_next_zero_bit()"). The influence of commit `dfe6c5692f` is that it provides a correct fix for the latest kernel. however, it shouldn't be pushed to stable branches. Let's use this commit to revert all branches that include `dfe6c5692f` and use a new fix method to fix commit `30dd3478c3`. Link: https://lkml.kernel.org/r/20241205104835.18223-1-heming.zhao@suse.com Link: https://lkml.kernel.org/r/20241205104835.18223-2-heming.zhao@suse.com Fixes: `dfe6c5692f` ("ocfs2: fix the la space leak when unmounting an ocfs2 volume") Signed-off-by: Heming Zhao <heming.zhao@suse.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-18 19:04:41 -08:00
Huang Ying	da5bd7fa78	mailmap: add entry for Ying Huang Map my old company email to my personal email. Link: https://lkml.kernel.org/r/20241205124201.529308-1-huang.ying.caritas@gmail.com Signed-off-by: "Huang, Ying" <huang.ying.caritas@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-18 19:04:41 -08:00
Isaac J. Manjarres	6a75f19af1	selftests/memfd: run sysctl tests when PID namespace support is enabled The sysctl tests for vm.memfd_noexec rely on the kernel to support PID namespaces (i.e. the kernel is built with CONFIG_PID_NS=y). If the kernel the test runs on does not support PID namespaces, the first sysctl test will fail when attempting to spawn a new thread in a new PID namespace, abort the test, preventing the remaining tests from being run. This is not desirable, as not all kernels need PID namespaces, but can still use the other features provided by memfd. Therefore, only run the sysctl tests if the kernel supports PID namespaces. Otherwise, skip those tests and emit an informative message to let the user know why the sysctl tests are not being run. Link: https://lkml.kernel.org/r/20241205192943.3228757-1-isaacmanjarres@google.com Fixes: `11f75a0144` ("selftests/memfd: add tests for MFD_NOEXEC_SEAL MFD_EXEC") Signed-off-by: Isaac J. Manjarres <isaacmanjarres@google.com> Reviewed-by: Jeff Xu <jeffxu@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: <stable@vger.kernel.org> [6.6+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-18 19:04:41 -08:00
Lorenzo Stoakes	dbf8be8218	docs/mm: add VMA locks documentation Locking around VMAs is complicated and confusing. While we have a number of disparate comments scattered around the place, we seem to be reaching a level of complexity that justifies a serious effort at clearly documenting how locks are expected to be used when it comes to interacting with mm_struct and vm_area_struct objects. This is especially pertinent as regards the efforts to find sensible abstractions for these fundamental objects in kernel rust code whose compiler strictly requires some means of expressing these rules (and through this expression, self-document these requirements as well as enforce them). The document limits scope to mmap and VMA locks and those that are immediately adjacent and relevant to them - so additionally covers page table locking as this is so very closely tied to VMA operations (and relies upon us handling these correctly). The document tries to cover some of the nastier and more confusing edge cases and concerns especially around lock ordering and page table teardown. The document is split between generally useful information for users of mm interfaces, and separately a section intended for mm kernel developers providing a discussion around internal implementation details. [lorenzo.stoakes@oracle.com: v3] Link: https://lkml.kernel.org/r/20241114205402.859737-1-lorenzo.stoakes@oracle.com [lorenzo.stoakes@oracle.com: docs/mm: minor corrections] Link: https://lkml.kernel.org/r/d3de735a-25ae-4eb2-866c-a9624fe6f795@lucifer.local [jannh@google.com: docs/mm: add more warnings around page table access] Link: https://lkml.kernel.org/r/20241118-vma-docs-addition1-onv3-v2-1-c9d5395b72ee@google.com Link: https://lkml.kernel.org/r/20241108135708.48567-1-lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Qi Zheng <zhengqi.arch@bytedance.com> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com> Reviewed-by: Jann Horn <jannh@google.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Hillf Danton <hdanton@sina.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-18 19:04:41 -08:00
Jakub Kicinski	dbfca1641e	Merge tag 'linux-can-fixes-for-6.13-20241218' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2024-12-18 There are 2 patches by Matthias Schiffer for the m_can_pci driver that handles the m_can cores found on the Intel Elkhart Lake processor. They fix the initialization and the interrupt handling under high CAN bus load. * tag 'linux-can-fixes-for-6.13-20241218' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can: can: m_can: fix missed interrupts with m_can_pci can: m_can: set init flag earlier in probe ==================== Link: https://patch.msgid.link/20241218121722.2311963-1-mkl@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-18 17:51:39 -08:00
Martin Hou	5c964c8a97	net: usb: qmi_wwan: add Quectel RG255C Add support for Quectel RG255C which is based on Qualcomm SDX35 chip. The composition is DM / NMEA / AT / QMI. T: Bus=01 Lev=01 Prnt=01 Port=04 Cnt=01 Dev#= 2 Spd=480 MxCh= 0 D: Ver= 2.01 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=2c7c ProdID=0316 Rev= 5.15 S: Manufacturer=Quectel S: Product=RG255C-CN S: SerialNumber=c68192c1 C:* #Ifs= 4 Cfg#= 1 Atr=a0 MxPwr=500mA I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=84(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan E: Ad=86(I) Atr=03(Int.) MxPS= 8 Ivl=32ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms Signed-off-by: Martin Hou <martin.hou@foxmail.com> Link: https://patch.msgid.link/tencent_17DDD787B48E8A5AB8379ED69E23A0CD9309@qq.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-18 17:24:03 -08:00
Jann Horn	12d908116f	io_uring: Fix registered ring file refcount leak Currently, io_uring_unreg_ringfd() (which cleans up registered rings) is only called on exit, but __io_uring_free (which frees the tctx in which the registered ring pointers are stored) is also called on execve (via begin_new_exec -> io_uring_task_cancel -> __io_uring_cancel -> io_uring_cancel_generic -> __io_uring_free). This means: A process going through execve while having registered rings will leak references to the rings' `struct file`. Fix it by zapping registered rings on execve(). This is implemented by moving the io_uring_unreg_ringfd() from io_uring_files_cancel() into its callee __io_uring_cancel(), which is called from io_uring_task_cancel() on execve. This could probably be exploited on 32-bit kernels by leaking 2^32 references to the same ring, because the file refcount is stored in a pointer-sized field and get_file() doesn't have protection against refcount overflow, just a WARN_ONCE(); but on 64-bit it should have no impact beyond a memory leak. Cc: stable@vger.kernel.org Fixes: `e7a6c00dc7` ("io_uring: add support for registering ring file descriptors") Signed-off-by: Jann Horn <jannh@google.com> Link: https://lore.kernel.org/r/20241218-uring-reg-ring-cleanup-v1-1-8f63e999045b@google.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-18 18:19:33 -07:00
Arnd Bergmann	cff865c700	net: phy: avoid undefined behavior in *_led_polarity_set() gcc runs into undefined behavior at the end of the three led_polarity_set() callback functions if it were called with a zero 'modes' argument and it just ends the function there without returning from it. This gets flagged by 'objtool' as a function that continues on to the next one: drivers/net/phy/aquantia/aquantia_leds.o: warning: objtool: aqr_phy_led_polarity_set+0xf: can't find jump dest instruction at .text+0x5d9 drivers/net/phy/intel-xway.o: warning: objtool: xway_gphy_led_polarity_set() falls through to next function xway_gphy_config_init() drivers/net/phy/mxl-gpy.o: warning: objtool: gpy_led_polarity_set() falls through to next function gpy_led_hw_control_get() There is no point to micro-optimize the behavior here to save a single-digit number of bytes in the kernel, so just change this to a "return -EINVAL" as we do when any unexpected bits are set. Fixes: `1758af47b9` ("net: phy: intel-xway: add support for PHY LEDs") Fixes: `9d55e68b19` ("net: phy: aquantia: correctly describe LED polarity override") Fixes: `eb89c79c1b` ("net: phy: mxl-gpy: correctly describe LED polarity") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241217081056.238792-1-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-18 16:50:23 -08:00
Hans de Goede	b3ded6072c	power: supply: bq24190: Fix BQ24296 Vbus regulator support There are 2 issues with bq24296_set_otg_vbus(): 1. When writing the OTG_CONFIG bit it uses POC_CHG_CONFIG_SHIFT which should be POC_OTG_CONFIG_SHIFT. 2. When turning the regulator off it never turns charging back on. Note this must be done through bq24190_charger_set_charge_type(), to ensure that the charge_type property value of none/trickle/fast is honored. Resolve both issues to fix BQ24296 Vbus regulator support not working. Fixes: `b150a703b5` ("power: supply: bq24190_charger: Add support for BQ24296") Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20241116203648.169100-2-hdegoede@redhat.com Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>	2024-12-19 00:35:30 +01:00
Phil Sutter	70b6f46a4e	netfilter: ipset: Fix for recursive locking warning With CONFIG_PROVE_LOCKING, when creating a set of type bitmap:ip, adding it to a set of type list:set and populating it from iptables SET target triggers a kernel warning: \| WARNING: possible recursive locking detected \| 6.12.0-rc7-01692-g5e9a28f41134-dirty #594 Not tainted \| -------------------------------------------- \| ping/4018 is trying to acquire lock: \| ffff8881094a6848 (&set->lock){+.-.}-{2:2}, at: ip_set_add+0x28c/0x360 [ip_set] \| \| but task is already holding lock: \| ffff88811034c048 (&set->lock){+.-.}-{2:2}, at: ip_set_add+0x28c/0x360 [ip_set] This is a false alarm: ipset does not allow nested list:set type, so the loop in list_set_kadd() can never encounter the outer set itself. No other set type supports embedded sets, so this is the only case to consider. To avoid the false report, create a distinct lock class for list:set type ipset locks. Fixes: `f830837f0e` ("netfilter: ipset: list:set set type support") Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2024-12-19 00:28:47 +01:00
David Laight	cf2c97423a	ipvs: Fix clamp() of ip_vs_conn_tab on small memory systems The 'max_avail' value is calculated from the system memory size using order_base_2(). order_base_2(x) is defined as '(x) ? fn(x) : 0'. The compiler generates two copies of the code that follows and then expands clamp(max, min, PAGE_SHIFT - 12) (11 on 32bit). This triggers a compile-time assert since min is 5. In reality a system would have to have less than 512MB memory for the bounds passed to clamp to be reversed. Swap the order of the arguments to clamp() to avoid the warning. Replace the clamp_val() on the line below with clamp(). clamp_val() is just 'an accident waiting to happen' and not needed here. Detected by compile time checks added to clamp(), specifically: minmax.h: use BUILD_BUG_ON_MSG() for the lo < hi test in clamp() Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> Closes: https://lore.kernel.org/all/CA+G9fYsT34UkGFKxus63H6UVpYi5GRZkezT9MRLfAbM3f6ke0g@mail.gmail.com/ Fixes: `4f325e2627` ("ipvs: dynamically limit the connection hash table") Tested-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: David Laight <david.laight@aculab.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2024-12-18 23:37:27 +01:00
Linus Torvalds	eabcdba3ad	Merge tag 'for-6.13-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - tree-checker catches invalid number of inline extent references - zoned mode fixes: - enhance zone append IO command so it also detects emulated writes - handle bio splitting at sectorsize boundary - when deleting a snapshot, fix a condition for visiting nodes in reloc trees * tag 'for-6.13-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: tree-checker: reject inline extent items with 0 ref count btrfs: split bios to the fs sector size boundary btrfs: use bio_is_zone_append() in the completion handler btrfs: fix improper generation check in snapshot delete	2024-12-18 14:17:21 -08:00
Oliver Upton	e96d8b80af	KVM: arm64: Only apply PMCR_EL0.P to the guest range of counters An important distinction from other registers affected by HPMN is that PMCR_EL0 only affects the guest range of counters, regardless of the EL from which it is accessed. Ensure that PMCR_EL0.P is always applied to 'guest' counters by manually computing the mask rather than deriving it from the current context. Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20241217175611.3658290-1-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-12-18 13:22:25 -08:00
Oliver Upton	d3ba35b69e	KVM: arm64: nv: Reload PMU events upon MDCR_EL2.HPME change MDCR_EL2.HPME is the 'global' enable bit for event counters reserved for EL2. Give the PMU a kick when it's changed to ensure events are reprogrammed before returning to the guest. Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20241217175550.3658212-1-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-12-18 13:22:25 -08:00
Oliver Upton	adf8623b3f	KVM: arm64: Use KVM_REQ_RELOAD_PMU to handle PMCR_EL0.E change Nested virt introduces yet another set of 'global' knobs for controlling event counters that are reserved for EL2 (i.e. >= HPMN). Get ready to share some plumbing with the NV controls by offloading counter reprogramming to KVM_REQ_RELOAD_PMU. Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20241217175532.3658134-1-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-12-18 13:22:25 -08:00
Oliver Upton	e22c369520	KVM: arm64: Add unified helper for reprogramming counters by mask Having separate helpers for enabling/disabling counters provides the wrong abstraction, as the state of each counter needs to be evaluated independently and, in some cases, use a different global enable bit. Collapse the enable/disable accessors into a single, common helper that reconfigures every counter set in @mask, leaving the complexity of determining if an event is actually enabled in kvm_pmu_counter_is_enabled(). Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20241217175513.3658056-1-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-12-18 13:22:25 -08:00
Quentin Perret	985bb51f17	KVM: arm64: Always check the state from hyp_ack_unshare() There are multiple pKVM memory transitions where the state of a page is not cross-checked from the completer's PoV for performance reasons. For example, if a page is PKVM_PAGE_OWNED from the initiator's PoV, we should be guaranteed by construction that it is PKVM_NOPAGE for everybody else, hence allowing us to save a page-table lookup. When it was introduced, hyp_ack_unshare() followed that logic and bailed out without checking the PKVM_PAGE_SHARED_BORROWED state in the hypervisor's stage-1. This was correct as we could safely assume that all host-initiated shares were directed at the hypervisor at the time. But with the introduction of other types of shares (e.g. for FF-A or non-protected guests), it is now very much required to cross check this state to prevent the host from running __pkvm_host_unshare_hyp() on a page shared with TZ or a non-protected guest. Thankfully, if an attacker were to try this, the hyp_unmap() call from hyp_complete_unshare() would fail, hence causing to WARN() from __do_unshare() with the host lock held, which is fatal. But this is fragile at best, and can hardly be considered a security measure. Let's just do the right thing and always check the state from hyp_ack_unshare(). Signed-off-by: Quentin Perret <qperret@google.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20241128154406.602875-1-qperret@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-12-18 13:21:37 -08:00
Linus Torvalds	b69810f38c	Merge tag 'cxl-fixes-6.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull cxl fixes from Ira Weiny: - prevent probe failure when non-critical RAS unmasking fails - fix CXL 1.1 link status sysfs attribute - fix 4 way (and greater) switch interleave region creation * tag 'cxl-fixes-6.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: cxl/region: Fix region creation for greater than x2 switches cxl/pci: Check dport->regs.rcd_pcie_cap availability before accessing cxl/pci: Fix potential bogus return value upon successful probing	2024-12-18 12:52:57 -08:00
Alex Deucher	3abb660f9e	drm/amdgpu/nbio7.0: fix IP version check Use the helper function rather than reading it directly. Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `0ec43fbece`) Cc: stable@vger.kernel.org	2024-12-18 15:20:57 -05:00
Mario Limonciello	a7f9d98eb1	drm/amd: Update strapping for NBIO 2.5.0 This helps to avoid a spurious PME event on hotplug to Azalia. Cc: Vijendar Mukunda <Vijendar.Mukunda@amd.com> Reported-and-tested-by: ionut_n2001@yahoo.com Closes: https://bugzilla.kernel.org/show_bug.cgi?id=215884 Tested-by: Gabriel Marcano <gabemarcano@yahoo.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20241211024414.7840-1-mario.limonciello@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `3f6f237b9d`) Cc: stable@vger.kernel.org	2024-12-18 15:20:06 -05:00
Linus Torvalds	397d1d88af	Merge tag 'selinux-pr-20241217' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux fix from Paul Moore: "One small SELinux patch to get rid improve our handling of unknown extended permissions by safely ignoring them. Not only does this make it easier to support newer SELinux policy on older kernels in the future, it removes to BUG() calls from the SELinux code." * tag 'selinux-pr-20241217' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: ignore unknown extended permissions	2024-12-18 12:10:15 -08:00
Huacai Chen	0674188f2f	ACPI: EC: Enable EC support on LoongArch by default Commit `a6021aa24f` ("ACPI: EC: make EC support compile-time conditional") only enable ACPI_EC on X86 by default, but the embedded controller is also widely used on LoongArch laptops so we also enable ACPI_EC for LoongArch. The laptop driver cannot work without EC, so also update the dependency of LOONGSON_LAPTOP to let it depend on APCI_EC. Fixes: `a6021aa24f` ("ACPI: EC: make EC support compile-time conditional") Reported-by: Xiaotian Wu <wuxiaotian@loongson.cn> Tested-by: Binbin Zhou <zhoubinbin@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Link: https://patch.msgid.link/20241217073704.3339587-1-chenhuacai@loongson.cn [ rjw: Added Fixes: ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2024-12-18 20:23:59 +01:00
Steven Rostedt	8cd63406d0	trace/ring-buffer: Do not use TP_printk() formatting for boot mapped buffers The TP_printk() of a TRACE_EVENT() is a generic printf format that any developer can create for their event. It may include pointers to strings and such. A boot mapped buffer may contain data from a previous kernel where the strings addresses are different. One solution is to copy the event content and update the pointers by the recorded delta, but a simpler solution (for now) is to just use the print_fields() function to print these events. The print_fields() function just iterates the fields and prints them according to what type they are, and ignores the TP_printk() format from the event itself. To understand the difference, when printing via TP_printk() the output looks like this: 4582.696626: kmem_cache_alloc: call_site=getname_flags+0x47/0x1f0 ptr=00000000e70e10e0 bytes_req=4096 bytes_alloc=4096 gfp_flags=GFP_KERNEL node=-1 accounted=false 4582.696629: kmem_cache_alloc: call_site=alloc_empty_file+0x6b/0x110 ptr=0000000095808002 bytes_req=360 bytes_alloc=384 gfp_flags=GFP_KERNEL node=-1 accounted=false 4582.696630: kmem_cache_alloc: call_site=security_file_alloc+0x24/0x100 ptr=00000000576339c3 bytes_req=16 bytes_alloc=16 gfp_flags=GFP_KERNEL\|__GFP_ZERO node=-1 accounted=false 4582.696653: kmem_cache_free: call_site=do_sys_openat2+0xa7/0xd0 ptr=00000000e70e10e0 name=names_cache But when printing via print_fields() (echo 1 > /sys/kernel/tracing/options/fields) the same event output looks like this: 4582.696626: kmem_cache_alloc: call_site=0xffffffff92d10d97 (-1831793257) ptr=0xffff9e0e8571e000 (-107689771147264) bytes_req=0x1000 (4096) bytes_alloc=0x1000 (4096) gfp_flags=0xcc0 (3264) node=0xffffffff (-1) accounted=(0) 4582.696629: kmem_cache_alloc: call_site=0xffffffff92d0250b (-1831852789) ptr=0xffff9e0e8577f800 (-107689770747904) bytes_req=0x168 (360) bytes_alloc=0x180 (384) gfp_flags=0xcc0 (3264) node=0xffffffff (-1) accounted=(0) 4582.696630: kmem_cache_alloc: call_site=0xffffffff92efca74 (-1829778828) ptr=0xffff9e0e8d35d3b0 (-107689640864848) bytes_req=0x10 (16) bytes_alloc=0x10 (16) gfp_flags=0xdc0 (3520) node=0xffffffff (-1) accounted=(0) 4582.696653: kmem_cache_free: call_site=0xffffffff92cfbea7 (-1831879001) ptr=0xffff9e0e8571e000 (-107689771147264) name=names_cache Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/20241218141507.28389a1d@gandalf.local.home Fixes: `07714b4bb3` ("tracing: Handle old buffer mappings for event strings and functions") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2024-12-18 14:20:38 -05:00
Edward Adam Davis	c58a812c8e	ring-buffer: Fix overflow in __rb_map_vma An overflow occurred when performing the following calculation: nr_pages = ((nr_subbufs + 1) << subbuf_order) - pgoff; Add a check before the calculation to avoid this problem. syzbot reported this as a slab-out-of-bounds in __rb_map_vma: BUG: KASAN: slab-out-of-bounds in __rb_map_vma+0x9ab/0xae0 kernel/trace/ring_buffer.c:7058 Read of size 8 at addr ffff8880767dd2b8 by task syz-executor187/5836 CPU: 0 UID: 0 PID: 5836 Comm: syz-executor187 Not tainted 6.13.0-rc2-syzkaller-00159-gf932fb9b4074 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/25/2024 Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0xc3/0x620 mm/kasan/report.c:489 kasan_report+0xd9/0x110 mm/kasan/report.c:602 __rb_map_vma+0x9ab/0xae0 kernel/trace/ring_buffer.c:7058 ring_buffer_map+0x56e/0x9b0 kernel/trace/ring_buffer.c:7138 tracing_buffers_mmap+0xa6/0x120 kernel/trace/trace.c:8482 call_mmap include/linux/fs.h:2183 [inline] mmap_file mm/internal.h:124 [inline] __mmap_new_file_vma mm/vma.c:2291 [inline] __mmap_new_vma mm/vma.c:2355 [inline] __mmap_region+0x1786/0x2670 mm/vma.c:2456 mmap_region+0x127/0x320 mm/mmap.c:1348 do_mmap+0xc00/0xfc0 mm/mmap.c:496 vm_mmap_pgoff+0x1ba/0x360 mm/util.c:580 ksys_mmap_pgoff+0x32c/0x5c0 mm/mmap.c:542 __do_sys_mmap arch/x86/kernel/sys_x86_64.c:89 [inline] __se_sys_mmap arch/x86/kernel/sys_x86_64.c:82 [inline] __x64_sys_mmap+0x125/0x190 arch/x86/kernel/sys_x86_64.c:82 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f The reproducer for this bug is: ------------------------8<------------------------- #include <fcntl.h> #include <stdlib.h> #include <unistd.h> #include <asm/types.h> #include <sys/mman.h> int main(int argc, char *argv) { int page_size = getpagesize(); int fd; void meta; system("echo 1 > /sys/kernel/tracing/buffer_size_kb"); fd = open("/sys/kernel/tracing/per_cpu/cpu0/trace_pipe_raw", O_RDONLY); meta = mmap(NULL, page_size, PROT_READ, MAP_SHARED, fd, page_size * 5); } ------------------------>8------------------------- Cc: stable@vger.kernel.org Fixes: `117c39200d` ("ring-buffer: Introducing ring-buffer mapping functions") Link: https://lore.kernel.org/tencent_06924B6674ED771167C23CC336C097223609@qq.com Reported-by: syzbot+345e4443a21200874b18@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=345e4443a21200874b18 Signed-off-by: Edward Adam Davis <eadavis@qq.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2024-12-18 14:15:10 -05:00
Linus Torvalds	c061cf420d	Merge tag 'trace-v6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: "Replace trace_check_vprintf() with test_event_printk() and ignore_event() The function test_event_printk() checks on boot up if the trace event printf() formats dereference any pointers, and if they do, it then looks at the arguments to make sure that the pointers they dereference will exist in the event on the ring buffer. If they do not, it issues a WARN_ON() as it is a likely bug. But this isn't the case for the strings that can be dereferenced with "%s", as some trace events (notably RCU and some IPI events) save a pointer to a static string in the ring buffer. As the string it points to lives as long as the kernel is running, it is not a bug to reference it, as it is guaranteed to be there when the event is read. But it is also possible (and a common bug) to point to some allocated string that could be freed before the trace event is read and the dereference is to bad memory. This case requires a run time check. The previous way to handle this was with trace_check_vprintf() that would process the printf format piece by piece and send what it didn't care about to vsnprintf() to handle arguments that were not strings. This kept it from having to reimplement vsnprintf(). But it relied on va_list implementation and for architectures that copied the va_list and did not pass it by reference, it wasn't even possible to do this check and it would be skipped. As 64bit x86 passed va_list by reference, most events were tested and this kept out bugs where strings would have been dereferenced after being freed. Instead of relying on the implementation of va_list, extend the boot up test_event_printk() function to validate all the "%s" strings that can be validated at boot, and for the few events that point to strings outside the ring buffer, flag both the event and the field that is dereferenced as "needs_test". Then before the event is printed, a call to ignore_event() is made, and if the event has the flag set, it iterates all its fields and for every field that is to be tested, it will read the pointer directly from the event in the ring buffer and make sure that it is valid. If the pointer is not valid, it will print a WARN_ON(), print out to the trace that the event has unsafe memory and ignore the print format. With this new update, the trace_check_vprintf() can be safely removed and now all events can be verified regardless of architecture" * tag 'trace-v6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Check "%s" dereference via the field and not the TP_printk format tracing: Add "%s" check in test_event_printk() tracing: Add missing helper functions in event pointer dereference check tracing: Fix test_event_printk() to process entire print argument	2024-12-18 10:03:33 -08:00
Michel Dänzer	85230ee36d	drm/amdgpu: Handle NULL bo->tbo.resource (again) in amdgpu_vm_bo_update Third time's the charm, I hope? Fixes: `d3116756a7` ("drm/ttm: rename bo->mem and make it a pointer") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3837 Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `695c2c745e`) Cc: stable@vger.kernel.org	2024-12-18 13:02:03 -05:00
Christian König	8d1a13816e	drm/amdgpu: fix amdgpu_coredump The VM pointer might already be outdated when that function is called. Use the PASID instead to gather the information instead. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `57f812d171`) Cc: stable@vger.kernel.org	2024-12-18 13:01:54 -05:00
Alex Deucher	9e752ee26c	drm/amdgpu/smu14.0.2: fix IP version check Use the helper function rather than reading it directly. Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `8f2cd1067a`) Cc: stable@vger.kernel.org	2024-12-18 13:01:48 -05:00
Alex Deucher	41be00f839	drm/amdgpu/gfx12: fix IP version check Use the helper function rather than reading it directly. Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `f1fd1d0f40`) Cc: stable@vger.kernel.org	2024-12-18 13:01:43 -05:00
Alex Deucher	6ebc5b9219	drm/amdgpu/mmhub4.1: fix IP version check Use the helper function rather than reading it directly. Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `63bfd24088`) Cc: stable@vger.kernel.org	2024-12-18 13:01:37 -05:00
Alex Deucher	8c1ecc7197	drm/amdgpu/nbio7.11: fix IP version check Use the helper function rather than reading it directly. Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `2c8eeaaa0f`) Cc: stable@vger.kernel.org	2024-12-18 13:01:31 -05:00
Alex Deucher	458600da79	drm/amdgpu/nbio7.7: fix IP version check Use the helper function rather than reading it directly. Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `22b9555bc9`) Cc: stable@vger.kernel.org	2024-12-18 13:01:06 -05:00
Linus Walleij	146b6057e1	wifi: cw1200: Fix potential NULL dereference A recent refactoring was identified by smatch to cause another potential NULL dereference: drivers/net/wireless/st/cw1200/cw1200_spi.c:440 cw1200_spi_disconnect() error: we previously assumed 'self' could be null (see line 433) Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/r/202411271742.Xa7CNVh1-lkp@intel.com/ Fixes: `2719a9e715` ("wifi: cw1200: Convert to GPIO descriptors") Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://patch.msgid.link/20241217-cw1200-fix-v1-1-911e6b5823ec@linaro.org	2024-12-18 19:58:27 +02:00
Pierre-Eric Pelloux-Prayer	a93b1020eb	drm/amdgpu: don't access invalid sched Since `2320c9e6a7` ("drm/sched: memset() 'job' in drm_sched_job_init()") accessing job->base.sched can produce unexpected results as the initialisation of (*job)->base.sched done in amdgpu_job_alloc is overwritten by the memset. This commit fixes an issue when a CS would fail validation and would be rejected after job->num_ibs is incremented. In this case, amdgpu_ib_free(ring->adev, ...) will be called, which would crash the machine because the ring value is bogus. To fix this, pass a NULL pointer to amdgpu_ib_free(): we can do this because the device is actually not used in this function. The next commit will remove the ring argument completely. Fixes: `2320c9e6a7` ("drm/sched: memset() 'job' in drm_sched_job_init()") Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `2ae520cb12`)	2024-12-18 12:57:38 -05:00
Mario Limonciello	536ae08d7b	drm/amd: Require CONFIG_HOTPLUG_PCI_PCIE for BOCO If the kernel hasn't been compiled with PCIe hotplug support this can lead to problems with dGPUs that use BOCO because they effectively drop off the bus. To prevent issues, disable BOCO support when compiled without PCIe hotplug. Reported-by: Gabriel Marcano <gabemarcano@yahoo.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1707#note_2696862 Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20241211155601.3585256-1-superm1@kernel.org Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `1ad5bdc28b`)	2024-12-18 12:56:49 -05:00
Linus Torvalds	37cb0c76ac	Merge tag 'hyperv-fixes-signed-20241217' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv fixes from Wei Liu: - Various fixes to Hyper-V tools in the kernel tree (Dexuan Cui, Olaf Hering, Vitaly Kuznetsov) - Fix a bug in the Hyper-V TSC page based sched_clock() (Naman Jain) - Two bug fixes in the Hyper-V utility functions (Michael Kelley) - Convert open-coded timeouts to secs_to_jiffies() in Hyper-V drivers (Easwar Hariharan) * tag 'hyperv-fixes-signed-20241217' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: tools/hv: reduce resource usage in hv_kvp_daemon tools/hv: add a .gitignore file tools/hv: reduce resouce usage in hv_get_dns_info helper hv/hv_kvp_daemon: Pass NIC name to hv_get_dns_info as well Drivers: hv: util: Avoid accessing a ringbuffer not initialized yet Drivers: hv: util: Don't force error code to ENODEV in util_probe() tools/hv: terminate fcopy daemon if read from uio fails drivers: hv: Convert open-coded timeouts to secs_to_jiffies() tools: hv: change permissions of NetworkManager configuration file x86/hyperv: Fix hv tsc page based sched_clock for hibernation tools: hv: Fix a complier warning in the fcopy uio daemon	2024-12-18 09:55:55 -08:00
Juergen Gross	349f0086ba	x86/static-call: fix 32-bit build In 32-bit x86 builds CONFIG_STATIC_CALL_INLINE isn't set, leading to static_call_initialized not being available. Define it as "0" in that case. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Fixes: `0ef8047b73` ("x86/static-call: provide a way to do very early static-call updates") Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2024-12-18 09:47:43 -08:00
Kees Cook	cc0c53f4fa	wifi: iwlwifi: mvm: Fix __counted_by usage in cfg80211_wowlan_nd_* Both struct cfg80211_wowlan_nd_match and struct cfg80211_wowlan_nd_info pre-allocate space for channels and matches, but then may end up using fewer that the full allocation. Shrink the associated counter (n_channels and n_matches) after counting the results. This avoids compile-time (and run-time) warnings from __counted_by. (The counter member needs to be updated _before_ accessing the array index.) Seen with coming GCC 15: drivers/net/wireless/intel/iwlwifi/mvm/d3.c: In function 'iwl_mvm_query_set_freqs': drivers/net/wireless/intel/iwlwifi/mvm/d3.c:2877:66: warning: operation on 'match->n_channels' may be undefined [-Wsequence-point] 2877 \| match->channels[match->n_channels++] = \| ~~~~~~~~~~~~~~~~~^~ drivers/net/wireless/intel/iwlwifi/mvm/d3.c:2885:66: warning: operation on 'match->n_channels' may be undefined [-Wsequence-point] 2885 \| match->channels[match->n_channels++] = \| ~~~~~~~~~~~~~~~~~^~ drivers/net/wireless/intel/iwlwifi/mvm/d3.c: In function 'iwl_mvm_query_netdetect_reasons': drivers/net/wireless/intel/iwlwifi/mvm/d3.c:2982:58: warning: operation on 'net_detect->n_matches' may be undefined [-Wsequence-point] 2982 \| net_detect->matches[net_detect->n_matches++] = match; \| ~~~~~~~~~~~~~~~~~~~~~^~ Cc: stable@vger.kernel.org Fixes: `aa4ec06c45` ("wifi: cfg80211: use __counted_by where appropriate") Signed-off-by: Kees Cook <kees@kernel.org> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://patch.msgid.link/20240619211233.work.355-kees@kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-12-18 18:34:57 +01:00
Jon Lin	7f9a1eed1a	spi: rockchip-sfc: Fix error in remove progress Fix error in remove progress: [ 43.026148] Call trace: [ 43.026370] klist_next+0x1c/0x1d4 [ 43.026671] device_for_each_child+0x48/0xac [ 43.027049] spi_unregister_controller+0x30/0x130 [ 43.027469] rockchip_sfc_remove+0x48/0x80 [spi_rockchip_sfc] Signed-off-by: Jon Lin <jon.lin@rock-chips.com> Link: https://patch.msgid.link/20241218154741.901591-1-jon.lin@rock-chips.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-12-18 16:02:08 +00:00
Rafael J. Wysocki	05648c2f58	Merge tag 'amd-pstate-v6.13-2024-12-11' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux Merge amd-pstate driver fixes for 6.13-rc4 from Mario Liminciello: "Fix a problem where systems without preferred cores were misdetecting preferred cores. Fix issues with with boost numerator handling leading to inconsistently programmed CPPC max performance values." * tag 'amd-pstate-v6.13-2024-12-11' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux: cpufreq/amd-pstate: Use boost numerator for upper bound of frequencies cpufreq/amd-pstate: Store the boost numerator as highest perf again cpufreq/amd-pstate: Detect preferred core support before driver registration	2024-12-18 15:38:22 +01:00
Ming Lei	85672ca9ce	block: avoid to reuse `hctx` not removed from cpuhp callback list If the 'hctx' isn't removed from cpuhp callback list, we can't reuse it, otherwise use-after-free may be triggered. Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202412172217.b906db7c-lkp@intel.com Tested-by: kernel test robot <oliver.sang@intel.com> Fixes: `22465bbac5` ("blk-mq: move cpuhp callback registering out of q->sysfs_lock") Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20241218101617.3275704-3-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-18 07:25:37 -07:00
Ming Lei	224749be6c	block: Revert "block: Fix potential deadlock while freezing queue and acquiring sysfs_lock" This reverts commit `be26ba9642`. Commit `be26ba9642` ("block: Fix potential deadlock while freezing queue and acquiring sysfs_loc") actually reverts commit `22465bbac5` ("blk-mq: move cpuhp callback registering out of q->sysfs_lock"), and causes the original resctrl lockdep warning. So revert it and we need to fix the issue in another way. Cc: Nilay Shroff <nilay@linux.ibm.com> Fixes: `be26ba9642` ("block: Fix potential deadlock while freezing queue and acquiring sysfs_loc") Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20241218101617.3275704-2-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-18 07:25:37 -07:00
Luis Chamberlain	51588b1b77	nvme: use blk_validate_block_size() for max LBA check The block layer already has support to validates proper block sizes with blk_validate_block_size(), we can leverage that as well. No functional changes. Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20241218020212.3657139-3-mcgrof@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-18 07:22:30 -07:00
Luis Chamberlain	26fff8a443	block/bdev: use helper for max block size check We already have a helper for checking the limits on the block size both low and high, just use that. No functional changes. Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20241218020212.3657139-2-mcgrof@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-18 07:22:30 -07:00
Shuming Fan	c9e3ebdc52	ASoC: rt722: add delay time to wait for the calibration procedure The calibration procedure needs some time to finish. This patch adds the delay time to ensure the calibration procedure is completed correctly. Signed-off-by: Shuming Fan <shumingf@realtek.com> Link: https://patch.msgid.link/20241218091307.96656-1-shumingf@realtek.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-12-18 14:17:32 +00:00
Daniel Lezcano	4feaedf7d2	thermal/thresholds: Fix boundaries and detection routine The current implementation does not work if the thermal zone is interrupt driven only. The boundaries are not correctly checked and computed as it happens only when the temperature is increasing or decreasing. The problem arises because the routine to detect when we cross a threshold is correlated with the computation of the boundaries. We assume we have to recompute the boundaries when a threshold is crossed but actually we should do that even if the it is not the case. Mixing the boundaries computation and the threshold detection for the sake of optimizing the routine is much more complex as it appears intuitively and prone to errors. This fix separates the boundaries computation and the threshold crossing detection into different routines. The result is a code much more simple to understand, thus easier to maintain. The drawback is we browse the thresholds list several time but we can consider that as neglictible because that happens when the temperature is updated. There are certainly some aeras to improve in the temperature update routine but it would be not adequate as this change aims to fix the thresholds for v6.13. Fixes: `445936f9e2` ("thermal: core: Add user thresholds support") Tested-by: Daniel Lezcano <daniel.lezcano@linaro.org> # rock5b, Lenovo x13s Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://patch.msgid.link/20241216212644.1145122-1-daniel.lezcano@linaro.org Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2024-12-18 14:51:31 +01:00
Fabrice Gasnier	edc19bd0e5	pwm: stm32: Fix complementary output in round_waveform_tohw() When the timer supports complementary output, the CCxNE bit must be set additionally to the CCxE bit. So to not overwrite the latter use \|= instead of = to set the former. Fixes: `deaba9cff8` ("pwm: stm32: Implementation of the waveform callbacks") Signed-off-by: Fabrice Gasnier <fabrice.gasnier@foss.st.com> Link: https://lore.kernel.org/r/20241217150021.2030213-1-fabrice.gasnier@foss.st.com [ukleinek: Slightly improve commit log] Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>	2024-12-18 11:08:36 +01:00
Haren Myneni	05aa156e15	powerpc/pseries/vas: Add close() callback in vas_vm_ops struct The mapping VMA address is saved in VAS window struct when the paste address is mapped. This VMA address is used during migration to unmap the paste address if the window is active. The paste address mapping will be removed when the window is closed or with the munmap(). But the VMA address in the VAS window is not updated with munmap() which is causing invalid access during migration. The KASAN report shows: [16386.254991] BUG: KASAN: slab-use-after-free in reconfig_close_windows+0x1a0/0x4e8 [16386.255043] Read of size 8 at addr c00000014a819670 by task drmgr/696928 [16386.255096] CPU: 29 UID: 0 PID: 696928 Comm: drmgr Kdump: loaded Tainted: G B 6.11.0-rc5-nxgzip #2 [16386.255128] Tainted: [B]=BAD_PAGE [16386.255148] Hardware name: IBM,9080-HEX Power11 (architected) 0x820200 0xf000007 of:IBM,FW1110.00 (NH1110_016) hv:phyp pSeries [16386.255181] Call Trace: [16386.255202] [c00000016b297660] [c0000000018ad0ac] dump_stack_lvl+0x84/0xe8 (unreliable) [16386.255246] [c00000016b297690] [c0000000006e8a90] print_report+0x19c/0x764 [16386.255285] [c00000016b297760] [c0000000006e9490] kasan_report+0x128/0x1f8 [16386.255309] [c00000016b297880] [c0000000006eb5c8] __asan_load8+0xac/0xe0 [16386.255326] [c00000016b2978a0] [c00000000013f898] reconfig_close_windows+0x1a0/0x4e8 [16386.255343] [c00000016b297990] [c000000000140e58] vas_migration_handler+0x3a4/0x3fc [16386.255368] [c00000016b297a90] [c000000000128848] pseries_migrate_partition+0x4c/0x4c4 ... [16386.256136] Allocated by task 696554 on cpu 31 at 16377.277618s: [16386.256149] kasan_save_stack+0x34/0x68 [16386.256163] kasan_save_track+0x34/0x80 [16386.256175] kasan_save_alloc_info+0x58/0x74 [16386.256196] __kasan_slab_alloc+0xb8/0xdc [16386.256209] kmem_cache_alloc_noprof+0x200/0x3d0 [16386.256225] vm_area_alloc+0x44/0x150 [16386.256245] mmap_region+0x214/0x10c4 [16386.256265] do_mmap+0x5fc/0x750 [16386.256277] vm_mmap_pgoff+0x14c/0x24c [16386.256292] ksys_mmap_pgoff+0x20c/0x348 [16386.256303] sys_mmap+0xd0/0x160 ... [16386.256350] Freed by task 0 on cpu 31 at 16386.204848s: [16386.256363] kasan_save_stack+0x34/0x68 [16386.256374] kasan_save_track+0x34/0x80 [16386.256384] kasan_save_free_info+0x64/0x10c [16386.256396] __kasan_slab_free+0x120/0x204 [16386.256415] kmem_cache_free+0x128/0x450 [16386.256428] vm_area_free_rcu_cb+0xa8/0xd8 [16386.256441] rcu_do_batch+0x2c8/0xcf0 [16386.256458] rcu_core+0x378/0x3c4 [16386.256473] handle_softirqs+0x20c/0x60c [16386.256495] do_softirq_own_stack+0x6c/0x88 [16386.256509] do_softirq_own_stack+0x58/0x88 [16386.256521] __irq_exit_rcu+0x1a4/0x20c [16386.256533] irq_exit+0x20/0x38 [16386.256544] interrupt_async_exit_prepare.constprop.0+0x18/0x2c ... [16386.256717] Last potentially related work creation: [16386.256729] kasan_save_stack+0x34/0x68 [16386.256741] __kasan_record_aux_stack+0xcc/0x12c [16386.256753] __call_rcu_common.constprop.0+0x94/0xd04 [16386.256766] vm_area_free+0x28/0x3c [16386.256778] remove_vma+0xf4/0x114 [16386.256797] do_vmi_align_munmap.constprop.0+0x684/0x870 [16386.256811] __vm_munmap+0xe0/0x1f8 [16386.256821] sys_munmap+0x54/0x6c [16386.256830] system_call_exception+0x1a0/0x4a0 [16386.256841] system_call_vectored_common+0x15c/0x2ec [16386.256868] The buggy address belongs to the object at c00000014a819670 which belongs to the cache vm_area_struct of size 168 [16386.256887] The buggy address is located 0 bytes inside of freed 168-byte region [c00000014a819670, c00000014a819718) [16386.256915] The buggy address belongs to the physical page: [16386.256928] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x14a81 [16386.256950] memcg:c0000000ba430001 [16386.256961] anon flags: 0x43ffff800000000(node=4\|zone=0\|lastcpupid=0x7ffff) [16386.256975] page_type: 0xfdffffff(slab) [16386.256990] raw: 043ffff800000000 c00000000501c080 0000000000000000 5deadbee00000001 [16386.257003] raw: 0000000000000000 00000000011a011a 00000001fdffffff c0000000ba430001 [16386.257018] page dumped because: kasan: bad access detected This patch adds close() callback in vas_vm_ops vm_operations_struct which will be executed during munmap() before freeing VMA. The VMA address in the VAS window is set to NULL after holding the window mmap_mutex. Fixes: `37e6764895` ("powerpc/pseries/vas: Add VAS migration handler") Signed-off-by: Haren Myneni <haren@linux.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20241214051758.997759-1-haren@linux.ibm.com	2024-12-18 14:03:41 +05:30
Marc Kleine-Budde	87f54c1219	Merge patch series "can: m_can: set init flag earlier in probe" This series fixes problems in the m_can_pci driver found on the Intel Elkhart Lake processor. Link: https://patch.msgid.link/e247f331cb72829fcbdfda74f31a59cbad1a6006.1728288535.git.matthias.schiffer@ew.tq-group.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2024-12-18 09:32:14 +01:00
Matthias Schiffer	743375f8de	can: m_can: fix missed interrupts with m_can_pci The interrupt line of PCI devices is interpreted as edge-triggered, however the interrupt signal of the m_can controller integrated in Intel Elkhart Lake CPUs appears to be generated level-triggered. Consider the following sequence of events: - IR register is read, interrupt X is set - A new interrupt Y is triggered in the m_can controller - IR register is written to acknowledge interrupt X. Y remains set in IR As at no point in this sequence no interrupt flag is set in IR, the m_can interrupt line will never become deasserted, and no edge will ever be observed to trigger another run of the ISR. This was observed to result in the TX queue of the EHL m_can to get stuck under high load, because frames were queued to the hardware in m_can_start_xmit(), but m_can_finish_tx() was never run to account for their successful transmission. On an Elkhart Lake based board with the two CAN interfaces connected to each other, the following script can reproduce the issue: ip link set can0 up type can bitrate 1000000 ip link set can1 up type can bitrate 1000000 cangen can0 -g 2 -I 000 -L 8 & cangen can0 -g 2 -I 001 -L 8 & cangen can0 -g 2 -I 002 -L 8 & cangen can0 -g 2 -I 003 -L 8 & cangen can0 -g 2 -I 004 -L 8 & cangen can0 -g 2 -I 005 -L 8 & cangen can0 -g 2 -I 006 -L 8 & cangen can0 -g 2 -I 007 -L 8 & cangen can1 -g 2 -I 100 -L 8 & cangen can1 -g 2 -I 101 -L 8 & cangen can1 -g 2 -I 102 -L 8 & cangen can1 -g 2 -I 103 -L 8 & cangen can1 -g 2 -I 104 -L 8 & cangen can1 -g 2 -I 105 -L 8 & cangen can1 -g 2 -I 106 -L 8 & cangen can1 -g 2 -I 107 -L 8 & stress-ng --matrix 0 & To fix the issue, repeatedly read and acknowledge interrupts at the start of the ISR until no interrupt flags are set, so the next incoming interrupt will also result in an edge on the interrupt line. While we have received a report that even with this patch, the TX queue can become stuck under certain (currently unknown) circumstances on the Elkhart Lake, this patch completely fixes the issue with the above reproducer, and it is unclear whether the remaining issue has a similar cause at all. Fixes: `cab7ffc032` ("can: m_can: add PCI glue driver for Intel Elkhart Lake") Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com> Reviewed-by: Markus Schneider-Pargmann <msp@baylibre.com> Link: https://patch.msgid.link/fdf0439c51bcb3a46c21e9fb21c7f1d06363be84.1728288535.git.matthias.schiffer@ew.tq-group.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2024-12-18 09:30:52 +01:00
Matthias Schiffer	fca2977629	can: m_can: set init flag earlier in probe While an m_can controller usually already has the init flag from a hardware reset, no such reset happens on the integrated m_can_pci of the Intel Elkhart Lake. If the CAN controller is found in an active state, m_can_dev_setup() would fail because m_can_niso_supported() calls m_can_cccr_update_bits(), which refuses to modify any other configuration bits when CCCR_INIT is not set. To avoid this issue, set CCCR_INIT before attempting to modify any other configuration flags. Fixes: `cd5a46ce6f` ("can: m_can: don't enable transceiver when probing") Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com> Reviewed-by: Markus Schneider-Pargmann <msp@baylibre.com> Link: https://patch.msgid.link/e247f331cb72829fcbdfda74f31a59cbad1a6006.1728288535.git.matthias.schiffer@ew.tq-group.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2024-12-18 09:30:52 +01:00
Kuniyuki Iwashima	954a2b4071	rtnetlink: Try the outer netns attribute in rtnl_get_peer_net(). Xiao Liang reported that the cited commit changed netns handling in newlink() of netkit, veth, and vxcan. Before the patch, if we don't find a netns attribute in the peer device attributes, we tried to find another netns attribute in the outer netlink attributes by passing it to rtnl_link_get_net(). Let's restore the original behaviour. Fixes: `4832756676` ("rtnetlink: fix double call of rtnl_link_get_net_ifla()") Reported-by: Xiao Liang <shaw.leon@gmail.com> Closes: https://lore.kernel.org/netdev/CABAhCORBVVU8P6AHcEkENMj+gD2d3ce9t=A_o48E0yOQp8_wUQ@mail.gmail.com/#t Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Tested-by: Xiao Liang <shaw.leon@gmail.com> Link: https://patch.msgid.link/20241216110432.51488-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-17 17:54:18 -08:00
Eric Dumazet	b9b8301d36	net: netdevsim: fix nsim_pp_hold_write() nsim_pp_hold_write() has two problems: 1) It may return with rtnl held, as found by syzbot. 2) Its return value does not propagate an error if any. Fixes: `1580cbcbfe` ("net: netdevsim: add some fake page pool use") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241216083703.1859921-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-17 17:46:17 -08:00
Andrea Righi	23579010cf	bpf: Fix bpf_get_smp_processor_id() on !CONFIG_SMP On x86-64 calling bpf_get_smp_processor_id() in a kernel with CONFIG_SMP disabled can trigger the following bug, as pcpu_hot is unavailable: [ 8.471774] BUG: unable to handle page fault for address: 00000000936a290c [ 8.471849] #PF: supervisor read access in kernel mode [ 8.471881] #PF: error_code(0x0000) - not-present page Fix by inlining a return 0 in the !CONFIG_SMP case. Fixes: `1ae6921009` ("bpf: inline bpf_get_smp_processor_id() helper") Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241217195813.622568-1-arighi@nvidia.com	2024-12-17 16:09:24 -08:00
Charlie Jenkins	498d5b14db	riscv: selftests: Fix warnings pointer masking test When compiling the pointer masking tests with -Wall this warning is present: pointer_masking.c: In function ‘test_tagged_addr_abi_sysctl’: pointer_masking.c:203:9: warning: ignoring return value of ‘pwrite’ declared with attribute ‘warn_unused_result’ [-Wunused-result] 203 \| pwrite(fd, &value, 1, 0); \| ^~~~~~~~~~~~~~~~~~~~~~~~ pointer_masking.c:208:9: warning: ignoring return value of ‘pwrite’ declared with attribute ‘warn_unused_result’ [-Wunused-result] 208 \| pwrite(fd, &value, 1, 0); I came across this on riscv64-linux-gnu-gcc (Ubuntu 11.4.0-1ubuntu1~22.04). Fix this by checking that the number of bytes written equal the expected number of bytes written. Fixes: `7470b5afd1` ("riscv: selftests: Add a pointer masking test") Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20241211-fix_warnings_pointer_masking_tests-v6-1-c7ae708fbd2f@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-12-17 14:23:18 -08:00
Nathan Chancellor	aef25be35d	hexagon: Disable constant extender optimization for LLVM prior to 19.1.0 The Hexagon-specific constant extender optimization in LLVM may crash on Linux kernel code [1], such as fs/bcache/btree_io.c after commit `32ed4a620c` ("bcachefs: Btree path tracepoints") in 6.12: clang: llvm/lib/Target/Hexagon/HexagonConstExtenders.cpp:745: bool (anonymous namespace)::HexagonConstExtenders::ExtRoot::operator<(const HCE::ExtRoot &) const: Assertion `ThisB->getParent() == OtherB->getParent()' failed. Stack dump: 0. Program arguments: clang --target=hexagon-linux-musl ... fs/bcachefs/btree_io.c 1. <eof> parser at end of file 2. Code generation 3. Running pass 'Function Pass Manager' on module 'fs/bcachefs/btree_io.c'. 4. Running pass 'Hexagon constant-extender optimization' on function '@__btree_node_lock_nopath' Without assertions enabled, there is just a hang during compilation. This has been resolved in LLVM main (20.0.0) [2] and backported to LLVM 19.1.0 but the kernel supports LLVM 13.0.1 and newer, so disable the constant expander optimization using the '-mllvm' option when using a toolchain that is not fixed. Cc: stable@vger.kernel.org Link: https://github.com/llvm/llvm-project/issues/99714 [1] Link: `68df06a0b2` [2] Link: `2ab8d93061` [3] Reviewed-by: Brian Cain <bcain@quicinc.com> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2024-12-17 14:07:53 -08:00
Joshua Hay	0c1683c681	idpf: trigger SW interrupt when exiting wb_on_itr mode There is a race condition between exiting wb_on_itr and completion write backs. For example, we are in wb_on_itr mode and a Tx completion is generated by HW, ready to be written back, as we are re-enabling interrupts: HW SW \| \| \| \| idpf_tx_splitq_clean_all \| \| napi_complete_done \| \| \| tx_completion_wb \| idpf_vport_intr_update_itr_ena_irq That tx_completion_wb happens before the vector is fully re-enabled. Continuing with this example, it is a UDP stream and the tx_completion_wb is the last one in the flow (there are no rx packets). Because the HW generated the completion before the interrupt is fully enabled, the HW will not fire the interrupt once the timer expires and the write back will not happen. NAPI poll won't be called. We have indicated we're back in interrupt mode but nothing else will trigger the interrupt. Therefore, the completion goes unprocessed, triggering a Tx timeout. To mitigate this, fire a SW triggered interrupt upon exiting wb_on_itr. This interrupt will catch the rogue completion and avoid the timeout. Add logic to set the appropriate bits in the vector's dyn_ctl register. Fixes: `9c4a27da0e` ("idpf: enable WB_ON_ITR") Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2024-12-17 14:00:26 -08:00
Olga Kornievskaia	9048cf05a1	NFSD: fix management of pending async copies Currently the pending_async_copies count is decremented just before a struct nfsd4_copy is destroyed. After commit `aa0ebd21df` ("NFSD: Add nfsd4_copy time-to-live") nfsd4_copy structures sticks around for 10 lease periods after the COPY itself has completed, the pending_async_copies count stays high for a long time. This causes NFSD to avoid the use of background copy even though the actual background copy workload might no longer be running. In this patch, decrement pending_async_copies once async copy thread is done processing the copy work. Fixes: `aa0ebd21df` ("NFSD: Add nfsd4_copy time-to-live") Signed-off-by: Olga Kornievskaia <okorniev@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-12-17 16:35:53 -05:00
Joshua Hay	93433c1d91	idpf: add support for SW triggered interrupts SW triggered interrupts are guaranteed to fire after their timer expires, unlike Tx and Rx interrupts which will only fire after the timer expires _and_ a descriptor write back is available to be processed by the driver. Add the necessary fields, defines, and initializations to enable a SW triggered interrupt in the vector's dyn_ctl register. Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2024-12-17 13:28:55 -08:00
Maksim Kiselev	f4bf0b909a	clk: thead: Fix TH1520 emmc and shdci clock rate In accordance with LicheePi 4A BSP the clock that comes to emmc/sdhci is 198Mhz which is got through frequency division of source clock VIDEO PLL by 4 [1]. But now the AP_SUBSYS driver sets the CLK EMMC SDIO to the same frequency as the VIDEO PLL, equal to 792 MHz. This causes emmc/sdhci to work 4 times slower. Let's fix this issue by adding fixed factor clock that divides VIDEO PLL by 4 for emmc/sdhci. Link: `7563179071/drivers/clk/thead/clk-light-fm.c (L454)` Fixes: `ae81b69fd2` ("clk: thead: Add support for T-Head TH1520 AP_SUBSYS clocks") Signed-off-by: Maksim Kiselev <bigunclemax@gmail.com> Link: https://lore.kernel.org/r/20241210083029.92620-1-bigunclemax@gmail.com Tested-by: Xi Ruoyao <xry111@xry111.site> Reviewed-by: Drew Fustini <dfustini@tenstorrent.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org>	2024-12-17 12:17:50 -08:00
Willow Cunningham	058387d9c6	arm64: dts: broadcom: Fix L2 linesize for Raspberry Pi 5 Set the cache-line-size parameter of the L2 cache for each core to the correct value of 64 bytes. Previously, the L2 cache line size was incorrectly set to 128 bytes for the Broadcom BCM2712. This causes validation tests for the Performance Application Programming Interface (PAPI) tool to fail as they depend on sysfs accurately reporting cache line sizes. The correct value of 64 bytes is stated in the official documentation of the ARM Cortex A-72, which is linked in the comments of arm64/boot/dts/broadcom/bcm2712.dtsi as the source for cache-line-size. Fixes: `faa3381267` ("arm64: dts: broadcom: Add minimal support for Raspberry Pi 5") Signed-off-by: Willow Cunningham <willow.e.cunningham@maine.edu> Link: https://lore.kernel.org/r/20241007212954.214724-1-willow.e.cunningham@maine.edu Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>	2024-12-17 11:03:22 -08:00
Qu Wenruo	dfb92681a1	btrfs: tree-checker: reject inline extent items with 0 ref count [BUG] There is a bug report in the mailing list where btrfs_run_delayed_refs() failed to drop the ref count for logical 25870311358464 num_bytes 2113536. The involved leaf dump looks like this: item 166 key (25870311358464 168 2113536) itemoff 10091 itemsize 50 extent refs 1 gen 84178 flags 1 ref#0: shared data backref parent 32399126528000 count 0 <<< ref#1: shared data backref parent 31808973717504 count 1 Notice the count number is 0. [CAUSE] There is no concrete evidence yet, but considering 0 -> 1 is also a single bit flipped, it's possible that hardware memory bitflip is involved, causing the on-disk extent tree to be corrupted. [FIX] To prevent us reading such corrupted extent item, or writing such damaged extent item back to disk, enhance the handling of BTRFS_EXTENT_DATA_REF_KEY and BTRFS_SHARED_DATA_REF_KEY keys for both inlined and key items, to detect such 0 ref count and reject them. CC: stable@vger.kernel.org # 5.4+ Link: https://lore.kernel.org/linux-btrfs/7c69dd49-c346-4806-86e7-e6f863a66f48@app.fastmail.com/ Reported-by: Frankie Fisher <frankie@terrorise.me.uk> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2024-12-17 19:54:32 +01:00
Christoph Hellwig	be691b5e59	btrfs: split bios to the fs sector size boundary Btrfs like other file systems can't really deal with I/O not aligned to it's internal block size (which strangely is called sector size in btrfs, for historical reasons), but the block layer split helper doesn't even know about that. Round down the split boundary so that all I/Os are aligned. Fixes: `d5e4377d50` ("btrfs: split zone append bios in btrfs_submit_bio") CC: stable@vger.kernel.org # 6.12 Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2024-12-17 19:54:32 +01:00
Christoph Hellwig	6c3864e055	btrfs: use bio_is_zone_append() in the completion handler Otherwise it won't catch bios turned into regular writes by the block level zone write plugging. The additional test it adds is for emulated zone append. Fixes: `9b1ce7f0c6` ("block: Implement zone append emulation") CC: stable@vger.kernel.org # 6.12 Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2024-12-17 19:54:32 +01:00
Josef Bacik	d75d72a858	btrfs: fix improper generation check in snapshot delete We have been using the following check if (generation <= root->root_key.offset) to make decisions about whether or not to visit a node during snapshot delete. This is because for normal subvolumes this is set to 0, and for snapshots it's set to the creation generation. The idea being that if the generation of the node is less than or equal to our creation generation then we don't need to visit that node, because it doesn't belong to us, we can simply drop our reference and move on. However reloc roots don't have their generation stored in root->root_key.offset, instead that is the objectid of their corresponding fs root. This means we can incorrectly not walk into nodes that need to be dropped when deleting a reloc root. There are a variety of consequences to making the wrong choice in two distinct areas. visit_node_for_delete() 1. False positive. We think we are newer than the block when we really aren't. We don't visit the node and drop our reference to the node and carry on. This would result in leaked space. 2. False negative. We do decide to walk down into a block that we should have just dropped our reference to. However this means that the child node will have refs > 1, so we will switch to UPDATE_BACKREF, and then the subsequent walk_down_proc() will notice that btrfs_header_owner(node) != root->root_key.objectid and it'll break out of the loop, and then walk_up_proc() will drop our reference, so this appears to be ok. do_walk_down() 1. False positive. We are in UPDATE_BACKREF and incorrectly decide that we are done and don't need to update the backref for our lower nodes. This is another case that simply won't happen with relocation, as we only have to do UPDATE_BACKREF if the node below us was shared and didn't have FULL_BACKREF set, and since we don't own that node because we're a reloc root we actually won't end up in this case. 2. False negative. Again this is tricky because as described above, we simply wouldn't be here from relocation, because we don't own any of the nodes because we never set btrfs_header_owner() to the reloc root objectid, and we always use FULL_BACKREF, we never actually need to set FULL_BACKREF on any children. Having spent a lot of time stressing relocation/snapshot delete recently I've not seen this pop in practice. But this is objectively incorrect, so fix this to get the correct starting generation based on the root we're dropping to keep me from thinking there's a problem here. CC: stable@vger.kernel.org Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>	2024-12-17 19:54:32 +01:00
Arnd Bergmann	2182e0f200	drm: rework FB_CORE dependency The 'select FB_CORE' statement moved from CONFIG_DRM to DRM_CLIENT_LIB, but there are now configurations that have code calling into fb_core as built-in even though the client_lib itself is a loadable module: x86_64-linux-ld: drivers/gpu/drm/drm_fb_helper.o: in function `drm_fb_helper_set_suspend': drm_fb_helper.c:(.text+0x2c6): undefined reference to `fb_set_suspend' x86_64-linux-ld: drivers/gpu/drm/drm_fb_helper.o: in function `drm_fb_helper_resume_worker': drm_fb_helper.c:(.text+0x2e1): undefined reference to `fb_set_suspend' In addition to DRM_CLIENT_LIB, the 'select' needs to be at least in DRM_KMS_HELPER and DRM_GEM_SHMEM_HELPER, so add it here. This patch is the KMS_HELPER part of [1]. Fixes: `dadd28d414` ("drm/client: Add client-lib module") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/series/141411/ # 1 Link: https://patchwork.freedesktop.org/patch/msgid/20241216074450.8590-4-tzimmermann@suse.de Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>	2024-12-17 18:28:43 +01:00
Linus Torvalds	5529876063	Merge tag 'ftrace-v6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull ftrace fixes from Steven Rostedt: - Always try to initialize the idle functions when graph tracer starts A bug was found that when a CPU is offline when graph tracing starts and then comes online, that CPU is not traced. The fix to that was to move the initialization of the idle shadow stack over to the hot plug online logic, which also handle onlined CPUs. The issue was that it removed the initialization of the shadow stack when graph tracing starts, but the callbacks to the hot plug logic do nothing if graph tracing isn't currently running. Although that fix fixed the onlining of a CPU during tracing, it broke the CPUs that were already online. - Have microblaze not try to get the "true parent" in function tracing If function tracing and graph tracing are both enabled at the same time the parent of the functions traced by the function tracer may sometimes be the graph tracing trampoline. The graph tracing hijacks the return pointer of the function to trace it, but that can interfere with the function tracing parent output. This was fixed by using the ftrace_graph_ret_addr() function passing in the kernel stack pointer using the ftrace_regs_get_stack_pointer() function. But Al Viro reported that Microblaze does not implement the kernel_stack_pointer(regs) helper function that ftrace_regs_get_stack_pointer() uses and fails to compile when function graph tracing is enabled. It was first thought that this was a microblaze issue, but the real cause is that this only works when an architecture implements HAVE_DYNAMIC_FTRACE_WITH_ARGS, as a requirement for that config is to have ftrace always pass a valid ftrace_regs to the callbacks. That also means that the architecture supports ftrace_regs_get_stack_pointer() Microblaze does not set HAVE_DYNAMIC_FTRACE_WITH_ARGS nor does it implement ftrace_regs_get_stack_pointer() which caused it to fail to build. Only implement the "true parent" logic if an architecture has that config set" * tag 'ftrace-v6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: ftrace: Do not find "true_parent" if HAVE_DYNAMIC_FTRACE_WITH_ARGS is not set fgraph: Still initialize idle shadow stacks when starting	2024-12-17 09:14:31 -08:00
Linus Torvalds	a241d7f0d3	Merge tag 's390-6.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Alexander Gordeev: - Fix DirectMap accounting in /proc/meminfo file - Fix strscpy() return code handling that led to "unsigned 'len' is never less than zero" warning - Fix the calculation determining whether to use three- or four-level paging: account KMSAN modules metadata * tag 's390-6.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/mm: Consider KMSAN modules metadata for paging levels s390/ipl: Fix never less than zero warning s390/mm: Fix DirectMap accounting	2024-12-17 09:09:32 -08:00
Thomas Zimmermann	8ce35bf0ef	drm/fbdev: Select FB_CORE dependency for fbdev on DMA and TTM Select FB_CORE if GEM's DMA and TTM implementations support fbdev emulation. Fixes linker errors about missing symbols from the fbdev subsystem. Also see [1] for a related SHMEM fix. Fixes: `dadd28d414` ("drm/client: Add client-lib module") Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/series/141411/ # 1 Reviewed-by: Arnd Bergmann <arnd@arndb.de> Link: https://patchwork.freedesktop.org/patch/msgid/20241216074450.8590-3-tzimmermann@suse.de	2024-12-17 18:06:23 +01:00
Thomas Zimmermann	8fc38062be	fbdev: Fix recursive dependencies wrt BACKLIGHT_CLASS_DEVICE Do not select BACKLIGHT_CLASS_DEVICE from FB_BACKLIGHT. The latter only controls backlight support within fbdev core code and data structures. Make fbdev drivers depend on BACKLIGHT_CLASS_DEVICE and let users select it explicitly. Fixes warnings about recursive dependencies, such as error: recursive dependency detected! symbol BACKLIGHT_CLASS_DEVICE is selected by FB_BACKLIGHT symbol FB_BACKLIGHT is selected by FB_SH_MOBILE_LCDC symbol FB_SH_MOBILE_LCDC depends on FB_DEVICE symbol FB_DEVICE depends on FB_CORE symbol FB_CORE is selected by DRM_GEM_DMA_HELPER symbol DRM_GEM_DMA_HELPER is selected by DRM_PANEL_ILITEK_ILI9341 symbol DRM_PANEL_ILITEK_ILI9341 depends on BACKLIGHT_CLASS_DEVICE BACKLIGHT_CLASS_DEVICE is user-selectable, so making drivers adapt to it is the correct approach in any case. For most drivers, backlight support is also configurable separately. v3: - Select BACKLIGHT_CLASS_DEVICE in PowerMac defconfigs (Christophe) - Fix PMAC_BACKLIGHT module dependency corner cases (Christophe) v2: - s/BACKLIGHT_DEVICE_CLASS/BACKLIGHT_CLASS_DEVICE (Helge) - Fix fbdev driver-dependency corner case (Arnd) Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Link: https://patchwork.freedesktop.org/patch/msgid/20241216074450.8590-2-tzimmermann@suse.de	2024-12-17 18:06:10 +01:00
Linus Torvalds	ed90ed56e4	Merge tag 'erofs-for-6.13-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs Pull erofs fixes from Gao Xiang: "The first one fixes a syzbot UAF report caused by a commit introduced in this cycle, but it also addresses a longstanding memory leak. The second one resolves a PSI memstall mis-accounting issue. The remaining patches switch file-backed mounts to use buffered I/Os by default instead of direct I/Os, since the page cache of underlay files is typically valid and maybe even dirty. This change also aligns with the default policy of loopback devices. A mount option has been added to try to use direct I/Os explicitly. Summary: - Fix (pcluster) memory leak and (sbi) UAF after umounting - Fix a case of PSI memstall mis-accounting - Use buffered I/Os by default for file-backed mounts" * tag 'erofs-for-6.13-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: erofs: use buffered I/O for file-backed mounts by default erofs: reference `struct erofs_device_info` for erofs_map_dev erofs: use `struct erofs_device_info` for the primary device erofs: add erofs_sb_free() helper MAINTAINERS: erofs: update Yue Hu's email address erofs: fix PSI memstall accounting erofs: fix rare pcluster memory leak after unmounting	2024-12-17 09:04:42 -08:00
John Stultz	4a07791457	locking/rtmutex: Make sure we wake anything on the wake_q when we release the lock->wait_lock Bert reported seeing occasional boot hangs when running with PREEPT_RT and bisected it down to commit `894d1b3db4` ("locking/mutex: Remove wakeups from under mutex::wait_lock"). It looks like I missed a few spots where we drop the wait_lock and potentially call into schedule without waking up the tasks on the wake_q structure. Since the tasks being woken are ww_mutex tasks they need to be able to run to release the mutex and unblock the task that currently is planning to wake them. Thus we can deadlock. So make sure we wake the wake_q tasks when we unlock the wait_lock. Closes: https://lore.kernel.org/lkml/20241211182502.2915-1-spasswolf@web.de Fixes: `894d1b3db4` ("locking/mutex: Remove wakeups from under mutex::wait_lock") Reported-by: Bert Karwatzki <spasswolf@web.de> Signed-off-by: John Stultz <jstultz@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20241212222138.2400498-1-jstultz@google.com	2024-12-17 17:47:24 +01:00
Kan Liang	b8c3a2502a	perf/x86/intel/ds: Add PEBS format 6 The only difference between 5 and 6 is the new counters snapshotting group, without the following counters snapshotting enabling patches, it's impossible to utilize the feature in a PEBS record. It's safe to share the same code path with format 5. Add format 6, so the end user can at least utilize the legacy PEBS features. Fixes: `a932aa0e86` ("perf/x86: Add Lunar Lake and Arrow Lake support") Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20241216204505.748363-1-kan.liang@linux.intel.com	2024-12-17 17:47:23 +01:00
Kan Liang	b6ccddd6fe	perf/x86/intel/uncore: Add Clearwater Forest support From the perspective of the uncore PMU, the Clearwater Forest is the same as the previous Sierra Forest. The only difference is the event list, which will be supported in the perf tool later. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20241211161146.235253-1-kan.liang@linux.intel.com	2024-12-17 17:47:23 +01:00
Linus Torvalds	1f13c38a85	Merge tag 'hardening-v6.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening fix from Kees Cook: "Silence a GCC value-range warning that is being ironically triggered by bounds checking" * tag 'hardening-v6.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: fortify: Hide run-time copy size from value range tracking	2024-12-17 08:45:40 -08:00
Steven Rostedt	afd2627f72	tracing: Check "%s" dereference via the field and not the TP_printk format The TP_printk() portion of a trace event is executed at the time a event is read from the trace. This can happen seconds, minutes, hours, days, months, years possibly later since the event was recorded. If the print format contains a dereference to a string via "%s", and that string was allocated, there's a chance that string could be freed before it is read by the trace file. To protect against such bugs, there are two functions that verify the event. The first one is test_event_printk(), which is called when the event is created. It reads the TP_printk() format as well as its arguments to make sure nothing may be dereferencing a pointer that was not copied into the ring buffer along with the event. If it is, it will trigger a WARN_ON(). For strings that use "%s", it is not so easy. The string may not reside in the ring buffer but may still be valid. Strings that are static and part of the kernel proper which will not be freed for the life of the running system, are safe to dereference. But to know if it is a pointer to a static string or to something on the heap can not be determined until the event is triggered. This brings us to the second function that tests for the bad dereferencing of strings, trace_check_vprintf(). It would walk through the printf format looking for "%s", and when it finds it, it would validate that the pointer is safe to read. If not, it would produces a WARN_ON() as well and write into the ring buffer "[UNSAFE-MEMORY]". The problem with this is how it used va_list to have vsnprintf() handle all the cases that it didn't need to check. Instead of re-implementing vsnprintf(), it would make a copy of the format up to the %s part, and call vsnprintf() with the current va_list ap variable, where the ap would then be ready to point at the string in question. For architectures that passed va_list by reference this was possible. For architectures that passed it by copy it was not. A test_can_verify() function was used to differentiate between the two, and if it wasn't possible, it would disable it. Even for architectures where this was feasible, it was a stretch to rely on such a method that is undocumented, and could cause issues later on with new optimizations of the compiler. Instead, the first function test_event_printk() was updated to look at "%s" as well. If the "%s" argument is a pointer outside the event in the ring buffer, it would find the field type of the event that is the problem and mark the structure with a new flag called "needs_test". The event itself will be marked by TRACE_EVENT_FL_TEST_STR to let it be known that this event has a field that needs to be verified before the event can be printed using the printf format. When the event fields are created from the field type structure, the fields would copy the field type's "needs_test" value. Finally, before being printed, a new function ignore_event() is called which will check if the event has the TEST_STR flag set (if not, it returns false). If the flag is set, it then iterates through the events fields looking for the ones that have the "needs_test" flag set. Then it uses the offset field from the field structure to find the pointer in the ring buffer event. It runs the tests to make sure that pointer is safe to print and if not, it triggers the WARN_ON() and also adds to the trace output that the event in question has an unsafe memory access. The ignore_event() makes the trace_check_vprintf() obsolete so it is removed. Link: https://lore.kernel.org/all/CAHk-=wh3uOnqnZPpR0PeLZZtyWbZLboZ7cHLCKRWsocvs9Y7hQ@mail.gmail.com/ Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/20241217024720.848621576@goodmis.org Fixes: `5013f454a3` ("tracing: Add check of trace event print fmts for dereferencing pointers") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2024-12-17 11:40:11 -05:00
Steven Rostedt	65a25d9f7a	tracing: Add "%s" check in test_event_printk() The test_event_printk() code makes sure that when a trace event is registered, any dereferenced pointers in from the event's TP_printk() are pointing to content in the ring buffer. But currently it does not handle "%s", as there's cases where the string pointer saved in the ring buffer points to a static string in the kernel that will never be freed. As that is a valid case, the pointer needs to be checked at runtime. Currently the runtime check is done via trace_check_vprintf(), but to not have to replicate everything in vsnprintf() it does some logic with the va_list that may not be reliable across architectures. In order to get rid of that logic, more work in the test_event_printk() needs to be done. Some of the strings can be validated at this time when it is obvious the string is valid because the string will be saved in the ring buffer content. Do all the validation of strings in the ring buffer at boot in test_event_printk(), and make sure that the field of the strings that point into the kernel are accessible. This will allow adding checks at runtime that will validate the fields themselves and not rely on paring the TP_printk() format at runtime. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/20241217024720.685917008@goodmis.org Fixes: `5013f454a3` ("tracing: Add check of trace event print fmts for dereferencing pointers") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2024-12-17 11:40:11 -05:00
Steven Rostedt	917110481f	tracing: Add missing helper functions in event pointer dereference check The process_pointer() helper function looks to see if various trace event macros are used. These macros are for storing data in the event. This makes it safe to dereference as the dereference will then point into the event on the ring buffer where the content of the data stays with the event itself. A few helper functions were missing. Those were: __get_rel_dynamic_array() __get_dynamic_array_len() __get_rel_dynamic_array_len() __get_rel_sockaddr() Also add a helper function find_print_string() to not need to use a middle man variable to test if the string exists. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/20241217024720.521836792@goodmis.org Fixes: `5013f454a3` ("tracing: Add check of trace event print fmts for dereferencing pointers") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2024-12-17 11:40:11 -05:00
Steven Rostedt	a6629626c5	tracing: Fix test_event_printk() to process entire print argument The test_event_printk() analyzes print formats of trace events looking for cases where it may dereference a pointer that is not in the ring buffer which can possibly be a bug when the trace event is read from the ring buffer and the content of that pointer no longer exists. The function needs to accurately go from one print format argument to the next. It handles quotes and parenthesis that may be included in an argument. When it finds the start of the next argument, it uses a simple "c = strstr(fmt + i, ',')" to find the end of that argument! In order to include "%s" dereferencing, it needs to process the entire content of the print format argument and not just the content of the first ',' it finds. As there may be content like: ({ const char saved_ptr = trace_seq_buffer_ptr(p); static const char access_str[] = { "---", "--x", "w--", "w-x", "-u-", "-ux", "wu-", "wux" }; union kvm_mmu_page_role role; role.word = REC->role; trace_seq_printf(p, "sp gen %u gfn %llx l%u %u-byte q%u%s %s%s" " %snxe %sad root %u %s%c", REC->mmu_valid_gen, REC->gfn, role.level, role.has_4_byte_gpte ? 4 : 8, role.quadrant, role.direct ? " direct" : "", access_str[role.access], role.invalid ? " invalid" : "", role.efer_nx ? "" : "!", role.ad_disabled ? "!" : "", REC->root_count, REC->unsync ? "unsync" : "sync", 0); saved_ptr; }) Which is an example of a full argument of an existing event. As the code already handles finding the next print format argument, process the argument at the end of it and not the start of it. This way it has both the start of the argument as well as the end of it. Add a helper function "process_pointer()" that will do the processing during the loop as well as at the end. It also makes the code cleaner and easier to read. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/20241217024720.362271189@goodmis.org Fixes: `5013f454a3` ("tracing: Add check of trace event print fmts for dereferencing pointers") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2024-12-17 11:40:11 -05:00
Linus Torvalds	59dbb9d81a	Merge tag 'xsa465+xsa466-6.13-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: "Fix xen netfront crash (XSA-465) and avoid using the hypercall page that doesn't do speculation mitigations (XSA-466)" * tag 'xsa465+xsa466-6.13-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: x86/xen: remove hypercall page x86/xen: use new hypercall functions instead of hypercall page x86/xen: add central hypercall functions x86/xen: don't do PV iret hypercall through hypercall page x86/static-call: provide a way to do very early static-call updates objtool/x86: allow syscall instruction x86: make get_cpu_vendor() accessible from Xen code xen/netfront: fix crash when removing device	2024-12-17 08:29:58 -08:00
Joe Hattori	185e1b1d91	platform/x86: mlx-platform: call pci_dev_put() to balance the refcount mlxplat_pci_fpga_device_init() calls pci_get_device() but does not release the refcount on error path. Call pci_dev_put() on the error path and in mlxplat_pci_fpga_device_exit() to fix this. This bug was found by an experimental static analysis tool that I am developing. Fixes: `02daa222fb` ("platform: mellanox: Add initial support for PCIe based programming logic device") Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp> Reviewed-by: Vadim Pasternak <vadimp@nvidia.com> Link: https://lore.kernel.org/r/20241216022538.381209-1-joe@pf.is.s.u-tokyo.ac.jp Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>	2024-12-17 18:21:45 +02:00
Zhang Kunbo	bedb4e6088	fs/nfs: fix missing declaration of nfs_idmap_cache_timeout fs/nfs/super.c should include fs/nfs/nfs4idmap.h for declaration of nfs_idmap_cache_timeout. This fixes the sparse warning: fs/nfs/super.c:1397:14: warning: symbol 'nfs_idmap_cache_timeout' was not declared. Should it be static? Signed-off-by: Zhang Kunbo <zhangkunbo@huawei.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2024-12-17 11:14:20 -05:00
Trond Myklebust	62e2a47cea	NFS/pnfs: Fix a live lock between recalled layouts and layoutget When the server is recalling a layout, we should ignore the count of outstanding layoutget calls, since the server is expected to return either NFS4ERR_RECALLCONFLICT or NFS4ERR_RETURNCONFLICT for as long as the recall is outstanding. Currently, we may end up livelocking, causing the layout to eventually be forcibly revoked. Fixes: `bf0291dd22` ("pNFS: Ensure LAYOUTGET and LAYOUTRETURN are properly serialised") Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2024-12-17 11:10:55 -05:00
Jens Axboe	020b40f356	io_uring: make ctx->timeout_lock a raw spinlock Chase reports that their tester complaints about a locking context mismatch: ============================= [ BUG: Invalid wait context ] 6.13.0-rc1-gf137f14b7ccb-dirty #9 Not tainted ----------------------------- syz.1.25198/182604 is trying to lock: ffff88805e66a358 (&ctx->timeout_lock){-.-.}-{3:3}, at: spin_lock_irq include/linux/spinlock.h:376 [inline] ffff88805e66a358 (&ctx->timeout_lock){-.-.}-{3:3}, at: io_match_task_safe io_uring/io_uring.c:218 [inline] ffff88805e66a358 (&ctx->timeout_lock){-.-.}-{3:3}, at: io_match_task_safe+0x187/0x250 io_uring/io_uring.c:204 other info that might help us debug this: context-{5:5} 1 lock held by syz.1.25198/182604: #0: ffff88802b7d48c0 (&acct->lock){+.+.}-{2:2}, at: io_acct_cancel_pending_work+0x2d/0x6b0 io_uring/io-wq.c:1049 stack backtrace: CPU: 0 UID: 0 PID: 182604 Comm: syz.1.25198 Not tainted 6.13.0-rc1-gf137f14b7ccb-dirty #9 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x82/0xd0 lib/dump_stack.c:120 print_lock_invalid_wait_context kernel/locking/lockdep.c:4826 [inline] check_wait_context kernel/locking/lockdep.c:4898 [inline] __lock_acquire+0x883/0x3c80 kernel/locking/lockdep.c:5176 lock_acquire.part.0+0x11b/0x370 kernel/locking/lockdep.c:5849 __raw_spin_lock_irq include/linux/spinlock_api_smp.h:119 [inline] _raw_spin_lock_irq+0x36/0x50 kernel/locking/spinlock.c:170 spin_lock_irq include/linux/spinlock.h:376 [inline] io_match_task_safe io_uring/io_uring.c:218 [inline] io_match_task_safe+0x187/0x250 io_uring/io_uring.c:204 io_acct_cancel_pending_work+0xb8/0x6b0 io_uring/io-wq.c:1052 io_wq_cancel_pending_work io_uring/io-wq.c:1074 [inline] io_wq_cancel_cb+0xb0/0x390 io_uring/io-wq.c:1112 io_uring_try_cancel_requests+0x15e/0xd70 io_uring/io_uring.c:3062 io_uring_cancel_generic+0x6ec/0x8c0 io_uring/io_uring.c:3140 io_uring_files_cancel include/linux/io_uring.h:20 [inline] do_exit+0x494/0x27a0 kernel/exit.c:894 do_group_exit+0xb3/0x250 kernel/exit.c:1087 get_signal+0x1d77/0x1ef0 kernel/signal.c:3017 arch_do_signal_or_restart+0x79/0x5b0 arch/x86/kernel/signal.c:337 exit_to_user_mode_loop kernel/entry/common.c:111 [inline] exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline] __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline] syscall_exit_to_user_mode+0x150/0x2a0 kernel/entry/common.c:218 do_syscall_64+0xd8/0x250 arch/x86/entry/common.c:89 entry_SYSCALL_64_after_hwframe+0x77/0x7f which is because io_uring has ctx->timeout_lock nesting inside the io-wq acct lock, the latter of which is used from inside the scheduler and hence is a raw spinlock, while the former is a "normal" spinlock and can hence be sleeping on PREEMPT_RT. Change ctx->timeout_lock to be a raw spinlock to solve this nesting dependency on PREEMPT_RT=y. Reported-by: chase xd <sl1589472800@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-17 08:21:46 -07:00
Yang Erkun	69d803c40e	nfsd: Revert "nfsd: release svc_expkey/svc_export with rcu_work" This reverts commit `f8c989a0c8`. Before this commit, svc_export_put or expkey_put will call path_put with sync mode. After this commit, path_put will be called with async mode. And this can lead the unexpected results show as follow. mkfs.xfs -f /dev/sda echo "/ (rw,no_root_squash,fsid=0)" > /etc/exports echo "/mnt (rw,no_root_squash,fsid=1)" >> /etc/exports exportfs -ra service nfs-server start mount -t nfs -o vers=4.0 127.0.0.1:/mnt /mnt1 mount /dev/sda /mnt/sda touch /mnt1/sda/file exportfs -r umount /mnt/sda # failed unexcepted The touch will finally call nfsd_cross_mnt, add refcount to mount, and then add cache_head. Before this commit, exportfs -r will call cache_flush to cleanup all cache_head, and path_put in svc_export_put/expkey_put will be finished with sync mode. So, the latter umount will always success. However, after this commit, path_put will be called with async mode, the latter umount may failed, and if we add some delay, umount will success too. Personally I think this bug and should be fixed. We first revert before bugfix patch, and then fix the original bug with a different way. Fixes: `f8c989a0c8` ("nfsd: release svc_expkey/svc_export with rcu_work") Signed-off-by: Yang Erkun <yangerkun@huawei.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-12-17 09:45:23 -05:00
Evgenii Shatokhin	a37eecb705	pinctrl: mcp23s08: Fix sleeping in atomic context due to regmap locking If a device uses MCP23xxx IO expander to receive IRQs, the following bug can happen: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:283 in_atomic(): 1, irqs_disabled(): 1, non_block: 0, ... preempt_count: 1, expected: 0 ... Call Trace: ... __might_resched+0x104/0x10e __might_sleep+0x3e/0x62 mutex_lock+0x20/0x4c regmap_lock_mutex+0x10/0x18 regmap_update_bits_base+0x2c/0x66 mcp23s08_irq_set_type+0x1ae/0x1d6 __irq_set_trigger+0x56/0x172 __setup_irq+0x1e6/0x646 request_threaded_irq+0xb6/0x160 ... We observed the problem while experimenting with a touchscreen driver which used MCP23017 IO expander (I2C). The regmap in the pinctrl-mcp23s08 driver uses a mutex for protection from concurrent accesses, which is the default for regmaps without .fast_io, .disable_locking, etc. mcp23s08_irq_set_type() calls regmap_update_bits_base(), and the latter locks the mutex. However, __setup_irq() locks desc->lock spinlock before calling these functions. As a result, the system tries to lock the mutex whole holding the spinlock. It seems, the internal regmap locks are not needed in this driver at all. mcp->lock seems to protect the regmap from concurrent accesses already, except, probably, in mcp_pinconf_get/set. mcp23s08_irq_set_type() and mcp23s08_irq_mask/unmask() are called under chip_bus_lock(), which calls mcp23s08_irq_bus_lock(). The latter takes mcp->lock and enables regmap caching, so that the potentially slow I2C accesses are deferred until chip_bus_unlock(). The accesses to the regmap from mcp23s08_probe_one() do not need additional locking. In all remaining places where the regmap is accessed, except mcp_pinconf_get/set(), the driver already takes mcp->lock. This patch adds locking in mcp_pinconf_get/set() and disables internal locking in the regmap config. Among other things, it fixes the sleeping in atomic context described above. Fixes: `8f38910ba4` ("pinctrl: mcp23s08: switch to regmap caching") Cc: stable@vger.kernel.org Signed-off-by: Evgenii Shatokhin <e.shatokhin@yadro.com> Link: https://lore.kernel.org/20241209074659.1442898-1-e.shatokhin@yadro.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>	2024-12-17 15:10:32 +01:00
Gianfranco Trad	7ed2d91588	qed: fix possible uninit pointer read in qed_mcp_nvm_info_populate() Coverity reports an uninit pointer read in qed_mcp_nvm_info_populate(). If EOPNOTSUPP is returned from qed_mcp_bist_nvm_get_num_images() ensure nvm_info.num_images is set to 0 to avoid possible uninit assignment to p_hwfn->nvm_info.image_att later on in out label. Closes: https://scan5.scan.coverity.com/#/project-view/63204/10063?selectedIssue=1636666 Suggested-by: Simon Horman <horms@kernel.org> Signed-off-by: Gianfranco Trad <gianf.trad@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241215011733.351325-2-gianf.trad@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-12-17 15:08:52 +01:00
Peter Ujfalusi	e8d0ba147d	ASoC: SOF: Intel: hda-dai: Do not release the link DMA on STOP The linkDMA should not be released on stop trigger since a stream re-start might happen without closing of the stream. This leaves a short time for other streams to 'steal' the linkDMA since it has been released. This issue is not easy to reproduce under normal conditions as usually after stop the stream is closed, or the same stream is restarted, but if another stream got in between the stop and start, like this: aplay -Dhw:0,3 -c2 -r48000 -fS32_LE /dev/zero -d 120 CTRL+z aplay -Dhw:0,0 -c2 -r48000 -fS32_LE /dev/zero -d 120 then the link DMA channels will be mixed up, resulting firmware error or crash. Fixes: `ab5593793e` ("ASoC: SOF: Intel: hda: Always clean up link DMA during stop") Cc: stable@vger.kernel.org Closes: https://github.com/thesofproject/sof/issues/9695 Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Reviewed-by: Liam Girdwood <liam.r.girdwood@intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://patch.msgid.link/20241217091019.31798-1-peter.ujfalusi@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-12-17 13:21:10 +00:00
Zhang Kunbo	2b2fc0be98	fs: fix missing declaration of init_files fs/file.c should include include/linux/init_task.h for declaration of init_files. This fixes the sparse warning: fs/file.c:501:21: warning: symbol 'init_files' was not declared. Should it be static? Signed-off-by: Zhang Kunbo <zhangkunbo@huawei.com> Link: https://lore.kernel.org/r/20241217071836.2634868-1-zhangkunbo@huawei.com Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-12-17 13:38:46 +01:00
Joe Hattori	0cb2c504d7	net: ethernet: bgmac-platform: fix an OF node reference leak The OF node obtained by of_parse_phandle() is not freed. Call of_node_put() to balance the refcount. This bug was found by an experimental static analysis tool that I am developing. Fixes: `1676aba5ef` ("net: ethernet: bgmac: device tree phy enablement") Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241214014912.2810315-1-joe@pf.is.s.u-tokyo.ac.jp Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-12-17 13:22:05 +01:00
Paolo Abeni	90d130aadc	Merge branch 'fixes-on-the-open-alliance-tc6-10base-t1x-mac-phy-support-generic-lib' Parthiban Veerasooran says: ==================== Fixes on the OPEN Alliance TC6 10BASE-T1x MAC-PHY support generic lib This patch series contain the below fixes. - Infinite loop error when tx credits becomes 0. - Race condition between tx skb reference pointers. v2: - Added mutex lock to protect tx skb reference handling. v3: - Added mutex protection in assigning new tx skb to waiting_tx_skb pointer. - Explained the possible scenario for the race condition with the time diagram in the commit message. v4: - Replaced mutex with spin_lock_bh() variants as the start_xmit runs in BH/softirq context which can't take sleeping locks. ==================== Link: https://patch.msgid.link/20241213123159.439739-1-parthiban.veerasooran@microchip.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-12-17 13:11:42 +01:00
Parthiban Veerasooran	e592b5110b	net: ethernet: oa_tc6: fix tx skb race condition between reference pointers There are two skb pointers to manage tx skb's enqueued from n/w stack. waiting_tx_skb pointer points to the tx skb which needs to be processed and ongoing_tx_skb pointer points to the tx skb which is being processed. SPI thread prepares the tx data chunks from the tx skb pointed by the ongoing_tx_skb pointer. When the tx skb pointed by the ongoing_tx_skb is processed, the tx skb pointed by the waiting_tx_skb is assigned to ongoing_tx_skb and the waiting_tx_skb pointer is assigned with NULL. Whenever there is a new tx skb from n/w stack, it will be assigned to waiting_tx_skb pointer if it is NULL. Enqueuing and processing of a tx skb handled in two different threads. Consider a scenario where the SPI thread processed an ongoing_tx_skb and it moves next tx skb from waiting_tx_skb pointer to ongoing_tx_skb pointer without doing any NULL check. At this time, if the waiting_tx_skb pointer is NULL then ongoing_tx_skb pointer is also assigned with NULL. After that, if a new tx skb is assigned to waiting_tx_skb pointer by the n/w stack and there is a chance to overwrite the tx skb pointer with NULL in the SPI thread. Finally one of the tx skb will be left as unhandled, resulting packet missing and memory leak. - Consider the below scenario where the TXC reported from the previous transfer is 10 and ongoing_tx_skb holds an tx ethernet frame which can be transported in 20 TXCs and waiting_tx_skb is still NULL. tx_credits = 10; /* 21 are filled in the previous transfer / ongoing_tx_skb = 20; waiting_tx_skb = NULL; / Still NULL / - So, (tc6->ongoing_tx_skb \|\| tc6->waiting_tx_skb) becomes true. - After oa_tc6_prepare_spi_tx_buf_for_tx_skbs() ongoing_tx_skb = 10; waiting_tx_skb = NULL; / Still NULL */ - Perform SPI transfer. - Process SPI rx buffer to get the TXC from footers. - Now let's assume previously filled 21 TXCs are freed so we are good to transport the next remaining 10 tx chunks from ongoing_tx_skb. tx_credits = 21; ongoing_tx_skb = 10; waiting_tx_skb = NULL; - So, (tc6->ongoing_tx_skb \|\| tc6->waiting_tx_skb) becomes true again. - In the oa_tc6_prepare_spi_tx_buf_for_tx_skbs() ongoing_tx_skb = NULL; waiting_tx_skb = NULL; - Now the below bad case might happen, Thread1 (oa_tc6_start_xmit) Thread2 (oa_tc6_spi_thread_handler) --------------------------- ----------------------------------- - if waiting_tx_skb is NULL - if ongoing_tx_skb is NULL - ongoing_tx_skb = waiting_tx_skb - waiting_tx_skb = skb - waiting_tx_skb = NULL ... - ongoing_tx_skb = NULL - if waiting_tx_skb is NULL - waiting_tx_skb = skb To overcome the above issue, protect the moving of tx skb reference from waiting_tx_skb pointer to ongoing_tx_skb pointer and assigning new tx skb to waiting_tx_skb pointer, so that the other thread can't access the waiting_tx_skb pointer until the current thread completes moving the tx skb reference safely. Fixes: `53fbde8ab2` ("net: ethernet: oa_tc6: implement transmit path to transfer tx ethernet frames") Signed-off-by: Parthiban Veerasooran <parthiban.veerasooran@microchip.com> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-12-17 13:11:22 +01:00
Parthiban Veerasooran	7d2f320e12	net: ethernet: oa_tc6: fix infinite loop error when tx credits becomes 0 SPI thread wakes up to perform SPI transfer whenever there is an TX skb from n/w stack or interrupt from MAC-PHY. Ethernet frame from TX skb is transferred based on the availability tx credits in the MAC-PHY which is reported from the previous SPI transfer. Sometimes there is a possibility that TX skb is available to transmit but there is no tx credits from MAC-PHY. In this case, there will not be any SPI transfer but the thread will be running in an endless loop until tx credits available again. So checking the availability of tx credits along with TX skb will prevent the above infinite loop. When the tx credits available again that will be notified through interrupt which will trigger the SPI transfer to get the available tx credits. Fixes: `53fbde8ab2` ("net: ethernet: oa_tc6: implement transmit path to transfer tx ethernet frames") Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Parthiban Veerasooran <parthiban.veerasooran@microchip.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-12-17 13:11:22 +01:00
Bartosz Golaszewski	44c5aa73cc	interconnect: icc-clk: check return values of devm_kasprintf() devm_kasprintf() can fail and return NULL, add missing return value checks. Fixes: `0ac2a08f42` ("interconnect: add clk-based icc provider support") Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20241202165723.17292-1-brgl@bgdev.pl Signed-off-by: Georgi Djakov <djakov@kernel.org>	2024-12-17 14:03:34 +02:00
Georgi Djakov	00a973e093	interconnect: qcom: icc-rpm: Set the count member before accessing the flex array The following UBSAN error is reported during boot on the db410c board on a clang-19 build: Internal error: UBSAN: array index out of bounds: 00000000f2005512 [#1] PREEMPT SMP ... pc : qnoc_probe+0x5f8/0x5fc ... The cause of the error is that the counter member was not set before accessing the annotated flexible array member, but after that. Fix this by initializing it earlier. Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> Closes: https://lore.kernel.org/r/CA+G9fYs+2mBz1y2dAzxkj9-oiBJ2Acm1Sf1h2YQ3VmBqj_VX2g@mail.gmail.com Fixes: `dd4904f3b9` ("interconnect: qcom: Annotate struct icc_onecell_data with __counted_by") Reviewed-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20241203223334.233404-1-djakov@kernel.org Signed-off-by: Georgi Djakov <djakov@kernel.org>	2024-12-17 14:03:02 +02:00
Yuezhang Mo	70465acbb0	exfat: fix exfat_find_empty_entry() not returning error on failure On failure, "dentry" is the error code. If the error code indicates that there is no space, a new cluster may need to be allocated; for other errors, it should be returned directly. Only on success, "dentry" is the index of the directory entry, and it needs to be converted into the directory entry index within the cluster where it is located. Fixes: `8a3f5711ad` ("exfat: reduce FAT chain traversal") Reported-by: syzbot+6f6c9397e0078ef60bce@syzkaller.appspotmail.com Tested-by: syzbot+6f6c9397e0078ef60bce@syzkaller.appspotmail.com Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>	2024-12-17 20:21:59 +09:00
Niklas Neronin	b9252f80b8	usb: xhci: fix ring expansion regression in 6.13-rc1 The source and destination rings were incorrectly assigned during the ring linking process. The "source" ring, which contains the new segments, was not spliced into the "destination" ring, leading to incorrect ring expansion. Fixes: `fe688e5006` ("usb: xhci: refactor xhci_link_rings() to use source and destination rings") Reported-by: Jeff Chua <jeff.chua.linux@gmail.com> Closes: https://lore.kernel.org/lkml/CAAJw_ZtppNqC9XA=-WVQDr+vaAS=di7jo15CzSqONeX48H75MA@mail.gmail.com/ Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241217102122.2316814-3-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-17 11:59:09 +01:00
Mathias Nyman	e21ebe51af	xhci: Turn NEC specific quirk for handling Stop Endpoint errors generic xHC hosts from several vendors have the same issue where endpoints start so slowly that a later queued 'Stop Endpoint' command may complete before endpoint is up and running. The 'Stop Endpoint' command fails with context state error as the endpoint still appears as stopped. See commit `42b7581376` ("usb: xhci: Limit Stop Endpoint retries") for details CC: stable@vger.kernel.org Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241217102122.2316814-2-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-17 11:59:09 +01:00
Umesh Nerlige Ramappa	1622ed27d2	i915/guc: Accumulate active runtime on gt reset On gt reset, if a context is running, then accumulate it's active time into the busyness counter since there will be no chance for the context to switch out and update it's run time. v2: Move comment right above the if (John) Fixes: `77cdd054dd` ("drm/i915/pmu: Connect engine busyness stats from GuC to pmu") Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241127174006.190128-4-umesh.nerlige.ramappa@intel.com (cherry picked from commit `7ed047da59`) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>	2024-12-17 10:15:15 +00:00
Umesh Nerlige Ramappa	59a0b46788	i915/guc: Ensure busyness counter increases motonically Active busyness of an engine is calculated using gt timestamp and the context switch in time. While capturing the gt timestamp, it's possible that the context switches out. This race could result in an active busyness value that is greater than the actual context runtime value by a small amount. This leads to a negative delta and throws off busyness calculations for the user. If a subsequent count is smaller than the previous one, just return the previous one, since we expect the busyness to catch up. Fixes: `77cdd054dd` ("drm/i915/pmu: Connect engine busyness stats from GuC to pmu") Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241127174006.190128-3-umesh.nerlige.ramappa@intel.com (cherry picked from commit `cf907f6d29`) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>	2024-12-17 10:15:10 +00:00
Umesh Nerlige Ramappa	abcc2ddae5	i915/guc: Reset engine utilization buffer before registration On GT reset, we store total busyness counts for all engines and re-register the utilization buffer with GuC. At that time we should reset the buffer, so that we don't get spurious busyness counts on subsequent queries. To repro this issue, run igt@perf_pmu@busy-hang followed by igt@perf_pmu@most-busy-idle-check-all for a couple iterations. Fixes: `77cdd054dd` ("drm/i915/pmu: Connect engine busyness stats from GuC to pmu") Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241127174006.190128-2-umesh.nerlige.ramappa@intel.com (cherry picked from commit `abd318237f`) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>	2024-12-17 10:15:03 +00:00
Mark Brown	212fbabe1d	KVM: arm64: Fix set_id_regs selftest for ASIDBITS becoming unwritable In commit `03c7527e97` ("KVM: arm64: Do not allow ID_AA64MMFR0_EL1.ASIDbits to be overridden") we made that bitfield in the ID registers unwritable however the change neglected to make the corresponding update to set_id_regs resulting in it failing: ok 56 ID_AA64MMFR0_EL1_BIGEND ==== Test Assertion Failure ==== aarch64/set_id_regs.c:434: masks[idx] & ftr_bits[j].mask == ftr_bits[j].mask pid=5566 tid=5566 errno=22 - Invalid argument 1 0x00000000004034a7: test_vm_ftr_id_regs at set_id_regs.c:434 2 0x0000000000401b53: main at set_id_regs.c:684 3 0x0000ffff8e6b7543: ?? ??:0 4 0x0000ffff8e6b7617: ?? ??:0 5 0x0000000000401e6f: _start at ??:? not ok 8 selftests: kvm: set_id_regs # exit=254 Remove ID_AA64MMFR1_EL1.ASIDBITS from the set of bitfields we test for writeability. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20241216-kvm-arm64-fix-set-id-asidbits-v1-1-8b105b888fc3@kernel.org Acked-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-12-17 01:14:56 -08:00
FUJITA Tomonori	94901b7a74	rust: net::phy fix module autoloading The alias symbol name was renamed. Adjust module_phy_driver macro to create the proper symbol name to fix module autoloading. Fixes: `054a9cd395` ("modpost: rename alias symbol for MODULE_DEVICE_TABLE()") Signed-off-by: FUJITA Tomonori <fujita.tomonori@gmail.com> Link: https://patch.msgid.link/20241212130015.238863-1-fujita.tomonori@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-12-17 09:20:07 +01:00
Juergen Gross	7fa0da5373	x86/xen: remove hypercall page The hypercall page is no longer needed. It can be removed, as from the Xen perspective it is optional. But, from Linux's perspective, it removes naked RET instructions that escape the speculative protections that Call Depth Tracking and/or Untrain Ret are trying to achieve. This is part of XSA-466 / CVE-2024-53241. Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>	2024-12-17 08:23:42 +01:00
Juergen Gross	b1c2cb86f4	x86/xen: use new hypercall functions instead of hypercall page Call the Xen hypervisor via the new xen_hypercall_func static-call instead of the hypercall page. This is part of XSA-466 / CVE-2024-53241. Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com> Co-developed-by: Peter Zijlstra <peterz@infradead.org> Co-developed-by: Josh Poimboeuf <jpoimboe@redhat.com>	2024-12-17 08:23:41 +01:00
Juergen Gross	b4845bb638	x86/xen: add central hypercall functions Add generic hypercall functions usable for all normal (i.e. not iret) hypercalls. Depending on the guest type and the processor vendor different functions need to be used due to the to be used instruction for entering the hypervisor: - PV guests need to use syscall - HVM/PVH guests on Intel need to use vmcall - HVM/PVH guests on AMD and Hygon need to use vmmcall As PVH guests need to issue hypercalls very early during boot, there is a 4th hypercall function needed for HVM/PVH which can be used on Intel and AMD processors. It will check the vendor type and then set the Intel or AMD specific function to use via static_call(). This is part of XSA-466 / CVE-2024-53241. Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com> Co-developed-by: Peter Zijlstra <peterz@infradead.org>	2024-12-17 08:23:29 +01:00
Dan Carpenter	7203d10e93	net: hinic: Fix cleanup in create_rxqs/txqs() There is a check for NULL at the start of create_txqs() and create_rxqs() which tess if "nic_dev->txqs" is non-NULL. The intention is that if the device is already open and the queues are already created then we don't create them a second time. However, the bug is that if we have an error in the create_txqs() then the pointer doesn't get set back to NULL. The NULL check at the start of the function will say that it's already open when it's not and the device can't be used. Set ->txqs back to NULL on cleanup on error. Fixes: `c3e79baf1b` ("net-next/hinic: Add logical Txq and Rxq") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/0cc98faf-a0ed-4565-a55b-0fa2734bc205@stanley.mountain Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-16 18:26:17 -08:00
Daniel Borkmann	e78c20f327	team: Fix feature exposure when no ports are present Small follow-up to align this to an equivalent behavior as the bond driver. The change in `3625920b62` ("teaming: fix vlan_features computing") removed the netdevice vlan_features when there is no team port attached, yet it leaves the full set of enc_features intact. Instead, leave the default features as pre `3625920b62`, and recompute once we do have ports attached. Also, similarly as in bonding case, call the netdev_base_features() helper on the enc_features. Fixes: `3625920b62` ("teaming: fix vlan_features computing") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20241213123657.401868-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-16 18:23:12 -08:00
Dan Carpenter	fbbd84af6b	chelsio/chtls: prevent potential integer overflow on 32bit The "gl->tot_len" variable is controlled by the user. It comes from process_responses(). On 32bit systems, the "gl->tot_len + sizeof(struct cpl_pass_accept_req) + sizeof(struct rss_header)" addition could have an integer wrapping bug. Use size_add() to prevent this. Fixes: `a089439478` ("crypto: chtls - Register chtls with net tls") Cc: stable@vger.kernel.org Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/c6bfb23c-2db2-4e1b-b8ab-ba3925c82ef5@stanley.mountain Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-16 18:08:11 -08:00
Jakub Kicinski	c8eb0c3ffd	Merge branch 'netdev-fix-repeated-netlink-messages-in-queue-dumps' Jakub Kicinski says: ==================== netdev: fix repeated netlink messages in queue dumps Fix dump continuation for queues and queue stats in the netdev family. Because we used post-increment when saving id of dumped queue next skb would re-dump the already dumped queue. ==================== Link: https://patch.msgid.link/20241213152244.3080955-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-16 17:30:14 -08:00
Jakub Kicinski	5712e323d4	selftests: net-drv: stats: sanity check netlink dumps Sanity check netlink dumps, to make sure dumps don't have repeated entries or gaps in IDs. Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/20241213152244.3080955-6-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-16 17:30:13 -08:00
Jakub Kicinski	1234810b16	selftests: net-drv: queues: sanity check netlink dumps This test already catches a netlink bug fixed by this series, but only when running on HW with many queues. Make sure the netdevsim instance created has a lot of queues, and constrain the size of the recv_buffer used by netlink. While at it test both rx and tx queues. Reviewed-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/20241213152244.3080955-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-16 17:30:13 -08:00
Jakub Kicinski	0518863407	selftests: net: support setting recv_size in YNL recv_size parameter allows constraining the buffer size for dumps. It's useful in testing kernel handling of dump continuation, IOW testing dumps which span multiple skbs. Let the tests set this parameter when initializing the YNL family. Keep the normal default, we don't want tests to unintentionally behave very differently than normal code. Reviewed-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/20241213152244.3080955-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-16 17:30:13 -08:00
Jakub Kicinski	ecc391a541	netdev: fix repeated netlink messages in queue stats The context is supposed to record the next queue to dump, not last dumped. If the dump doesn't fit we will restart from the already-dumped queue, duplicating the message. Before this fix and with the selftest improvements later in this series we see: # ./run_kselftest.sh -t drivers/net:stats.py timeout set to 45 selftests: drivers/net: stats.py KTAP version 1 1..5 ok 1 stats.check_pause ok 2 stats.check_fec ok 3 stats.pkt_byte_sum # Check\| At /root/ksft-net-drv/drivers/net/./stats.py, line 125, in qstat_by_ifindex: # Check\| ksft_eq(len(queues[qtype]), len(set(queues[qtype])), # Check failed 45 != 44 repeated queue keys # Check\| At /root/ksft-net-drv/drivers/net/./stats.py, line 127, in qstat_by_ifindex: # Check\| ksft_eq(len(queues[qtype]), max(queues[qtype]) + 1, # Check failed 45 != 44 missing queue keys # Check\| At /root/ksft-net-drv/drivers/net/./stats.py, line 125, in qstat_by_ifindex: # Check\| ksft_eq(len(queues[qtype]), len(set(queues[qtype])), # Check failed 45 != 44 repeated queue keys # Check\| At /root/ksft-net-drv/drivers/net/./stats.py, line 127, in qstat_by_ifindex: # Check\| ksft_eq(len(queues[qtype]), max(queues[qtype]) + 1, # Check failed 45 != 44 missing queue keys # Check\| At /root/ksft-net-drv/drivers/net/./stats.py, line 125, in qstat_by_ifindex: # Check\| ksft_eq(len(queues[qtype]), len(set(queues[qtype])), # Check failed 103 != 100 repeated queue keys # Check\| At /root/ksft-net-drv/drivers/net/./stats.py, line 127, in qstat_by_ifindex: # Check\| ksft_eq(len(queues[qtype]), max(queues[qtype]) + 1, # Check failed 103 != 100 missing queue keys # Check\| At /root/ksft-net-drv/drivers/net/./stats.py, line 125, in qstat_by_ifindex: # Check\| ksft_eq(len(queues[qtype]), len(set(queues[qtype])), # Check failed 102 != 100 repeated queue keys # Check\| At /root/ksft-net-drv/drivers/net/./stats.py, line 127, in qstat_by_ifindex: # Check\| ksft_eq(len(queues[qtype]), max(queues[qtype]) + 1, # Check failed 102 != 100 missing queue keys not ok 4 stats.qstat_by_ifindex ok 5 stats.check_down # Totals: pass:4 fail:1 xfail:0 xpass:0 skip:0 error:0 With the fix: # ./ksft-net-drv/run_kselftest.sh -t drivers/net:stats.py timeout set to 45 selftests: drivers/net: stats.py KTAP version 1 1..5 ok 1 stats.check_pause ok 2 stats.check_fec ok 3 stats.pkt_byte_sum ok 4 stats.qstat_by_ifindex ok 5 stats.check_down # Totals: pass:5 fail:0 xfail:0 xpass:0 skip:0 error:0 Fixes: `ab63a2387c` ("netdev: add per-queue statistics") Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20241213152244.3080955-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-16 17:30:11 -08:00
Jakub Kicinski	b1f3a2f5a7	netdev: fix repeated netlink messages in queue dump The context is supposed to record the next queue to dump, not last dumped. If the dump doesn't fit we will restart from the already-dumped queue, duplicating the message. Before this fix and with the selftest improvements later in this series we see: # ./run_kselftest.sh -t drivers/net:queues.py timeout set to 45 selftests: drivers/net: queues.py KTAP version 1 1..2 # Check\| At /root/ksft-net-drv/drivers/net/./queues.py, line 32, in get_queues: # Check\| ksft_eq(queues, expected) # Check failed 102 != 100 # Check\| At /root/ksft-net-drv/drivers/net/./queues.py, line 32, in get_queues: # Check\| ksft_eq(queues, expected) # Check failed 101 != 100 not ok 1 queues.get_queues ok 2 queues.addremove_queues # Totals: pass:1 fail:1 xfail:0 xpass:0 skip:0 error:0 not ok 1 selftests: drivers/net: queues.py # exit=1 With the fix: # ./ksft-net-drv/run_kselftest.sh -t drivers/net:queues.py timeout set to 45 selftests: drivers/net: queues.py KTAP version 1 1..2 ok 1 queues.get_queues ok 2 queues.addremove_queues # Totals: pass:2 fail:0 xfail:0 xpass:0 skip:0 error:0 Fixes: `6b6171db7f` ("netdev-genl: Add netlink framework functions for queue") Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20241213152244.3080955-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-16 17:30:07 -08:00
Thorsten Blum	0efbca0fec	nios2: Use str_yes_no() helper in show_cpuinfo() Remove hard-coded strings by using the str_yes_no() helper function. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>	2024-12-16 18:41:58 -06:00
Kees Cook	239d87327d	fortify: Hide run-time copy size from value range tracking GCC performs value range tracking for variables as a way to provide better diagnostics. One place this is regularly seen is with warnings associated with bounds-checking, e.g. -Wstringop-overflow, -Wstringop-overread, -Warray-bounds, etc. In order to keep the signal-to-noise ratio high, warnings aren't emitted when a value range spans the entire value range representable by a given variable. For example: unsigned int len; char dst[8]; ... memcpy(dst, src, len); If len's value is unknown, it has the full "unsigned int" range of [0, UINT_MAX], and GCC's compile-time bounds checks against memcpy() will be ignored. However, when a code path has been able to narrow the range: if (len > 16) return; memcpy(dst, src, len); Then the range will be updated for the execution path. Above, len is now [0, 16] when reading memcpy(), so depending on other optimizations, we might see a -Wstringop-overflow warning like: error: '__builtin_memcpy' writing between 9 and 16 bytes into region of size 8 [-Werror=stringop-overflow] When building with CONFIG_FORTIFY_SOURCE, the fortified run-time bounds checking can appear to narrow value ranges of lengths for memcpy(), depending on how the compiler constructs the execution paths during optimization passes, due to the checks against the field sizes. For example: if (p_size_field != SIZE_MAX && p_size != p_size_field && p_size_field < size) As intentionally designed, these checks only affect the kernel warnings emitted at run-time and do not block the potentially overflowing memcpy(), so GCC thinks it needs to produce a warning about the resulting value range that might be reaching the memcpy(). We have seen this manifest a few times now, with the most recent being with cpumasks: In function ‘bitmap_copy’, inlined from ‘cpumask_copy’ at ./include/linux/cpumask.h:839:2, inlined from ‘__padata_set_cpumasks’ at kernel/padata.c:730:2: ./include/linux/fortify-string.h:114:33: error: ‘__builtin_memcpy’ reading between 257 and 536870904 bytes from a region of size 256 [-Werror=stringop-overread] 114 \| #define __underlying_memcpy __builtin_memcpy \| ^ ./include/linux/fortify-string.h:633:9: note: in expansion of macro ‘__underlying_memcpy’ 633 \| __underlying_##op(p, q, __fortify_size); \ \| ^~~~~~~~~~~~~ ./include/linux/fortify-string.h:678:26: note: in expansion of macro ‘__fortify_memcpy_chk’ 678 \| #define memcpy(p, q, s) __fortify_memcpy_chk(p, q, s, \ \| ^~~~~~~~~~~~~~~~~~~~ ./include/linux/bitmap.h:259:17: note: in expansion of macro ‘memcpy’ 259 \| memcpy(dst, src, len); \| ^~~~~~ kernel/padata.c: In function ‘__padata_set_cpumasks’: kernel/padata.c:713:48: note: source object ‘pcpumask’ of size [0, 256] 713 \| cpumask_var_t pcpumask, \| ~~~~~~~~~~~~~~^~~~~~~~ This warning is _not_ emitted when CONFIG_FORTIFY_SOURCE is disabled, and with the recent -fdiagnostics-details we can confirm the origin of the warning is due to FORTIFY's bounds checking: ../include/linux/bitmap.h:259:17: note: in expansion of macro 'memcpy' 259 \| memcpy(dst, src, len); \| ^~~~~~ '__padata_set_cpumasks': events 1-2 ../include/linux/fortify-string.h:613:36: 612 \| if (p_size_field != SIZE_MAX && \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 613 \| p_size != p_size_field && p_size_field < size) \| ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~ \| \| \| (1) when the condition is evaluated to false \| (2) when the condition is evaluated to true '__padata_set_cpumasks': event 3 114 \| #define __underlying_memcpy __builtin_memcpy \| ^ \| \| \| (3) out of array bounds here Note that the cpumask warning started appearing since bitmap functions were recently marked __always_inline in commit `ed8cd2b3bd` ("bitmap: Switch from inline to __always_inline"), which allowed GCC to gain visibility into the variables as they passed through the FORTIFY implementation. In order to silence these false positives but keep otherwise deterministic compile-time warnings intact, hide the length variable from GCC with OPTIMIZE_HIDE_VAR() before calling the builtin memcpy. Additionally add a comment about why all the macro args have copies with const storage. Reported-by: "Thomas Weißschuh" <linux@weissschuh.net> Closes: https://lore.kernel.org/all/db7190c8-d17f-4a0d-bc2f-5903c79f36c2@t-8ch.de/ Reported-by: Nilay Shroff <nilay@linux.ibm.com> Closes: https://lore.kernel.org/all/20241112124127.1666300-1-nilay@linux.ibm.com/ Tested-by: Nilay Shroff <nilay@linux.ibm.com> Acked-by: Yury Norov <yury.norov@gmail.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Kees Cook <kees@kernel.org>	2024-12-16 16:23:07 -08:00
Murad Masimov	dd471e2577	hwmon: (tmp513) Fix interpretation of values of Temperature Result and Limit Registers The values returned by the driver after processing the contents of the Temperature Result and the Temperature Limit Registers do not correspond to the TMP512/TMP513 specifications. A raw register value is converted to a signed integer value by a sign extension in accordance with the algorithm provided in the specification, but due to the off-by-one error in the sign bit index, the result is incorrect. According to the TMP512 and TMP513 datasheets, the Temperature Result (08h to 0Bh) and Limit (11h to 14h) Registers are 13-bit two's complement integer values, shifted left by 3 bits. The value is scaled by 0.0625 degrees Celsius per bit. E.g., if regval = 1 1110 0111 0000 000, the output should be -25 degrees, but the driver will return +487 degrees. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: `59dfa75e5d` ("hwmon: Add driver for Texas Instruments TMP512/513 sensor chips.") Signed-off-by: Murad Masimov <m.masimov@maxima.ru> Link: https://lore.kernel.org/r/20241216173648.526-4-m.masimov@maxima.ru [groeck: fixed description line length] Signed-off-by: Guenter Roeck <linux@roeck-us.net>	2024-12-16 15:58:25 -08:00
Murad Masimov	da1d0e6ba2	hwmon: (tmp513) Fix Current Register value interpretation The value returned by the driver after processing the contents of the Current Register does not correspond to the TMP512/TMP513 specifications. A raw register value is converted to a signed integer value by a sign extension in accordance with the algorithm provided in the specification, but due to the off-by-one error in the sign bit index, the result is incorrect. Moreover, negative values will be reported as large positive due to missing sign extension from u32 to long. According to the TMP512 and TMP513 datasheets, the Current Register (07h) is a 16-bit two's complement integer value. E.g., if regval = 1000 0011 0000 0000, then the value must be (-32000 * lsb), but the driver will return (33536 * lsb). Fix off-by-one bug, and also cast data->curr_lsb_ua (which is of type u32) to long to prevent incorrect cast for negative values. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: `59dfa75e5d` ("hwmon: Add driver for Texas Instruments TMP512/513 sensor chips.") Signed-off-by: Murad Masimov <m.masimov@maxima.ru> Link: https://lore.kernel.org/r/20241216173648.526-3-m.masimov@maxima.ru [groeck: Fixed description line length] Signed-off-by: Guenter Roeck <linux@roeck-us.net>	2024-12-16 15:58:25 -08:00
Murad Masimov	74d7e038fd	hwmon: (tmp513) Fix interpretation of values of Shunt Voltage and Limit Registers The values returned by the driver after processing the contents of the Shunt Voltage Register and the Shunt Limit Registers do not correspond to the TMP512/TMP513 specifications. A raw register value is converted to a signed integer value by a sign extension in accordance with the algorithm provided in the specification, but due to the off-by-one error in the sign bit index, the result is incorrect. Moreover, the PGA shift calculated with the tmp51x_get_pga_shift function is relevant only to the Shunt Voltage Register, but is also applied to the Shunt Limit Registers. According to the TMP512 and TMP513 datasheets, the Shunt Voltage Register (04h) is 13 to 16 bit two's complement integer value, depending on the PGA setting. The Shunt Positive (0Ch) and Negative (0Dh) Limit Registers are 16-bit two's complement integer values. Below are some examples: * Shunt Voltage Register If PGA = 8, and regval = 1000 0011 0000 0000, then the decimal value must be -32000, but the value calculated by the driver will be 33536. * Shunt Limit Register If regval = 1000 0011 0000 0000, then the decimal value must be -32000, but the value calculated by the driver will be 768, if PGA = 1. Fix sign bit index, and also correct misleading comment describing the tmp51x_get_pga_shift function. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: `59dfa75e5d` ("hwmon: Add driver for Texas Instruments TMP512/513 sensor chips.") Signed-off-by: Murad Masimov <m.masimov@maxima.ru> Link: https://lore.kernel.org/r/20241216173648.526-2-m.masimov@maxima.ru [groeck: Fixed description and multi-line alignments] Signed-off-by: Guenter Roeck <linux@roeck-us.net>	2024-12-16 15:58:24 -08:00
Ilya Dryomov	18d44c5d06	ceph: allocate sparse_ext map only for sparse reads If mounted with sparseread option, ceph_direct_read_write() ends up making an unnecessarily allocation for O_DIRECT writes. Fixes: `03bc06c7b0` ("ceph: add new mount option to enable sparse reads") Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Markuze <amarkuze@redhat.com>	2024-12-16 23:25:44 +01:00
Ilya Dryomov	66e0c4f914	ceph: fix memory leak in ceph_direct_read_write() The bvecs array which is allocated in iter_get_bvecs_alloc() is leaked and pages remain pinned if ceph_alloc_sparse_ext_map() fails. There is no need to delay the allocation of sparse_ext map until after the bvecs array is set up, so fix this by moving sparse_ext allocation a bit earlier. Also, make a similar adjustment in __ceph_sync_read() for consistency (a leak of the same kind in __ceph_sync_read() has been addressed differently). Cc: stable@vger.kernel.org Fixes: `03bc06c7b0` ("ceph: add new mount option to enable sparse reads") Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Markuze <amarkuze@redhat.com>	2024-12-16 23:25:44 +01:00
Alex Markuze	9abee47580	ceph: improve error handling and short/overflow-read logic in __ceph_sync_read() This patch refines the read logic in __ceph_sync_read() to ensure more predictable and efficient behavior in various edge cases. - Return early if the requested read length is zero or if the file size (`i_size`) is zero. - Initialize the index variable (`idx`) where needed and reorder some code to ensure it is always set before use. - Improve error handling by checking for negative return values earlier. - Remove redundant encrypted file checks after failures. Only attempt filesystem-level decryption if the read succeeded. - Simplify leftover calculations to correctly handle cases where the read extends beyond the end of the file or stops short. This can be hit by continuously reading a file while, on another client, we keep truncating and writing new data into it. - This resolves multiple issues caused by integer and consequent buffer overflow (`pages` array being accessed beyond `num_pages`): - https://tracker.ceph.com/issues/67524 - https://tracker.ceph.com/issues/68980 - https://tracker.ceph.com/issues/68981 Cc: stable@vger.kernel.org Fixes: `1065da21e5` ("ceph: stop copying to iter at EOF on sync reads") Reported-by: Luis Henriques (SUSE) <luis.henriques@linux.dev> Signed-off-by: Alex Markuze <amarkuze@redhat.com> Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2024-12-16 23:25:43 +01:00
Ilya Dryomov	12eb22a5a6	ceph: validate snapdirname option length when mounting It becomes a path component, so it shouldn't exceed NAME_MAX characters. This was hardened in commit `c152737be2` ("ceph: Use strscpy() instead of strcpy() in __get_snap_name()"), but no actual check was put in place. Cc: stable@vger.kernel.org Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Markuze <amarkuze@redhat.com>	2024-12-16 23:25:43 +01:00
Max Kellermann	550f7ca98e	ceph: give up on paths longer than PATH_MAX If the full path to be built by ceph_mdsc_build_path() happens to be longer than PATH_MAX, then this function will enter an endless (retry) loop, effectively blocking the whole task. Most of the machine becomes unusable, making this a very simple and effective DoS vulnerability. I cannot imagine why this retry was ever implemented, but it seems rather useless and harmful to me. Let's remove it and fail with ENAMETOOLONG instead. Cc: stable@vger.kernel.org Reported-by: Dario Weißer <dario@cure53.de> Signed-off-by: Max Kellermann <max.kellermann@ionos.com> Reviewed-by: Alex Markuze <amarkuze@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2024-12-16 23:25:43 +01:00
Max Kellermann	d6fd6f8280	ceph: fix memory leaks in __ceph_sync_read() In two `break` statements, the call to ceph_release_page_vector() was missing, leaking the allocation from ceph_alloc_page_vector(). Instead of adding the missing ceph_release_page_vector() calls, the Ceph maintainers preferred to transfer page ownership to the `ceph_osd_request` by passing `own_pages=true` to osd_req_op_extent_osd_data_pages(). This requires postponing the ceph_osdc_put_request() call until after the block that accesses the `pages`. Cc: stable@vger.kernel.org Fixes: `03bc06c7b0` ("ceph: add new mount option to enable sparse reads") Fixes: `f0fe1e54cf` ("ceph: plumb in decryption during reads") Signed-off-by: Max Kellermann <max.kellermann@ionos.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2024-12-16 23:25:43 +01:00
Steven Rostedt	166438a432	ftrace: Do not find "true_parent" if HAVE_DYNAMIC_FTRACE_WITH_ARGS is not set When function tracing and function graph tracing are both enabled (in different instances) the "parent" of some of the function tracing events is "return_to_handler" which is the trampoline used by function graph tracing. To fix this, ftrace_get_true_parent_ip() was introduced that returns the "true" parent ip instead of the trampoline. To do this, the ftrace_regs_get_stack_pointer() is used, which uses kernel_stack_pointer(). The problem is that microblaze does not implement kerenl_stack_pointer() so when function graph tracing is enabled, the build fails. But microblaze also does not enabled HAVE_DYNAMIC_FTRACE_WITH_ARGS. That option has to be enabled by the architecture to reliably get the values from the fregs parameter passed in. When that config is not set, the architecture can also pass in NULL, which is not tested for in that function and could cause the kernel to crash. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Michal Simek <monstr@monstr.eu> Cc: Jeff Xie <jeff.xie@linux.dev> Link: https://lore.kernel.org/20241216164633.6df18e87@gandalf.local.home Fixes: `60b1f578b5` ("ftrace: Get the true parent ip for function tracer") Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2024-12-16 17:22:26 -05:00
Steven Rostedt	cc252bb592	fgraph: Still initialize idle shadow stacks when starting A bug was discovered where the idle shadow stacks were not initialized for offline CPUs when starting function graph tracer, and when they came online they were not traced due to the missing shadow stack. To fix this, the idle task shadow stack initialization was moved to using the CPU hotplug callbacks. But it removed the initialization when the function graph was enabled. The problem here is that the hotplug callbacks are called when the CPUs come online, but the idle shadow stack initialization only happens if function graph is currently active. This caused the online CPUs to not get their shadow stack initialized. The idle shadow stack initialization still needs to be done when the function graph is registered, as they will not be allocated if function graph is not registered. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20241211135335.094ba282@batman.local.home Fixes: `2c02f7375e` ("fgraph: Use CPU hotplug mechanism to initialize idle shadow stacks") Reported-by: Linus Walleij <linus.walleij@linaro.org> Tested-by: Linus Walleij <linus.walleij@linaro.org> Closes: https://lore.kernel.org/all/CACRpkdaTBrHwRbbrphVy-=SeDz6MSsXhTKypOtLrTQ+DgGAOcQ@mail.gmail.com/ Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2024-12-16 16:03:33 -05:00
Qiang Yu	fb8e7b33c2	arm64: dts: qcom: x1e80100: Fix up BAR space size for PCIe6a As per memory map table, the region for PCIe6a is 64MByte. Hence, set the size of 32 bit non-prefetchable memory region beginning on address 0x70300000 as 0x3d00000 so that BAR space assigned to BAR registers can be allocated from 0x70300000 to 0x74000000. Fixes: `7af1418500` ("arm64: dts: qcom: x1e80100: Fix up BAR spaces") Cc: stable@vger.kernel.org Signed-off-by: Qiang Yu <quic_qianyu@quicinc.com> Reviewed-by: Johan Hovold <johan+linaro@kernel.org> Link: https://lore.kernel.org/r/20241113080508.3458849-1-quic_qianyu@quicinc.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>	2024-12-16 14:32:00 -06:00
Johan Hovold	1fb5cf0d16	Revert "arm64: dts: qcom: x1e78100-t14s: enable otg on usb-c ports" This reverts commit `1a48dd7b9a`. A recent change enabling OTG mode on the Lenovo ThinkPad T14s USB-C ports can break SuperSpeed device hotplugging. The host controller is enumerated, but the device is not: xhci-hcd xhci-hcd.5.auto: xHCI Host Controller xhci-hcd xhci-hcd.5.auto: new USB bus registered, assigned bus number 3 xhci-hcd xhci-hcd.5.auto: hcc params 0x0110ffc5 hci version 0x110 quirks 0x000080a000000810 xhci-hcd xhci-hcd.5.auto: irq 247, io mem 0x0a800000 xhci-hcd xhci-hcd.5.auto: xHCI Host Controller xhci-hcd xhci-hcd.5.auto: new USB bus registered, assigned bus number 4 xhci-hcd xhci-hcd.5.auto: Host supports USB 3.1 Enhanced SuperSpeed hub 3-0:1.0: USB hub found hub 3-0:1.0: 1 port detected hub 4-0:1.0: USB hub found hub 4-0:1.0: 1 port detected Once this happens on either of the two ports, no amount of disconnecting and reconnecting makes the SuperSpeed device be enumerated, while FullSpeed device enumeration still works. With retimer (and orientation detection) support not even merged yet, let's revert at least until we have stable host mode in mainline. Fixes: `1a48dd7b9a` ("arm64: dts: qcom: x1e78100-t14s: enable otg on usb-c ports") Cc: Jonathan Marek <jonathan@marek.ca> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Reviewed-by: Abel Vesa <abel.vesa@linaro.org> Link: https://lore.kernel.org/r/20241206172402.20724-1-johan+linaro@kernel.org Signed-off-by: Bjorn Andersson <andersson@kernel.org>	2024-12-16 14:31:32 -06:00
Daniel Lezcano	65c8c78cc7	thermal/thresholds: Fix uapi header macros leading to a compilation error The macros giving the direction of the crossing thresholds use the BIT macro which is not exported to the userspace. Consequently when an userspace program includes the header, it fails to compile. Replace the macros by their litteral to allow the compilation of userspace program using this header. Fixes: `445936f9e2` ("thermal: core: Add user thresholds support") Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://patch.msgid.link/20241212201311.4143196-1-daniel.lezcano@linaro.org [ rjw: Add Fixes: ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2024-12-16 21:30:20 +01:00
Linus Torvalds	f44d154d6e	Merge tag 'soc-fixes-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull SoC fixes from Arnd Bergmann: "Three small fixes for the soc tree: - devicetee fix for the Arm Juno reference machine, to allow more interesting PCI configurations - build fix for SCMI firmware on the NXP i.MX platform - fix for a race condition in Arm FF-A firmware" * tag 'soc-fixes-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: arm64: dts: fvp: Update PCIe bus-range property firmware: arm_ffa: Fix the race around setting ffa_dev->properties firmware: arm_scmi: Fix i.MX build dependency	2024-12-16 10:10:53 -08:00
Linus Torvalds	dc690bc256	Merge tag 'platform-drivers-x86-v6.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Ilpo Järvinen: - alienware-wmi: - Add support for Alienware m16 R1 AMD - Do not setup legacy LED control with X and G Series - intel/ifs: Clearwater Forest support - intel/vsec: Panther Lake support - p2sb: Do not hide the device if BIOS left it unhidden - touchscreen_dmi: Add SARY Tab 3 tablet information * tag 'platform-drivers-x86-v6.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86/intel/vsec: Add support for Panther Lake platform/x86/intel/ifs: Add Clearwater Forest to CPU support list platform/x86: touchscreen_dmi: Add info for SARY Tab 3 tablet p2sb: Do not scan and remove the P2SB device when it is unhidden p2sb: Move P2SB hide and unhide code to p2sb_scan_and_cache() p2sb: Introduce the global flag p2sb_hidden_by_bios p2sb: Factor out p2sb_read_from_cache() alienware-wmi: Adds support to Alienware m16 R1 AMD alienware-wmi: Fix X Series and G Series quirks	2024-12-16 10:01:57 -08:00
Mark Brown	001a3d5e8b	ASoC: Intel: sof_sdw: Update DMI matches for Lenovo Merge series from Bard Liao <yung-chuan.liao@linux.intel.com>: The DMI match information for these models has changed so the match entries need updates.	2024-12-16 17:10:37 +00:00
Greg Kroah-Hartman	59275b7633	Merge tag 'usb-serial-6.13-rc3' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus Johan writes: USB-serial device ids for 6.13-rc3 Here are some new modem device ids. All have been in linux-next with no reported issues. * tag 'usb-serial-6.13-rc3' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial: USB: serial: option: add Telit FE910C04 rmnet compositions USB: serial: option: add MediaTek T7XX compositions USB: serial: option: add Netprisma LCUK54 modules for WWAN Ready USB: serial: option: add MeiG Smart SLM770A USB: serial: option: add TCL IK512 MBIM & ECM	2024-12-16 16:31:47 +01:00
Greg Kroah-Hartman	e16ebd9d83	Merge tag 'mhi-fixes-for-v6.13' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mani/mhi into char-misc-linus Manivannan writes: MHI Host ======== - Fix the MHI BAR mapping by passing BAR number to pcim_iomap_region() API instead of BAR mask. This fixes a regression for Qualcomm MHI modems. * tag 'mhi-fixes-for-v6.13' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mani/mhi: bus: mhi: host: pci_generic: fix MHI BAR mapping	2024-12-16 16:30:58 +01:00
Greg Kroah-Hartman	6ffc565c24	Merge tag 'iio-fixes-for-6.13a' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jic23/iio into char-misc-linus Jonathan writes: IIO: 1st set of fixes for 6.13 cycle. The usual mixed back of fixes for ancient and recent bugs A number of these were from an audit by Javier Carrasco of places we may leak stack data via holes in structures passed to userspace that were not explicitly zeroed. Very helpful effort after we found a similar bug in review of new code. This affected: - dummy driver - kionix,kmx61 - murrata,zpa2326 - rockchip,saradc - rohm,bh1745 - veml,vcnl4035 - ti,ads1119 - ti,ads8688 - ti,tmp0006 Other fixes: core,inkern - Underflow fo reference count issue in error path of iio_channel_get_all() for devices we haven't yet gotten. tests - Kconfig typo fix so it's possible to tell what test is being enabled. - Check for allocation failures. adi,ad4695 - Ensure timing requirement wrt to final clock edge vs next conversion signal are met by adding an additional SPI transfer. adi,ad7124 - Ensure channels are disabled at probe to avoid wrong data if channel 0 is not the first one read. adi,ad7173 - Fix handing of multiple devices by not editing a single static structure and instead making a per instance copy. adi,ad9467 - Fix handing of multiple devices by not editing a single static structure and instead making a per instance copy. adi,ad9832 - Off by one error on input verification for phase control adi,ad9834 - Off by one error on input verification for phase control. atmel,at91 - In an error path don't use a variable that hasn't been initialized yet. invensense,icm42600 - SPI burst write does not work for some icm426000 chips, disable it. - Ensure correct handling of timestamps after resume. st,sensors - Add back accidentally dropped IIS2MDC compatible in binding doc. st,stm32-dfsdm: - label property was accidentally made a required property. Make it optional again to fix use of older DT. ti,ads1119 - Use a fixed size for the channel data rather than an integer, bringing the code inline with the advertised 16 bit channel size and avoiding userspace misinterpreting the data. ti,ads124s08 - Use _cansleep gpio calls as no reason to prevent sleeping and it was stopping a new board design working (trivial fix). ti,ads1298 - Add a check on failure of devm_kasprintf() * tag 'iio-fixes-for-6.13a' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jic23/iio: (27 commits) iio: adc: ti-ads1119: fix sample size in scan struct for triggered buffer iio: temperature: tmp006: fix information leak in triggered buffer iio: inkern: call iio_device_put() only on mapped devices iio: adc: ad9467: Fix the "don't allow reading vref if not available" case iio: adc: at91: call input_free_device() on allocated iio_dev iio: adc: ad7173: fix using shared static info struct iio: adc: ti-ads124s08: Use gpiod_set_value_cansleep() iio: adc: ti-ads1119: fix information leak in triggered buffer iio: pressure: zpa2326: fix information leak in triggered buffer iio: adc: rockchip_saradc: fix information leak in triggered buffer iio: imu: kmx61: fix information leak in triggered buffer iio: light: vcnl4035: fix information leak in triggered buffer iio: light: bh1745: fix information leak in triggered buffer iio: adc: ti-ads8688: fix information leak in triggered buffer iio: dummy: iio_simply_dummy_buffer: fix information leak in triggered buffer iio: test: Fix GTS test config dt-bindings: iio: st-sensors: Re-add IIS2MDC magnetometer iio: adc: ti-ads1298: Add NULL check in ads1298_init iio: adc: stm32-dfsdm: handle label as an optional property iio: adc: ad4695: fix buffered read, single sample timings ...	2024-12-16 16:29:55 +01:00
Chen-Yu Tsai	6f4a0fd03c	ASoC: dt-bindings: realtek,rt5645: Fix CPVDD voltage comment Both the ALC5645 and ALC5650 datasheets specify a recommended voltage of 1.8V for CPVDD, not 3.5V. Fix the comment. Cc: Matthias Brugger <matthias.bgg@gmail.com> Fixes: `26aa19174f` ("ASoC: dt-bindings: rt5645: add suppliers") Fixes: `83d43ab0a1` ("ASoC: dt-bindings: realtek,rt5645: Convert to dtschema") Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://patch.msgid.link/20241211035403.4157760-1-wenst@chromium.org Signed-off-by: Mark Brown <broonie@kernel.org>	2024-12-16 15:12:40 +00:00
Richard Fitzgerald	ba7d47a54b	ASoC: Intel: sof_sdw: Fix DMI match for Lenovo 21QA and 21QB Update the DMI match for a Lenovo laptop to the new DMI identifier. This laptop ships with a different DMI identifier to what was expected, and now has two identifiers. Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com> Fixes: `ea657f6b24` ("ASoC: Intel: sof_sdw: Add quirk for cs42l43 system using host DMICs") Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://patch.msgid.link/20241216140821.153670-3-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-12-16 14:28:58 +00:00
Richard Fitzgerald	7c449ef0fd	ASoC: Intel: sof_sdw: Fix DMI match for Lenovo 21Q6 and 21Q7 Update the DMI match for a Lenovo laptop to the new DMI identifier. This laptop ships with a different DMI identifier to what was expected, and now has two identifiers. Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com> Fixes: `83c062ae81` ("ASoC: Intel: sof_sdw: Add quirks for some new Lenovo laptops") Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://patch.msgid.link/20241216140821.153670-2-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-12-16 14:28:57 +00:00
Kalesh AP	7179fe0074	RDMA/bnxt_re: Fix reporting hw_ver in query_device Driver currently populates subsystem_device id in the "hw_ver" field of ib_attr structure in query_device. Updated to populate PCI revision ID. Fixes: `1ac5a40479` ("RDMA/bnxt_re: Add bnxt_re RoCE driver") Reviewed-by: Preethi G <preethi.gurusiddalingeswaraswamy@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/20241211083931.968831-6-kalesh-anakkur.purayil@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-12-16 08:33:31 -05:00
Hongguang Gao	34db8ec931	RDMA/bnxt_re: Fix to export port num to ib_query_qp Current driver implementation doesn't populate the port_num field in query_qp. Adding the code to convert internal firmware port id to ibv defined port number and export it. Reviewed-by: Saravanan Vajravel <saravanan.vajravel@broadcom.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Hongguang Gao <hongguang.gao@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/20241211083931.968831-5-kalesh-anakkur.purayil@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-12-16 08:33:31 -05:00
Damodharam Ammepalli	da2132e683	RDMA/bnxt_re: Fix setting mandatory attributes for modify_qp Firmware expects "min_rnr_timer" as a mandatory attribute in MODIFY_QP command during the RTR-RTS transition. This needs to be enforced by the driver which is missing while setting bnxt_set_mandatory_attributes that sends these flags as part of modify_qp optimization. Fixes: `82c32d2192` ("RDMA/bnxt_re: Add support for optimized modify QP") Reviewed-by: Rukhsana Ansari <rukhsana.ansari@broadcom.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/20241211083931.968831-4-kalesh-anakkur.purayil@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-12-16 08:33:31 -05:00
Saravanan Vajravel	798653a0ee	RDMA/bnxt_re: Add check for path mtu in modify_qp When RDMA app configures path MTU, add a check in modify_qp verb to make sure that it doesn't go beyond interface MTU. If this check fails, driver will fail the modify_qp verb. Fixes: `1ac5a40479` ("RDMA/bnxt_re: Add bnxt_re RoCE driver") Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Saravanan Vajravel <saravanan.vajravel@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/20241211083931.968831-3-kalesh-anakkur.purayil@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-12-16 08:33:31 -05:00
Kalesh AP	38651476e4	RDMA/bnxt_re: Fix the check for 9060 condition The check for 9060 condition should only be made for legacy chips. Fixes: `9152e0b722` ("RDMA/bnxt_re: HW workarounds for handling specific conditions") Reviewed-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/20241211083931.968831-2-kalesh-anakkur.purayil@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-12-16 08:33:31 -05:00
Gao Xiang	6422cde1b0	erofs: use buffered I/O for file-backed mounts by default For many use cases (e.g. container images are just fetched from remote), performance will be impacted if underlay page cache is up-to-date but direct i/o flushes dirty pages first. Instead, let's use buffered I/O by default to keep in sync with loop devices and add a (re)mount option to explicitly give a try to use direct I/O if supported by the underlying files. The container startup time is improved as below: [workload] docker.io/library/workpress:latest unpack 1st run non-1st runs EROFS snapshotter buffered I/O file 4.586404265s 0.308s 0.198s EROFS snapshotter direct I/O file 4.581742849s 2.238s 0.222s EROFS snapshotter loop 4.596023152s 0.346s 0.201s Overlayfs snapshotter 5.382851037s 0.206s 0.214s Fixes: `fb17675026` ("erofs: add file-backed mount support") Cc: Derek McGowan <derek@mcg.dev> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20241212134336.2059899-1-hsiangkao@linux.alibaba.com	2024-12-16 21:02:07 +08:00
Gao Xiang	f8d920a402	erofs: reference `struct erofs_device_info` for erofs_map_dev Record `m_sb` and `m_dif` to replace `m_fscache`, `m_daxdev`, `m_fp` and `m_dax_part_off` in order to simplify the codebase. Note that `m_bdev` is still left since it can be assigned from `sb->s_bdev` directly. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20241212235401.2857246-1-hsiangkao@linux.alibaba.com	2024-12-16 21:02:06 +08:00
Gao Xiang	7b00af2c54	erofs: use `struct erofs_device_info` for the primary device Instead of just listing each one directly in `struct erofs_sb_info` except that we still use `sb->s_bdev` for the primary block device. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20241216125310.930933-2-hsiangkao@linux.alibaba.com	2024-12-16 21:01:59 +08:00
Venkata Prasad Potturu	88438444fd	ASoC: amd: ps: Fix for enabling DMIC on acp63 platform via _DSD entry Add condition check to register ACP PDM sound card by reading _WOV acpi entry. Fixes: `0386d765f2` ("ASoC: amd: ps: refactor acp device configuration read logic") Signed-off-by: Venkata Prasad Potturu <venkataprasad.potturu@amd.com> Link: https://patch.msgid.link/20241213061147.1060451-1-venkataprasad.potturu@amd.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-12-16 12:31:03 +00:00
Thomas Gleixner	a60b990798	PCI/MSI: Handle lack of irqdomain gracefully Alexandre observed a warning emitted from pci_msi_setup_msi_irqs() on a RISCV platform which does not provide PCI/MSI support: WARNING: CPU: 1 PID: 1 at drivers/pci/msi/msi.h:121 pci_msi_setup_msi_irqs+0x2c/0x32 __pci_enable_msix_range+0x30c/0x596 pci_msi_setup_msi_irqs+0x2c/0x32 pci_alloc_irq_vectors_affinity+0xb8/0xe2 RISCV uses hierarchical interrupt domains and correctly does not implement the legacy fallback. The warning triggers from the legacy fallback stub. That warning is bogus as the PCI/MSI layer knows whether a PCI/MSI parent domain is associated with the device or not. There is a check for MSI-X, which has a legacy assumption. But that legacy fallback assumption is only valid when legacy support is enabled, but otherwise the check should simply return -ENOTSUPP. Loongarch tripped over the same problem and blindly enabled legacy support without implementing the legacy fallbacks. There are weak implementations which return an error, so the problem was papered over. Correct pci_msi_domain_supports() to evaluate the legacy mode and add the missing supported check into the MSI enable path to complete it. Fixes: `d2a463b297` ("PCI/MSI: Reject multi-MSI early") Reported-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/87ed2a8ow5.ffs@tglx	2024-12-16 10:59:47 +01:00
Mika Westerberg	24740385cb	thunderbolt: Improve redrive mode handling When USB-C monitor is connected directly to Intel Barlow Ridge host, it goes into "redrive" mode that basically routes the DisplayPort signals directly from the GPU to the USB-C monitor without any tunneling needed. However, the host router must be powered on for this to work. Aaron reported that there are a couple of cases where this will not work with the current code: - Booting with USB-C monitor plugged in. - Plugging in USB-C monitor when the host router is in sleep state (runtime suspended). - Plugging in USB-C device while the system is in system sleep state. In all these cases once the host router is runtime suspended the picture on the connected USB-C display disappears too. This is certainly not what the user expected. For this reason improve the redrive mode handling to keep the host router from runtime suspending when detect that any of the above cases is happening. Fixes: `a75e0684ef` ("thunderbolt: Keep the domain powered when USB4 port is in redrive mode") Reported-by: Aaron Rainbolt <arainbolt@kfocus.org> Closes: https://lore.kernel.org/linux-usb/20241009220118.70bfedd0@kf-ir16/ Cc: stable@vger.kernel.org Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>	2024-12-16 09:59:38 +02:00
Namjae Jeon	fe4ed2f09b	ksmbd: conn lock to serialize smb2 negotiate If client send parallel smb2 negotiate request on same connection, ksmbd_conn can be racy. smb2 negotiate handling that are not performance-related can be serialized with conn lock. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2024-12-15 22:20:03 -06:00
Marios Makassikis	43fb7bce88	ksmbd: fix broken transfers when exceeding max simultaneous operations Since commit `0a77d947f5` ("ksmbd: check outstanding simultaneous SMB operations"), ksmbd enforces a maximum number of simultaneous operations for a connection. The problem is that reaching the limit causes ksmbd to close the socket, and the client has no indication that it should have slowed down. This behaviour can be reproduced by setting "smb2 max credits = 128" (or lower), and transferring a large file (25GB). smbclient fails as below: $ smbclient //192.168.1.254/testshare -U user%pass smb: \> put file.bin cli_push returned NT_STATUS_USER_SESSION_DELETED putting file file.bin as \file.bin smb2cli_req_compound_submit: Insufficient credits. 0 available, 1 needed NT_STATUS_INTERNAL_ERROR closing remote file \file.bin smb: \> smb2cli_req_compound_submit: Insufficient credits. 0 available, 1 needed Windows clients fail with 0x8007003b (with smaller files even). Fix this by delaying reading from the socket until there's room to allocate a request. This effectively applies backpressure on the client, so the transfer completes, albeit at a slower rate. Fixes: `0a77d947f5` ("ksmbd: check outstanding simultaneous SMB operations") Signed-off-by: Marios Makassikis <mmakassikis@freebox.fr> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2024-12-15 22:20:03 -06:00
Marios Makassikis	83c47d9e0c	ksmbd: count all requests in req_running counter This changes the semantics of req_running to count all in-flight requests on a given connection, rather than the number of elements in the conn->request list. The latter is used only in smb2_cancel, and the counter is not used Signed-off-by: Marios Makassikis <mmakassikis@freebox.fr> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2024-12-15 22:20:03 -06:00
Thiébaud Weksteen	900f83cf37	selinux: ignore unknown extended permissions When evaluating extended permissions, ignore unknown permissions instead of calling BUG(). This commit ensures that future permissions can be added without interfering with older kernels. Cc: stable@vger.kernel.org Fixes: `fa1aa143ac` ("selinux: extended permissions for ioctls") Signed-off-by: Thiébaud Weksteen <tweek@google.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2024-12-15 21:59:03 -05:00
Linus Torvalds	78d4f34e21	Linux 6.13-rc3	2024-12-15 15:58:23 -08:00
Linus Torvalds	42a19aa170	Merge tag 'arc-6.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull ARC fixes from Vineet Gupta: - Sundry build and misc fixes * tag 'arc-6.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: ARC: build: Try to guess GCC variant of cross compiler ARC: bpf: Correct conditional check in 'check_jmp_32' ARC: dts: Replace deprecated snps,nr-gpios property for snps,dw-apb-gpio-port devices ARC: build: Use __force to suppress per-CPU cmpxchg warnings ARC: fix reference of dependency for PAE40 config ARC: build: disallow invalid PAE40 + 4K page config arc: rename aux.h to arc_aux.h	2024-12-15 15:38:12 -08:00
Linus Torvalds	7031a38ab7	Merge tag 'efi-fixes-for-v6.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI fixes from Ard Biesheuvel: - Limit EFI zboot to GZIP and ZSTD before it comes in wider use - Fix inconsistent error when looking up a non-existent file in efivarfs with a name that does not adhere to the NAME-GUID format - Drop some unused code * tag 'efi-fixes-for-v6.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: efi/esrt: remove esre_attribute::store() efivarfs: Fix error on non-existent file efi/zboot: Limit compression options to GZIP and ZSTD	2024-12-15 15:33:41 -08:00
Linus Torvalds	151167d85a	Merge tag 'i2c-for-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "i2c host fixes: PNX used the wrong unit for timeouts, Nomadik was missing a sentinel, and RIIC was missing rounding up" * tag 'i2c-for-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: riic: Always round-up when calculating bus period i2c: nomadik: Add missing sentinel to match table i2c: pnx: Fix timeout in wait functions	2024-12-15 15:29:07 -08:00
Nikita Yushchenko	922b4b955a	net: renesas: rswitch: rework ts tags management The existing linked list based implementation of how ts tags are assigned and managed is unsafe against concurrency and corner cases: - element addition in tx processing can race against element removal in ts queue completion, - element removal in ts queue completion can race against element removal in device close, - if a large number of frames gets added to tx queue without ts queue completions in between, elements with duplicate tag values can get added. Use a different implementation, based on per-port used tags bitmaps and saved skb arrays. Safety for addition in tx processing vs removal in ts completion is provided by: tag = find_first_zero_bit(...); smp_mb(); <write rdev->ts_skb[tag]> set_bit(...); vs <read rdev->ts_skb[tag]> smp_mb(); clear_bit(...); Safety for removal in ts completion vs removal in device close is provided by using atomic read-and-clear for rdev->ts_skb[tag]: ts_skb = xchg(&rdev->ts_skb[tag], NULL); if (ts_skb) <handle it> Fixes: `33f5d733b5` ("net: renesas: rswitch: Improve TX timestamp accuracy") Signed-off-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com> Link: https://patch.msgid.link/20241212062558.436455-1-nikita.yoush@cogentembedded.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-15 14:39:10 -08:00
Vasily Gorbik	282da38b46	s390/mm: Consider KMSAN modules metadata for paging levels The calculation determining whether to use three- or four-level paging didn't account for KMSAN modules metadata. Include this metadata in the virtual memory size calculation to ensure correct paging mode selection and avoiding potentially unnecessary physical memory size limitations. Fixes: `65ca73f9fb` ("s390/mm: define KMSAN metadata for vmalloc and modules") Acked-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>	2024-12-15 23:35:09 +01:00
Jakub Kicinski	cb85f2b897	Merge branch 'ionic-minor-code-fixes' Shannon Nelson says: ==================== ionic: minor code fixes These are a couple of code fixes for the ionic driver. ==================== Link: https://patch.msgid.link/20241212213157.12212-1-shannon.nelson@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-15 14:33:33 -08:00
Shannon Nelson	b096d62ba1	ionic: use ee->offset when returning sprom data Some calls into ionic_get_module_eeprom() don't use a single full buffer size, but instead multiple calls with an offset. Teach our driver to use the offset correctly so we can respond appropriately to the caller. Fixes: `4d03e00a21` ("ionic: Add initial ethtool support") Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20241212213157.12212-4-shannon.nelson@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-15 14:33:31 -08:00
Shannon Nelson	746e6ae2e2	ionic: no double destroy workqueue There are some FW error handling paths that can cause us to try to destroy the workqueue more than once, so let's be sure we're checking for that. The case where this popped up was in an AER event where the handlers got called in such a way that ionic_reset_prepare() and thus ionic_dev_teardown() got called twice in a row. The second time through the workqueue was already destroyed, and destroy_workqueue() choked on the bad wq pointer. We didn't hit this in AER handler testing before because at that time we weren't using a private workqueue. Later we replaced the use of the system workqueue with our own private workqueue but hadn't rerun the AER handler testing since then. Fixes: `9e25450da7` ("ionic: add private workqueue per-device") Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20241212213157.12212-3-shannon.nelson@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-15 14:33:31 -08:00
Brett Creeley	9590d32e09	ionic: Fix netdev notifier unregister on failure If register_netdev() fails, then the driver leaks the netdev notifier. Fix this by calling ionic_lif_unregister() on register_netdev() failure. This will also call ionic_lif_unregister_phc() if it has already been registered. Fixes: `30b87ab4c0` ("ionic: remove lif list concept") Signed-off-by: Brett Creeley <brett.creeley@amd.com> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20241212213157.12212-2-shannon.nelson@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-15 14:33:31 -08:00
Donald Hunter	663ad7481f	tools/net/ynl: fix sub-message key lookup for nested attributes Use the correct attribute space for sub-message key lookup in nested attributes when adding attributes. This fixes rt_link where the "kind" key and "data" sub-message are nested attributes in "linkinfo". For example: ./tools/net/ynl/cli.py \ --create \ --spec Documentation/netlink/specs/rt_link.yaml \ --do newlink \ --json '{"link": 99, "linkinfo": { "kind": "vlan", "data": {"id": 4 } } }' Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Fixes: `ab463c4342` ("tools/net/ynl: Add support for encoding sub-messages") Link: https://patch.msgid.link/20241213130711.40267-1-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-15 13:30:43 -08:00
Eric Dumazet	ee76746387	netdevsim: prevent bad user input in nsim_dev_health_break_write() If either a zero count or a large one is provided, kernel can crash. Fixes: `82c93a87bf` ("netdevsim: implement couple of testing devlink health reporters") Reported-by: syzbot+ea40e4294e58b0292f74@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/675c6862.050a0220.37aaf.00b1.GAE@google.com/T/#u Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20241213172518.2415666-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-15 13:26:47 -08:00
Vladimir Oltean	2d5df3a680	net: mscc: ocelot: fix incorrect IFH SRC_PORT field in ocelot_ifh_set_basic() Packets injected by the CPU should have a SRC_PORT field equal to the CPU port module index in the Analyzer block (ocelot->num_phys_ports). The blamed commit copied the ocelot_ifh_set_basic() call incorrectly from ocelot_xmit_common() in net/dsa/tag_ocelot.c. Instead of calling with "x", it calls with BIT_ULL(x), but the field is not a port mask, but rather a single port index. [ side note: this is the technical debt of code duplication :( ] The error used to be silent and doesn't appear to have other user-visible manifestations, but with new changes in the packing library, it now fails loudly as follows: ------------[ cut here ]------------ Cannot store 0x40 inside bits 46-43 - will truncate sja1105 spi2.0: xmit timed out WARNING: CPU: 1 PID: 102 at lib/packing.c:98 __pack+0x90/0x198 sja1105 spi2.0: timed out polling for tstamp CPU: 1 UID: 0 PID: 102 Comm: felix_xmit Tainted: G W N 6.13.0-rc1-00372-gf706b85d972d-dirty #2605 Call trace: __pack+0x90/0x198 (P) __pack+0x90/0x198 (L) packing+0x78/0x98 ocelot_ifh_set_basic+0x260/0x368 ocelot_port_inject_frame+0xa8/0x250 felix_port_deferred_xmit+0x14c/0x258 kthread_worker_fn+0x134/0x350 kthread+0x114/0x138 The code path pertains to the ocelot switchdev driver and to the felix secondary DSA tag protocol, ocelot-8021q. Here seen with ocelot-8021q. The messenger (packing) is not really to blame, so fix the original commit instead. Fixes: `e1b9e80236` ("net: mscc: ocelot: fix QoS class for injected packets with "ocelot-8021q"") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241212165546.879567-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-15 13:12:58 -08:00
Linus Torvalds	dccbe2047a	Merge tag 'edac_urgent_for_v6.13_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull EDAC fix from Borislav Petkov: - Make sure amd64_edac loads successfully on certain Zen4 memory configurations * tag 'edac_urgent_for_v6.13_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC/amd64: Simplify ECC check on unified memory controllers	2024-12-15 10:01:10 -08:00
Linus Torvalds	f7c7a1ba22	Merge tag 'irq_urgent_for_v6.13_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Borislav Petkov: - Disable the secure programming interface of the GIC500 chip in the RK3399 SoC to fix interrupt priority assignment and even make a dead machine boot again when the gic-v3 driver enables pseudo NMIs - Correct the declaration of a percpu variable to fix several sparse warnings * tag 'irq_urgent_for_v6.13_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/gic-v3: Work around insecure GIC integrations irqchip/gic: Correct declaration of *percpu_base pointer in union gic_base	2024-12-15 09:58:27 -08:00
Linus Torvalds	acd855a949	Merge tag 'sched_urgent_for_v6.13_rc3-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Borislav Petkov: - Prevent incorrect dequeueing of the deadline dlserver helper task and fix its time accounting - Properly track the CFS runqueue runnable stats - Check the total number of all queued tasks in a sched fair's runqueue hierarchy before deciding to stop the tick - Fix the scheduling of the task that got woken last (NEXT_BUDDY) by preventing those from being delayed * tag 'sched_urgent_for_v6.13_rc3-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/dlserver: Fix dlserver time accounting sched/dlserver: Fix dlserver double enqueue sched/eevdf: More PELT vs DELAYED_DEQUEUE sched/fair: Fix sched_can_stop_tick() for fair tasks sched/fair: Fix NEXT_BUDDY	2024-12-15 09:38:03 -08:00
Linus Torvalds	81576a9a27	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm fixes from Paolo Bonzini: "ARM64: - Fix confusion with implicitly-shifted MDCR_EL2 masks breaking SPE/TRBE initialization - Align nested page table walker with the intended memory attribute combining rules of the architecture - Prevent userspace from constraining the advertised ASID width, avoiding horrors of guest TLBIs not matching the intended context in hardware - Don't leak references on LPIs when insertion into the translation cache fails RISC-V: - Replace csr_write() with csr_set() for HVIEN PMU overflow bit x86: - Cache CPUID.0xD XSTATE offsets+sizes during module init On Intel's Emerald Rapids CPUID costs hundreds of cycles and there are a lot of leaves under 0xD. Getting rid of the CPUIDs during nested VM-Enter and VM-Exit is planned for the next release, for now just cache them: even on Skylake that is 40% faster" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: Cache CPUID.0xD XSTATE offsets+sizes during module init RISC-V: KVM: Fix csr_write -> csr_set for HVIEN PMU overflow bit KVM: arm64: vgic-its: Add error handling in vgic_its_cache_translation KVM: arm64: Do not allow ID_AA64MMFR0_EL1.ASIDbits to be overridden KVM: arm64: Fix S1/S2 combination when FWB==1 and S2 has Device memory type arm64: Fix usage of new shifted MDCR_EL2 values	2024-12-15 09:26:13 -08:00
David S. Miller	c296c0bf45	Merge branch 'smc-fixes' Guangguan Wang says: ==================== net: several fixes for smc v1 -> v2: rewrite patch #2 suggested by Paolo. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2024-12-15 12:35:00 +00:00
Guangguan Wang	c5b8ee5022	net/smc: check return value of sock_recvmsg when draining clc data When receiving clc msg, the field length in smc_clc_msg_hdr indicates the length of msg should be received from network and the value should not be fully trusted as it is from the network. Once the value of length exceeds the value of buflen in function smc_clc_wait_msg it may run into deadloop when trying to drain the remaining data exceeding buflen. This patch checks the return value of sock_recvmsg when draining data in case of deadloop in draining. Fixes: `fb4f79264c` ("net/smc: tolerate future SMCD versions") Signed-off-by: Guangguan Wang <guangguan.wang@linux.alibaba.com> Reviewed-by: Wen Gu <guwen@linux.alibaba.com> Reviewed-by: D. Wythe <alibuda@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-12-15 12:34:59 +00:00
Guangguan Wang	9ab332deb6	net/smc: check smcd_v2_ext_offset when receiving proposal msg When receiving proposal msg in server, the field smcd_v2_ext_offset in proposal msg is from the remote client and can not be fully trusted. Once the value of smcd_v2_ext_offset exceed the max value, there has the chance to access wrong address, and crash may happen. This patch checks the value of smcd_v2_ext_offset before using it. Fixes: `5c21c4ccaf` ("net/smc: determine accepted ISM devices") Signed-off-by: Guangguan Wang <guangguan.wang@linux.alibaba.com> Reviewed-by: Wen Gu <guwen@linux.alibaba.com> Reviewed-by: D. Wythe <alibuda@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-12-15 12:34:59 +00:00
Guangguan Wang	7863c9f3d2	net/smc: check v2_ext_offset/eid_cnt/ism_gid_cnt when receiving proposal msg When receiving proposal msg in server, the fields v2_ext_offset/ eid_cnt/ism_gid_cnt in proposal msg are from the remote client and can not be fully trusted. Especially the field v2_ext_offset, once exceed the max value, there has the chance to access wrong address, and crash may happen. This patch checks the fields v2_ext_offset/eid_cnt/ism_gid_cnt before using them. Fixes: `8c3dca341a` ("net/smc: build and send V2 CLC proposal") Signed-off-by: Guangguan Wang <guangguan.wang@linux.alibaba.com> Reviewed-by: Wen Gu <guwen@linux.alibaba.com> Reviewed-by: D. Wythe <alibuda@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-12-15 12:34:59 +00:00
Guangguan Wang	a29e220d3c	net/smc: check iparea_offset and ipv6_prefixes_cnt when receiving proposal msg When receiving proposal msg in server, the field iparea_offset and the field ipv6_prefixes_cnt in proposal msg are from the remote client and can not be fully trusted. Especially the field iparea_offset, once exceed the max value, there has the chance to access wrong address, and crash may happen. This patch checks iparea_offset and ipv6_prefixes_cnt before using them. Fixes: `e7b7a64a84` ("smc: support variable CLC proposal messages") Signed-off-by: Guangguan Wang <guangguan.wang@linux.alibaba.com> Reviewed-by: Wen Gu <guwen@linux.alibaba.com> Reviewed-by: D. Wythe <alibuda@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-12-15 12:34:59 +00:00
Guangguan Wang	679e9ddcf9	net/smc: check sndbuf_space again after NOSPACE flag is set in smc_poll When application sending data more than sndbuf_space, there have chances application will sleep in epoll_wait, and will never be wakeup again. This is caused by a race between smc_poll and smc_cdc_tx_handler. application tasklet smc_tx_sendmsg(len > sndbuf_space) \| epoll_wait for EPOLL_OUT,timeout=0 \| smc_poll \| if (!smc->conn.sndbuf_space) \| \| smc_cdc_tx_handler \| atomic_add sndbuf_space \| smc_tx_sndbuf_nonfull \| if (!test_bit SOCK_NOSPACE) \| do not sk_write_space; set_bit SOCK_NOSPACE; \| return mask=0; \| Application will sleep in epoll_wait as smc_poll returns 0. And smc_cdc_tx_handler will not call sk_write_space because the SOCK_NOSPACE has not be set. If there is no inflight cdc msg, sk_write_space will not be called any more, and application will sleep in epoll_wait forever. So check sndbuf_space again after NOSPACE flag is set to break the race. Fixes: `8dce2786a2` ("net/smc: smc_poll improvements") Signed-off-by: Guangguan Wang <guangguan.wang@linux.alibaba.com> Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-12-15 12:34:59 +00:00
Guangguan Wang	2b33eb8f1b	net/smc: protect link down work from execute after lgr freed link down work may be scheduled before lgr freed but execute after lgr freed, which may result in crash. So it is need to hold a reference before shedule link down work, and put the reference after work executed or canceled. The relevant crash call stack as follows: list_del corruption. prev->next should be ffffb638c9c0fe20, but was 0000000000000000 ------------[ cut here ]------------ kernel BUG at lib/list_debug.c:51! invalid opcode: 0000 [#1] SMP NOPTI CPU: 6 PID: 978112 Comm: kworker/6:119 Kdump: loaded Tainted: G #1 Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 2221b89 04/01/2014 Workqueue: events smc_link_down_work [smc] RIP: 0010:__list_del_entry_valid.cold+0x31/0x47 RSP: 0018:ffffb638c9c0fdd8 EFLAGS: 00010086 RAX: 0000000000000054 RBX: ffff942fb75e5128 RCX: 0000000000000000 RDX: ffff943520930aa0 RSI: ffff94352091fc80 RDI: ffff94352091fc80 RBP: 0000000000000000 R08: 0000000000000000 R09: ffffb638c9c0fc38 R10: ffffb638c9c0fc30 R11: ffffffffa015eb28 R12: 0000000000000002 R13: ffffb638c9c0fe20 R14: 0000000000000001 R15: ffff942f9cd051c0 FS: 0000000000000000(0000) GS:ffff943520900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f4f25214000 CR3: 000000025fbae004 CR4: 00000000007706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: rwsem_down_write_slowpath+0x17e/0x470 smc_link_down_work+0x3c/0x60 [smc] process_one_work+0x1ac/0x350 worker_thread+0x49/0x2f0 ? rescuer_thread+0x360/0x360 kthread+0x118/0x140 ? __kthread_bind_mask+0x60/0x60 ret_from_fork+0x1f/0x30 Fixes: `541afa10c1` ("net/smc: add smcr_port_err() and smcr_link_down() processing") Signed-off-by: Guangguan Wang <guangguan.wang@linux.alibaba.com> Reviewed-by: Tony Lu <tonylu@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-12-15 12:34:59 +00:00
Linus Torvalds	2d8308bf5b	Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fix from James Bottomley: "Single one-line fix in the ufs driver" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ufs: core: Update compl_time_stamp_local_clock after completing a cqe	2024-12-14 15:53:02 -08:00
Linus Torvalds	35f301dd45	Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Pull bpf fixes from Daniel Borkmann: - Fix a bug in the BPF verifier to track changes to packet data property for global functions (Eduard Zingerman) - Fix a theoretical BPF prog_array use-after-free in RCU handling of __uprobe_perf_func (Jann Horn) - Fix BPF tracing to have an explicit list of tracepoints and their arguments which need to be annotated as PTR_MAYBE_NULL (Kumar Kartikeya Dwivedi) - Fix a logic bug in the bpf_remove_insns code where a potential error would have been wrongly propagated (Anton Protopopov) - Avoid deadlock scenarios caused by nested kprobe and fentry BPF programs (Priya Bala Govindasamy) - Fix a bug in BPF verifier which was missing a size check for BTF-based context access (Kumar Kartikeya Dwivedi) - Fix a crash found by syzbot through an invalid BPF prog_array access in perf_event_detach_bpf_prog (Jiri Olsa) - Fix several BPF sockmap bugs including a race causing a refcount imbalance upon element replace (Michal Luczaj) - Fix a use-after-free from mismatching BPF program/attachment RCU flavors (Jann Horn) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: (23 commits) bpf: Avoid deadlock caused by nested kprobe and fentry bpf programs selftests/bpf: Add tests for raw_tp NULL args bpf: Augment raw_tp arguments with PTR_MAYBE_NULL bpf: Revert "bpf: Mark raw_tp arguments with PTR_MAYBE_NULL" selftests/bpf: Add test for narrow ctx load for pointer args bpf: Check size for BTF-based ctx access of pointer members selftests/bpf: extend changes_pkt_data with cases w/o subprograms bpf: fix null dereference when computing changes_pkt_data of prog w/o subprogs bpf: Fix theoretical prog_array UAF in __uprobe_perf_func() bpf: fix potential error return selftests/bpf: validate that tail call invalidates packet pointers bpf: consider that tail calls invalidate packet pointers selftests/bpf: freplace tests for tracking of changes_packet_data bpf: check changes_pkt_data property for extension programs selftests/bpf: test for changing packet data from global functions bpf: track changes_pkt_data property for global functions bpf: refactor bpf_helper_changes_pkt_data to use helper number bpf: add find_containing_subprog() utility function bpf,perf: Fix invalid prog_array access in perf_event_detach_bpf_prog bpf: Fix UAF via mismatching bpf_prog/attachment RCU flavors ...	2024-12-14 12:58:14 -08:00
Priya Bala Govindasamy	c83508da56	bpf: Avoid deadlock caused by nested kprobe and fentry bpf programs BPF program types like kprobe and fentry can cause deadlocks in certain situations. If a function takes a lock and one of these bpf programs is hooked to some point in the function's critical section, and if the bpf program tries to call the same function and take the same lock it will lead to deadlock. These situations have been reported in the following bug reports. In percpu_freelist - Link: https://lore.kernel.org/bpf/CAADnVQLAHwsa+2C6j9+UC6ScrDaN9Fjqv1WjB1pP9AzJLhKuLQ@mail.gmail.com/T/ Link: https://lore.kernel.org/bpf/CAPPBnEYm+9zduStsZaDnq93q1jPLqO-PiKX9jy0MuL8LCXmCrQ@mail.gmail.com/T/ In bpf_lru_list - Link: https://lore.kernel.org/bpf/CAPPBnEajj+DMfiR_WRWU5=6A7KKULdB5Rob_NJopFLWF+i9gCA@mail.gmail.com/T/ Link: https://lore.kernel.org/bpf/CAPPBnEZQDVN6VqnQXvVqGoB+ukOtHGZ9b9U0OLJJYvRoSsMY_g@mail.gmail.com/T/ Link: https://lore.kernel.org/bpf/CAPPBnEaCB1rFAYU7Wf8UxqcqOWKmRPU1Nuzk3_oLk6qXR7LBOA@mail.gmail.com/T/ Similar bugs have been reported by syzbot. In queue_stack_maps - Link: https://lore.kernel.org/lkml/0000000000004c3fc90615f37756@google.com/ Link: https://lore.kernel.org/all/20240418230932.2689-1-hdanton@sina.com/T/ In lpm_trie - Link: https://lore.kernel.org/linux-kernel/00000000000035168a061a47fa38@google.com/T/ In ringbuf - Link: https://lore.kernel.org/bpf/20240313121345.2292-1-hdanton@sina.com/T/ Prevent kprobe and fentry bpf programs from attaching to these critical sections by removing CC_FLAGS_FTRACE for percpu_freelist.o, bpf_lru_list.o, queue_stack_maps.o, lpm_trie.o, ringbuf.o files. The bugs reported by syzbot are due to tracepoint bpf programs being called in the critical sections. This patch does not aim to fix deadlocks caused by tracepoint programs. However, it does prevent deadlocks from occurring in similar situations due to kprobe and fentry programs. Signed-off-by: Priya Bala Govindasamy <pgovind2@uci.edu> Link: https://lore.kernel.org/r/CAPPBnEZpjGnsuA26Mf9kYibSaGLm=oF6=12L21X1GEQdqjLnzQ@mail.gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-12-14 09:49:27 -08:00
Linus Torvalds	a0e3919a2d	Merge tag 'usb-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB driver fixes from Greg KH: "Here are some small USB driver fixes for some reported issues. Included in here are: - typec driver bugfixes - u_serial gadget driver bugfix for much reported and discussed issue - dwc2 bugfixes - midi gadget driver bugfix - ehci-hcd driver bugfix - other small bugfixes All of these have been in linux-next for over a week with no reported issues" * tag 'usb-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: usb: typec: ucsi: Fix connector status writing past buffer size usb: typec: ucsi: Fix completion notifications usb: dwc2: Fix HCD port connection race usb: dwc2: hcd: Fix GetPortStatus & SetPortFeature usb: dwc2: Fix HCD resume usb: gadget: u_serial: Fix the issue that gs_start_io crashed due to accessing null pointer usb: misc: onboard_usb_dev: skip suspend/resume sequence for USB5744 SMBus support usb: dwc3: xilinx: make sure pipe clock is deselected in usb2 only mode usb: core: hcd: only check primary hcd skip_phy_initialization usb: gadget: midi2: Fix interpretation of is_midi1 bits usb: dwc3: imx8mp: fix software node kernel dump usb: typec: anx7411: fix OF node reference leaks in anx7411_typec_switch_probe() usb: typec: anx7411: fix fwnode_handle reference leak usb: host: max3421-hcd: Correctly abort a USB request. dt-bindings: phy: imx8mq-usb: correct reference to usb-switch.yaml usb: ehci-hcd: fix call balance of clocks handling routines	2024-12-14 09:35:22 -08:00
Linus Torvalds	636110be62	Merge tag 'tty-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull serial driver fixes from Greg KH: "Here are two small serial driver fixes for 6.13-rc3. They are: - ioport build fallout fix for the 8250 port driver that should resolve Guenter's runtime problems - sh-sci driver bugfix for a reported problem Both of these have been in linux-next for a while with no reported issues" * tag 'tty-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: tty: serial: Work around warning backtrace in serial8250_set_defaults serial: sh-sci: Check if TX data was written to device in .tx_empty()	2024-12-14 09:31:19 -08:00
Linus Torvalds	3de4f6d919	Merge tag 'staging-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging driver fixes from Greg KH: "Here are some small staging gpib driver build and bugfixes for issues that have been much-reported (should finally fix Guenter's build issues). There are more of these coming in later -rc releases, but for now this should fix the majority of the reported problems. All of these have been in linux-next for a while with no reported issues" * tag 'staging-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: staging: gpib: Fix i386 build issue staging: gpib: Fix faulty workaround for assignment in if staging: gpib: Workaround for ppc build failure staging: gpib: Make GPIB_NI_PCI_ISA depend on HAS_IOPORT	2024-12-14 09:27:07 -08:00
Linus Torvalds	ec2092915d	Merge tag 'v6.13-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: "Fix a regression in rsassa-pkcs1 as well as a buffer overrun in hisilicon/debugfs" * tag 'v6.13-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: hisilicon/debugfs - fix the struct pointer incorrectly offset problem crypto: rsassa-pkcs1 - Copy source data for SG list	2024-12-14 09:15:49 -08:00
Linus Torvalds	7824850768	Merge tag 'rust-fixes-6.13' of https://github.com/Rust-for-Linux/linux Pull rust fixes from Miguel Ojeda: "Toolchain and infrastructure: - Set bindgen's Rust target version to prevent issues when pairing older rustc releases with newer bindgen releases, such as bindgen >= 0.71.0 and rustc < 1.82 due to unsafe_extern_blocks. drm/panic: - Remove spurious empty line detected by a new Clippy warning" * tag 'rust-fixes-6.13' of https://github.com/Rust-for-Linux/linux: rust: kbuild: set `bindgen`'s Rust target version drm/panic: remove spurious empty line to clean warning	2024-12-14 08:51:43 -08:00
Linus Torvalds	115c0cc251	Merge tag 'iommu-fixes-v6.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux Pull iommu fixes from Joerg Roedel: - Per-domain device-list locking fixes for the AMD IOMMU driver - Fix incorrect use of smp_processor_id() in the NVidia-specific part of the ARM-SMMU-v3 driver - Intel IOMMU driver fixes: - Remove cache tags before disabling ATS - Avoid draining PRQ in sva mm release path - Fix qi_batch NULL pointer with nested parent domain * tag 'iommu-fixes-v6.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: iommu/vt-d: Avoid draining PRQ in sva mm release path iommu/vt-d: Fix qi_batch NULL pointer with nested parent domain iommu/vt-d: Remove cache tags before disabling ATS iommu/amd: Add lockdep asserts for domain->dev_list iommu/amd: Put list_add/del(dev_data) back under the domain->lock iommu/tegra241-cmdqv: do not use smp_processor_id in preemptible context	2024-12-14 08:41:40 -08:00
Linus Torvalds	5d97859e17	Merge tag 'ata-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux Pull ata fix from Damien Le Moal: - Fix an OF node reference leak in the sata_highbank driver * tag 'ata-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux: ata: sata_highbank: fix OF node reference leak in highbank_initialize_phys()	2024-12-14 08:40:05 -08:00
Wolfram Sang	5b6b08af1f	Merge tag 'i2c-host-fixes-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-current i2c-host-fixes for v6.13-rc3 - Replaced jiffies with msec for timeout calculations. - Added a sentinel to the 'of_device_id' array in Nomadik. - Rounded up bus period calculation in RIIC.	2024-12-14 10:01:46 +01:00
Eric Dumazet	429fde2d81	net: tun: fix tun_napi_alloc_frags() syzbot reported the following crash [1] Issue came with the blamed commit. Instead of going through all the iov components, we keep using the first one and end up with a malformed skb. [1] kernel BUG at net/core/skbuff.c:2849 ! Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI CPU: 0 UID: 0 PID: 6230 Comm: syz-executor132 Not tainted 6.13.0-rc1-syzkaller-00407-g96b6fcc0ee41 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/25/2024 RIP: 0010:__pskb_pull_tail+0x1568/0x1570 net/core/skbuff.c:2848 Code: 38 c1 0f 8c 32 f1 ff ff 4c 89 f7 e8 92 96 74 f8 e9 25 f1 ff ff e8 e8 ae 09 f8 48 8b 5c 24 08 e9 eb fb ff ff e8 d9 ae 09 f8 90 <0f> 0b 66 0f 1f 44 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 RSP: 0018:ffffc90004cbef30 EFLAGS: 00010293 RAX: ffffffff8995c347 RBX: 00000000fffffff2 RCX: ffff88802cf45a00 RDX: 0000000000000000 RSI: 00000000fffffff2 RDI: 0000000000000000 RBP: ffff88807df0c06a R08: ffffffff8995b084 R09: 1ffff1100fbe185c R10: dffffc0000000000 R11: ffffed100fbe185d R12: ffff888076e85d50 R13: ffff888076e85c80 R14: ffff888076e85cf4 R15: ffff888076e85c80 FS: 00007f0dca6ea6c0(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f0dca6ead58 CR3: 00000000119da000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> skb_cow_data+0x2da/0xcb0 net/core/skbuff.c:5284 tipc_aead_decrypt net/tipc/crypto.c:894 [inline] tipc_crypto_rcv+0x402/0x24e0 net/tipc/crypto.c:1844 tipc_rcv+0x57e/0x12a0 net/tipc/node.c:2109 tipc_l2_rcv_msg+0x2bd/0x450 net/tipc/bearer.c:668 __netif_receive_skb_list_ptype net/core/dev.c:5720 [inline] __netif_receive_skb_list_core+0x8b7/0x980 net/core/dev.c:5762 __netif_receive_skb_list net/core/dev.c:5814 [inline] netif_receive_skb_list_internal+0xa51/0xe30 net/core/dev.c:5905 gro_normal_list include/net/gro.h:515 [inline] napi_complete_done+0x2b5/0x870 net/core/dev.c:6256 napi_complete include/linux/netdevice.h:567 [inline] tun_get_user+0x2ea0/0x4890 drivers/net/tun.c:1982 tun_chr_write_iter+0x10d/0x1f0 drivers/net/tun.c:2057 do_iter_readv_writev+0x600/0x880 vfs_writev+0x376/0xba0 fs/read_write.c:1050 do_writev+0x1b6/0x360 fs/read_write.c:1096 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Fixes: `de4f5fed3f` ("iov_iter: add iter_iovec() helper") Reported-by: syzbot+4f66250f6663c0c1d67e@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/675b61aa.050a0220.599f4.00bb.GAE@google.com/T/#u Cc: stable@vger.kernel.org Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Jens Axboe <axboe@kernel.dk> Acked-by: Willem de Bruijn <willemb@google.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://patch.msgid.link/20241212222247.724674-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-13 19:33:45 -08:00
Linus Torvalds	a446e965a1	Merge tag '6.13-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6 Pull smb client fixes from Steve French: - fix rmmod leak - two minor cleanups - fix for unlink/rename with pending i/o * tag '6.13-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: smb: client: destroy cfid_put_wq on module exit cifs: Use str_yes_no() helper in cifs_ses_add_channel() cifs: Fix rmdir failure due to ongoing I/O on deleted file smb3: fix compiler warning in reparse code	2024-12-13 17:36:02 -08:00
Linus Torvalds	f86135613a	Merge tag 'spi-fix-v6.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "A few fairly small fixes for v6.13, the most substatial one being disabling STIG mode for Cadence QSPI controllers on Altera SoCFPGA platforms since it doesn't work" * tag 'spi-fix-v6.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: spi-cadence-qspi: Disable STIG mode for Altera SoCFPGA. spi: rockchip: Fix PM runtime count on no-op cs spi: aspeed: Fix an error handling path in aspeed_spi_[read\|write]_user()	2024-12-13 17:29:19 -08:00
Linus Torvalds	4e1b4861a2	Merge tag 'regulator-fix-v6.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fixes from Mark Brown: "A couple of additional changes, one ensuring we give AXP717 enough time to stabilise after changing voltages which fixes serious stability issues on some platforms and another documenting the DT support required for the Qualcomm WCN6750" * tag 'regulator-fix-v6.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: axp20x: AXP717: set ramp_delay regulator: dt-bindings: qcom,qca6390-pmu: document wcn6750-pmu	2024-12-13 17:18:28 -08:00
Linus Torvalds	e72da82d5a	Merge tag 'drm-fixes-2024-12-14' of https://gitlab.freedesktop.org/drm/kernel Pull drm fixes from Dave Airlie: "This is the weekly fixes pull for drm. Just has i915, xe and amdgpu changes in it. Nothing too major in here: i915: - Don't use indexed register writes needlessly [dsb] - Stop using non-posted DSB writes for legacy LUT [color] - Fix NULL pointer dereference in capture_engine - Fix memory leak by correcting cache object name in error handler xe: - Fix a KUNIT test error message (Mirsad Todorovac) - Fix an invalidation fence PM ref leak (Daniele) - Fix a register pool UAF (Lucas) amdgpu: - ISP hw init fix - SR-IOV fixes - Fix contiguous VRAM mapping for UVD on older GPUs - Fix some regressions due to drm scheduler changes - Workload profile fixes - Cleaner shader fix amdkfd: - Fix DMA map direction for migration - Fix a potential null pointer dereference - Cacheline size fixes - Runtime PM fix" * tag 'drm-fixes-2024-12-14' of https://gitlab.freedesktop.org/drm/kernel: drm/xe/reg_sr: Remove register pool drm/xe: Call invalidation_fence_fini for PT inval fences in error state drm/xe: fix the ERR_PTR() returned on failure to allocate tiny pt drm/amdkfd: pause autosuspend when creating pdd drm/amdgpu: fix when the cleaner shader is emitted drm/amdgpu: Fix ISP HW init issue drm/amdkfd: hard-code MALL cacheline size for gfx11, gfx12 drm/amdkfd: hard-code cacheline size for gfx11 drm/amdkfd: Dereference null return value drm/i915: Fix memory leak by correcting cache object name in error handler drm/i915: Fix NULL pointer dereference in capture_engine drm/i915/color: Stop using non-posted DSB writes for legacy LUT drm/i915/dsb: Don't use indexed register writes needlessly drm/amdkfd: Correct the migration DMA map direction drm/amd/pm: Set SMU v13.0.7 default workload type drm/amd/pm: Initialize power profile mode amdgpu/uvd: get ring reference from rq scheduler drm/amdgpu: fix UVD contiguous CS mapping problem drm/amdgpu: use sjt mec fw on gfx943 for sriov Revert "drm/amdgpu: Fix ISP hw init issue"	2024-12-13 16:58:39 -08:00
Linus Torvalds	974acf9974	Merge tag 'pm-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management documentation fix from Rafael Wysocki: "Fix a runtime PM documentation mistake that may mislead someone into making a coding mistake (Paul Barker)" * tag 'pm-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: Documentation: PM: Clarify pm_runtime_resume_and_get() return value	2024-12-13 16:55:43 -08:00
Linus Torvalds	c810e8df96	Merge tag 'acpi-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "These fix two coding mistakes, one in the ACPI resources handling code and one in ACPICA: - Relocate the addr->info.mem.caching check in acpi_decode_space() to only execute it if the resource is of the correct type (Ilpo Järvinen) - Don't release a context_mutex that was never acquired in acpi_remove_address_space_handler() (Daniil Tatianin)" * tag 'acpi-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPICA: events/evxfregn: don't release the ContextMutex that was never acquired ACPI: resource: Fix memory resource type union access	2024-12-13 16:51:56 -08:00
Alexei Starovoitov	a8e1a3ddf7	Merge branch 'explicit-raw_tp-null-arguments' Kumar Kartikeya Dwivedi says: ==================== Explicit raw_tp NULL arguments This set reverts the raw_tp masking changes introduced in commit `cb4158ce8e` ("bpf: Mark raw_tp arguments with PTR_MAYBE_NULL") and replaces it wwith an explicit list of tracepoints and their arguments which need to be annotated as PTR_MAYBE_NULL. More context on the fallout caused by the masking fix and subsequent discussions can be found in [0]. To remedy this, we implement a solution of explicitly defined tracepoint and define which args need to be marked NULL or scalar (for IS_ERR case). The commit logs describes the details of this approach in detail. We will follow up this solution an approach Eduard is working on to perform automated analysis of NULL-ness of tracepoint arguments. The current PoC is available here: - LLVM branch with the analysis: https://github.com/eddyz87/llvm-project/tree/nullness-for-tracepoint-params - Python script for merging of analysis results: https://gist.github.com/eddyz87/e47c164466a60e8d49e6911cff146f47 The idea is to infer a tri-state verdict for each tracepoint parameter: definitely not null, can be null, unknown (in which case no assumptions should be made). Using this information, the verifier in most cases will be able to precisely determine the state of the tracepoint parameter without any human effort. At that point, the table maintained manually in this set can be dropped and replace with this automated analysis tool's result. This will be kept up to date with each kernel release. [0]: https://lore.kernel.org/bpf/20241206161053.809580-1-memxor@gmail.com Changelog: ---------- v2 -> v3: v2: https://lore.kernel.org/bpf/20241213175127.2084759-1-memxor@gmail.com * Address Eduard's nits, add Reviewed-by v1 -> v2: v1: https://lore.kernel.org/bpf/20241211020156.18966-1-memxor@gmail.com * Address comments from Jiri * Mark module tracepoints args NULL by default * Add more sunrpc tracepoints * Unify scalar or null handling * Address comments from Alexei * Use bitmask approach suggested in review * Unify scalar or null handling * Drop most tests that rely on CONFIG options * Drop scripts to generate tests ==================== Link: https://patch.msgid.link/20241213221929.3495062-1-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-12-13 16:24:54 -08:00
Kumar Kartikeya Dwivedi	0da1955b5b	selftests/bpf: Add tests for raw_tp NULL args Add tests to ensure that arguments are correctly marked based on their specified positions, and whether they get marked correctly as maybe null. For modules, all tracepoint parameters should be marked PTR_MAYBE_NULL by default. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20241213221929.3495062-4-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-12-13 16:24:53 -08:00
Kumar Kartikeya Dwivedi	838a10bd2e	bpf: Augment raw_tp arguments with PTR_MAYBE_NULL Arguments to a raw tracepoint are tagged as trusted, which carries the semantics that the pointer will be non-NULL. However, in certain cases, a raw tracepoint argument may end up being NULL. More context about this issue is available in [0]. Thus, there is a discrepancy between the reality, that raw_tp arguments can actually be NULL, and the verifier's knowledge, that they are never NULL, causing explicit NULL check branch to be dead code eliminated. A previous attempt [1], i.e. the second fixed commit, was made to simulate symbolic execution as if in most accesses, the argument is a non-NULL raw_tp, except for conditional jumps. This tried to suppress branch prediction while preserving compatibility, but surfaced issues with production programs that were difficult to solve without increasing verifier complexity. A more complete discussion of issues and fixes is available at [2]. Fix this by maintaining an explicit list of tracepoints where the arguments are known to be NULL, and mark the positional arguments as PTR_MAYBE_NULL. Additionally, capture the tracepoints where arguments are known to be ERR_PTR, and mark these arguments as scalar values to prevent potential dereference. Each hex digit is used to encode NULL-ness (0x1) or ERR_PTR-ness (0x2), shifted by the zero-indexed argument number x 4. This can be represented as follows: 1st arg: 0x1 2nd arg: 0x10 3rd arg: 0x100 ... and so on (likewise for ERR_PTR case). In the future, an automated pass will be used to produce such a list, or insert __nullable annotations automatically for tracepoints. Each compilation unit will be analyzed and results will be collated to find whether a tracepoint pointer is definitely not null, maybe null, or an unknown state where verifier conservatively marks it PTR_MAYBE_NULL. A proof of concept of this tool from Eduard is available at [3]. Note that in case we don't find a specification in the raw_tp_null_args array and the tracepoint belongs to a kernel module, we will conservatively mark the arguments as PTR_MAYBE_NULL. This is because unlike for in-tree modules, out-of-tree module tracepoints may pass NULL freely to the tracepoint. We don't protect against such tracepoints passing ERR_PTR (which is uncommon anyway), lest we mark all such arguments as SCALAR_VALUE. While we are it, let's adjust the test raw_tp_null to not perform dereference of the skb->mark, as that won't be allowed anymore, and make it more robust by using inline assembly to test the dead code elimination behavior, which should still stay the same. [0]: https://lore.kernel.org/bpf/ZrCZS6nisraEqehw@jlelli-thinkpadt14gen4.remote.csb [1]: https://lore.kernel.org/all/20241104171959.2938862-1-memxor@gmail.com [2]: https://lore.kernel.org/bpf/20241206161053.809580-1-memxor@gmail.com [3]: https://github.com/eddyz87/llvm-project/tree/nullness-for-tracepoint-params Reported-by: Juri Lelli <juri.lelli@redhat.com> # original bug Reported-by: Manu Bretelle <chantra@meta.com> # bugs in masking fix Fixes: `3f00c52393` ("bpf: Allow trusted pointers to be passed to KF_TRUSTED_ARGS kfuncs") Fixes: `cb4158ce8e` ("bpf: Mark raw_tp arguments with PTR_MAYBE_NULL") Reviewed-by: Eduard Zingerman <eddyz87@gmail.com> Co-developed-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20241213221929.3495062-3-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-12-13 16:24:53 -08:00
Kumar Kartikeya Dwivedi	c00d738e16	bpf: Revert "bpf: Mark raw_tp arguments with PTR_MAYBE_NULL" This patch reverts commit `cb4158ce8e` ("bpf: Mark raw_tp arguments with PTR_MAYBE_NULL"). The patch was well-intended and meant to be as a stop-gap fixing branch prediction when the pointer may actually be NULL at runtime. Eventually, it was supposed to be replaced by an automated script or compiler pass detecting possibly NULL arguments and marking them accordingly. However, it caused two main issues observed for production programs and failed to preserve backwards compatibility. First, programs relied on the verifier not exploring == NULL branch when pointer is not NULL, thus they started failing with a 'dereference of scalar' error. Next, allowing raw_tp arguments to be modified surfaced the warning in the verifier that warns against reg->off when PTR_MAYBE_NULL is set. More information, context, and discusson on both problems is available in [0]. Overall, this approach had several shortcomings, and the fixes would further complicate the verifier's logic, and the entire masking scheme would have to be removed eventually anyway. Hence, revert the patch in preparation of a better fix avoiding these issues to replace this commit. [0]: https://lore.kernel.org/bpf/20241206161053.809580-1-memxor@gmail.com Reported-by: Manu Bretelle <chantra@meta.com> Fixes: `cb4158ce8e` ("bpf: Mark raw_tp arguments with PTR_MAYBE_NULL") Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20241213221929.3495062-2-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-12-13 16:24:53 -08:00
Linus Torvalds	c30c65f3fe	Merge tag 'block-6.13-20241213' of git://git.kernel.dk/linux Pull block fixes from Jens Axboe: - Series from Damien fixing issues with the zoned write plugging - Fix for a potential UAF in block cgroups - Fix deadlock around queue freezing and the sysfs lock - Various little cleanups and fixes * tag 'block-6.13-20241213' of git://git.kernel.dk/linux: block: Fix potential deadlock while freezing queue and acquiring sysfs_lock block: Fix queue_iostats_passthrough_show() blk-mq: Clean up blk_mq_requeue_work() mq-deadline: Remove a local variable blk-iocost: Avoid using clamp() on inuse in __propagate_weights() block: Make bio_iov_bvec_set() accept pointer to const iov_iter block: get wp_offset by bdev_offset_from_zone_start blk-cgroup: Fix UAF in blkcg_unpin_online() MAINTAINERS: update Coly Li's email address block: Prevent potential deadlocks in zone write plug error recovery dm: Fix dm-zoned-reclaim zone write pointer alignment block: Ignore REQ_NOWAIT for zone reset and zone finish operations block: Use a zone write plug BIO work for REQ_NOWAIT BIOs	2024-12-13 15:10:59 -08:00
Linus Torvalds	2a816e4f6a	Merge tag 'io_uring-6.13-20241213' of git://git.kernel.dk/linux Pull io_uring fix from Jens Axboe: "A single fix for a regression introduced in the 6.13 merge window" * tag 'io_uring-6.13-20241213' of git://git.kernel.dk/linux: io_uring/rsrc: don't put/free empty buffers	2024-12-13 15:08:55 -08:00
Linus Torvalds	1c021e7908	Merge tag 'libnvdimm-fixes-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm fix from Ira Weiny: - sysbot fix for out of bounds access * tag 'libnvdimm-fixes-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: acpi: nfit: vmalloc-out-of-bounds Read in acpi_nfit_ctl	2024-12-13 15:07:22 -08:00
Leon Romanovsky	824927e884	ARC: build: Try to guess GCC variant of cross compiler ARC GCC compiler is packaged starting from Fedora 39i and the GCC variant of cross compile tools has arc-linux-gnu- prefix and not arc-linux-. This is causing that CROSS_COMPILE variable is left unset. This change allows builds without need to supply CROSS_COMPILE argument if distro package is used. Before this change: $ make -j 128 ARCH=arc W=1 drivers/infiniband/hw/mlx4/ gcc: warning: ‘-mcpu=’ is deprecated; use ‘-mtune=’ or ‘-march=’ instead gcc: error: unrecognized command-line option ‘-mmedium-calls’ gcc: error: unrecognized command-line option ‘-mlock’ gcc: error: unrecognized command-line option ‘-munaligned-access’ [1] https://packages.fedoraproject.org/pkgs/cross-gcc/gcc-arc-linux-gnu/index.html Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Vineet Gupta <vgupta@kernel.org>	2024-12-13 14:39:24 -08:00
Linus Torvalds	4800575d8c	Merge tag 'xfs-fixes-6.13-rc3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux Pull xfs fixes from Carlos Maiolino: - Fixes for scrub subsystem - Fix quota crashes - Fix sb_spino_align checons on large fsblock sizes - Fix discarded superblock updates - Fix stuck unmount due to a locked inode * tag 'xfs-fixes-6.13-rc3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (28 commits) xfs: port xfs_ioc_start_commit to multigrain timestamps xfs: return from xfs_symlink_verify early on V4 filesystems xfs: fix zero byte checking in the superblock scrubber xfs: check pre-metadir fields correctly xfs: don't crash on corrupt /quotas dirent xfs: don't move nondir/nonreg temporary repair files to the metadir namespace xfs: fix sb_spino_align checks for large fsblock sizes xfs: convert quotacheck to attach dquot buffers xfs: attach dquot buffer to dquot log item buffer xfs: clean up log item accesses in xfs_qm_dqflush{,_done} xfs: separate dquot buffer reads from xfs_dqflush xfs: don't lose solo dquot update transactions xfs: don't lose solo superblock counter update transactions xfs: avoid nested calls to __xfs_trans_commit xfs: only run precommits once per transaction object xfs: unlock inodes when erroring out of xfs_trans_alloc_dir xfs: fix scrub tracepoints when inode-rooted btrees are involved xfs: update btree keys correctly when _insrec splits an inode root block xfs: fix error bailout in xfs_rtginode_create xfs: fix null bno_hint handling in xfs_rtallocate_rtg ...	2024-12-13 14:27:40 -08:00
Linus Torvalds	01af50af76	Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - arm64 stacktrace: address some fallout from the recent changes to unwinding across exception boundaries - Ensure the arm64 signal delivery failure is recoverable - only override the return registers after all the user accesses took place - Fix the arm64 kselftest access to SVCR - only when SME is detected * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: kselftest/arm64: abi: fix SVCR detection arm64: signal: Ensure signal delivery failure is recoverable arm64: stacktrace: Don't WARN when unwinding other tasks arm64: stacktrace: Skip reporting LR at exception boundaries	2024-12-13 14:23:09 -08:00
Linus Torvalds	a5b3d5dfce	Merge tag 'riscv-for-linus-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - avoid taking a mutex while resolving jump_labels in the mutex implementation - avoid trying to resolve the early boot DT pointer via the linear map - avoid trying to IPI kfence TLB flushes, as kfence might flush with IRQs disabled - avoid calling PMD destructors on PMDs that were never constructed * tag 'riscv-for-linus-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: mm: Do not call pmd dtor on vmemmap page table teardown riscv: Fix IPIs usage in kfence_protect_page() riscv: Fix wrong usage of __pa() on a fixmap address riscv: Fixup boot failure when CONFIG_DEBUG_RT_MUTEXES=y	2024-12-13 14:18:18 -08:00
Ville Syrjälä	9398332f23	drm/modes: Avoid divide by zero harder in drm_mode_vrefresh() drm_mode_vrefresh() is trying to avoid divide by zero by checking whether htotal or vtotal are zero. But we may still end up with a div-by-zero of vtotalhtotal... Cc: stable@vger.kernel.org Reported-by: syzbot+622bba18029bcde672e1@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=622bba18029bcde672e1 Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241129042629.18280-2-ville.syrjala@linux.intel.com Reviewed-by: Jani Nikula <jani.nikula@intel.com>	2024-12-14 00:05:32 +02:00
Rafael J. Wysocki	e14d5ae28e	Merge branch 'acpica' Merge an ACPICA fix for 6.13-rc3: - Don't release a context_mutex that was never acquired in acpi_remove_address_space_handler() (Daniil Tatianin). * acpica: ACPICA: events/evxfregn: don't release the ContextMutex that was never acquired	2024-12-13 21:32:06 +01:00
Paolo Bonzini	3522c41975	Merge tag 'kvm-riscv-fixes-6.13-1' of https://github.com/kvm-riscv/linux into HEAD KVM/riscv fixes for 6.13, take #1 - Replace csr_write() with csr_set() for HVIEN PMU overflow bit	2024-12-13 13:59:20 -05:00
Sean Christopherson	1201f226c8	KVM: x86: Cache CPUID.0xD XSTATE offsets+sizes during module init Snapshot the output of CPUID.0xD.[1..n] during kvm.ko initiliaization to avoid the overead of CPUID during runtime. The offset, size, and metadata for CPUID.0xD.[1..n] sub-leaves does not depend on XCR0 or XSS values, i.e. is constant for a given CPU, and thus can be cached during module load. On Intel's Emerald Rapids, CPUID is wildly expensive, to the point where recomputing XSAVE offsets and sizes results in a 4x increase in latency of nested VM-Enter and VM-Exit (nested transitions can trigger xstate_required_size() multiple times per transition), relative to using cached values. The issue is easily visible by running `perf top` while triggering nested transitions: kvm_update_cpuid_runtime() shows up at a whopping 50%. As measured via RDTSC from L2 (using KVM-Unit-Test's CPUID VM-Exit test and a slightly modified L1 KVM to handle CPUID in the fastpath), a nested roundtrip to emulate CPUID on Skylake (SKX), Icelake (ICX), and Emerald Rapids (EMR) takes: SKX 11650 ICX 22350 EMR 28850 Using cached values, the latency drops to: SKX 6850 ICX 9000 EMR 7900 The underlying issue is that CPUID itself is slow on ICX, and comically slow on EMR. The problem is exacerbated on CPUs which support XSAVES and/or XSAVEC, as KVM invokes xstate_required_size() twice on each runtime CPUID update, and because there are more supported XSAVE features (CPUID for supported XSAVE feature sub-leafs is significantly slower). SKX: CPUID.0xD.2 = 348 cycles CPUID.0xD.3 = 400 cycles CPUID.0xD.4 = 276 cycles CPUID.0xD.5 = 236 cycles <other sub-leaves are similar> EMR: CPUID.0xD.2 = 1138 cycles CPUID.0xD.3 = 1362 cycles CPUID.0xD.4 = 1068 cycles CPUID.0xD.5 = 910 cycles CPUID.0xD.6 = 914 cycles CPUID.0xD.7 = 1350 cycles CPUID.0xD.8 = 734 cycles CPUID.0xD.9 = 766 cycles CPUID.0xD.10 = 732 cycles CPUID.0xD.11 = 718 cycles CPUID.0xD.12 = 734 cycles CPUID.0xD.13 = 1700 cycles CPUID.0xD.14 = 1126 cycles CPUID.0xD.15 = 898 cycles CPUID.0xD.16 = 716 cycles CPUID.0xD.17 = 748 cycles CPUID.0xD.18 = 776 cycles Note, updating runtime CPUID information multiple times per nested transition is itself a flaw, especially since CPUID is a mandotory intercept on both Intel and AMD. E.g. KVM doesn't need to ensure emulated CPUID state is up-to-date while running L2. That flaw will be fixed in a future patch, as deferring runtime CPUID updates is more subtle than it appears at first glance, the benefits aren't super critical to have once the XSAVE issue is resolved, and caching CPUID output is desirable even if KVM's updates are deferred. Cc: Jim Mattson <jmattson@google.com> Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20241211013302.1347853-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-12-13 13:58:10 -05:00
Linus Torvalds	243f750a2d	Merge tag 'gpio-fixes-for-v6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull gpio fixes from Bartosz Golaszewski: - fix several low-level issues in gpio-graniterapids - fix an initialization order issue that manifests itself with __counted_by() checks in gpio-ljca - don't default to y for CONFIG_GPIO_MVEBU with COMPILE_TEST enabled - move the DEFAULT_SYMBOL_NAMESPACE define before the export.h include in gpio-idio-16 * tag 'gpio-fixes-for-v6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: gpio: idio-16: Actually make use of the GPIO_IDIO_16 symbol namespace gpio: graniterapids: Fix GPIO Ack functionality gpio: graniterapids: Check if GPIO line can be used for IRQs gpio: graniterapids: Determine if GPIO pad can be used by driver gpio: graniterapids: Fix invalid RXEVCFG register bitmask gpio: graniterapids: Fix invalid GPI_IS register offset gpio: graniterapids: Fix incorrect BAR assignment gpio: graniterapids: Fix vGPIO driver crash gpio: ljca: Initialize num before accessing item in ljca_gpio_config gpio: GPIO_MVEBU should not default to y when compile-testing	2024-12-13 09:57:15 -08:00
Nilay Shroff	be26ba9642	block: Fix potential deadlock while freezing queue and acquiring sysfs_lock For storing a value to a queue attribute, the queue_attr_store function first freezes the queue (->q_usage_counter(io)) and then acquire ->sysfs_lock. This seems not correct as the usual ordering should be to acquire ->sysfs_lock before freezing the queue. This incorrect ordering causes the following lockdep splat which we are able to reproduce always simply by accessing /sys/kernel/debug file using ls command: [ 57.597146] WARNING: possible circular locking dependency detected [ 57.597154] 6.12.0-10553-gb86545e02e8c #20 Tainted: G W [ 57.597162] ------------------------------------------------------ [ 57.597168] ls/4605 is trying to acquire lock: [ 57.597176] c00000003eb56710 (&mm->mmap_lock){++++}-{4:4}, at: __might_fault+0x58/0xc0 [ 57.597200] but task is already holding lock: [ 57.597207] c0000018e27c6810 (&sb->s_type->i_mutex_key#3){++++}-{4:4}, at: iterate_dir+0x94/0x1d4 [ 57.597226] which lock already depends on the new lock. [ 57.597233] the existing dependency chain (in reverse order) is: [ 57.597241] -> #5 (&sb->s_type->i_mutex_key#3){++++}-{4:4}: [ 57.597255] down_write+0x6c/0x18c [ 57.597264] start_creating+0xb4/0x24c [ 57.597274] debugfs_create_dir+0x2c/0x1e8 [ 57.597283] blk_register_queue+0xec/0x294 [ 57.597292] add_disk_fwnode+0x2e4/0x548 [ 57.597302] brd_alloc+0x2c8/0x338 [ 57.597309] brd_init+0x100/0x178 [ 57.597317] do_one_initcall+0x88/0x3e4 [ 57.597326] kernel_init_freeable+0x3cc/0x6e0 [ 57.597334] kernel_init+0x34/0x1cc [ 57.597342] ret_from_kernel_user_thread+0x14/0x1c [ 57.597350] -> #4 (&q->debugfs_mutex){+.+.}-{4:4}: [ 57.597362] __mutex_lock+0xfc/0x12a0 [ 57.597370] blk_register_queue+0xd4/0x294 [ 57.597379] add_disk_fwnode+0x2e4/0x548 [ 57.597388] brd_alloc+0x2c8/0x338 [ 57.597395] brd_init+0x100/0x178 [ 57.597402] do_one_initcall+0x88/0x3e4 [ 57.597410] kernel_init_freeable+0x3cc/0x6e0 [ 57.597418] kernel_init+0x34/0x1cc [ 57.597426] ret_from_kernel_user_thread+0x14/0x1c [ 57.597434] -> #3 (&q->sysfs_lock){+.+.}-{4:4}: [ 57.597446] __mutex_lock+0xfc/0x12a0 [ 57.597454] queue_attr_store+0x9c/0x110 [ 57.597462] sysfs_kf_write+0x70/0xb0 [ 57.597471] kernfs_fop_write_iter+0x1b0/0x2ac [ 57.597480] vfs_write+0x3dc/0x6e8 [ 57.597488] ksys_write+0x84/0x140 [ 57.597495] system_call_exception+0x130/0x360 [ 57.597504] system_call_common+0x160/0x2c4 [ 57.597516] -> #2 (&q->q_usage_counter(io)#21){++++}-{0:0}: [ 57.597530] __submit_bio+0x5ec/0x828 [ 57.597538] submit_bio_noacct_nocheck+0x1e4/0x4f0 [ 57.597547] iomap_readahead+0x2a0/0x448 [ 57.597556] xfs_vm_readahead+0x28/0x3c [ 57.597564] read_pages+0x88/0x41c [ 57.597571] page_cache_ra_unbounded+0x1ac/0x2d8 [ 57.597580] filemap_get_pages+0x188/0x984 [ 57.597588] filemap_read+0x13c/0x4bc [ 57.597596] xfs_file_buffered_read+0x88/0x17c [ 57.597605] xfs_file_read_iter+0xac/0x158 [ 57.597614] vfs_read+0x2d4/0x3b4 [ 57.597622] ksys_read+0x84/0x144 [ 57.597629] system_call_exception+0x130/0x360 [ 57.597637] system_call_common+0x160/0x2c4 [ 57.597647] -> #1 (mapping.invalidate_lock#2){++++}-{4:4}: [ 57.597661] down_read+0x6c/0x220 [ 57.597669] filemap_fault+0x870/0x100c [ 57.597677] xfs_filemap_fault+0xc4/0x18c [ 57.597684] __do_fault+0x64/0x164 [ 57.597693] __handle_mm_fault+0x1274/0x1dac [ 57.597702] handle_mm_fault+0x248/0x484 [ 57.597711] ___do_page_fault+0x428/0xc0c [ 57.597719] hash__do_page_fault+0x30/0x68 [ 57.597727] do_hash_fault+0x90/0x35c [ 57.597736] data_access_common_virt+0x210/0x220 [ 57.597745] _copy_from_user+0xf8/0x19c [ 57.597754] sel_write_load+0x178/0xd54 [ 57.597762] vfs_write+0x108/0x6e8 [ 57.597769] ksys_write+0x84/0x140 [ 57.597777] system_call_exception+0x130/0x360 [ 57.597785] system_call_common+0x160/0x2c4 [ 57.597794] -> #0 (&mm->mmap_lock){++++}-{4:4}: [ 57.597806] __lock_acquire+0x17cc/0x2330 [ 57.597814] lock_acquire+0x138/0x400 [ 57.597822] __might_fault+0x7c/0xc0 [ 57.597830] filldir64+0xe8/0x390 [ 57.597839] dcache_readdir+0x80/0x2d4 [ 57.597846] iterate_dir+0xd8/0x1d4 [ 57.597855] sys_getdents64+0x88/0x2d4 [ 57.597864] system_call_exception+0x130/0x360 [ 57.597872] system_call_common+0x160/0x2c4 [ 57.597881] other info that might help us debug this: [ 57.597888] Chain exists of: &mm->mmap_lock --> &q->debugfs_mutex --> &sb->s_type->i_mutex_key#3 [ 57.597905] Possible unsafe locking scenario: [ 57.597911] CPU0 CPU1 [ 57.597917] ---- ---- [ 57.597922] rlock(&sb->s_type->i_mutex_key#3); [ 57.597932] lock(&q->debugfs_mutex); [ 57.597940] lock(&sb->s_type->i_mutex_key#3); [ 57.597950] rlock(&mm->mmap_lock); [ 57.597958] * DEADLOCK * [ 57.597965] 2 locks held by ls/4605: [ 57.597971] #0: c0000000137c12f8 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0xcc/0x154 [ 57.597989] #1: c0000018e27c6810 (&sb->s_type->i_mutex_key#3){++++}-{4:4}, at: iterate_dir+0x94/0x1d4 Prevent the above lockdep warning by acquiring ->sysfs_lock before freezing the queue while storing a queue attribute in queue_attr_store function. Later, we also found[1] another function __blk_mq_update_nr_ hw_queues where we first freeze queue and then acquire the ->sysfs_lock. So we've also updated lock ordering in __blk_mq_update_nr_hw_queues function and ensured that in all code paths we follow the correct lock ordering i.e. acquire ->sysfs_lock before freezing the queue. [1] https://lore.kernel.org/all/CAFj5m9Ke8+EHKQBs_Nk6hqd=LGXtk4mUxZUN5==ZcCjnZSBwHw@mail.gmail.com/ Reported-by: kjain@linux.ibm.com Fixes: `af28141498` ("block: freeze the queue in queue_attr_store") Tested-by: kjain@linux.ibm.com Cc: hch@lst.de Cc: axboe@kernel.dk Cc: ritesh.list@gmail.com Cc: ming.lei@redhat.com Cc: gjoyce@linux.ibm.com Signed-off-by: Nilay Shroff <nilay@linux.ibm.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20241210144222.1066229-1-nilay@linux.ibm.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-13 10:51:58 -07:00
Linus Torvalds	de20dc2b96	Merge tag 'sound-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A collection of fixes; all look small, device-specific and boring" * tag 'sound-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ASoC: Intel: sof_sdw: Add space for a terminator into DAIs array ASoC: fsl_spdif: change IFACE_PCM to IFACE_MIXER ASoC: fsl_xcvr: change IFACE_PCM to IFACE_MIXER ASoC: tas2781: Fix calibration issue in stress test ASoC: audio-graph-card: Call of_node_put() on correct node ASoC: amd: yc: Fix the wrong return value ALSA: control: Avoid WARN() for symlink errors sound: usb: format: don't warn that raw DSD is unsupported sound: usb: enable DSD output for ddHiFi TC44C ALSA: hda/realtek: Add new alc2xx-fixup-headset-mic model ALSA: hda/ca0132: Use standard HD-audio quirk matching helpers ALSA: usb-audio: Add implicit feedback quirk for Yamaha THR5 ALSA: hda/realtek - Add support for ASUS Zen AIO 27 Z272SD_A272SD audio ALSA: hda/realtek: Fix headset mic on Acer Nitro 5 ALSA: hda: cs35l56: Remove calls to cs35l56_force_sync_asp1_registers_from_cache()	2024-12-13 09:51:40 -08:00
Linus Torvalds	a3170b7d93	Merge tag 'docs-6.13-fix' of git://git.lwn.net/linux Pull documentation fix from Jonathan Corbet: "A single fix for a docs-build regression caused by the EXPORT_SYMBOL_NS() mass change" * tag 'docs-6.13-fix' of git://git.lwn.net/linux: scripts/kernel-doc: Get -export option working again	2024-12-13 09:46:02 -08:00
Linus Torvalds	266facde83	Merge tag 'slab-for-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab Pull slab fix from Vlastimil Babka: - Fix for memcg unreclaimable slab stats drift when post-charging large kmalloc allocations (Shakeel Butt) * tag 'slab-for-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: memcg: slub: fix SUnreclaim for post charged objects	2024-12-13 09:43:50 -08:00
Marc Zyngier	773c05f417	irqchip/gic-v3: Work around insecure GIC integrations It appears that the relatively popular RK3399 SoC has been put together using a large amount of illicit substances, as experiments reveal that its integration of GIC500 exposes the secure programming interface to non-secure. This has some pretty bad effects on the way priorities are handled, and results in a dead machine if booting with pseudo-NMI enabled (irqchip.gicv3_pseudo_nmi=1) if the kernel contains `18fdb6348c` ("arm64: irqchip/gic-v3: Select priorities at boot time"), which relies on the priorities being programmed using the NS view. Let's restore some sanity by going one step further and disable security altogether in this case. This is not any worse, and puts us in a mode where priorities actually make some sense. Huge thanks to Mark Kettenis who initially identified this issue on OpenBSD, and to Chen-Yu Tsai who reported the problem in Linux. Fixes: `18fdb6348c` ("arm64: irqchip/gic-v3: Select priorities at boot time") Reported-by: Mark Kettenis <mark.kettenis@xs4all.nl> Reported-by: Chen-Yu Tsai <wens@csie.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Chen-Yu Tsai <wens@csie.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20241213141037.3995049-1-maz@kernel.org	2024-12-13 18:15:29 +01:00
Uros Bizjak	a1855f1b7c	irqchip/gic: Correct declaration of percpu_base pointer in union gic_base percpu_base is used in various percpu functions that expect variable in __percpu address space. Correct the declaration of percpu_base to void __iomem __percpu percpu_base; to declare the variable as __percpu pointer. The patch fixes several sparse warnings: irq-gic.c:1172:44: warning: incorrect type in assignment (different address spaces) irq-gic.c:1172:44: expected void [noderef] __percpu [noderef] __iomem percpu_base irq-gic.c:1172:44: got void [noderef] __iomem [noderef] __percpu * ... irq-gic.c:1231:43: warning: incorrect type in argument 1 (different address spaces) irq-gic.c:1231:43: expected void [noderef] __percpu __pdata irq-gic.c:1231:43: got void [noderef] __percpu [noderef] __iomem *percpu_base There were no changes in the resulting object files. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/all/20241213145809.2918-2-ubizjak@gmail.com	2024-12-13 18:11:52 +01:00
Krzysztof Karas	080b2e7b5e	drm/display: use ERR_PTR on DP tunnel manager creation fail Instead of returning a generic NULL on error from drm_dp_tunnel_mgr_create(), use error pointers with informative codes to align the function with stub that is executed when CONFIG_DRM_DISPLAY_DP_TUNNEL is unset. This will also trigger IS_ERR() in current caller (intel_dp_tunnerl_mgr_init()) instead of bypassing it via NULL pointer. v2: use error codes inside drm_dp_tunnel_mgr_create() instead of handling on caller's side (Michal, Imre) v3: fixup commit message and add "CC"/"Fixes" lines (Andi), mention aligning function code with stub Fixes: `91888b5b1a` ("drm/i915/dp: Add support for DP tunnel BW allocation") Cc: Imre Deak <imre.deak@intel.com> Cc: <stable@vger.kernel.org> # v6.9+ Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/7q4fpnmmztmchczjewgm6igy55qt6jsm7tfd4fl4ucfq6yg2oy@q4lxtsu6445c	2024-12-13 18:57:34 +02:00
Arnd Bergmann	8b55f88189	media: mediatek: vcodec: mark vdec_vp9_slice_map_counts_eob_coef noinline With KASAN enabled, clang fails to optimize the inline version of vdec_vp9_slice_map_counts_eob_coef() properly, leading to kilobytes of temporary values spilled to the stack: drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c:1526:12: error: stack frame size (2160) exceeds limit (2048) in 'vdec_vp9_slice_update_prob' [-Werror,-Wframe-larger-than] This seems to affect all versions of clang including the latest (clang-20), but the degree of stack overhead is different per release. Marking the function as noinline_for_stack is harmless here and avoids the problem completely. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Sebastian Fricke <sebastian.fricke@collabora.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>	2024-12-13 17:51:35 +01:00
Bernd Schubert	78f2560fc9	fuse: Set nbytesp=0 in fuse_get_user_pages on allocation failure In fuse_get_user_pages(), set nbytesp to 0 when struct page *pages allocation fails. This prevents the caller (fuse_direct_io) from making incorrect assumptions that could lead to NULL pointer dereferences when processing the request reply. Previously, nbytesp was left unmodified on allocation failure, which could cause issues if the caller assumed pages had been added to ap->descs[] when they hadn't. Reported-by: syzbot+87b8e6ed25dbc41759f7@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=87b8e6ed25dbc41759f7 Fixes: `3b97c3652d` ("fuse: convert direct io to use folios") Signed-off-by: Bernd Schubert <bschubert@ddn.com> Reviewed-by: Joanne Koong <joannelkoong@gmail.com> Tested-by: Dmitry Antipov <dmantipov@yandex.ru> Tested-by: David Howells <dhowells@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2024-12-13 16:43:36 +01:00
Bart Van Assche	a6fe7b7051	block: Fix queue_iostats_passthrough_show() Make queue_iostats_passthrough_show() report 0/1 in sysfs instead of 0/4. This patch fixes the following sparse warning: block/blk-sysfs.c:266:31: warning: incorrect type in argument 1 (different base types) block/blk-sysfs.c:266:31: expected unsigned long var block/blk-sysfs.c:266:31: got restricted blk_flags_t Cc: Keith Busch <kbusch@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Fixes: `110234da18` ("block: enable passthrough command statistics") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20241212212941.1268662-4-bvanassche@acm.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-13 08:09:43 -07:00
Bart Van Assche	312ccd4b75	blk-mq: Clean up blk_mq_requeue_work() Move a statement that occurs in both branches of an if-statement in front of the if-statement. Fix a typo in a source code comment. No functionality has been changed. Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Nitesh Shetty <nj.shetty@samsung.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20241212212941.1268662-3-bvanassche@acm.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-13 08:09:43 -07:00
Bart Van Assche	e01424fab3	mq-deadline: Remove a local variable Since commit `fde02699c2` ("block: mq-deadline: Remove support for zone write locking"), the local variable 'insert_before' is assigned once and is used once. Hence remove this local variable. Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Nitesh Shetty <nj.shetty@samsung.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20241212212941.1268662-2-bvanassche@acm.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-13 08:09:43 -07:00
Weizhao Ouyang	ce03573a19	kselftest/arm64: abi: fix SVCR detection When using svcr_in to check ZA and Streaming Mode, we should make sure that the value in x2 is correct, otherwise it may trigger an Illegal instruction if FEAT_SVE and !FEAT_SME. Fixes: `43e3f85523` ("kselftest/arm64: Add SME support to syscall ABI test") Signed-off-by: Weizhao Ouyang <o451686892@gmail.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20241211111639.12344-1-o451686892@gmail.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2024-12-13 14:59:37 +00:00
Lu Baolu	dda2b8c3c6	iommu/vt-d: Avoid draining PRQ in sva mm release path When a PASID is used for SVA by a device, it's possible that the PASID entry is cleared before the device flushes all ongoing DMA requests and removes the SVA domain. This can occur when an exception happens and the process terminates before the device driver stops DMA and calls the iommu driver to unbind the PASID. There's no need to drain the PRQ in the mm release path. Instead, the PRQ will be drained in the SVA unbind path. Unfortunately, commit `c43e1ccdeb` ("iommu/vt-d: Drain PRQs when domain removed from RID") changed this behavior by unconditionally draining the PRQ in intel_pasid_tear_down_entry(). This can lead to a potential sleeping-in-atomic-context issue. Smatch static checker warning: drivers/iommu/intel/prq.c:95 intel_iommu_drain_pasid_prq() warn: sleeping in atomic context To avoid this issue, prevent draining the PRQ in the SVA mm release path and restore the previous behavior. Fixes: `c43e1ccdeb` ("iommu/vt-d: Drain PRQs when domain removed from RID") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/linux-iommu/c5187676-2fa2-4e29-94e0-4a279dc88b49@stanley.mountain/ Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20241212021529.1104745-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-12-13 15:54:27 +01:00
Yi Liu	74536f9196	iommu/vt-d: Fix qi_batch NULL pointer with nested parent domain The qi_batch is allocated when assigning cache tag for a domain. While for nested parent domain, it is missed. Hence, when trying to map pages to the nested parent, NULL dereference occurred. Also, there is potential memleak since there is no lock around domain->qi_batch allocation. To solve it, add a helper for qi_batch allocation, and call it in both the __cache_tag_assign_domain() and __cache_tag_assign_parent_domain(). BUG: kernel NULL pointer dereference, address: 0000000000000200 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 8104795067 P4D 0 Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 223 UID: 0 PID: 4357 Comm: qemu-system-x86 Not tainted 6.13.0-rc1-00028-g4b50c3c3b998-dirty #2632 Call Trace: ? __die+0x24/0x70 ? page_fault_oops+0x80/0x150 ? do_user_addr_fault+0x63/0x7b0 ? exc_page_fault+0x7c/0x220 ? asm_exc_page_fault+0x26/0x30 ? cache_tag_flush_range_np+0x13c/0x260 intel_iommu_iotlb_sync_map+0x1a/0x30 iommu_map+0x61/0xf0 batch_to_domain+0x188/0x250 iopt_area_fill_domains+0x125/0x320 ? rcu_is_watching+0x11/0x50 iopt_map_pages+0x63/0x100 iopt_map_common.isra.0+0xa7/0x190 iopt_map_user_pages+0x6a/0x80 iommufd_ioas_map+0xcd/0x1d0 iommufd_fops_ioctl+0x118/0x1c0 __x64_sys_ioctl+0x93/0xc0 do_syscall_64+0x71/0x140 entry_SYSCALL_64_after_hwframe+0x76/0x7e Fixes: `705c1cdf1e` ("iommu/vt-d: Introduce batched cache invalidation") Cc: stable@vger.kernel.org Co-developed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20241210130322.17175-1-yi.l.liu@intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-12-13 15:54:25 +01:00
Lu Baolu	1f2557e08a	iommu/vt-d: Remove cache tags before disabling ATS The current implementation removes cache tags after disabling ATS, leading to potential memory leaks and kernel crashes. Specifically, CACHE_TAG_DEVTLB type cache tags may still remain in the list even after the domain is freed, causing a use-after-free condition. This issue really shows up when multiple VFs from different PFs passed through to a single user-space process via vfio-pci. In such cases, the kernel may crash with kernel messages like: BUG: kernel NULL pointer dereference, address: 0000000000000014 PGD 19036a067 P4D 1940a3067 PUD 136c9b067 PMD 0 Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 74 UID: 0 PID: 3183 Comm: testCli Not tainted 6.11.9 #2 RIP: 0010:cache_tag_flush_range+0x9b/0x250 Call Trace: <TASK> ? __die+0x1f/0x60 ? page_fault_oops+0x163/0x590 ? exc_page_fault+0x72/0x190 ? asm_exc_page_fault+0x22/0x30 ? cache_tag_flush_range+0x9b/0x250 ? cache_tag_flush_range+0x5d/0x250 intel_iommu_tlb_sync+0x29/0x40 intel_iommu_unmap_pages+0xfe/0x160 __iommu_unmap+0xd8/0x1a0 vfio_unmap_unpin+0x182/0x340 [vfio_iommu_type1] vfio_remove_dma+0x2a/0xb0 [vfio_iommu_type1] vfio_iommu_type1_ioctl+0xafa/0x18e0 [vfio_iommu_type1] Move cache_tag_unassign_domain() before iommu_disable_pci_caps() to fix it. Fixes: `3b1d9e2b2d` ("iommu/vt-d: Add cache tag assignment interface") Cc: stable@vger.kernel.org Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20241129020506.576413-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-12-13 15:54:24 +01:00
Kevin Brodsky	a3b4647e2f	arm64: signal: Ensure signal delivery failure is recoverable Commit `eaf62ce156` ("arm64/signal: Set up and restore the GCS context for signal handlers") introduced a potential failure point at the end of setup_return(). This is unfortunate as it is too late to deliver a SIGSEGV: if that SIGSEGV is handled, the subsequent sigreturn will end up returning to the original handler, which is not the intention (since we failed to deliver that signal). Make sure this does not happen by calling gcs_signal_entry() at the very beginning of setup_return(), and add a comment just after to discourage error cases being introduced from that point onwards. While at it, also take care of copy_siginfo_to_user(): since it may fail, we shouldn't be calling it after setup_return() either. Call it before setup_return() instead, and move the setting of X1/X2 inside setup_return() where it belongs (after the "point of no failure"). Background: the first part of setup_rt_frame(), including setup_sigframe(), has no impact on the execution of the interrupted thread. The signal frame is written to the stack, but the stack pointer remains unchanged. Failure at this stage can be recovered by a SIGSEGV handler, and sigreturn will restore the original context, at the point where the original signal occurred. On the other hand, once setup_return() has updated registers including SP, the thread's control flow has been modified and we must deliver the original signal. Fixes: `eaf62ce156` ("arm64/signal: Set up and restore the GCS context for signal handlers") Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Reviewed-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20241210160940.2031997-1-kevin.brodsky@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2024-12-13 14:13:27 +00:00
Ming-Hung Tsai	0bb1968da2	dm array: fix cursor index when skipping across block boundaries dm_array_cursor_skip() seeks to the target position by loading array blocks iteratively until the specified number of entries to skip is reached. When seeking across block boundaries, it uses dm_array_cursor_next() to step into the next block. dm_array_cursor_skip() must first move the cursor index to the end of the current block; otherwise, the cursor position could incorrectly remain in the same block, causing the actual number of skipped entries to be much smaller than expected. This bug affects cache resizing in v2 metadata and could lead to data loss if the fast device is shrunk during the first-time resume. For example: 1. create a cache metadata consists of 32768 blocks, with a dirty block assigned to the second bitmap block. cache_restore v1.0 is required. cat <<EOF >> cmeta.xml <superblock uuid="" block_size="64" nr_cache_blocks="32768" \ policy="smq" hint_width="4"> <mappings> <mapping cache_block="32767" origin_block="0" dirty="true"/> </mappings> </superblock> EOF dmsetup create cmeta --table "0 8192 linear /dev/sdc 0" cache_restore -i cmeta.xml -o /dev/mapper/cmeta --metadata-version=2 2. bring up the cache while attempt to discard all the blocks belonging to the second bitmap block (block# 32576 to 32767). The last command is expected to fail, but it actually succeeds. dmsetup create cdata --table "0 2084864 linear /dev/sdc 8192" dmsetup create corig --table "0 65536 linear /dev/sdc 2105344" dmsetup create cache --table "0 65536 cache /dev/mapper/cmeta \ /dev/mapper/cdata /dev/mapper/corig 64 2 metadata2 writeback smq \ 2 migration_threshold 0" In addition to the reproducer described above, this fix can be verified using the "array_cursor/skip" tests in dm-unit: dm-unit run /pdata/array_cursor/skip/ --kernel-dir <KERNEL_DIR> Signed-off-by: Ming-Hung Tsai <mtsai@redhat.com> Fixes: `9b696229aa` ("dm persistent data: add cursor skip functions to the cursor APIs") Reviewed-by: Joe Thornber <thornber@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-12-13 08:39:18 -05:00
Ming-Hung Tsai	626f128ee9	dm array: fix unreleased btree blocks on closing a faulty array cursor The cached block pointer in dm_array_cursor might be NULL if it reaches an unreadable array block, or the array is empty. Therefore, dm_array_cursor_end() should call dm_btree_cursor_end() unconditionally, to prevent leaving unreleased btree blocks. This fix can be verified using the "array_cursor/iterate/empty" test in dm-unit: dm-unit run /pdata/array_cursor/iterate/empty --kernel-dir <KERNEL_DIR> Signed-off-by: Ming-Hung Tsai <mtsai@redhat.com> Fixes: `fdd1315aa5` ("dm array: introduce cursor api") Reviewed-by: Joe Thornber <thornber@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-12-13 08:37:39 -05:00
Ming-Hung Tsai	f2893c0804	dm array: fix releasing a faulty array block twice in dm_array_cursor_end When dm_bm_read_lock() fails due to locking or checksum errors, it releases the faulty block implicitly while leaving an invalid output pointer behind. The caller of dm_bm_read_lock() should not operate on this invalid dm_block pointer, or it will lead to undefined result. For example, the dm_array_cursor incorrectly caches the invalid pointer on reading a faulty array block, causing a double release in dm_array_cursor_end(), then hitting the BUG_ON in dm-bufio cache_put(). Reproduce steps: 1. initialize a cache device dmsetup create cmeta --table "0 8192 linear /dev/sdc 0" dmsetup create cdata --table "0 65536 linear /dev/sdc 8192" dmsetup create corig --table "0 524288 linear /dev/sdc $262144" dd if=/dev/zero of=/dev/mapper/cmeta bs=4k count=1 dmsetup create cache --table "0 524288 cache /dev/mapper/cmeta \ /dev/mapper/cdata /dev/mapper/corig 128 2 metadata2 writethrough smq 0" 2. wipe the second array block offline dmsteup remove cache cmeta cdata corig mapping_root=$(dd if=/dev/sdc bs=1c count=8 skip=192 \ 2>/dev/null \| hexdump -e '1/8 "%u\n"') ablock=$(dd if=/dev/sdc bs=1c count=8 skip=$((4096*mapping_root+2056)) \ 2>/dev/null \| hexdump -e '1/8 "%u\n"') dd if=/dev/zero of=/dev/sdc bs=4k count=1 seek=$ablock 3. try reopen the cache device dmsetup create cmeta --table "0 8192 linear /dev/sdc 0" dmsetup create cdata --table "0 65536 linear /dev/sdc 8192" dmsetup create corig --table "0 524288 linear /dev/sdc $262144" dmsetup create cache --table "0 524288 cache /dev/mapper/cmeta \ /dev/mapper/cdata /dev/mapper/corig 128 2 metadata2 writethrough smq 0" Kernel logs: (snip) device-mapper: array: array_block_check failed: blocknr 0 != wanted 10 device-mapper: block manager: array validator check failed for block 10 device-mapper: array: get_ablock failed device-mapper: cache metadata: dm_array_cursor_next for mapping failed ------------[ cut here ]------------ kernel BUG at drivers/md/dm-bufio.c:638! Fix by setting the cached block pointer to NULL on errors. In addition to the reproducer described above, this fix can be verified using the "array_cursor/damaged" test in dm-unit: dm-unit run /pdata/array_cursor/damaged --kernel-dir <KERNEL_DIR> Signed-off-by: Ming-Hung Tsai <mtsai@redhat.com> Fixes: `fdd1315aa5` ("dm array: introduce cursor api") Reviewed-by: Joe Thornber <thornber@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-12-13 08:33:38 -05:00
Arnd Bergmann	f578281000	Merge tag 'ffa-fix-6.13' of https://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into arm/fixes Arm FF-A fix for v6.13 A single fix to address a possible race around setting ffa_dev->properties in ffa_device_register() by updating ffa_device_register() to take all the partition information received from the firmware and updating the struct ffa_device accordingly before registering the device to the bus/driver model in the kernel. * tag 'ffa-fix-6.13' of https://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux: firmware: arm_ffa: Fix the race around setting ffa_dev->properties Link: https://lore.kernel.org/r/20241210101113.3232602-1-sudeep.holla@arm.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2024-12-13 14:26:32 +01:00
Jeff Johnson	b83accfec0	MAINTAINERS: wifi: ath: add Jeff Johnson as maintainer The "ATHEROS ATH GENERIC UTILITIES" entry shares the same git tree as the ATH10K, ATH11K, and ATH12K entries which I already maintain, so add me to that entry as well. Signed-off-by: Jeff Johnson <jjohnson@kernel.org> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://patch.msgid.link/20241212-ath-maintainer-v1-1-7ea5e86780a8@kernel.org	2024-12-13 14:48:16 +02:00
Miklos Szeredi	aa21f333c8	fs: fix is_mnt_ns_file() Commit `1fa08aece4` ("nsfs: convert to path_from_stashed() helper") reused nsfs dentry's d_fsdata, which no longer contains a pointer to proc_ns_operations. Fix the remaining use in is_mnt_ns_file(). Fixes: `1fa08aece4` ("nsfs: convert to path_from_stashed() helper") Cc: stable@vger.kernel.org # v6.9 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Link: https://lore.kernel.org/r/20241211121118.85268-1-mszeredi@redhat.com Acked-by: Al Viro <viro@zeniv.linux.org.uk> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-12-13 13:47:04 +01:00
Emmanuel Grumbach	0e8c520916	wifi: iwlwifi: fix CRF name for Bz We had BE201 hard coded. Look at the RF_ID and decide based on its value. Fixes: `6795a37161` ("wifi: iwlwifi: Print a specific device name.") Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20241212132940.b9eebda1ca60.I36791a134ed5e538e059418eb6520761da97b44c@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-12-13 13:39:17 +01:00
Vineeth Pillai (Google)	c7f7e9c731	sched/dlserver: Fix dlserver time accounting dlserver time is accounted when: - dlserver is active and the dlserver proxies the cfs task. - dlserver is active but deferred and cfs task runs after being picked through the normal fair class pick. dl_server_update is called in two places to make sure that both the above times are accounted for. But it doesn't check if dlserver is active or not. Now that we have this dl_server_active flag, we can consolidate dl_server_update into one place and all we need to check is whether dlserver is active or not. When dlserver is active there is only two possible conditions: - dlserver is deferred. - cfs task is running on behalf of dlserver. Fixes: `a110a81c52` ("sched/deadline: Deferrable dl server") Signed-off-by: "Vineeth Pillai (Google)" <vineeth@bitbyteword.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Marcel Ziswiler <marcel.ziswiler@codethink.co.uk> # ROCK 5B Link: https://lore.kernel.org/r/20241213032244.877029-2-vineeth@bitbyteword.org	2024-12-13 12:57:35 +01:00
Vineeth Pillai (Google)	b53127db1d	sched/dlserver: Fix dlserver double enqueue dlserver can get dequeued during a dlserver pick_task due to the delayed deueue feature and this can lead to issues with dlserver logic as it still thinks that dlserver is on the runqueue. The dlserver throttling and replenish logic gets confused and can lead to double enqueue of dlserver. Double enqueue of dlserver could happend due to couple of reasons: Case 1 ------ Delayed dequeue feature[1] can cause dlserver being stopped during a pick initiated by dlserver: __pick_next_task pick_task_dl -> server_pick_task pick_task_fair pick_next_entity (if (sched_delayed)) dequeue_entities dl_server_stop server_pick_task goes ahead with update_curr_dl_se without knowing that dlserver is dequeued and this confuses the logic and may lead to unintended enqueue while the server is stopped. Case 2 ------ A race condition between a task dequeue on one cpu and same task's enqueue on this cpu by a remote cpu while the lock is released causing dlserver double enqueue. One cpu would be in the schedule() and releasing RQ-lock: current->state = TASK_INTERRUPTIBLE(); schedule(); deactivate_task() dl_stop_server(); pick_next_task() pick_next_task_fair() sched_balance_newidle() rq_unlock(this_rq) at which point another CPU can take our RQ-lock and do: try_to_wake_up() ttwu_queue() rq_lock() ... activate_task() dl_server_start() --> first enqueue wakeup_preempt() := check_preempt_wakeup_fair() update_curr() update_curr_task() if (current->dl_server) dl_server_update() enqueue_dl_entity() --> second enqueue This bug was not apparent as the enqueue in dl_server_start doesn't usually happen because of the defer logic. But as a side effect of the first case(dequeue during dlserver pick), dl_throttled and dl_yield will be set and this causes the time accounting of dlserver to messup and then leading to a enqueue in dl_server_start. Have an explicit flag representing the status of dlserver to avoid the confusion. This is set in dl_server_start and reset in dlserver_stop. Fixes: `63ba8422f8` ("sched/deadline: Introduce deadline servers") Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: "Vineeth Pillai (Google)" <vineeth@bitbyteword.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Marcel Ziswiler <marcel.ziswiler@codethink.co.uk> # ROCK 5B Link: https://lkml.kernel.org/r/20241213032244.877029-1-vineeth@bitbyteword.org	2024-12-13 12:57:34 +01:00
Michael Trimarchi	d2bd3fcb82	drm/panel: synaptics-r63353: Fix regulator unbalance The shutdown function can be called when the display is already unprepared. For example during reboot this trigger a kernel backlog. Calling the drm_panel_unprepare, allow us to avoid to trigger the kernel warning. Fixes: `2e87bad7cd` ("drm/panel: Add Synaptics R63353 panel driver") Tested-by: Dario Binacchi <dario.binacchi@amarulasolutions.com> Signed-off-by: Michael Trimarchi <michael@amarulasolutions.com> Signed-off-by: Dario Binacchi <dario.binacchi@amarulasolutions.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Reviewed-by: Jessica Zhang <quic_jesszhan@quicinc.com> Link: https://lore.kernel.org/r/20241205163002.1804784-1-dario.binacchi@amarulasolutions.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20241205163002.1804784-1-dario.binacchi@amarulasolutions.com	2024-12-13 10:52:07 +01:00
Marek Vasut	406dd4c798	drm/panel: st7701: Add prepare_prev_first flag to drm_panel The DSI host must be enabled for the panel to be initialized in prepare(). Set the prepare_prev_first flag to guarantee this. This fixes the panel operation on NXP i.MX8MP SoC / Samsung DSIM DSI host. Fixes: `849b2e3ff9` ("drm/panel: Add Sitronix ST7701 panel driver") Signed-off-by: Marek Vasut <marex@denx.de> Reviewed-by: Jessica Zhang <quic_jesszhan@quicinc.com> Link: https://lore.kernel.org/r/20241124224812.150263-1-marex@denx.de Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20241124224812.150263-1-marex@denx.de	2024-12-13 10:51:56 +01:00
Yang Yingliang	f8fd0968ef	drm/panel: novatek-nt35950: fix return value check in nt35950_probe() mipi_dsi_device_register_full() never returns NULL pointer, it will return ERR_PTR() when it fails, so replace the check with IS_ERR(). Fixes: `623a3531e9` ("drm/panel: Add driver for Novatek NT35950 DSI DriverIC panels") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20241029123957.1588-1-yangyingliang@huaweicloud.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20241029123957.1588-1-yangyingliang@huaweicloud.com	2024-12-13 10:51:39 +01:00
Zhang Zekun	e1e1af9148	drm/panel: himax-hx83102: Add a check to prevent NULL pointer dereference drm_mode_duplicate() could return NULL due to lack of memory, which will then call NULL pointer dereference. Add a check to prevent it. Fixes: `0ef94554dc` ("drm/panel: himax-hx83102: Break out as separate driver") Signed-off-by: Zhang Zekun <zhangzekun11@huawei.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20241025073408.27481-3-zhangzekun11@huawei.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20241025073408.27481-3-zhangzekun11@huawei.com	2024-12-13 10:51:24 +01:00
Juergen Gross	a2796dff62	x86/xen: don't do PV iret hypercall through hypercall page Instead of jumping to the Xen hypercall page for doing the iret hypercall, directly code the required sequence in xen-asm.S. This is done in preparation of no longer using hypercall page at all, as it has shown to cause problems with speculation mitigations. This is part of XSA-466 / CVE-2024-53241. Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>	2024-12-13 09:28:43 +01:00
Juergen Gross	0ef8047b73	x86/static-call: provide a way to do very early static-call updates Add static_call_update_early() for updating static-call targets in very early boot. This will be needed for support of Xen guest type specific hypercall functions. This is part of XSA-466 / CVE-2024-53241. Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com> Co-developed-by: Peter Zijlstra <peterz@infradead.org> Co-developed-by: Josh Poimboeuf <jpoimboe@redhat.com>	2024-12-13 09:28:32 +01:00
Juergen Gross	dda014ba59	objtool/x86: allow syscall instruction The syscall instruction is used in Xen PV mode for doing hypercalls. Allow syscall to be used in the kernel in case it is tagged with an unwind hint for objtool. This is part of XSA-466 / CVE-2024-53241. Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com> Co-developed-by: Peter Zijlstra <peterz@infradead.org>	2024-12-13 09:28:21 +01:00
Juergen Gross	efbcd61d9b	x86: make get_cpu_vendor() accessible from Xen code In order to be able to differentiate between AMD and Intel based systems for very early hypercalls without having to rely on the Xen hypercall page, make get_cpu_vendor() non-static. Refactor early_cpu_init() for the same reason by splitting out the loop initializing cpu_devs() into an externally callable function. This is part of XSA-466 / CVE-2024-53241. Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com>	2024-12-13 09:28:10 +01:00
Juergen Gross	f9244fb55f	xen/netfront: fix crash when removing device When removing a netfront device directly after a suspend/resume cycle it might happen that the queues have not been setup again, causing a crash during the attempt to stop the queues another time. Fix that by checking the queues are existing before trying to stop them. This is XSA-465 / CVE-2024-53240. Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Fixes: `d50b7914fa` ("xen-netfront: Fix NULL sring after live migration") Signed-off-by: Juergen Gross <jgross@suse.com>	2024-12-13 09:12:24 +01:00
Jiri Slaby (SUSE)	145ac100b6	efi/esrt: remove esre_attribute::store() esre_attribute::store() is not needed since commit `af97a77bc0` (efi: Move some sysfs files to be read-only by root). Drop it. Found by https://github.com/jirislaby/clang-struct. Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: linux-efi@vger.kernel.org Signed-off-by: Ard Biesheuvel <ardb@kernel.org>	2024-12-13 08:43:58 +01:00
Carlos Maiolino	bf354410af	Merge tag 'xfs-6.13-fixes_2024-12-12' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into next-rc xfs: bug fixes for 6.13 [01/12] Bug fixes for 6.13. This has been running on the djcloud for months with no problems. Enjoy! Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2024-12-13 07:47:12 +01:00
Darrick J. Wong	12f2930f5f	xfs: port xfs_ioc_start_commit to multigrain timestamps Take advantage of the multigrain timestamp APIs to ensure that nobody can sneak in and write things to a file between starting a file update operation and committing the results. This should have been part of the multigrain timestamp merge, but I forgot to fling it at jlayton when he resubmitted the patchset due to developer bandwidth problems. Cc: <stable@vger.kernel.org> # v6.13-rc1 Fixes: `4e40eff0b5` ("fs: add infrastructure for multigrain timestamps") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jeff Layton <jlayton@kernel.org>	2024-12-12 17:45:13 -08:00
Darrick J. Wong	7f8b718c58	xfs: return from xfs_symlink_verify early on V4 filesystems V4 symlink blocks didn't have headers, so return early if this is a V4 filesystem. Cc: <stable@vger.kernel.org> # v5.1 Fixes: `39708c20ab` ("xfs: miscellaneous verifier magic value fixups") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>	2024-12-12 17:45:13 -08:00
Darrick J. Wong	c004a793e0	xfs: fix zero byte checking in the superblock scrubber The logic to check that the region past the end of the superblock is all zeroes is wrong -- we don't want to check only the bytes past the end of the maximally sized ondisk superblock structure as currently defined in xfs_format.h; we want to check the bytes beyond the end of the ondisk as defined by the feature bits. Port the superblock size logic from xfs_repair and then put it to use in xfs_scrub. Cc: <stable@vger.kernel.org> # v4.15 Fixes: `21fb4cb198` ("xfs: scrub the secondary superblocks") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>	2024-12-12 17:45:13 -08:00
Darrick J. Wong	06b20ef09b	xfs: check pre-metadir fields correctly The checks that were added to the superblock scrubber for metadata directories aren't quite right -- the old inode pointers are now defined to be zeroes until someone else reuses them. Also consolidate the new metadir field checks to one place; they were inexplicably scattered around. Cc: <stable@vger.kernel.org> # v6.13-rc1 Fixes: `28d756d4d5` ("xfs: update sb field checks when metadir is turned on") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>	2024-12-12 17:45:12 -08:00
Darrick J. Wong	e57e083be9	xfs: don't crash on corrupt /quotas dirent If the /quotas dirent points to an inode but the inode isn't loadable (and hence mkdir returns -EEXIST), don't crash, just bail out. Cc: <stable@vger.kernel.org> # v6.13-rc1 Fixes: `e80fbe1ad8` ("xfs: use metadir for quota inodes") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>	2024-12-12 17:45:12 -08:00
Darrick J. Wong	3853b5e1d7	xfs: don't move nondir/nonreg temporary repair files to the metadir namespace Only directories or regular files are allowed in the metadata directory tree. Don't move the repair tempfile to the metadir namespace if this is not true; this will cause the inode verifiers to trip. xrep_tempfile_adjust_directory_tree opportunistically moves sc->tempip from the regular directory tree to the metadata directory tree if sc->ip is part of the metadata directory tree. However, the scrub setup functions grab sc->ip and create sc->tempip before we actually get around to checking if the file mode is the right type for the scrubber. IOWs, you can invoke the symlink scrubber with the file handle of a subdirectory in the metadir. xrep_setup_symlink will create a temporary symlink file, xrep_tempfile_adjust_directory_tree will foolishly try to set the METADATA flag on the temp symlink, which trips the inode verifier in the inode item precommit, which shuts down the filesystem when expensive checks are turned on. If they're /not/ turned on, then xchk_symlink will return ENOENT when it sees that it's been passed a symlink, but the invalid inode could still get flushed to disk. We don't want that. Cc: <stable@vger.kernel.org> # v6.13-rc1 Fixes: `9dc31acb01` ("xfs: move repair temporary files to the metadata directory tree") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>	2024-12-12 17:45:12 -08:00
Darrick J. Wong	7f8a44f372	xfs: fix sb_spino_align checks for large fsblock sizes For a sparse inodes filesystem, mkfs.xfs computes the values of sb_spino_align and sb_inoalignmt with the following code: int cluster_size = XFS_INODE_BIG_CLUSTER_SIZE; if (cfg->sb_feat.crcs_enabled) cluster_size = cfg->inodesize / XFS_DINODE_MIN_SIZE; sbp->sb_spino_align = cluster_size >> cfg->blocklog; sbp->sb_inoalignmt = XFS_INODES_PER_CHUNK cfg->inodesize >> cfg->blocklog; On a V5 filesystem with 64k fsblocks and 512 byte inodes, this results in cluster_size = 8192 * (512 / 256) = 16384. As a result, sb_spino_align and sb_inoalignmt are both set to zero. Unfortunately, this trips the new sb_spino_align check that was just added to xfs_validate_sb_common, and the mkfs fails: # mkfs.xfs -f -b size=64k, /dev/sda meta-data=/dev/sda isize=512 agcount=4, agsize=81136 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=1, rmapbt=1 = reflink=1 bigtime=1 inobtcount=1 nrext64=1 = exchange=0 metadir=0 data = bsize=65536 blocks=324544, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=65536 ascii-ci=0, ftype=1, parent=0 log =internal log bsize=65536 blocks=5006, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=65536 blocks=0, rtextents=0 = rgcount=0 rgsize=0 extents Discarding blocks...Sparse inode alignment (0) is invalid. Metadata corruption detected at 0x560ac5a80bbe, xfs_sb block 0x0/0x200 libxfs_bwrite: write verifier failed on xfs_sb bno 0x0/0x1 mkfs.xfs: Releasing dirty buffer to free list! found dirty buffer (bulk) on free list! Sparse inode alignment (0) is invalid. Metadata corruption detected at 0x560ac5a80bbe, xfs_sb block 0x0/0x200 libxfs_bwrite: write verifier failed on xfs_sb bno 0x0/0x1 mkfs.xfs: writing AG headers failed, err=22 Prior to commit `59e43f5479` this all worked fine, even if "sparse" inodes are somewhat meaningless when everything fits in a single fsblock. Adjust the checks to handle existing filesystems. Cc: <stable@vger.kernel.org> # v6.13-rc1 Fixes: `59e43f5479` ("xfs: sb_spino_align is not verified") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>	2024-12-12 17:45:12 -08:00
Darrick J. Wong	ca378189fd	xfs: convert quotacheck to attach dquot buffers Now that we've converted the dquot logging machinery to attach the dquot buffer to the li_buf pointer so that the AIL dqflush doesn't have to allocate or read buffers in a reclaim path, do the same for the quotacheck code so that the reclaim shrinker dqflush call doesn't have to do that either. Cc: <stable@vger.kernel.org> # v6.12 Fixes: `903edea6c5` ("mm: warn about illegal __GFP_NOFAIL usage in a more appropriate location and manner") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>	2024-12-12 17:45:12 -08:00
Darrick J. Wong	acc8f8628c	xfs: attach dquot buffer to dquot log item buffer Ever since 6.12-rc1, I've observed a pile of warnings from the kernel when running fstests with quotas enabled: WARNING: CPU: 1 PID: 458580 at mm/page_alloc.c:4221 __alloc_pages_noprof+0xc9c/0xf18 CPU: 1 UID: 0 PID: 458580 Comm: xfsaild/sda3 Tainted: G W 6.12.0-rc6-djwa #rc6 6ee3e0e531f6457e2d26aa008a3b65ff184b377c <snip> Call trace: __alloc_pages_noprof+0xc9c/0xf18 alloc_pages_mpol_noprof+0x94/0x240 alloc_pages_noprof+0x68/0xf8 new_slab+0x3e0/0x568 ___slab_alloc+0x5a0/0xb88 __slab_alloc.constprop.0+0x7c/0xf8 __kmalloc_noprof+0x404/0x4d0 xfs_buf_get_map+0x594/0xde0 [xfs 384cb02810558b4c490343c164e9407332118f88] xfs_buf_read_map+0x64/0x2e0 [xfs 384cb02810558b4c490343c164e9407332118f88] xfs_trans_read_buf_map+0x1dc/0x518 [xfs 384cb02810558b4c490343c164e9407332118f88] xfs_qm_dqflush+0xac/0x468 [xfs 384cb02810558b4c490343c164e9407332118f88] xfs_qm_dquot_logitem_push+0xe4/0x148 [xfs 384cb02810558b4c490343c164e9407332118f88] xfsaild+0x3f4/0xde8 [xfs 384cb02810558b4c490343c164e9407332118f88] kthread+0x110/0x128 ret_from_fork+0x10/0x20 ---[ end trace 0000000000000000 ]--- This corresponds to the line: WARN_ON_ONCE(current->flags & PF_MEMALLOC); within the NOFAIL checks. What's happening here is that the XFS AIL is trying to write a disk quota update back into the filesystem, but for that it needs to read the ondisk buffer for the dquot. The buffer is not in memory anymore, probably because it was evicted. Regardless, the buffer cache tries to allocate a new buffer, but those allocations are NOFAIL. The AIL thread has marked itself PF_MEMALLOC (aka noreclaim) since commit `43ff2122e6` ("xfs: on-stack delayed write buffer lists") presumably because reclaim can push on XFS to push on the AIL. An easy way to fix this probably would have been to drop the NOFAIL flag from the xfs_buf allocation and open code a retry loop, but then there's still the problem that for bs>ps filesystems, the buffer itself could require up to 64k worth of pages. Inode items had similar behavior (multi-page cluster buffers that we don't want to allocate in the AIL) which we solved by making transaction precommit attach the inode cluster buffers to the dirty log item. Let's solve the dquot problem in the same way. So: Make a real precommit handler to read the dquot buffer and attach it to the log item; pass it to dqflush in the push method; and have the iodone function detach the buffer once we've flushed everything. Add a state flag to the log item to track when a thread has entered the precommit -> push mechanism to skip the detaching if it turns out that the dquot is very busy, as we don't hold the dquot lock between log item commit and AIL push). Reading and attaching the dquot buffer in the precommit hook is inspired by the work done for inode cluster buffers some time ago. Cc: <stable@vger.kernel.org> # v6.12 Fixes: `903edea6c5` ("mm: warn about illegal __GFP_NOFAIL usage in a more appropriate location and manner") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>	2024-12-12 17:45:11 -08:00
Darrick J. Wong	ec88b41b93	xfs: clean up log item accesses in xfs_qm_dqflush{,_done} Clean up these functions a little bit before we move on to the real modifications, and make the variable naming consistent for dquot log items. Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>	2024-12-12 17:45:11 -08:00
Darrick J. Wong	a40fe30868	xfs: separate dquot buffer reads from xfs_dqflush The first step towards holding the dquot buffer in the li_buf instead of reading it in the AIL is to separate the part that reads the buffer from the actual flush code. There should be no functional changes. Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>	2024-12-12 17:45:11 -08:00
Darrick J. Wong	07137e925f	xfs: don't lose solo dquot update transactions Quota counter updates are tracked via incore objects which hang off the xfs_trans object. These changes are then turned into dirty log items in xfs_trans_apply_dquot_deltas just prior to commiting the log items to the CIL. However, updating the incore deltas do not cause XFS_TRANS_DIRTY to be set on the transaction. In other words, a pure quota counter update will be silently discarded if there are no other dirty log items attached to the transaction. This is currently not the case anywhere in the filesystem because quota updates always dirty at least one other metadata item, but a subsequent bug fix will add dquot log item precommits, so we actually need a dirty dquot log item prior to xfs_trans_run_precommits. Also let's not leave a logic bomb. Cc: <stable@vger.kernel.org> # v2.6.35 Fixes: `0924378a68` ("xfs: split out iclog writing from xfs_trans_commit()") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>	2024-12-12 17:45:11 -08:00
Darrick J. Wong	3762113b59	xfs: don't lose solo superblock counter update transactions Superblock counter updates are tracked via per-transaction counters in the xfs_trans object. These changes are then turned into dirty log items in xfs_trans_apply_sb_deltas just prior to commiting the log items to the CIL. However, updating the per-transaction counter deltas do not cause XFS_TRANS_DIRTY to be set on the transaction. In other words, a pure sb counter update will be silently discarded if there are no other dirty log items attached to the transaction. This is currently not the case anywhere in the filesystem because sb counter updates always dirty at least one other metadata item, but let's not leave a logic bomb. Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>	2024-12-12 17:45:11 -08:00
Darrick J. Wong	a004afdc62	xfs: avoid nested calls to __xfs_trans_commit Currently, __xfs_trans_commit calls xfs_defer_finish_noroll, which calls __xfs_trans_commit again on the same transaction. In other words, there's a nested function call (albeit with slightly different arguments) that has caused minor amounts of confusion in the past. There's no reason to keep this around, since there's only one place where we actually want the xfs_defer_finish_noroll, and that is in the top level xfs_trans_commit call. This also reduces stack usage a little bit. Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>	2024-12-12 17:45:11 -08:00
Darrick J. Wong	44d9b07e52	xfs: only run precommits once per transaction object Committing a transaction tx0 with a defer ops chain of (A, B, C) creates a chain of transactions that looks like this: tx0 -> txA -> txB -> txC Prior to commit `cb04211748`, __xfs_trans_commit would run precommits on tx0, then call xfs_defer_finish_noroll to convert A-C to tx[A-C]. Unfortunately, after the finish_noroll loop we forgot to run precommits on txC. That was fixed by adding the second precommit call. Unfortunately, none of us remembered that xfs_defer_finish_noroll calls __xfs_trans_commit a second time to commit tx0 before finishing work A in txA and committing that. In other words, we run precommits twice on tx0: xfs_trans_commit(tx0) __xfs_trans_commit(tx0, false) xfs_trans_run_precommits(tx0) xfs_defer_finish_noroll(tx0) xfs_trans_roll(tx0) txA = xfs_trans_dup(tx0) __xfs_trans_commit(tx0, true) xfs_trans_run_precommits(tx0) This currently isn't an issue because the inode item precommit is idempotent; the iunlink item precommit deletes itself so it can't be called again; and the buffer/dquot item precommits only check the incore objects for corruption. However, it doesn't make sense to run precommits twice. Fix this situation by only running precommits after finish_noroll. Cc: <stable@vger.kernel.org> # v6.4 Fixes: `cb04211748` ("xfs: defered work could create precommits") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>	2024-12-12 17:45:10 -08:00
Darrick J. Wong	53b001a21c	xfs: unlock inodes when erroring out of xfs_trans_alloc_dir Debugging a filesystem patch with generic/475 caused the system to hang after observing the following sequences in dmesg: XFS (dm-0): metadata I/O error in "xfs_imap_to_bp+0x61/0xe0 [xfs]" at daddr 0x491520 len 32 error 5 XFS (dm-0): metadata I/O error in "xfs_btree_read_buf_block+0xba/0x160 [xfs]" at daddr 0x3445608 len 8 error 5 XFS (dm-0): metadata I/O error in "xfs_imap_to_bp+0x61/0xe0 [xfs]" at daddr 0x138e1c0 len 32 error 5 XFS (dm-0): log I/O error -5 XFS (dm-0): Metadata I/O Error (0x1) detected at xfs_trans_read_buf_map+0x1ea/0x4b0 [xfs] (fs/xfs/xfs_trans_buf.c:311). Shutting down filesystem. XFS (dm-0): Please unmount the filesystem and rectify the problem(s) XFS (dm-0): Internal error dqp->q_ino.reserved < dqp->q_ino.count at line 869 of file fs/xfs/xfs_trans_dquot.c. Caller xfs_trans_dqresv+0x236/0x440 [xfs] XFS (dm-0): Corruption detected. Unmount and run xfs_repair XFS (dm-0): Unmounting Filesystem be6bcbcc-9921-4deb-8d16-7cc94e335fa7 The system is stuck in unmount trying to lock a couple of inodes so that they can be purged. The dquot corruption notice above is a clue to what happened -- a link() call tried to set up a transaction to link a child into a directory. Quota reservation for the transaction failed after IO errors shut down the filesystem, but then we forgot to unlock the inodes on our way out. Fix that. Cc: <stable@vger.kernel.org> # v6.10 Fixes: `bd5562111d` ("xfs: Hold inode locks in xfs_trans_alloc_dir") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>	2024-12-12 17:45:10 -08:00
Darrick J. Wong	ffc3ea4f3c	xfs: fix scrub tracepoints when inode-rooted btrees are involved Fix a minor mistakes in the scrub tracepoints that can manifest when inode-rooted btrees are enabled. The existing code worked fine for bmap btrees, but we should tighten the code up to be less sloppy. Cc: <stable@vger.kernel.org> # v5.7 Fixes: `92219c292a` ("xfs: convert btree cursor inode-private member names") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>	2024-12-12 17:45:10 -08:00
Darrick J. Wong	6d7b4bc1c3	xfs: update btree keys correctly when _insrec splits an inode root block In commit `2c813ad66a`, I partially fixed a bug wherein xfs_btree_insrec would erroneously try to update the parent's key for a block that had been split if we decided to insert the new record into the new block. The solution was to detect this situation and update the in-core key value that we pass up to the caller so that the caller will (eventually) add the new block to the parent level of the tree with the correct key. However, I missed a subtlety about the way inode-rooted btrees work. If the full block was a maximally sized inode root block, we'll solve that fullness by moving the root block's records to a new block, resizing the root block, and updating the root to point to the new block. We don't pass a pointer to the new block to the caller because that work has already been done. The new record will /always/ land in the new block, so in this case we need to use xfs_btree_update_keys to update the keys. This bug can theoretically manifest itself in the very rare case that we split a bmbt root block and the new record lands in the very first slot of the new block, though I've never managed to trigger it in practice. However, it is very easy to reproduce by running generic/522 with the realtime rmapbt patchset if rtinherit=1. Cc: <stable@vger.kernel.org> # v4.8 Fixes: `2c813ad66a` ("xfs: support btrees with overlapping intervals for keys") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>	2024-12-12 17:45:10 -08:00
Darrick J. Wong	23bee6f390	xfs: fix error bailout in xfs_rtginode_create smatch reported that we screwed up the error cleanup in this function. Fix it. Cc: <stable@vger.kernel.org> # v6.13-rc1 Fixes: `ae897e0bed` ("xfs: support creating per-RTG files in growfs") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>	2024-12-12 17:45:10 -08:00
Darrick J. Wong	af9f02457f	xfs: fix null bno_hint handling in xfs_rtallocate_rtg xfs_bmap_rtalloc initializes the bno_hint variable to NULLRTBLOCK (aka NULLFSBLOCK). If the allocation request is for a file range that's adjacent to an existing mapping, it will then change bno_hint to the blkno hint in the bmalloca structure. In other words, bno_hint is either a rt block number, or it's all 1s. Unfortunately, commit `ec12f97f1b` didn't take the NULLRTBLOCK state into account, which means that it tries to translate that into a realtime extent number. We then end up with an obnoxiously high rtx number and pointlessly feed that to the near allocator. This often fails and falls back to the by-size allocator. Seeing as we had no locality hint anyway, this is a waste of time. Fix the code to detect a lack of bno_hint correctly. This was detected by running xfs/009 with metadir enabled and a 28k rt extent size. Cc: <stable@vger.kernel.org> # v6.12 Fixes: `ec12f97f1b` ("xfs: make the rtalloc start hint a xfs_rtblock_t") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>	2024-12-12 17:45:09 -08:00
Darrick J. Wong	dc5a052739	xfs: mark metadir repair tempfiles with IRECOVERY Once in a long while, xfs/566 and xfs/801 report directory corruption in one of the metadata subdirectories while it's forcibly rebuilding all filesystem metadata. I observed the following sequence of events: 1. Initiate a repair of the parent pointers for the /quota/user file. This is the secret file containing user quota data. 2. The pptr repair thread creates a temporary file and begins staging parent pointers in the ondisk metadata in preparation for an exchange-range to commit the new pptr data. 3. At the same time, initiate a repair of the /quota directory itself. 4. The dir repair thread finds the temporary file from (2), scans it for parent pointers, and stages a dirent in its own temporary dir in preparation to commit the fixed directory. 5. The parent pointer repair completes and frees the temporary file. 6. The dir repair commits the new directory and scans it again. It finds the dirent that points to the old temporary file in (2) and marks the directory corrupt. Oops! Repair code must never scan the temporary files that other repair functions create to stage new metadata. They're not supposed to do that, but the predicate function xrep_is_tempfile is incorrect because it assumes that any XFS_DIFLAG2_METADATA file cannot ever be a temporary file, but xrep_tempfile_adjust_directory_tree creates exactly that. Fix this by setting the IRECOVERY flag on temporary metadata directory inodes and using that to correct the predicate. Repair code is supposed to erase all the data in temporary files before releasing them, so it's ok if a thread scans the temporary file after we drop IRECOVERY. Cc: <stable@vger.kernel.org> # v6.13-rc1 Fixes: `bb6cdd5529` ("xfs: hide metadata inodes from everyone because they are special") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>	2024-12-12 17:45:09 -08:00
Darrick J. Wong	6f4669708a	xfs: set XFS_SICK_INO_SYMLINK_ZAPPED explicitly when zapping a symlink If we need to reset a symlink target to the "durr it's busted" string, then we clear the zapped flag as well. However, this should be using the provided helper so that we don't set the zapped state on an otherwise ok symlink. Cc: <stable@vger.kernel.org> # v6.10 Fixes: `2651923d8d` ("xfs: online repair of symbolic links") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>	2024-12-12 17:45:09 -08:00
Darrick J. Wong	aa7bfb537e	xfs: separate healthy clearing mask during repair In commit `d9041681dd` we introduced some XFS_SICK_*ZAPPED flags so that the inode record repair code could clean up a damaged inode record enough to iget the inode but still be able to remember that the higher level repair code needs to be called. As part of that, we introduced a xchk_mark_healthy_if_clean helper that is supposed to cause the ZAPPED state to be removed if that higher level metadata actually checks out. This was done by setting additional bits in sick_mask hoping that xchk_update_health will clear all those bits after a healthy scrub. Unfortunately, that's not quite what sick_mask means -- bits in that mask are indeed cleared if the metadata is healthy, but they're set if the metadata is NOT healthy. fsck is only intended to set the ZAPPED bits explicitly. If something else sets the CORRUPT/XCORRUPT state after the xchk_mark_healthy_if_clean call, we end up marking the metadata zapped. This can happen if the following sequence happens: 1. Scrub runs, discovers that the metadata is fine but could be optimized and calls xchk_mark_healthy_if_clean on a ZAPPED flag. That causes the ZAPPED flag to be set in sick_mask because the metadata is not CORRUPT or XCORRUPT. 2. Repair runs to optimize the metadata. 3. Some other metadata used for cross-referencing in (1) becomes corrupt. 4. Post-repair scrub runs, but this time it sets CORRUPT or XCORRUPT due to the events in (3). 5. Now the xchk_health_update sets the ZAPPED flag on the metadata we just repaired. This is not the correct state. Fix this by moving the "if healthy" mask to a separate field, and only ever using it to clear the sick state. Cc: <stable@vger.kernel.org> # v6.8 Fixes: `d9041681dd` ("xfs: set inode sick state flags when we zap either ondisk fork") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>	2024-12-12 17:45:09 -08:00
Darrick J. Wong	7ce31f20a0	xfs: don't drop errno values when we fail to ficlone the entire range Way back when we first implemented FICLONE for XFS, life was simple -- either the the entire remapping completed, or something happened and we had to return an errno explaining what happened. Neither of those ioctls support returning partial results, so it's all or nothing. Then things got complicated when copy_file_range came along, because it actually can return the number of bytes copied, so commit `3f68c1f562` tried to make it so that we could return a partial result if the REMAP_FILE_CAN_SHORTEN flag is set. This is also how FIDEDUPERANGE can indicate that the kernel performed a partial deduplication. Unfortunately, the logic is wrong if an error stops the remapping and CAN_SHORTEN is not set. Because those callers cannot return partial results, it is an error for ->remap_file_range to return a positive quantity that is less than the @len passed in. Implementations really should be returning a negative errno in this case, because that's what btrfs (which introduced FICLONE{,RANGE}) did. Therefore, ->remap_range implementations cannot silently drop an errno that they might have when the number of bytes remapped is less than the number of bytes requested and CAN_SHORTEN is not set. Found by running generic/562 on a 64k fsblock filesystem and wondering why it reported corrupt files. Cc: <stable@vger.kernel.org> # v4.20 Fixes: `3fc9f5e409` ("xfs: remove xfs_reflink_remap_range") Really-Fixes: `3f68c1f562` ("xfs: support returning partial reflink results") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>	2024-12-12 17:45:09 -08:00
Darrick J. Wong	bd27c7bcdc	xfs: return a 64-bit block count from xfs_btree_count_blocks With the nrext64 feature enabled, it's possible for a data fork to have 2^48 extent mappings. Even with a 64k fsblock size, that maps out to a bmbt containing more than 2^32 blocks. Therefore, this predicate must return a u64 count to avoid an integer wraparound that will cause scrub to do the wrong thing. It's unlikely that any such filesystem currently exists, because the incore bmbt would consume more than 64GB of kernel memory on its own, and so far nobody except me has driven a filesystem that far, judging from the lack of complaints. Cc: <stable@vger.kernel.org> # v5.19 Fixes: `df9ad5cc7a` ("xfs: Introduce macros to represent new maximum extent counts for data/attr forks") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>	2024-12-12 17:45:09 -08:00
Darrick J. Wong	e1d8602b6c	xfs: keep quota directory inode loaded In the same vein as the previous patch, there's no point in the metapath scrub setup function doing a lookup on the quota metadir just so it can validate that lookups work correctly. Instead, retain the quota directory inode in memory for the lifetime of the mount so that we can check this meaningfully. Cc: <stable@vger.kernel.org> # v6.13-rc1 Fixes: `128a055291` ("xfs: scrub quota file metapaths") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>	2024-12-12 17:45:08 -08:00
Darrick J. Wong	9b72800103	xfs: metapath scrubber should use the already loaded inodes Don't waste time in xchk_setup_metapath_dqinode doing a second lookup of the quota inodes, just grab them from the quotainfo structure. The whole point of this scrubber is to make sure that the dirents exist, so it's completely silly to do lookups. Cc: <stable@vger.kernel.org> # v6.13-rc1 Fixes: `128a055291` ("xfs: scrub quota file metapaths") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>	2024-12-12 17:45:08 -08:00
Darrick J. Wong	a440a28ddb	xfs: fix off-by-one error in fsmap's end_daddr usage In commit `ca6448aed4`, we created an "end_daddr" variable to fix fsmap reporting when the end of the range requested falls in the middle of an unknown (aka free on the rmapbt) region. Unfortunately, I didn't notice that the the code sets end_daddr to the last sector of the device but then uses that quantity to compute the length of the synthesized mapping. Zizhi Wo later observed that when end_daddr isn't set, we still don't report the last fsblock on a device because in that case (aka when info->last is true), the info->high mapping that we pass to xfs_getfsmap_group_helper has a startblock that points to the last fsblock. This is also wrong because the code uses startblock to compute the length of the synthesized mapping. Fix the second problem by setting end_daddr unconditionally, and fix the first problem by setting start_daddr to one past the end of the range to query. Cc: <stable@vger.kernel.org> # v6.11 Fixes: `ca6448aed4` ("xfs: Fix missing interval for missing_owner in xfs fsmap") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reported-by: Zizhi Wo <wozizhi@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2024-12-12 17:45:08 -08:00
Linus Torvalds	f932fb9b40	Merge tag 'v6.13-rc2-ksmbd-server-fixes' of git://git.samba.org/ksmbd Pull smb server fixes from Steve French: - fix ctime setting in setattr - fix reference count on user session to avoid potential race with session expire - fix query dir issue * tag 'v6.13-rc2-ksmbd-server-fixes' of git://git.samba.org/ksmbd: ksmbd: set ATTR_CTIME flags when setting mtime ksmbd: fix racy issue from session lookup and expire ksmbd: retry iterate_dir in smb2_query_dir	2024-12-12 17:33:20 -08:00
Linus Torvalds	01abac26dc	Merge tag 'perf-tools-fixes-for-v6.13-2024-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools fixes from Namhyung Kim: "A set of random fixes for this cycle. perf record: - Fix build-id event size calculation in perf record - Fix perf record -C/--cpu option on hybrid systems - Fix perf mem record with precise-ip on SapphireRapids perf test: - Refresh hwmon directory before reading the test files - Make sure system_tsc_freq event is tested on x86 only Others: - Usual header file sync - Fix undefined behavior in perf ftrace profile - Properly initialize a return variable in perf probe" * tag 'perf-tools-fixes-for-v6.13-2024-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (21 commits) perf probe: Fix uninitialized variable libperf: evlist: Fix --cpu argument on hybrid platform perf test expr: Fix system_tsc_freq for only x86 perf test hwmon_pmu: Fix event file location perf hwmon_pmu: Use openat rather than dup to refresh directory perf ftrace: Fix undefined behavior in cmp_profile_data() perf tools: Fix precise_ip fallback logic perf tools: Fix build error on generated/fs_at_flags_array.c tools headers: Sync uapi/linux/prctl.h with the kernel sources tools headers: Sync uapi/linux/mount.h with the kernel sources tools headers: Sync uapi/linux/fcntl.h with the kernel sources tools headers: Sync uapi/asm-generic/mman.h with the kernel sources tools headers: Sync *xattrat syscall changes with the kernel sources tools headers: Sync arm64 kvm header with the kernel sources tools headers: Sync x86 kvm and cpufeature headers with the kernel tools headers: Sync uapi/linux/kvm.h with the kernel sources tools headers: Sync uapi/linux/perf_event.h with the kernel sources tools headers: Sync uapi/drm/drm.h with the kernel sources perf machine: Initialize machine->env to address a segfault perf test: Don't signal all processes on system when interrupting tests ...	2024-12-12 17:26:55 -08:00
Dave Airlie	d172ea67db	Merge tag 'amd-drm-fixes-6.13-2024-12-11' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.13-2024-12-11: amdgpu: - ISP hw init fix - SR-IOV fixes - Fix contiguous VRAM mapping for UVD on older GPUs - Fix some regressions due to drm scheduler changes - Workload profile fixes - Cleaner shader fix amdkfd: - Fix DMA map direction for migration - Fix a potential null pointer dereference - Cacheline size fixes - Runtime PM fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241211215449.741848-1-alexander.deucher@amd.com	2024-12-13 09:43:20 +10:00
Dave Airlie	4dba1fd3fe	Merge tag 'drm-xe-fixes-2024-12-12' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes - Fix a KUNIT test error message (Mirsad Todorovac) - Fix an invalidation fence PM ref leak (Daniele) - Fix a register pool UAF (Lucas) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/Z1s5elHXOyeIHnE0@fedora	2024-12-13 09:15:48 +10:00
Dave Airlie	c98f9ab909	Merge tag 'drm-intel-fixes-2024-12-11' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes - Don't use indexed register writes needlessly [dsb] (Ville Syrjälä) - Stop using non-posted DSB writes for legacy LUT [color] (Ville Syrjälä) - Fix NULL pointer dereference in capture_engine (Eugene Kobyak) - Fix memory leak by correcting cache object name in error handler (Jiasheng Jiang) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Tvrtko Ursulin <tursulin@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/Z1nULMrutE4HERvB@linux	2024-12-13 08:53:17 +10:00
Linus Torvalds	eefa7a9c06	Merge tag 'for-linus' of https://github.com/openrisc/linux Pull OpenRISC fixes from Stafford Horne: - Fix from Masahiro Yamada to fix 6.13 OpenRISC boot issues after vmlinux.lds.h symbol ordering was changed - Code formatting fixups from Geert * tag 'for-linus' of https://github.com/openrisc/linux: openrisc: Fix misalignments in head.S openrisc: place exception table at the head of vmlinux	2024-12-12 13:20:53 -08:00
Alexei Starovoitov	e4c80f6975	Merge branch 'add-missing-size-check-for-btf-based-ctx-access' Kumar Kartikeya Dwivedi says: ==================== Add missing size check for BTF-based ctx access This set fixes a issue reported for tracing and struct ops programs using btf_ctx_access for ctx checks, where loading a pointer argument from the ctx doesn't enforce a BPF_DW access size check. The original report is at link [0]. Also add a regression test along with the fix. [0]: https://lore.kernel.org/bpf/51338.1732985814@localhost ==================== Link: https://patch.msgid.link/20241212092050.3204165-1-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-12-12 11:40:18 -08:00
Kumar Kartikeya Dwivedi	8025731c28	selftests/bpf: Add test for narrow ctx load for pointer args Ensure that performing narrow ctx loads other than size == 8 are rejected when the argument is a pointer type. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20241212092050.3204165-3-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-12-12 11:40:18 -08:00
Kumar Kartikeya Dwivedi	659b9ba7cb	bpf: Check size for BTF-based ctx access of pointer members Robert Morris reported the following program type which passes the verifier in [0]: SEC("struct_ops/bpf_cubic_init") void BPF_PROG(bpf_cubic_init, struct sock sk) { asm volatile("r2 = (u16)(r1 + 0)"); // verifier should demand u64 asm volatile("(u32 *)(r2 +1504) = 0"); // 1280 in some configs } The second line may or may not work, but the first instruction shouldn't pass, as it's a narrow load into the context structure of the struct ops callback. The code falls back to btf_ctx_access to ensure correctness and obtaining the types of pointers. Ensure that the size of the access is correctly checked to be 8 bytes, otherwise the verifier thinks the narrow load obtained a trusted BTF pointer and will permit loads/stores as it sees fit. Perform the check on size after we've verified that the load is for a pointer field, as for scalar values narrow loads are fine. Access to structs passed as arguments to a BPF program are also treated as scalars, therefore no adjustment is needed in their case. Existing verifier selftests are broken by this change, but because they were incorrect. Verifier tests for d_path were performing narrow load into context to obtain path pointer, had this program actually run it would cause a crash. The same holds for verifier_btf_ctx_access tests. [0]: https://lore.kernel.org/bpf/51338.1732985814@localhost Fixes: `9e15db6613` ("bpf: Implement accurate raw_tp context access via BTF") Reported-by: Robert Morris <rtm@mit.edu> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20241212092050.3204165-2-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-12-12 11:40:18 -08:00
Eduard Zingerman	04789af756	selftests/bpf: extend changes_pkt_data with cases w/o subprograms Extend changes_pkt_data tests with test cases freplacing the main program that does not have subprograms. Try four combinations when both main program and replacement do and do not change packet data. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20241212070711.427443-2-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-12-12 11:38:36 -08:00
Eduard Zingerman	ac6542ad92	bpf: fix null dereference when computing changes_pkt_data of prog w/o subprogs bpf_prog_aux->func field might be NULL if program does not have subprograms except for main sub-program. The fixed commit does bpf_prog_aux->func access unconditionally, which might lead to null pointer dereference. The bug could be triggered by replacing the following BPF program: SEC("tc") int main_changes(struct __sk_buff sk) { bpf_skb_pull_data(sk, 0); return 0; } With the following BPF program: SEC("freplace") long changes_pkt_data(struct __sk_buff sk) { return bpf_skb_pull_data(sk, 0); } bpf_prog_aux instance itself represents the main sub-program, use this property to fix the bug. Fixes: `81f6d0530b` ("bpf: check changes_pkt_data property for extension programs") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/r/202412111822.qGw6tOyB-lkp@intel.com/ Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20241212070711.427443-1-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-12-12 11:37:19 -08:00
Linus Torvalds	150b567e0d	Merge tag 'net-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bluetooth, netfilter and wireless. Current release - fix to a fix: - rtnetlink: fix error code in rtnl_newlink() - tipc: fix NULL deref in cleanup_bearer() Current release - regressions: - ip: fix warning about invalid return from in ip_route_input_rcu() Current release - new code bugs: - udp: fix L4 hash after reconnect - eth: lan969x: fix cyclic dependency between modules - eth: bnxt_en: fix potential crash when dumping FW log coredump Previous releases - regressions: - wifi: mac80211: - fix a queue stall in certain cases of channel switch - wake the queues in case of failure in resume - splice: do not checksum AF_UNIX sockets - virtio_net: fix BUG()s in BQL support due to incorrect accounting of purged packets during interface stop - eth: - stmmac: fix TSO DMA API mis-usage causing oops - bnxt_en: fixes for HW GRO: GSO type on 5750X chips and oops due to incorrect aggregation ID mask on 5760X chips Previous releases - always broken: - Bluetooth: improve setsockopt() handling of malformed user input - eth: ocelot: fix PTP timestamping in presence of packet loss - ptp: kvm: x86: avoid "fail to initialize ptp_kvm" when simply not supported" * tag 'net-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (81 commits) net: dsa: tag_ocelot_8021q: fix broken reception net: dsa: microchip: KSZ9896 register regmap alignment to 32 bit boundaries net: renesas: rswitch: fix initial MPIC register setting Bluetooth: btmtk: avoid UAF in btmtk_process_coredump Bluetooth: iso: Fix circular lock in iso_conn_big_sync Bluetooth: iso: Fix circular lock in iso_listen_bis Bluetooth: SCO: Add support for 16 bits transparent voice setting Bluetooth: iso: Fix recursive locking warning Bluetooth: iso: Always release hdev at the end of iso_listen_bis Bluetooth: hci_event: Fix using rcu_read_(un)lock while iterating Bluetooth: hci_core: Fix sleeping function called from invalid context team: Fix feature propagation of NETIF_F_GSO_ENCAP_ALL team: Fix initial vlan_feature set in __team_compute_features bonding: Fix feature propagation of NETIF_F_GSO_ENCAP_ALL bonding: Fix initial {vlan,mpls}_feature set in bond_compute_features net, team, bonding: Add netdev_base_features helper net/sched: netem: account for backlog updates from child qdisc net: dsa: felix: fix stuck CPU-injected packets with short taprio windows splice: do not checksum AF_UNIX sockets net: usb: qmi_wwan: add Telit FE910C04 compositions ...	2024-12-12 11:28:05 -08:00
Nathan Chancellor	57e420c84f	blk-iocost: Avoid using clamp() on inuse in __propagate_weights() After a recent change to clamp() and its variants [1] that increases the coverage of the check that high is greater than low because it can be done through inlining, certain build configurations (such as s390 defconfig) fail to build with clang with: block/blk-iocost.c:1101:11: error: call to '__compiletime_assert_557' declared with 'error' attribute: clamp() low limit 1 greater than high limit active 1101 \| inuse = clamp_t(u32, inuse, 1, active); \| ^ include/linux/minmax.h:218:36: note: expanded from macro 'clamp_t' 218 \| #define clamp_t(type, val, lo, hi) __careful_clamp(type, val, lo, hi) \| ^ include/linux/minmax.h:195:2: note: expanded from macro '__careful_clamp' 195 \| __clamp_once(type, val, lo, hi, __UNIQUE_ID(v_), __UNIQUE_ID(l_), __UNIQUE_ID(h_)) \| ^ include/linux/minmax.h:188:2: note: expanded from macro '__clamp_once' 188 \| BUILD_BUG_ON_MSG(statically_true(ulo > uhi), \ \| ^ __propagate_weights() is called with an active value of zero in ioc_check_iocgs(), which results in the high value being less than the low value, which is undefined because the value returned depends on the order of the comparisons. The purpose of this expression is to ensure inuse is not more than active and at least 1. This could be written more simply with a ternary expression that uses min(inuse, active) as the condition so that the value of that condition can be used if it is not zero and one if it is. Do this conversion to resolve the error and add a comment to deter people from turning this back into clamp(). Fixes: `7caa47151a` ("blkcg: implement blk-iocost") Link: https://lore.kernel.org/r/34d53778977747f19cce2abb287bb3e6@AcuMS.aculab.com/ [1] Suggested-by: David Laight <david.laight@aculab.com> Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> Closes: https://lore.kernel.org/llvm/CA+G9fYsD7mw13wredcZn0L-KBA3yeoVSTuxnss-AEWMN3ha0cA@mail.gmail.com/ Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202412120322.3GfVe3vF-lkp@intel.com/ Signed-off-by: Nathan Chancellor <nathan@kernel.org> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-12 11:48:14 -07:00
Gao Xiang	e2de3c1bf6	erofs: add erofs_sb_free() helper Unify the common parts of erofs_fc_free() and erofs_kill_sb() as erofs_sb_free(). Thus, fput() in erofs_fc_get_tree() is no longer needed, too. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20241212133504.2047178-1-hsiangkao@linux.alibaba.com	2024-12-13 00:26:27 +08:00
Joerg Roedel	7460d43bd7	Merge tag 'arm-smmu-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux into fixes Arm SMMU fixes for 6.13-rc - Use raw_smp_processor_id() when balancing traffic for NVIDIA's custom command queue implementation.	2024-12-12 17:25:37 +01:00
Yue Hu	6d1917045e	MAINTAINERS: erofs: update Yue Hu's email address The current email address is no longer valid, use my gmail instead. Signed-off-by: Yue Hu <zbestahu@gmail.com> Acked-by: Gao Xiang <xiang@kernel.org> Acked-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20241211080918.8512-1-zbestahu@163.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2024-12-13 00:25:28 +08:00
Gao Xiang	1a2180f685	erofs: fix PSI memstall accounting Max Kellermann recently reported psi_group_cpu.tasks[NR_MEMSTALL] is incorrect in the 6.11.9 kernel. The root cause appears to be that, since the problematic commit, bio can be NULL, causing psi_memstall_leave() to be skipped in z_erofs_submit_queue(). Reported-by: Max Kellermann <max.kellermann@ionos.com> Closes: https://lore.kernel.org/r/CAKPOu+8tvSowiJADW2RuKyofL_CSkm_SuyZA7ME5vMLWmL6pqw@mail.gmail.com Fixes: `9e2f9d34dd` ("erofs: handle overlapped pclusters out of crafted images properly") Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20241127085236.3538334-1-hsiangkao@linux.alibaba.com	2024-12-13 00:24:40 +08:00
Gao Xiang	b10a1e5643	erofs: fix rare pcluster memory leak after unmounting There may still exist some pcluster with valid reference counts during unmounting. Instead of introducing another synchronization primitive, just try again as unmounting is relatively rare. This approach is similar to z_erofs_cache_invalidate_folio(). It was also reported by syzbot as a UAF due to commit `f5ad9f9a60` ("erofs: free pclusters if no cached folio is attached"): BUG: KASAN: slab-use-after-free in do_raw_spin_trylock+0x72/0x1f0 kernel/locking/spinlock_debug.c:123 .. queued_spin_trylock include/asm-generic/qspinlock.h:92 [inline] do_raw_spin_trylock+0x72/0x1f0 kernel/locking/spinlock_debug.c:123 __raw_spin_trylock include/linux/spinlock_api_smp.h:89 [inline] _raw_spin_trylock+0x20/0x80 kernel/locking/spinlock.c:138 spin_trylock include/linux/spinlock.h:361 [inline] z_erofs_put_pcluster fs/erofs/zdata.c:959 [inline] z_erofs_decompress_pcluster fs/erofs/zdata.c:1403 [inline] z_erofs_decompress_queue+0x3798/0x3ef0 fs/erofs/zdata.c:1425 z_erofs_decompressqueue_work+0x99/0xe0 fs/erofs/zdata.c:1437 process_one_work kernel/workqueue.c:3229 [inline] process_scheduled_works+0xa68/0x1840 kernel/workqueue.c:3310 worker_thread+0x870/0xd30 kernel/workqueue.c:3391 kthread+0x2f2/0x390 kernel/kthread.c:389 ret_from_fork+0x4d/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 </TASK> However, it seems a long outstanding memory leak. Fix it now. Fixes: `f5ad9f9a60` ("erofs: free pclusters if no cached folio is attached") Reported-by: syzbot+7ff87b095e7ca0c5ac39@syzkaller.appspotmail.com Closes: https://lore.kernel.org/r/674c1235.050a0220.ad585.0032.GAE@google.com Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20241203072821.1885740-1-hsiangkao@linux.alibaba.com	2024-12-13 00:24:12 +08:00
Mark Rutland	65ac33bed8	arm64: stacktrace: Don't WARN when unwinding other tasks The arm64 stacktrace code has a few error conditions where a WARN_ON_ONCE() is triggered before the stacktrace is terminated and an error is returned to the caller. The conditions shouldn't be triggered when unwinding the current task, but it is possible to trigger these when unwinding another task which is not blocked, as the stack of that task is concurrently modified. Kent reports that these warnings can be triggered while running filesystem tests on bcachefs, which calls the stacktrace code directly. To produce a meaningful stacktrace of another task, the task in question should be blocked, but the stacktrace code is expected to be robust to cases where it is not blocked. Note that this is purely about not unuduly scaring the user and/or crashing the kernel; stacktraces in such cases are meaningless and may leak kernel secrets from the stack of the task being unwound. Ideally we'd pin the task in a blocked state during the unwind, as we do for /proc/${PID}/wchan since commit: `42a20f86dc` ("sched: Add wrapper for get_wchan() to keep task blocked") ... but a bunch of places don't do that, notably /proc/${PID}/stack, where we don't pin the task in a blocked state, but do restrict the output to privileged users since commit: `f8a00cef17` ("proc: restrict kernel stack dumps to root") ... and so it's possible to trigger these warnings accidentally, e.g. by reading /proc//stack (as root): \| for n in $(seq 1 10); do \| while true; do cat /proc//stack > /dev/null 2>&1; done & \| done \| ------------[ cut here ]------------ \| WARNING: CPU: 3 PID: 166 at arch/arm64/kernel/stacktrace.c:207 arch_stack_walk+0x1c8/0x370 \| Modules linked in: \| CPU: 3 UID: 0 PID: 166 Comm: cat Not tainted 6.13.0-rc2-00003-g3dafa7a7925d #2 \| Hardware name: linux,dummy-virt (DT) \| pstate: 81400005 (Nzcv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--) \| pc : arch_stack_walk+0x1c8/0x370 \| lr : arch_stack_walk+0x1b0/0x370 \| sp : ffff800080773890 \| x29: ffff800080773930 x28: fff0000005c44500 x27: fff00000058fa038 \| x26: 000000007ffff000 x25: 0000000000000000 x24: 0000000000000000 \| x23: ffffa35a8d9600ec x22: 0000000000000000 x21: fff00000043a33c0 \| x20: ffff800080773970 x19: ffffa35a8d960168 x18: 0000000000000000 \| x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 \| x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 \| x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000 \| x8 : ffff8000807738e0 x7 : ffff8000806e3800 x6 : ffff8000806e3818 \| x5 : ffff800080773920 x4 : ffff8000806e4000 x3 : ffff8000807738e0 \| x2 : 0000000000000018 x1 : ffff8000806e3800 x0 : 0000000000000000 \| Call trace: \| arch_stack_walk+0x1c8/0x370 (P) \| stack_trace_save_tsk+0x8c/0x108 \| proc_pid_stack+0xb0/0x134 \| proc_single_show+0x60/0x120 \| seq_read_iter+0x104/0x438 \| seq_read+0xf8/0x140 \| vfs_read+0xc4/0x31c \| ksys_read+0x70/0x108 \| __arm64_sys_read+0x1c/0x28 \| invoke_syscall+0x48/0x104 \| el0_svc_common.constprop.0+0x40/0xe0 \| do_el0_svc+0x1c/0x28 \| el0_svc+0x30/0xcc \| el0t_64_sync_handler+0x10c/0x138 \| el0t_64_sync+0x198/0x19c \| ---[ end trace 0000000000000000 ]--- Fix this by only warning when unwinding the current task. When unwinding another task the error conditions will be handled by returning an error without producing a warning. The two warnings in kunwind_next_frame_record_meta() were added recently as part of commit: `c2c6b27b5a` ("arm64: stacktrace: unwind exception boundaries") The warning when recovering the fgraph return address has changed form many times, but was originally introduced back in commit: `9f416319f4` ("arm64: fix unwind_frame() for filtered out fn for function graph tracing") Fixes: `c2c6b27b5a` ("arm64: stacktrace: unwind exception boundaries") Fixes: `9f416319f4` ("arm64: fix unwind_frame() for filtered out fn for function graph tracing") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reported-by: Kent Overstreet <kent.overstreet@linux.dev> Cc: Kees Cook <keescook@chromium.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20241211140704.2498712-3-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2024-12-12 16:23:15 +00:00
Mark Rutland	32ed120568	arm64: stacktrace: Skip reporting LR at exception boundaries Aishwarya reports that warnings are sometimes seen when running the ftrace kselftests, e.g. \| WARNING: CPU: 5 PID: 2066 at arch/arm64/kernel/stacktrace.c:141 arch_stack_walk+0x4a0/0x4c0 \| Modules linked in: \| CPU: 5 UID: 0 PID: 2066 Comm: ftracetest Not tainted 6.13.0-rc2 #2 \| Hardware name: linux,dummy-virt (DT) \| pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--) \| pc : arch_stack_walk+0x4a0/0x4c0 \| lr : arch_stack_walk+0x248/0x4c0 \| sp : ffff800083643d20 \| x29: ffff800083643dd0 x28: ffff00007b891400 x27: ffff00007b891928 \| x26: 0000000000000001 x25: 00000000000000c0 x24: ffff800082f39d80 \| x23: ffff80008003ee8c x22: ffff80008004baa8 x21: ffff8000800533e0 \| x20: ffff800083643e10 x19: ffff80008003eec8 x18: 0000000000000000 \| x17: 0000000000000000 x16: ffff800083640000 x15: 0000000000000000 \| x14: 02a37a802bbb8a92 x13: 00000000000001a9 x12: 0000000000000001 \| x11: ffff800082ffad60 x10: ffff800083643d20 x9 : ffff80008003eed0 \| x8 : ffff80008004baa8 x7 : ffff800086f2be80 x6 : ffff0000057cf000 \| x5 : 0000000000000000 x4 : 0000000000000000 x3 : ffff800086f2b690 \| x2 : ffff80008004baa8 x1 : ffff80008004baa8 x0 : ffff80008004baa8 \| Call trace: \| arch_stack_walk+0x4a0/0x4c0 (P) \| arch_stack_walk+0x248/0x4c0 (L) \| profile_pc+0x44/0x80 \| profile_tick+0x50/0x80 (F) \| tick_nohz_handler+0xcc/0x160 (F) \| __hrtimer_run_queues+0x2ac/0x340 (F) \| hrtimer_interrupt+0xf4/0x268 (F) \| arch_timer_handler_virt+0x34/0x60 (F) \| handle_percpu_devid_irq+0x88/0x220 (F) \| generic_handle_domain_irq+0x34/0x60 (F) \| gic_handle_irq+0x54/0x140 (F) \| call_on_irq_stack+0x24/0x58 (F) \| do_interrupt_handler+0x88/0x98 \| el1_interrupt+0x34/0x68 (F) \| el1h_64_irq_handler+0x18/0x28 \| el1h_64_irq+0x6c/0x70 \| queued_spin_lock_slowpath+0x78/0x460 (P) The warning in question is: WARN_ON_ONCE(state->common.pc == orig_pc)) ... in kunwind_recover_return_address(), which is triggered when return_to_handler() is encountered in the trace, but ftrace_graph_ret_addr() cannot find a corresponding original return address on the fgraph return stack. This happens because the stacktrace code encounters an exception boundary where the LR was not live at the time of the exception, but the LR happens to contain return_to_handler(); either because the task recently returned there, or due to unfortunate usage of the LR at a scratch register. In such cases attempts to recover the return address via ftrace_graph_ret_addr() may fail, triggering the WARN_ON_ONCE() above and aborting the unwind (hence the stacktrace terminating after reporting the PC at the time of the exception). Handling unreliable LR values in these cases is likely to require some larger rework, so for the moment avoid this problem by restoring the old behaviour of skipping the LR at exception boundaries, which the stacktrace code did prior to commit: `c2c6b27b5a` ("arm64: stacktrace: unwind exception boundaries") This commit is effectively a partial revert, keeping the structures and logic to explicitly identify exception boundaries while still skipping reporting of the LR. The logic to explicitly identify exception boundaries is still useful for general robustness and as a building block for future support for RELIABLE_STACKTRACE. Fixes: `c2c6b27b5a` ("arm64: stacktrace: unwind exception boundaries") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reported-by: Aishwarya TCV <aishwarya.tcv@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20241211140704.2498712-2-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2024-12-12 16:23:05 +00:00
Lucas De Marchi	d7b028656c	drm/xe/reg_sr: Remove register pool That pool implementation doesn't really work: if the krealloc happens to move the memory and return another address, the entries in the xarray become invalid, leading to use-after-free later: BUG: KASAN: slab-use-after-free in xe_reg_sr_apply_mmio+0x570/0x760 [xe] Read of size 4 at addr ffff8881244b2590 by task modprobe/2753 Allocated by task 2753: kasan_save_stack+0x39/0x70 kasan_save_track+0x14/0x40 kasan_save_alloc_info+0x37/0x60 __kasan_kmalloc+0xc3/0xd0 __kmalloc_node_track_caller_noprof+0x200/0x6d0 krealloc_noprof+0x229/0x380 Simplify the code to fix the bug. A better pooling strategy may be added back later if needed. Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241209232739.147417-2-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit `e5283bd4df`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2024-12-12 16:57:28 +01:00
Daniele Ceraolo Spurio	cefade70f3	drm/xe: Call invalidation_fence_fini for PT inval fences in error state Invalidation_fence_init takes a PM reference, which is released in its _fini counterpart, so we need to make sure that the latter is called, even if the fence is in an error state. Since we already have a function that calls _fini() and signals the fence in the tlb inval code, we can expose that and call it from the PT code. Fixes: `f002702290` ("drm/xe: Hold a PM ref when GT TLB invalidations are inflight") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: <stable@vger.kernel.org> # v6.11+ Cc: Matthew Brost <matthew.brost@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241206015022.1567113-1-daniele.ceraolospurio@intel.com (cherry picked from commit `65338639b7`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2024-12-12 16:57:28 +01:00
Mirsad Todorovac	ed69b28b3a	drm/xe: fix the ERR_PTR() returned on failure to allocate tiny pt Running coccinelle spatch gave the following warning: ./drivers/gpu/drm/xe/tests/xe_migrate.c:226:5-11: inconsistent IS_ERR and PTR_ERR on line 228. The code reports PTR_ERR(pt) when IS_ERR(tiny) is checked: → 211 pt = xe_bo_create_pin_map(xe, tile, m->q->vm, XE_PAGE_SIZE, 212 ttm_bo_type_kernel, 213 XE_BO_FLAG_VRAM_IF_DGFX(tile) \| 214 XE_BO_FLAG_PINNED); 215 if (IS_ERR(pt)) { 216 KUNIT_FAIL(test, "Failed to allocate fake pt: %li\n", 217 PTR_ERR(pt)); 218 goto free_big; 219 } 220 221 tiny = xe_bo_create_pin_map(xe, tile, m->q->vm, → 222 2 * SZ_4K, 223 ttm_bo_type_kernel, 224 XE_BO_FLAG_VRAM_IF_DGFX(tile) \| 225 XE_BO_FLAG_PINNED); → 226 if (IS_ERR(tiny)) { → 227 KUNIT_FAIL(test, "Failed to allocate fake pt: %li\n", → 228 PTR_ERR(pt)); 229 goto free_pt; 230 } Now, the IS_ERR(tiny) and the corresponding PTR_ERR(pt) do not match. Returning PTR_ERR(tiny), as the last failed function call, seems logical. Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Mirsad Todorovac <mtodorovac69@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241121212057.1526634-2-mtodorovac69@gmail.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit `cb57c75098`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2024-12-12 16:57:28 +01:00
John Garry	2f4873f9b5	block: Make bio_iov_bvec_set() accept pointer to const iov_iter Make bio_iov_bvec_set() accept a pointer to const iov_iter, which means that we can drop the undesirable casting to struct iov_iter pointer in blk_rq_map_user_bvec(). Signed-off-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20241202115727.2320401-1-john.g.garry@oracle.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-12 08:43:28 -07:00
Jakub Kicinski	ad913dfd8b	Merge tag 'for-net-2024-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - SCO: Fix transparent voice setting - ISO: Locking fixes - hci_core: Fix sleeping function called from invalid context - hci_event: Fix using rcu_read_(un)lock while iterating - btmtk: avoid UAF in btmtk_process_coredump * tag 'for-net-2024-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: btmtk: avoid UAF in btmtk_process_coredump Bluetooth: iso: Fix circular lock in iso_conn_big_sync Bluetooth: iso: Fix circular lock in iso_listen_bis Bluetooth: SCO: Add support for 16 bits transparent voice setting Bluetooth: iso: Fix recursive locking warning Bluetooth: iso: Always release hdev at the end of iso_listen_bis Bluetooth: hci_event: Fix using rcu_read_(un)lock while iterating Bluetooth: hci_core: Fix sleeping function called from invalid context Bluetooth: Improve setsockopt() handling of malformed user input ==================== Link: https://patch.msgid.link/20241212142806.2046274-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-12 07:10:40 -08:00
Robert Hodaszi	36ff681d22	net: dsa: tag_ocelot_8021q: fix broken reception The blamed commit changed the dsa_8021q_rcv() calling convention to accept pre-populated source_port and switch_id arguments. If those are not available, as in the case of tag_ocelot_8021q, the arguments must be pre-initialized with -1. Due to the bug of passing uninitialized arguments in tag_ocelot_8021q, dsa_8021q_rcv() does not detect that it needs to populate the source_port and switch_id, and this makes dsa_conduit_find_user() fail, which leads to packet loss on reception. Fixes: `dcfe767378` ("net: dsa: tag_sja1105: absorb logic for not overwriting precise info into dsa_8021q_rcv()") Signed-off-by: Robert Hodaszi <robert.hodaszi@digi.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20241211144741.1415758-1-robert.hodaszi@digi.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-12 07:10:27 -08:00
Jesse Van Gavere	5af53577c6	net: dsa: microchip: KSZ9896 register regmap alignment to 32 bit boundaries Commit `8d7ae22ae9` ("net: dsa: microchip: KSZ9477 register regmap alignment to 32 bit boundaries") fixed an issue whereby regmap_reg_range did not allow writes as 32 bit words to KSZ9477 PHY registers, this fix for KSZ9896 is adapted from there as the same errata is present in KSZ9896C as "Module 5: Certain PHY registers must be written as pairs instead of singly" the explanation below is likewise taken from this commit. The commit provided code to apply "Module 6: Certain PHY registers must be written as pairs instead of singly" errata for KSZ9477 as this chip for certain PHY registers (0xN120 to 0xN13F, N=1,2,3,4,5) must be accessed as 32 bit words instead of 16 or 8 bit access. Otherwise, adjacent registers (no matter if reserved or not) are overwritten with 0x0. Without this patch some registers (e.g. 0x113c or 0x1134) required for 32 bit access are out of valid regmap ranges. As a result, following error is observed and KSZ9896 is not properly configured: ksz-switch spi1.0: can't rmw 32bit reg 0x113c: -EIO ksz-switch spi1.0: can't rmw 32bit reg 0x1134: -EIO ksz-switch spi1.0 lan1 (uninitialized): failed to connect to PHY: -EIO ksz-switch spi1.0 lan1 (uninitialized): error -5 setting up PHY for tree 0, switch 0, port 0 The solution is to modify regmap_reg_range to allow accesses with 4 bytes boundaries. Fixes: `5c844d57aa` ("net: dsa: microchip: fix writes to phy registers >= 0x10") Signed-off-by: Jesse Van Gavere <jesse.vangavere@scioteq.com> Link: https://patch.msgid.link/20241211092932.26881-1-jesse.vangavere@scioteq.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-12 07:10:11 -08:00
Jens Axboe	99d6af6e8a	io_uring/rsrc: don't put/free empty buffers If cloning of buffers fail and we have to put the ones already grabbed, check for NULL buffers and skip those. They used to be dummy ubufs, but now they are just NULL and that should be checked before reaping them. Reported-by: chase xd <sl1589472800@gmail.com> Link: https://lore.kernel.org/io-uring/CADZouDQ7TcKn8gz8_efnyAEp1JvU1ktRk8PWz-tO0FXUoh8VGQ@mail.gmail.com/ Fixes: `d50f94d761` ("io_uring/rsrc: get rid of the empty node and dummy_ubuf") Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-12 08:01:52 -07:00
Nikita Yushchenko	fb9e6039c3	net: renesas: rswitch: fix initial MPIC register setting MPIC.PIS must be set per phy interface type. MPIC.LSC must be set per speed. Do that strictly per datasheet, instead of hardcoding MPIC.PIS to GMII. Fixes: `3590918b5d` ("net: ethernet: renesas: Add support for "Ethernet Switch"") Signed-off-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://patch.msgid.link/20241211053012.368914-1-nikita.yoush@cogentembedded.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-12-12 15:32:22 +01:00
Thadeu Lima de Souza Cascardo	b548f5e945	Bluetooth: btmtk: avoid UAF in btmtk_process_coredump hci_devcd_append may lead to the release of the skb, so it cannot be accessed once it is called. ================================================================== BUG: KASAN: slab-use-after-free in btmtk_process_coredump+0x2a7/0x2d0 [btmtk] Read of size 4 at addr ffff888033cfabb0 by task kworker/0:3/82 CPU: 0 PID: 82 Comm: kworker/0:3 Tainted: G U 6.6.40-lockdep-03464-g1d8b4eb3060e #1 b0b3c1cc0c842735643fb411799d97921d1f688c Hardware name: Google Yaviks_Ufs/Yaviks_Ufs, BIOS Google_Yaviks_Ufs.15217.552.0 05/07/2024 Workqueue: events btusb_rx_work [btusb] Call Trace: <TASK> dump_stack_lvl+0xfd/0x150 print_report+0x131/0x780 kasan_report+0x177/0x1c0 btmtk_process_coredump+0x2a7/0x2d0 [btmtk 03edd567dd71a65958807c95a65db31d433e1d01] btusb_recv_acl_mtk+0x11c/0x1a0 [btusb 675430d1e87c4f24d0c1f80efe600757a0f32bec] btusb_rx_work+0x9e/0xe0 [btusb 675430d1e87c4f24d0c1f80efe600757a0f32bec] worker_thread+0xe44/0x2cc0 kthread+0x2ff/0x3a0 ret_from_fork+0x51/0x80 ret_from_fork_asm+0x1b/0x30 </TASK> Allocated by task 82: stack_trace_save+0xdc/0x190 kasan_set_track+0x4e/0x80 __kasan_slab_alloc+0x4e/0x60 kmem_cache_alloc+0x19f/0x360 skb_clone+0x132/0xf70 btusb_recv_acl_mtk+0x104/0x1a0 [btusb] btusb_rx_work+0x9e/0xe0 [btusb] worker_thread+0xe44/0x2cc0 kthread+0x2ff/0x3a0 ret_from_fork+0x51/0x80 ret_from_fork_asm+0x1b/0x30 Freed by task 1733: stack_trace_save+0xdc/0x190 kasan_set_track+0x4e/0x80 kasan_save_free_info+0x28/0xb0 ____kasan_slab_free+0xfd/0x170 kmem_cache_free+0x183/0x3f0 hci_devcd_rx+0x91a/0x2060 [bluetooth] worker_thread+0xe44/0x2cc0 kthread+0x2ff/0x3a0 ret_from_fork+0x51/0x80 ret_from_fork_asm+0x1b/0x30 The buggy address belongs to the object at ffff888033cfab40 which belongs to the cache skbuff_head_cache of size 232 The buggy address is located 112 bytes inside of freed 232-byte region [ffff888033cfab40, ffff888033cfac28) The buggy address belongs to the physical page: page:00000000a174ba93 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x33cfa head:00000000a174ba93 order:1 entire_mapcount:0 nr_pages_mapped:0 pincount:0 anon flags: 0x4000000000000840(slab\|head\|zone=1) page_type: 0xffffffff() raw: 4000000000000840 ffff888100848a00 0000000000000000 0000000000000001 raw: 0000000000000000 0000000080190019 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff888033cfaa80: fb fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc ffff888033cfab00: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb >ffff888033cfab80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff888033cfac00: fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc ffff888033cfac80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== Check if we need to call hci_devcd_complete before calling hci_devcd_append. That requires that we check data->cd_info.cnt >= MTK_COREDUMP_NUM instead of data->cd_info.cnt > MTK_COREDUMP_NUM, as we increment data->cd_info.cnt only once the call to hci_devcd_append succeeds. Fixes: `0b70151328` ("Bluetooth: btusb: mediatek: add MediaTek devcoredump support") Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2024-12-12 09:25:28 -05:00
Iulia Tanasescu	7a17308c17	Bluetooth: iso: Fix circular lock in iso_conn_big_sync This fixes the circular locking dependency warning below, by reworking iso_sock_recvmsg, to ensure that the socket lock is always released before calling a function that locks hdev. [ 561.670344] ====================================================== [ 561.670346] WARNING: possible circular locking dependency detected [ 561.670349] 6.12.0-rc6+ #26 Not tainted [ 561.670351] ------------------------------------------------------ [ 561.670353] iso-tester/3289 is trying to acquire lock: [ 561.670355] ffff88811f600078 (&hdev->lock){+.+.}-{3:3}, at: iso_conn_big_sync+0x73/0x260 [bluetooth] [ 561.670405] but task is already holding lock: [ 561.670407] ffff88815af58258 (sk_lock-AF_BLUETOOTH){+.+.}-{0:0}, at: iso_sock_recvmsg+0xbf/0x500 [bluetooth] [ 561.670450] which lock already depends on the new lock. [ 561.670452] the existing dependency chain (in reverse order) is: [ 561.670453] -> #2 (sk_lock-AF_BLUETOOTH){+.+.}-{0:0}: [ 561.670458] lock_acquire+0x7c/0xc0 [ 561.670463] lock_sock_nested+0x3b/0xf0 [ 561.670467] bt_accept_dequeue+0x1a5/0x4d0 [bluetooth] [ 561.670510] iso_sock_accept+0x271/0x830 [bluetooth] [ 561.670547] do_accept+0x3dd/0x610 [ 561.670550] __sys_accept4+0xd8/0x170 [ 561.670553] __x64_sys_accept+0x74/0xc0 [ 561.670556] x64_sys_call+0x17d6/0x25f0 [ 561.670559] do_syscall_64+0x87/0x150 [ 561.670563] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 561.670567] -> #1 (sk_lock-AF_BLUETOOTH-BTPROTO_ISO){+.+.}-{0:0}: [ 561.670571] lock_acquire+0x7c/0xc0 [ 561.670574] lock_sock_nested+0x3b/0xf0 [ 561.670577] iso_sock_listen+0x2de/0xf30 [bluetooth] [ 561.670617] __sys_listen_socket+0xef/0x130 [ 561.670620] __x64_sys_listen+0xe1/0x190 [ 561.670623] x64_sys_call+0x2517/0x25f0 [ 561.670626] do_syscall_64+0x87/0x150 [ 561.670629] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 561.670632] -> #0 (&hdev->lock){+.+.}-{3:3}: [ 561.670636] __lock_acquire+0x32ad/0x6ab0 [ 561.670639] lock_acquire.part.0+0x118/0x360 [ 561.670642] lock_acquire+0x7c/0xc0 [ 561.670644] __mutex_lock+0x18d/0x12f0 [ 561.670647] mutex_lock_nested+0x1b/0x30 [ 561.670651] iso_conn_big_sync+0x73/0x260 [bluetooth] [ 561.670687] iso_sock_recvmsg+0x3e9/0x500 [bluetooth] [ 561.670722] sock_recvmsg+0x1d5/0x240 [ 561.670725] sock_read_iter+0x27d/0x470 [ 561.670727] vfs_read+0x9a0/0xd30 [ 561.670731] ksys_read+0x1a8/0x250 [ 561.670733] __x64_sys_read+0x72/0xc0 [ 561.670736] x64_sys_call+0x1b12/0x25f0 [ 561.670738] do_syscall_64+0x87/0x150 [ 561.670741] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 561.670744] other info that might help us debug this: [ 561.670745] Chain exists of: &hdev->lock --> sk_lock-AF_BLUETOOTH-BTPROTO_ISO --> sk_lock-AF_BLUETOOTH [ 561.670751] Possible unsafe locking scenario: [ 561.670753] CPU0 CPU1 [ 561.670754] ---- ---- [ 561.670756] lock(sk_lock-AF_BLUETOOTH); [ 561.670758] lock(sk_lock AF_BLUETOOTH-BTPROTO_ISO); [ 561.670761] lock(sk_lock-AF_BLUETOOTH); [ 561.670764] lock(&hdev->lock); [ 561.670767] * DEADLOCK * Fixes: `07a9342b94` ("Bluetooth: ISO: Send BIG Create Sync via hci_sync") Signed-off-by: Iulia Tanasescu <iulia.tanasescu@nxp.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2024-12-12 09:25:13 -05:00
Iulia Tanasescu	168e28305b	Bluetooth: iso: Fix circular lock in iso_listen_bis This fixes the circular locking dependency warning below, by releasing the socket lock before enterning iso_listen_bis, to avoid any potential deadlock with hdev lock. [ 75.307983] ====================================================== [ 75.307984] WARNING: possible circular locking dependency detected [ 75.307985] 6.12.0-rc6+ #22 Not tainted [ 75.307987] ------------------------------------------------------ [ 75.307987] kworker/u81:2/2623 is trying to acquire lock: [ 75.307988] ffff8fde1769da58 (sk_lock-AF_BLUETOOTH-BTPROTO_ISO) at: iso_connect_cfm+0x253/0x840 [bluetooth] [ 75.308021] but task is already holding lock: [ 75.308022] ffff8fdd61a10078 (&hdev->lock) at: hci_le_per_adv_report_evt+0x47/0x2f0 [bluetooth] [ 75.308053] which lock already depends on the new lock. [ 75.308054] the existing dependency chain (in reverse order) is: [ 75.308055] -> #1 (&hdev->lock){+.+.}-{3:3}: [ 75.308057] __mutex_lock+0xad/0xc50 [ 75.308061] mutex_lock_nested+0x1b/0x30 [ 75.308063] iso_sock_listen+0x143/0x5c0 [bluetooth] [ 75.308085] __sys_listen_socket+0x49/0x60 [ 75.308088] __x64_sys_listen+0x4c/0x90 [ 75.308090] x64_sys_call+0x2517/0x25f0 [ 75.308092] do_syscall_64+0x87/0x150 [ 75.308095] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 75.308098] -> #0 (sk_lock-AF_BLUETOOTH-BTPROTO_ISO){+.+.}-{0:0}: [ 75.308100] __lock_acquire+0x155e/0x25f0 [ 75.308103] lock_acquire+0xc9/0x300 [ 75.308105] lock_sock_nested+0x32/0x90 [ 75.308107] iso_connect_cfm+0x253/0x840 [bluetooth] [ 75.308128] hci_connect_cfm+0x6c/0x190 [bluetooth] [ 75.308155] hci_le_per_adv_report_evt+0x27b/0x2f0 [bluetooth] [ 75.308180] hci_le_meta_evt+0xe7/0x200 [bluetooth] [ 75.308206] hci_event_packet+0x21f/0x5c0 [bluetooth] [ 75.308230] hci_rx_work+0x3ae/0xb10 [bluetooth] [ 75.308254] process_one_work+0x212/0x740 [ 75.308256] worker_thread+0x1bd/0x3a0 [ 75.308258] kthread+0xe4/0x120 [ 75.308259] ret_from_fork+0x44/0x70 [ 75.308261] ret_from_fork_asm+0x1a/0x30 [ 75.308263] other info that might help us debug this: [ 75.308264] Possible unsafe locking scenario: [ 75.308264] CPU0 CPU1 [ 75.308265] ---- ---- [ 75.308265] lock(&hdev->lock); [ 75.308267] lock(sk_lock- AF_BLUETOOTH-BTPROTO_ISO); [ 75.308268] lock(&hdev->lock); [ 75.308269] lock(sk_lock-AF_BLUETOOTH-BTPROTO_ISO); [ 75.308270] * DEADLOCK * [ 75.308271] 4 locks held by kworker/u81:2/2623: [ 75.308272] #0: ffff8fdd66e52148 ((wq_completion)hci0#2){+.+.}-{0:0}, at: process_one_work+0x443/0x740 [ 75.308276] #1: ffffafb488b7fe48 ((work_completion)(&hdev->rx_work)), at: process_one_work+0x1ce/0x740 [ 75.308280] #2: ffff8fdd61a10078 (&hdev->lock){+.+.}-{3:3} at: hci_le_per_adv_report_evt+0x47/0x2f0 [bluetooth] [ 75.308304] #3: ffffffffb6ba4900 (rcu_read_lock){....}-{1:2}, at: hci_connect_cfm+0x29/0x190 [bluetooth] Fixes: `02171da6e8` ("Bluetooth: ISO: Add hcon for listening bis sk") Signed-off-by: Iulia Tanasescu <iulia.tanasescu@nxp.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2024-12-12 09:24:57 -05:00
Frédéric Danis	29a651451e	Bluetooth: SCO: Add support for 16 bits transparent voice setting The voice setting is used by sco_connect() or sco_conn_defer_accept() after being set by sco_sock_setsockopt(). The PCM part of the voice setting is used for offload mode through PCM chipset port. This commits add support for mSBC 16 bits offloading, i.e. audio data not transported over HCI. The BCM4349B1 supports 16 bits transparent data on its I2S port. If BT_VOICE_TRANSPARENT is used when accepting a SCO connection, this gives only garbage audio while using BT_VOICE_TRANSPARENT_16BIT gives correct audio. This has been tested with connection to iPhone 14 and Samsung S24. Fixes: `ad10b1a487` ("Bluetooth: Add Bluetooth socket voice option") Signed-off-by: Frédéric Danis <frederic.danis@collabora.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2024-12-12 09:24:35 -05:00
Iulia Tanasescu	9bde7c3b3a	Bluetooth: iso: Fix recursive locking warning This updates iso_sock_accept to use nested locking for the parent socket, to avoid lockdep warnings caused because the parent and child sockets are locked by the same thread: [ 41.585683] ============================================ [ 41.585688] WARNING: possible recursive locking detected [ 41.585694] 6.12.0-rc6+ #22 Not tainted [ 41.585701] -------------------------------------------- [ 41.585705] iso-tester/3139 is trying to acquire lock: [ 41.585711] ffff988b29530a58 (sk_lock-AF_BLUETOOTH) at: bt_accept_dequeue+0xe3/0x280 [bluetooth] [ 41.585905] but task is already holding lock: [ 41.585909] ffff988b29533a58 (sk_lock-AF_BLUETOOTH) at: iso_sock_accept+0x61/0x2d0 [bluetooth] [ 41.586064] other info that might help us debug this: [ 41.586069] Possible unsafe locking scenario: [ 41.586072] CPU0 [ 41.586076] ---- [ 41.586079] lock(sk_lock-AF_BLUETOOTH); [ 41.586086] lock(sk_lock-AF_BLUETOOTH); [ 41.586093] * DEADLOCK * [ 41.586097] May be due to missing lock nesting notation [ 41.586101] 1 lock held by iso-tester/3139: [ 41.586107] #0: ffff988b29533a58 (sk_lock-AF_BLUETOOTH) at: iso_sock_accept+0x61/0x2d0 [bluetooth] Fixes: `ccf74f2390` ("Bluetooth: Add BTPROTO_ISO socket type") Signed-off-by: Iulia Tanasescu <iulia.tanasescu@nxp.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2024-12-12 09:24:20 -05:00
Iulia Tanasescu	9c76fff747	Bluetooth: iso: Always release hdev at the end of iso_listen_bis Since hci_get_route holds the device before returning, the hdev should be released with hci_dev_put at the end of iso_listen_bis even if the function returns with an error. Fixes: `02171da6e8` ("Bluetooth: ISO: Add hcon for listening bis sk") Signed-off-by: Iulia Tanasescu <iulia.tanasescu@nxp.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2024-12-12 09:24:05 -05:00
Luiz Augusto von Dentz	581dd2dc16	Bluetooth: hci_event: Fix using rcu_read_(un)lock while iterating The usage of rcu_read_(un)lock while inside list_for_each_entry_rcu is not safe since for the most part entries fetched this way shall be treated as rcu_dereference: Note that the value returned by rcu_dereference() is valid only within the enclosing RCU read-side critical section [1]_. For example, the following is not legal:: rcu_read_lock(); p = rcu_dereference(head.next); rcu_read_unlock(); x = p->address; /* BUG!!! / rcu_read_lock(); y = p->data; / BUG!!! */ rcu_read_unlock(); Fixes: `a0bfde167b` ("Bluetooth: ISO: Add support for connecting multiple BISes") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2024-12-12 09:23:49 -05:00
Luiz Augusto von Dentz	4d94f05558	Bluetooth: hci_core: Fix sleeping function called from invalid context This reworks hci_cb_list to not use mutex hci_cb_list_lock to avoid bugs like the bellow: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:585 in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 5070, name: kworker/u9:2 preempt_count: 0, expected: 0 RCU nest depth: 1, expected: 0 4 locks held by kworker/u9:2/5070: #0: ffff888015be3948 ((wq_completion)hci0#2){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3229 [inline] #0: ffff888015be3948 ((wq_completion)hci0#2){+.+.}-{0:0}, at: process_scheduled_works+0x8e0/0x1770 kernel/workqueue.c:3335 #1: ffffc90003b6fd00 ((work_completion)(&hdev->rx_work)){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3230 [inline] #1: ffffc90003b6fd00 ((work_completion)(&hdev->rx_work)){+.+.}-{0:0}, at: process_scheduled_works+0x91b/0x1770 kernel/workqueue.c:3335 #2: ffff8880665d0078 (&hdev->lock){+.+.}-{3:3}, at: hci_le_create_big_complete_evt+0xcf/0xae0 net/bluetooth/hci_event.c:6914 #3: ffffffff8e132020 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:298 [inline] #3: ffffffff8e132020 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:750 [inline] #3: ffffffff8e132020 (rcu_read_lock){....}-{1:2}, at: hci_le_create_big_complete_evt+0xdb/0xae0 net/bluetooth/hci_event.c:6915 CPU: 0 PID: 5070 Comm: kworker/u9:2 Not tainted 6.8.0-syzkaller-08073-g480e035fc4c7 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024 Workqueue: hci0 hci_rx_work Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114 __might_resched+0x5d4/0x780 kernel/sched/core.c:10187 __mutex_lock_common kernel/locking/mutex.c:585 [inline] __mutex_lock+0xc1/0xd70 kernel/locking/mutex.c:752 hci_connect_cfm include/net/bluetooth/hci_core.h:2004 [inline] hci_le_create_big_complete_evt+0x3d9/0xae0 net/bluetooth/hci_event.c:6939 hci_event_func net/bluetooth/hci_event.c:7514 [inline] hci_event_packet+0xa53/0x1540 net/bluetooth/hci_event.c:7569 hci_rx_work+0x3e8/0xca0 net/bluetooth/hci_core.c:4171 process_one_work kernel/workqueue.c:3254 [inline] process_scheduled_works+0xa00/0x1770 kernel/workqueue.c:3335 worker_thread+0x86d/0xd70 kernel/workqueue.c:3416 kthread+0x2f0/0x390 kernel/kthread.c:388 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243 </TASK> Reported-by: syzbot+2fb0835e0c9cefc34614@syzkaller.appspotmail.com Tested-by: syzbot+2fb0835e0c9cefc34614@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=2fb0835e0c9cefc34614 Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2024-12-12 09:23:28 -05:00
Takashi Iwai	7b26bc6582	Merge tag 'asoc-fix-v6.12-rc2' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: Fixes for v6.13 A small pile of driver specific fixes, all quite small and not particularly major.	2024-12-12 14:49:35 +01:00
T.J. Mercier	0cff90dec6	dma-buf: Fix __dma_buf_debugfs_list_del argument for !CONFIG_DEBUG_FS The arguments for __dma_buf_debugfs_list_del do not match for both the CONFIG_DEBUG_FS case and the !CONFIG_DEBUG_FS case. The !CONFIG_DEBUG_FS case should take a struct dma_buf , but it's currently struct file . This can lead to the build error: error: passing argument 1 of ‘__dma_buf_debugfs_list_del’ from incompatible pointer type [-Werror=incompatible-pointer-types] dma-buf.c:63:53: note: expected ‘struct file ’ but argument is of type ‘struct dma_buf ’ 63 \| static void __dma_buf_debugfs_list_del(struct file *file) Fixes: `bfc7bc5393` ("dma-buf: Do not build debugfs related code when !CONFIG_DEBUG_FS") Signed-off-by: T.J. Mercier <tjmercier@google.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20241117170326.1971113-1-tjmercier@google.com	2024-12-12 18:53:45 +05:30
Daniil Tatianin	c53d96a448	ACPICA: events/evxfregn: don't release the ContextMutex that was never acquired This bug was first introduced in `c27f3d011b`, where the author of the patch probably meant to do DeleteMutex instead of ReleaseMutex. The mutex leak was noticed later on and fixed in `e4dfe10837`, but the bogus MutexRelease line was never removed, so do it now. Link: https://github.com/acpica/acpica/pull/982 Fixes: `c27f3d011b` ("ACPICA: Fix race in generic_serial_bus (I2C) and GPIO op_region parameter handling") Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru> Link: https://patch.msgid.link/20241122082954.658356-1-d-tatianin@yandex-team.ru Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2024-12-12 13:22:41 +01:00
Paolo Abeni	3d64c3d3c6	Merge tag 'nf-24-12-11' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Fix bogus test reports in rpath.sh selftest by adding permanent neighbor entries, from Phil Sutter. 2) Lockdep reports possible ABBA deadlock in xt_IDLETIMER, fix it by removing sysfs out of the mutex section, also from Phil Sutter. 3) It is illegal to release basechain via RCU callback, for several reasons. Keep it simple and safe by calling synchronize_rcu() instead. This is a partially reverting a botched recent attempt of me to fix this basechain release path on netdevice removal. From Florian Westphal. netfilter pull request 24-12-11 * tag 'nf-24-12-11' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nf_tables: do not defer rule destruction via call_rcu netfilter: IDLETIMER: Fix for possible ABBA deadlock selftests: netfilter: Stabilize rpath.sh ==================== Link: https://patch.msgid.link/20241211230130.176937-1-pablo@netfilter.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-12-12 13:11:38 +01:00
Geert Uytterhoeven	de6b43798d	i2c: riic: Always round-up when calculating bus period Currently, the RIIC driver may run the I2C bus faster than requested, which may cause subtle failures. E.g. Biju reported a measured bus speed of 450 kHz instead of the expected maximum of 400 kHz on RZ/G2L. The initial calculation of the bus period uses DIV_ROUND_UP(), to make sure the actual bus speed never becomes faster than the requested bus speed. However, the subsequent division-by-two steps do not use round-up, which may lead to a too-small period, hence a too-fast and possible out-of-spec bus speed. E.g. on RZ/Five, requesting a bus speed of 100 resp. 400 kHz will yield too-fast target bus speeds of 100806 resp. 403226 Hz instead of 97656 resp. 390625 Hz. Fix this by using DIV_ROUND_UP() in the subsequent divisions, too. Tested on RZ/A1H, RZ/A2M, and RZ/Five. Fixes: `d982d66514` ("i2c: riic: remove clock and frequency restrictions") Reported-by: Biju Das <biju.das.jz@bp.renesas.com> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: <stable@vger.kernel.org> # v4.15+ Link: https://lore.kernel.org/r/c59aea77998dfea1b4456c4b33b55ab216fcbf5e.1732284746.git.geert+renesas@glider.be Signed-off-by: Andi Shyti <andi.shyti@kernel.org>	2024-12-12 12:54:02 +01:00
Charles Keepax	255cc582e6	ASoC: Intel: sof_sdw: Add space for a terminator into DAIs array The code uses the initialised member of the asoc_sdw_dailink struct to determine if a member of the array is in use. However in the case the array is completely full this will lead to an access 1 past the end of the array, expand the array by one entry to include a space for a terminator. Fixes: `27fd36aefa` ("ASoC: Intel: sof-sdw: Add new code for parsing the snd_soc_acpi structs") Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://patch.msgid.link/20241212105742.1508574-1-ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-12-12 11:08:49 +00:00
Daniel Borkmann	9871284458	team: Fix feature propagation of NETIF_F_GSO_ENCAP_ALL Similar to bonding driver, add NETIF_F_GSO_ENCAP_ALL to TEAM_VLAN_FEATURES in order to support slave devices which propagate NETIF_F_GSO_UDP_TUNNEL & NETIF_F_GSO_UDP_TUNNEL_CSUM as vlan_features. Fixes: `3625920b62` ("teaming: fix vlan_features computing") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Nikolay Aleksandrov <razor@blackwall.org> Cc: Ido Schimmel <idosch@idosch.org> Cc: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20241210141245.327886-5-daniel@iogearbox.net Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-12-12 11:59:18 +01:00
Daniel Borkmann	396699ac2c	team: Fix initial vlan_feature set in __team_compute_features Similarly as with bonding, fix the calculation of vlan_features to reuse netdev_base_features() in order derive the set in the same way as ndo_fix_features before iterating through the slave devices to refine the feature set. Fixes: `3625920b62` ("teaming: fix vlan_features computing") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Nikolay Aleksandrov <razor@blackwall.org> Cc: Ido Schimmel <idosch@idosch.org> Cc: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20241210141245.327886-4-daniel@iogearbox.net Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-12-12 11:59:18 +01:00
Daniel Borkmann	77b11c8bf3	bonding: Fix feature propagation of NETIF_F_GSO_ENCAP_ALL Drivers like mlx5 expose NIC's vlan_features such as NETIF_F_GSO_UDP_TUNNEL & NETIF_F_GSO_UDP_TUNNEL_CSUM which are later not propagated when the underlying devices are bonded and a vlan device created on top of the bond. Right now, the more cumbersome workaround for this is to create the vlan on top of the mlx5 and then enslave the vlan devices to a bond. To fix this, add NETIF_F_GSO_ENCAP_ALL to BOND_VLAN_FEATURES such that bond_compute_features() can probe and propagate the vlan_features from the slave devices up to the vlan device. Given the following bond: # ethtool -i enp2s0f{0,1}np{0,1} driver: mlx5_core [...] # ethtool -k enp2s0f0np0 \| grep udp tx-udp_tnl-segmentation: on tx-udp_tnl-csum-segmentation: on tx-udp-segmentation: on rx-udp_tunnel-port-offload: on rx-udp-gro-forwarding: off # ethtool -k enp2s0f1np1 \| grep udp tx-udp_tnl-segmentation: on tx-udp_tnl-csum-segmentation: on tx-udp-segmentation: on rx-udp_tunnel-port-offload: on rx-udp-gro-forwarding: off # ethtool -k bond0 \| grep udp tx-udp_tnl-segmentation: on tx-udp_tnl-csum-segmentation: on tx-udp-segmentation: on rx-udp_tunnel-port-offload: off [fixed] rx-udp-gro-forwarding: off Before: # ethtool -k bond0.100 \| grep udp tx-udp_tnl-segmentation: off [requested on] tx-udp_tnl-csum-segmentation: off [requested on] tx-udp-segmentation: on rx-udp_tunnel-port-offload: off [fixed] rx-udp-gro-forwarding: off After: # ethtool -k bond0.100 \| grep udp tx-udp_tnl-segmentation: on tx-udp_tnl-csum-segmentation: on tx-udp-segmentation: on rx-udp_tunnel-port-offload: off [fixed] rx-udp-gro-forwarding: off Various users have run into this reporting performance issues when configuring Cilium in vxlan tunneling mode and having the combination of bond & vlan for the core devices connecting the Kubernetes cluster to the outside world. Fixes: `a9b3ace44c` ("bonding: fix vlan_features computing") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Nikolay Aleksandrov <razor@blackwall.org> Cc: Ido Schimmel <idosch@idosch.org> Cc: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20241210141245.327886-3-daniel@iogearbox.net Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-12-12 11:59:18 +01:00
Daniel Borkmann	d064ea7fe2	bonding: Fix initial {vlan,mpls}_feature set in bond_compute_features If a bonding device has slave devices, then the current logic to derive the feature set for the master bond device is limited in that flags which are fully supported by the underlying slave devices cannot be propagated up to vlan devices which sit on top of bond devices. Instead, these get blindly masked out via current NETIF_F_ALL_FOR_ALL logic. vlan_features and mpls_features should reuse netdev_base_features() in order derive the set in the same way as ndo_fix_features before iterating through the slave devices to refine the feature set. Fixes: `a9b3ace44c` ("bonding: fix vlan_features computing") Fixes: `2e770b507c` ("net: bonding: Inherit MPLS features from slave devices") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Nikolay Aleksandrov <razor@blackwall.org> Cc: Ido Schimmel <idosch@idosch.org> Cc: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20241210141245.327886-2-daniel@iogearbox.net Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-12-12 11:59:18 +01:00
Daniel Borkmann	d2516c3a53	net, team, bonding: Add netdev_base_features helper Both bonding and team driver have logic to derive the base feature flags before iterating over their slave devices to refine the set via netdev_increment_features(). Add a small helper netdev_base_features() so this can be reused instead of having it open-coded multiple times. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Nikolay Aleksandrov <razor@blackwall.org> Cc: Ido Schimmel <idosch@idosch.org> Cc: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20241210141245.327886-1-daniel@iogearbox.net Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-12-12 11:59:18 +01:00
Joanne Koong	7a4f541873	fuse: fix direct io folio offset and length calculation For the direct io case, the pages from userspace may be part of a huge folio, even if all folios in the page cache for fuse are small. Fix the logic for calculating the offset and length of the folio for the direct io case, which currently incorrectly assumes that all folios encountered are one page size. Fixes: `3b97c3652d` ("fuse: convert direct io to use folios") Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com> Reviewed-by: Bernd Schubert <bschubert@ddn.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2024-12-12 09:27:42 +01:00
James Clark	434fffa926	perf probe: Fix uninitialized variable Since the linked fixes: commit, err is returned uninitialized due to the removal of "return 0". Initialize err to fix it. This fixes the following intermittent test failure on release builds: $ perf test "testsuite_probe" ... -- [ FAIL ] -- perf_probe :: test_invalid_options :: mutually exclusive options :: -L foo -V bar (output regexp parsing) Regexp not found: \"Error: switch .+ cannot be used with switch .+\" ... Fixes: `080e47b2a2` ("perf probe: Introduce quotation marks support") Tested-by: Namhyung Kim <namhyung@kernel.org> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20241211085525.519458-2-james.clark@linaro.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2024-12-11 21:40:46 -08:00
Martin Ottens	f8d4bc4550	net/sched: netem: account for backlog updates from child qdisc In general, 'qlen' of any classful qdisc should keep track of the number of packets that the qdisc itself and all of its children holds. In case of netem, 'qlen' only accounts for the packets in its internal tfifo. When netem is used with a child qdisc, the child qdisc can use 'qdisc_tree_reduce_backlog' to inform its parent, netem, about created or dropped SKBs. This function updates 'qlen' and the backlog statistics of netem, but netem does not account for changes made by a child qdisc. 'qlen' then indicates the wrong number of packets in the tfifo. If a child qdisc creates new SKBs during enqueue and informs its parent about this, netem's 'qlen' value is increased. When netem dequeues the newly created SKBs from the child, the 'qlen' in netem is not updated. If 'qlen' reaches the configured sch->limit, the enqueue function stops working, even though the tfifo is not full. Reproduce the bug: Ensure that the sender machine has GSO enabled. Configure netem as root qdisc and tbf as its child on the outgoing interface of the machine as follows: $ tc qdisc add dev <oif> root handle 1: netem delay 100ms limit 100 $ tc qdisc add dev <oif> parent 1:0 tbf rate 50Mbit burst 1542 latency 50ms Send bulk TCP traffic out via this interface, e.g., by running an iPerf3 client on the machine. Check the qdisc statistics: $ tc -s qdisc show dev <oif> Statistics after 10s of iPerf3 TCP test before the fix (note that netem's backlog > limit, netem stopped accepting packets): qdisc netem 1: root refcnt 2 limit 1000 delay 100ms Sent 2767766 bytes 1848 pkt (dropped 652, overlimits 0 requeues 0) backlog 4294528236b 1155p requeues 0 qdisc tbf 10: parent 1:1 rate 50Mbit burst 1537b lat 50ms Sent 2767766 bytes 1848 pkt (dropped 327, overlimits 7601 requeues 0) backlog 0b 0p requeues 0 Statistics after the fix: qdisc netem 1: root refcnt 2 limit 1000 delay 100ms Sent 37766372 bytes 24974 pkt (dropped 9, overlimits 0 requeues 0) backlog 0b 0p requeues 0 qdisc tbf 10: parent 1:1 rate 50Mbit burst 1537b lat 50ms Sent 37766372 bytes 24974 pkt (dropped 327, overlimits 96017 requeues 0) backlog 0b 0p requeues 0 tbf segments the GSO SKBs (tbf_segment) and updates the netem's 'qlen'. The interface fully stops transferring packets and "locks". In this case, the child qdisc and tfifo are empty, but 'qlen' indicates the tfifo is at its limit and no more packets are accepted. This patch adds a counter for the entries in the tfifo. Netem's 'qlen' is only decreased when a packet is returned by its dequeue function, and not during enqueuing into the child qdisc. External updates to 'qlen' are thus accounted for and only the behavior of the backlog statistics changes. As in other qdiscs, 'qlen' then keeps track of how many packets are held in netem and all of its children. As before, sch->limit remains as the maximum number of packets in the tfifo. The same applies to netem's backlog statistics. Fixes: `50612537e9` ("netem: fix classful handling") Signed-off-by: Martin Ottens <martin.ottens@fau.de> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://patch.msgid.link/20241210131412.1837202-1-martin.ottens@fau.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-11 20:26:37 -08:00
Jakub Kicinski	564d74d6ac	Merge tag 'batadv-net-pullrequest-20241210' of git://git.open-mesh.org/linux-merge Simon Wunderlich says: ==================== Here are some batman-adv bugfixes: - fix TT unitialized data and size limit issues, by Remi Pommarel (3 patches) * tag 'batadv-net-pullrequest-20241210' of git://git.open-mesh.org/linux-merge: batman-adv: Do not let TT changes list grows indefinitely batman-adv: Remove uninitialized data in full table TT response batman-adv: Do not send uninitialized TT changes ==================== Link: https://patch.msgid.link/20241210135024.39068-1-sw@simonwunderlich.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-11 20:26:00 -08:00
Vladimir Oltean	acfcdb78d5	net: dsa: felix: fix stuck CPU-injected packets with short taprio windows With this port schedule: tc qdisc replace dev $send_if parent root handle 100 taprio \ num_tc 8 queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \ map 0 1 2 3 4 5 6 7 \ base-time 0 cycle-time 10000 \ sched-entry S 01 1250 \ sched-entry S 02 1250 \ sched-entry S 04 1250 \ sched-entry S 08 1250 \ sched-entry S 10 1250 \ sched-entry S 20 1250 \ sched-entry S 40 1250 \ sched-entry S 80 1250 \ flags 2 ptp4l would fail to take TX timestamps of Pdelay_Resp messages like this: increasing tx_timestamp_timeout may correct this issue, but it is likely caused by a driver bug ptp4l[4134.168]: port 2: send peer delay response failed It turns out that the driver can't take their TX timestamps because it can't transmit them in the first place. And there's nothing special about the Pdelay_Resp packets - they're just regular 68 byte packets. But with this taprio configuration, the switch would refuse to send even the ETH_ZLEN minimum packet size. This should have definitely not been the case. When applying the taprio config, the driver prints: mscc_felix 0000:00:00.5: port 0 tc 0 min gate length 1250 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 132 octets including FCS mscc_felix 0000:00:00.5: port 0 tc 1 min gate length 1250 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 132 octets including FCS mscc_felix 0000:00:00.5: port 0 tc 2 min gate length 1250 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 132 octets including FCS mscc_felix 0000:00:00.5: port 0 tc 3 min gate length 1250 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 132 octets including FCS mscc_felix 0000:00:00.5: port 0 tc 4 min gate length 1250 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 132 octets including FCS mscc_felix 0000:00:00.5: port 0 tc 5 min gate length 1250 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 132 octets including FCS mscc_felix 0000:00:00.5: port 0 tc 6 min gate length 1250 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 132 octets including FCS mscc_felix 0000:00:00.5: port 0 tc 7 min gate length 1250 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 132 octets including FCS and thus, everything under 132 bytes - ETH_FCS_LEN should have been sent without problems. Yet it's not. For the forwarding path, the configuration is fine, yet packets injected from Linux get stuck with this schedule no matter what. The first hint that the static guard bands are the cause of the problem is that reverting Michael Walle's commit `297c4de6f7` ("net: dsa: felix: re-enable TAS guard band mode") made things work. It must be that the guard bands are calculated incorrectly. I remembered that there is a magic constant in the driver, set to 33 ns for no logical reason other than experimentation, which says "never let the static guard bands get so large as to leave less than this amount of remaining space in the time slot, because the queue system will refuse to schedule packets otherwise, and they will get stuck". I had a hunch that my previous experimentally-determined value was only good for packets coming from the forwarding path, and that the CPU injection path needed more. I came to the new value of 35 ns through binary search, after seeing that with 544 ns (the bit time required to send the Pdelay_Resp packet at gigabit) it works. Again, this is purely experimental, there's no logic and the manual doesn't say anything. The new driver prints for this schedule look like this: mscc_felix 0000:00:00.5: port 0 tc 0 min gate length 1250 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 131 octets including FCS mscc_felix 0000:00:00.5: port 0 tc 1 min gate length 1250 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 131 octets including FCS mscc_felix 0000:00:00.5: port 0 tc 2 min gate length 1250 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 131 octets including FCS mscc_felix 0000:00:00.5: port 0 tc 3 min gate length 1250 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 131 octets including FCS mscc_felix 0000:00:00.5: port 0 tc 4 min gate length 1250 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 131 octets including FCS mscc_felix 0000:00:00.5: port 0 tc 5 min gate length 1250 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 131 octets including FCS mscc_felix 0000:00:00.5: port 0 tc 6 min gate length 1250 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 131 octets including FCS mscc_felix 0000:00:00.5: port 0 tc 7 min gate length 1250 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 131 octets including FCS So yes, the maximum MTU is now even smaller by 1 byte than before. This is maybe counter-intuitive, but makes more sense with a diagram of one time slot. Before: Gate open Gate close \| \| v 1250 ns total time slot duration v <----------------------------------------------------> <----><----------------------------------------------> 33 ns 1217 ns static guard band useful Gate open Gate close \| \| v 1250 ns total time slot duration v <----------------------------------------------------> <-----><---------------------------------------------> 35 ns 1215 ns static guard band useful The static guard band implemented by this switch hardware directly determines the maximum allowable MTU for that traffic class. The larger it is, the earlier the switch will stop scheduling frames for transmission, because otherwise they might overrun the gate close time (and avoiding that is the entire purpose of Michael's patch). So, we now have guard bands smaller by 2 ns, thus, in this particular case, we lose a byte of the maximum MTU. Fixes: `11afdc6526` ("net: dsa: felix: tc-taprio intervals smaller than MTU should send at least one packet") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Michael Walle <mwalle@kernel.org> Link: https://patch.msgid.link/20241210132640.3426788-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-11 20:24:56 -08:00
Frederik Deweerdt	6bd8614fc2	splice: do not checksum AF_UNIX sockets When `skb_splice_from_iter` was introduced, it inadvertently added checksumming for AF_UNIX sockets. This resulted in significant slowdowns, for example when using sendfile over unix sockets. Using the test code in [1] in my test setup (2G single core qemu), the client receives a 1000M file in: - without the patch: 1482ms (+/- 36ms) - with the patch: 652.5ms (+/- 22.9ms) This commit addresses the issue by marking checksumming as unnecessary in `unix_stream_sendmsg` Cc: stable@vger.kernel.org Signed-off-by: Frederik Deweerdt <deweerdt.lkml@gmail.com> Fixes: `2e910b9532` ("net: Add a function to splice pages into an skbuff for MSG_SPLICE_PAGES") Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/Z1fMaHkRf8cfubuE@xiberoa Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-11 20:22:41 -08:00
Daniele Palmas	3b58b53a26	net: usb: qmi_wwan: add Telit FE910C04 compositions Add the following Telit FE910C04 compositions: 0x10c0: rmnet + tty (AT/NMEA) + tty (AT) + tty (diag) T: Bus=02 Lev=01 Prnt=03 Port=06 Cnt=01 Dev#= 13 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=10c0 Rev=05.15 S: Manufacturer=Telit Cinterion S: Product=FE910 S: SerialNumber=f71b8b32 C: #Ifs= 4 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=82(I) Atr=03(Int.) MxPS= 8 Ivl=32ms I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=60 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=84(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=86(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms 0x10c4: rmnet + tty (AT) + tty (AT) + tty (diag) T: Bus=02 Lev=01 Prnt=03 Port=06 Cnt=01 Dev#= 14 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=10c4 Rev=05.15 S: Manufacturer=Telit Cinterion S: Product=FE910 S: SerialNumber=f71b8b32 C: #Ifs= 4 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=82(I) Atr=03(Int.) MxPS= 8 Ivl=32ms I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=84(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=86(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms 0x10c8: rmnet + tty (AT) + tty (diag) + DPL (data packet logging) + adb T: Bus=02 Lev=01 Prnt=03 Port=06 Cnt=01 Dev#= 17 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=10c8 Rev=05.15 S: Manufacturer=Telit Cinterion S: Product=FE910 S: SerialNumber=f71b8b32 C: #Ifs= 5 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=82(I) Atr=03(Int.) MxPS= 8 Ivl=32ms I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=84(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 3 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=80 Driver=(none) E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none) E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms Signed-off-by: Daniele Palmas <dnlplm@gmail.com> Link: https://patch.msgid.link/20241209151821.3688829-1-dnlplm@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-11 20:21:42 -08:00
Jakub Kicinski	563307e9c0	Merge branch 'mana-fix-few-memory-leaks-in-mana_gd_setup_irqs' Maxim Levitsky says: ==================== MANA: Fix few memory leaks in mana_gd_setup_irqs Fix 2 minor memory leaks in the mana driver, introduced by commit ==================== Link: https://patch.msgid.link/20241209175751.287738-1-mlevitsk@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-11 20:21:07 -08:00
Maxim Levitsky	9a5beb6ca6	net: mana: Fix irq_contexts memory leak in mana_gd_setup_irqs gc->irq_contexts is not freeded if one of the later operations fail. Suggested-by: Michael Kelley <mhklinux@outlook.com> Fixes: `8afefc3612` ("net: mana: Assigning IRQ affinity on HT cores") Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Saurabh Sengar <ssengar@linux.microsoft.com> Reviewed-by: Yury Norov <yury.norov@gmail.com> Link: https://patch.msgid.link/20241209175751.287738-3-mlevitsk@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-11 20:21:04 -08:00
Maxim Levitsky	bb1e3eb57d	net: mana: Fix memory leak in mana_gd_setup_irqs Commit `8afefc3612` ("net: mana: Assigning IRQ affinity on HT cores") added memory allocation in mana_gd_setup_irqs of 'irqs' but the code doesn't free this temporary array in the success path. This was caught by kmemleak. Fixes: `8afefc3612` ("net: mana: Assigning IRQ affinity on HT cores") Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Saurabh Sengar <ssengar@linux.microsoft.com> Reviewed-by: Yury Norov <yury.norov@gmail.com> Link: https://patch.msgid.link/20241209175751.287738-2-mlevitsk@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-11 20:21:04 -08:00
Simon Horman	15bfb14727	MAINTAINERS: Add ethtool.h to NETWORKING [GENERAL] This is part of an effort to assign a section in MAINTAINERS to header files related to Networking. In this case the files named ethool.h. Signed-off-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241210-mnt-ethtool-h-v1-1-2a40b567939d@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-11 19:55:07 -08:00
Jann Horn	f49856f525	udmabuf: fix memory leak on last export_udmabuf() error path In export_udmabuf(), if dma_buf_fd() fails because the FD table is full, a dma_buf owning the udmabuf has already been created; but the error handling in udmabuf_create() will tear down the udmabuf without doing anything about the containing dma_buf. This leaves a dma_buf in memory that contains a dangling pointer; though that doesn't seem to lead to anything bad except a memory leak. Fix it by moving the dma_buf_fd() call out of export_udmabuf() so that we can give it different error handling. Note that the shape of this code changed a lot in commit `5e72b2b41a` ("udmabuf: convert udmabuf driver to use folios"); but the memory leak seems to have existed since the introduction of udmabuf. Fixes: `fbb0de7950` ("Add udmabuf misc device") Acked-by: Vivek Kasireddy <vivek.kasireddy@intel.com> Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241204-udmabuf-fixes-v2-3-23887289de1c@google.com	2024-12-11 16:47:41 -08:00
Jann Horn	0a16e24e34	udmabuf: also check for F_SEAL_FUTURE_WRITE When F_SEAL_FUTURE_WRITE was introduced, it was overlooked that udmabuf must reject memfds with this flag, just like ones with F_SEAL_WRITE. Fix it by adding F_SEAL_FUTURE_WRITE to SEALS_DENIED. Fixes: `ab3948f58f` ("mm/memfd: add an F_SEAL_FUTURE_WRITE seal to memfd") Cc: stable@vger.kernel.org Acked-by: Vivek Kasireddy <vivek.kasireddy@intel.com> Signed-off-by: Jann Horn <jannh@google.com> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241204-udmabuf-fixes-v2-2-23887289de1c@google.com	2024-12-11 16:47:40 -08:00
Jann Horn	9cb189a882	udmabuf: fix racy memfd sealing check The current check_memfd_seals() is racy: Since we first do check_memfd_seals() and then udmabuf_pin_folios() without holding any relevant lock across both, F_SEAL_WRITE can be set in between. This is problematic because we can end up holding pins to pages in a write-sealed memfd. Fix it using the inode lock, that's probably the easiest way. In the future, we might want to consider moving this logic into memfd, especially if anyone else wants to use memfd_pin_folios(). Reported-by: Julian Orth <ju.orth@gmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219106 Closes: https://lore.kernel.org/r/CAG48ez0w8HrFEZtJkfmkVKFDhE5aP7nz=obrimeTgpD+StkV9w@mail.gmail.com Fixes: `fbb0de7950` ("Add udmabuf misc device") Cc: stable@vger.kernel.org Signed-off-by: Jann Horn <jannh@google.com> Acked-by: Joel Fernandes (Google) <joel@joelfernandes.org> Acked-by: Vivek Kasireddy <vivek.kasireddy@intel.com> Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241204-udmabuf-fixes-v2-1-23887289de1c@google.com	2024-12-11 16:47:40 -08:00
Robert Beckett	ebefac5647	nvme-pci: 512 byte aligned dma pool segment quirk We initially introduced a quick fix limiting the queue depth to 1 as experimentation showed that it fixed data corruption on 64GB steamdecks. Further experimentation revealed corruption only happens when the last PRP data element aligns to the end of the page boundary. The device appears to treat this as a PRP chain to a new list instead of the data element that it actually is. This implementation is in violation of the spec. Encountering this errata with the Linux driver requires the host request a 128k transfer and coincidently be handed the last small pool dma buffer within a page. The QD1 quirk effectly works around this because the last data PRP always was at a 248 byte offset from the page start, so it never appeared at the end of the page, but comes at the expense of throttling IO and wasting the remainder of the PRP page beyond 256 bytes. Also to note, the MDTS on these devices is small enough that the "large" prp pool can hold enough PRP elements to never reach the end, so that pool is not a problem either. Introduce a new quirk to ensure the small pool is always aligned such that the last PRP element can't appear a the end of the page. This comes at the expense of wasting 256 bytes per small pool page allocated. Link: https://lore.kernel.org/linux-nvme/20241113043151.GA20077@lst.de/T/#u Fixes: `83bdfcbdbe` ("nvme-pci: qdepth 1 quirk") Cc: Paweł Anikiel <panikiel@google.com> Signed-off-by: Robert Beckett <bob.beckett@collabora.com> Signed-off-by: Keith Busch <kbusch@kernel.org>	2024-12-11 14:46:23 -08:00
Florian Westphal	b04df3da1b	netfilter: nf_tables: do not defer rule destruction via call_rcu nf_tables_chain_destroy can sleep, it can't be used from call_rcu callbacks. Moreover, nf_tables_rule_release() is only safe for error unwinding, while transaction mutex is held and the to-be-desroyed rule was not exposed to either dataplane or dumps, as it deactives+frees without the required synchronize_rcu() in-between. nft_rule_expr_deactivate() callbacks will change ->use counters of other chains/sets, see e.g. nft_lookup .deactivate callback, these must be serialized via transaction mutex. Also add a few lockdep asserts to make this more explicit. Calling synchronize_rcu() isn't ideal, but fixing this without is hard and way more intrusive. As-is, we can get: WARNING: .. net/netfilter/nf_tables_api.c:5515 nft_set_destroy+0x.. Workqueue: events nf_tables_trans_destroy_work RIP: 0010:nft_set_destroy+0x3fe/0x5c0 Call Trace: <TASK> nf_tables_trans_destroy_work+0x6b7/0xad0 process_one_work+0x64a/0xce0 worker_thread+0x613/0x10d0 In case the synchronize_rcu becomes an issue, we can explore alternatives. One way would be to allocate nft_trans_rule objects + one nft_trans_chain object, deactivate the rules + the chain and then defer the freeing to the nft destroy workqueue. We'd still need to keep the synchronize_rcu path as a fallback to handle -ENOMEM corner cases though. Reported-by: syzbot+b26935466701e56cfdc2@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/67478d92.050a0220.253251.0062.GAE@google.com/T/ Fixes: `c03d278fdf` ("netfilter: nf_tables: wait for rcu grace period on net_device removal") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2024-12-11 23:27:50 +01:00
Phil Sutter	f36b01994d	netfilter: IDLETIMER: Fix for possible ABBA deadlock Deletion of the last rule referencing a given idletimer may happen at the same time as a read of its file in sysfs: \| ====================================================== \| WARNING: possible circular locking dependency detected \| 6.12.0-rc7-01692-g5e9a28f41134-dirty #594 Not tainted \| ------------------------------------------------------ \| iptables/3303 is trying to acquire lock: \| ffff8881057e04b8 (kn->active#48){++++}-{0:0}, at: __kernfs_remove+0x20 \| \| but task is already holding lock: \| ffffffffa0249068 (list_mutex){+.+.}-{3:3}, at: idletimer_tg_destroy_v] \| \| which lock already depends on the new lock. A simple reproducer is: \| #!/bin/bash \| \| while true; do \| iptables -A INPUT -i foo -j IDLETIMER --timeout 10 --label "testme" \| iptables -D INPUT -i foo -j IDLETIMER --timeout 10 --label "testme" \| done & \| while true; do \| cat /sys/class/xt_idletimer/timers/testme >/dev/null \| done Avoid this by freeing list_mutex right after deleting the element from the list, then continuing with the teardown. Fixes: `0902b469bd` ("netfilter: xtables: idletimer target implementation") Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2024-12-11 23:15:19 +01:00
Phil Sutter	d92906fd1b	selftests: netfilter: Stabilize rpath.sh On some systems, neighbor discoveries from ns1 for fec0:42::1 (i.e., the martian trap address) would happen at the wrong time and cause false-negative test result. Problem analysis also discovered that IPv6 martian ping test was broken in that sent neighbor discoveries, not echo requests were inadvertently trapped Avoid the race condition by introducing the neighbors to each other upfront. Also pin down the firewall rules to matching on echo requests only. Fixes: `efb056e5f1` ("netfilter: ip6t_rpfilter: Fix regression with VRF interfaces") Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2024-12-11 23:15:17 +01:00
Linus Torvalds	231825b2e1	Revert "unicode: Don't special case ignorable code points" This reverts commit `5c26d2f1d3`. It turns out that we can't do this, because while the old behavior of ignoring ignorable code points was most definitely wrong, we have case-folding filesystems with on-disk hash values with that wrong behavior. So now you can't look up those names, because they hash to something different. Of course, it's also entirely possible that in the meantime people have created new files with the new ("more correct") case folding logic, and reverting will just make other things break. The correct solution is to not do case folding in filesystems, but sadly, people seem to never really understand that. People still see it as a feature, not a bug. Reported-by: Qi Han <hanqi@vivo.com> Link: https://bugzilla.kernel.org/show_bug.cgi?id=219586 Cc: Gabriel Krisman Bertazi <krisman@suse.de> Requested-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2024-12-11 14:11:23 -08:00
Linus Torvalds	ec8e2d3889	Merge tag 'vfio-v6.13-rc3' of https://github.com/awilliam/linux-vfio Pull vfio fix from Alex Williamson: - Fix migration dirty page tracking support in the mlx5-vfio-pci variant driver in configurations where the system page size exceeds the device maximum message size, and anticipate device updates where the opposite may also be required (Yishai Hadas) * tag 'vfio-v6.13-rc3' of https://github.com/awilliam/linux-vfio: vfio/mlx5: Align the page tracking max message size with the device capability	2024-12-11 13:48:25 -08:00
Linus Torvalds	becb337c23	Merge tag 'linux_kselftest-fixes-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull Kselftest fix from Shuah Khan: - fix the offset for kprobe syntax error test case when checking the BTF arguments on 64-bit powerpc * tag 'linux_kselftest-fixes-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests/ftrace: adjust offset for kprobe syntax error test	2024-12-11 13:41:41 -08:00
Tejun Heo	18b2093f45	sched_ext: Fix invalid irq restore in scx_ops_bypass() While adding outer irqsave/restore locking, `0e7ffff1b8` ("scx: Fix raciness in scx_ops_bypass()") forgot to convert an inner rq_unlock_irqrestore() to rq_unlock() which could re-enable IRQ prematurely leading to the following warning: raw_local_irq_restore() called with IRQs enabled WARNING: CPU: 1 PID: 96 at kernel/locking/irqflag-debug.c:10 warn_bogus_irq_restore+0x30/0x40 ... Sched_ext: create_dsq (enabling) pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : warn_bogus_irq_restore+0x30/0x40 lr : warn_bogus_irq_restore+0x30/0x40 ... Call trace: warn_bogus_irq_restore+0x30/0x40 (P) warn_bogus_irq_restore+0x30/0x40 (L) scx_ops_bypass+0x224/0x3b8 scx_ops_enable.isra.0+0x2c8/0xaa8 bpf_scx_reg+0x18/0x30 ... irq event stamp: 33739 hardirqs last enabled at (33739): [<ffff8000800b699c>] scx_ops_bypass+0x174/0x3b8 hardirqs last disabled at (33738): [<ffff800080d48ad4>] _raw_spin_lock_irqsave+0xb4/0xd8 Drop the stray _irqrestore(). Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Ihor Solodrai <ihor.solodrai@pm.me> Link: http://lkml.kernel.org/r/qC39k3UsonrBYD_SmuxHnZIQLsuuccoCrkiqb_BT7DvH945A1_LZwE4g-5Pu9FcCtqZt4lY1HhIPi0homRuNWxkgo1rgP3bkxa0donw8kV4=@pm.me Fixes: `0e7ffff1b8` ("scx: Fix raciness in scx_ops_bypass()") Cc: stable@vger.kernel.org # v6.12	2024-12-11 11:02:35 -10:00
Borislav Petkov (AMD)	747367340c	EDAC/amd64: Simplify ECC check on unified memory controllers The intent of the check is to see whether at least one UMC has ECC enabled. So do that instead of tracking which ones are enabled in masks which are too small in size anyway and lead to not loading the driver on Zen4 machines with UMCs enabled over UMC8. Fixes: `e2be5955a8` ("EDAC/amd64: Add support for AMD Family 19h Models 10h-1Fh and A0h-AFh") Reported-by: Avadhut Naik <avadhut.naik@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Avadhut Naik <avadhut.naik@amd.com> Reviewed-by: Avadhut Naik <avadhut.naik@amd.com> Cc: <stable@kernel.org> Link: https://lore.kernel.org/r/20241210212054.3895697-1-avadhut.naik@amd.com	2024-12-11 21:47:33 +01:00
Björn Töpel	21f1b85c89	riscv: mm: Do not call pmd dtor on vmemmap page table teardown The vmemmap's, which is used for RV64 with SPARSEMEM_VMEMMAP, page tables are populated using pmd (page middle directory) hugetables. However, the pmd allocation is not using the generic mechanism used by the VMA code (e.g. pmd_alloc()), or the RISC-V specific create_pgd_mapping()/alloc_pmd_late(). Instead, the vmemmap page table code allocates a page, and calls vmemmap_set_pmd(). This results in that the pmd ctor is not called, nor would it make sense to do so. Now, when tearing down a vmemmap page table pmd, the cleanup code would unconditionally, and incorrectly call the pmd dtor, which results in a crash (best case). This issue was found when running the HMM selftests: \| tools/testing/selftests/mm# ./test_hmm.sh smoke \| ... # when unloading the test_hmm.ko module \| page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x10915b \| flags: 0x1000000000000000(node=0\|zone=1) \| raw: 1000000000000000 0000000000000000 dead000000000122 0000000000000000 \| raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000 \| page dumped because: VM_BUG_ON_PAGE(ptdesc->pmd_huge_pte) \| ------------[ cut here ]------------ \| kernel BUG at include/linux/mm.h:3080! \| Kernel BUG [#1] \| Modules linked in: test_hmm(-) sch_fq_codel fuse drm drm_panel_orientation_quirks backlight dm_mod \| CPU: 1 UID: 0 PID: 514 Comm: modprobe Tainted: G W 6.12.0-00982-gf2a4f1682d07 #2 \| Tainted: [W]=WARN \| Hardware name: riscv-virtio qemu/qemu, BIOS 2024.10 10/01/2024 \| epc : remove_pgd_mapping+0xbec/0x1070 \| ra : remove_pgd_mapping+0xbec/0x1070 \| epc : ffffffff80010a68 ra : ffffffff80010a68 sp : ff20000000a73940 \| gp : ffffffff827b2d88 tp : ff6000008785da40 t0 : ffffffff80fbce04 \| t1 : 0720072007200720 t2 : 706d756420656761 s0 : ff20000000a73a50 \| s1 : ff6000008915cff8 a0 : 0000000000000039 a1 : 0000000000000008 \| a2 : ff600003fff0de20 a3 : 0000000000000000 a4 : 0000000000000000 \| a5 : 0000000000000000 a6 : c0000000ffffefff a7 : ffffffff824469b8 \| s2 : ff1c0000022456c0 s3 : ff1ffffffdbfffff s4 : ff6000008915c000 \| s5 : ff6000008915c000 s6 : ff6000008915c000 s7 : ff1ffffffdc00000 \| s8 : 0000000000000001 s9 : ff1ffffffdc00000 s10: ffffffff819a31f0 \| s11: ffffffffffffffff t3 : ffffffff8000c950 t4 : ff60000080244f00 \| t5 : ff60000080244000 t6 : ff20000000a73708 \| status: 0000000200000120 badaddr: ffffffff80010a68 cause: 0000000000000003 \| [<ffffffff80010a68>] remove_pgd_mapping+0xbec/0x1070 \| [<ffffffff80fd238e>] vmemmap_free+0x14/0x1e \| [<ffffffff8032e698>] section_deactivate+0x220/0x452 \| [<ffffffff8032ef7e>] sparse_remove_section+0x4a/0x58 \| [<ffffffff802f8700>] __remove_pages+0x7e/0xba \| [<ffffffff803760d8>] memunmap_pages+0x2bc/0x3fe \| [<ffffffff02a3ca28>] dmirror_device_remove_chunks+0x2ea/0x518 [test_hmm] \| [<ffffffff02a3e026>] hmm_dmirror_exit+0x3e/0x1018 [test_hmm] \| [<ffffffff80102c14>] __riscv_sys_delete_module+0x15a/0x2a6 \| [<ffffffff80fd020c>] do_trap_ecall_u+0x1f2/0x266 \| [<ffffffff80fde0a2>] _new_vmalloc_restore_context_a0+0xc6/0xd2 \| Code: bf51 7597 0184 8593 76a5 854a 4097 0029 80e7 2c00 (9002) 7597 \| ---[ end trace 0000000000000000 ]--- \| Kernel panic - not syncing: Fatal exception in interrupt Add a check to avoid calling the pmd dtor, if the calling context is vmemmap_free(). Fixes: `c75a74f4ba` ("riscv: mm: Add memory hotplugging support") Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20241120131203.1859787-1-bjorn@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-12-11 11:44:21 -08:00
Alexandre Ghiti	b3431a8bb3	riscv: Fix IPIs usage in kfence_protect_page() flush_tlb_kernel_range() may use IPIs to flush the TLBs of all the cores, which triggers the following warning when the irqs are disabled: [ 3.455330] WARNING: CPU: 1 PID: 0 at kernel/smp.c:815 smp_call_function_many_cond+0x452/0x520 [ 3.456647] Modules linked in: [ 3.457218] CPU: 1 UID: 0 PID: 0 Comm: swapper/1 Not tainted 6.12.0-rc7-00010-g91d3de7240b8 #1 [ 3.457416] Hardware name: QEMU QEMU Virtual Machine, BIOS [ 3.457633] epc : smp_call_function_many_cond+0x452/0x520 [ 3.457736] ra : on_each_cpu_cond_mask+0x1e/0x30 [ 3.457786] epc : ffffffff800b669a ra : ffffffff800b67c2 sp : ff2000000000bb50 [ 3.457824] gp : ffffffff815212b8 tp : ff6000008014f080 t0 : 000000000000003f [ 3.457859] t1 : ffffffff815221e0 t2 : 000000000000000f s0 : ff2000000000bc10 [ 3.457920] s1 : 0000000000000040 a0 : ffffffff815221e0 a1 : 0000000000000001 [ 3.457953] a2 : 0000000000010000 a3 : 0000000000000003 a4 : 0000000000000000 [ 3.458006] a5 : 0000000000000000 a6 : ffffffffffffffff a7 : 0000000000000000 [ 3.458042] s2 : ffffffff815223be s3 : 00fffffffffff000 s4 : ff600001ffe38fc0 [ 3.458076] s5 : ff600001ff950d00 s6 : 0000000200000120 s7 : 0000000000000001 [ 3.458109] s8 : 0000000000000001 s9 : ff60000080841ef0 s10: 0000000000000001 [ 3.458141] s11: ffffffff81524812 t3 : 0000000000000001 t4 : ff60000080092bc0 [ 3.458172] t5 : 0000000000000000 t6 : ff200000000236d0 [ 3.458203] status: 0000000200000100 badaddr: ffffffff800b669a cause: 0000000000000003 [ 3.458373] [<ffffffff800b669a>] smp_call_function_many_cond+0x452/0x520 [ 3.458593] [<ffffffff800b67c2>] on_each_cpu_cond_mask+0x1e/0x30 [ 3.458625] [<ffffffff8000e4ca>] __flush_tlb_range+0x118/0x1ca [ 3.458656] [<ffffffff8000e6b2>] flush_tlb_kernel_range+0x1e/0x26 [ 3.458683] [<ffffffff801ea56a>] kfence_protect+0xc0/0xce [ 3.458717] [<ffffffff801e9456>] kfence_guarded_free+0xc6/0x1c0 [ 3.458742] [<ffffffff801e9d6c>] __kfence_free+0x62/0xc6 [ 3.458764] [<ffffffff801c57d8>] kfree+0x106/0x32c [ 3.458786] [<ffffffff80588cf2>] detach_buf_split+0x188/0x1a8 [ 3.458816] [<ffffffff8058708c>] virtqueue_get_buf_ctx+0xb6/0x1f6 [ 3.458839] [<ffffffff805871da>] virtqueue_get_buf+0xe/0x16 [ 3.458880] [<ffffffff80613d6a>] virtblk_done+0x5c/0xe2 [ 3.458908] [<ffffffff8058766e>] vring_interrupt+0x6a/0x74 [ 3.458930] [<ffffffff800747d8>] __handle_irq_event_percpu+0x7c/0xe2 [ 3.458956] [<ffffffff800748f0>] handle_irq_event+0x3c/0x86 [ 3.458978] [<ffffffff800786cc>] handle_simple_irq+0x9e/0xbe [ 3.459004] [<ffffffff80073934>] generic_handle_domain_irq+0x1c/0x2a [ 3.459027] [<ffffffff804bf87c>] imsic_handle_irq+0xba/0x120 [ 3.459056] [<ffffffff80073934>] generic_handle_domain_irq+0x1c/0x2a [ 3.459080] [<ffffffff804bdb76>] riscv_intc_aia_irq+0x24/0x34 [ 3.459103] [<ffffffff809d0452>] handle_riscv_irq+0x2e/0x4c [ 3.459133] [<ffffffff809d923e>] call_on_irq_stack+0x32/0x40 So only flush the local TLB and let the lazy kfence page fault handling deal with the faults which could happen when a core has an old protected pte version cached in its TLB. That leads to potential inaccuracies which can be tolerated when using kfence. Fixes: `47513f243b` ("riscv: Enable KFENCE for riscv64") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20241209074125.52322-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-12-11 11:44:03 -08:00
Alexandre Ghiti	c796e18720	riscv: Fix wrong usage of __pa() on a fixmap address riscv uses fixmap addresses to map the dtb so we can't use __pa() which is reserved for linear mapping addresses. Fixes: `b2473a3597` ("of/fdt: add dt_phys arg to early_init_dt_scan and early_init_dt_verify") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20241209074508.53037-1-alexghiti@rivosinc.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-12-11 11:43:44 -08:00
Guo Ren	b3134b8c1a	riscv: Fixup boot failure when CONFIG_DEBUG_RT_MUTEXES=y When CONFIG_DEBUG_RT_MUTEXES=y, mutex_lock->rt_mutex_try_acquire would change from rt_mutex_cmpxchg_acquire to rt_mutex_slowtrylock(): raw_spin_lock_irqsave(&lock->wait_lock, flags); ret = __rt_mutex_slowtrylock(lock); raw_spin_unlock_irqrestore(&lock->wait_lock, flags); Because queued_spin_#ops to ticket_#ops is changed one by one by jump_label, raw_spin_lock/unlock would cause a deadlock during the changing. That means in arch/riscv/kernel/jump_label.c: 1. arch_jump_label_transform_queue() -> mutex_lock(&text_mutex); +-> raw_spin_lock -> queued_spin_lock \|-> raw_spin_unlock -> queued_spin_unlock patch_insn_write -> change the raw_spin_lock to ticket_lock mutex_unlock(&text_mutex); ... 2. /* Dirty the lock value / arch_jump_label_transform_queue() -> mutex_lock(&text_mutex); +-> raw_spin_lock -> ticket_lock* \|-> raw_spin_unlock -> queued_spin_unlock /* BUG: ticket_lock with queued_spin_unlock / patch_insn_write -> change the raw_spin_unlock to ticket_unlock mutex_unlock(&text_mutex); ... 3. / Dead lock / arch_jump_label_transform_queue() -> mutex_lock(&text_mutex); +-> raw_spin_lock -> ticket_lock / deadlock! */ \|-> raw_spin_unlock -> ticket_unlock patch_insn_write -> change other raw_spin_#op -> ticket_#op mutex_unlock(&text_mutex); So, the solution is to disable mutex usage of arch_jump_label_transform_queue() during early_boot_irqs_disabled, just like we have done for stop_machine. Reported-by: Conor Dooley <conor@kernel.org> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Fixes: `ab83647fad` ("riscv: Add qspinlock support") Link: https://lore.kernel.org/linux-riscv/CAJF2gTQwYTGinBmCSgVUoPv0_q4EPt_+WiyfUA1HViAKgUzxAg@mail.gmail.com/T/#mf488e6347817fca03bb93a7d34df33d8615b3775 Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com> Tested-by: Nam Cao <namcao@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20241130153310.3349484-1-guoren@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-12-11 11:43:39 -08:00
Shengjiu Wang	bb76e82bfe	ASoC: fsl_spdif: change IFACE_PCM to IFACE_MIXER As the snd_soc_card_get_kcontrol() is updated to use snd_ctl_find_id_mixer() in commit `897cc72b08` ("ASoC: soc-card: Use snd_ctl_find_id_mixer() instead of open-coding") which make the iface fix to be IFACE_MIXER. Fixes: `897cc72b08` ("ASoC: soc-card: Use snd_ctl_find_id_mixer() instead of open-coding") Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com> Link: https://patch.msgid.link/20241126053254.3657344-3-shengjiu.wang@nxp.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-12-11 18:45:27 +00:00
Shengjiu Wang	7c17f7780a	ASoC: fsl_xcvr: change IFACE_PCM to IFACE_MIXER As the snd_soc_card_get_kcontrol() is updated to use snd_ctl_find_id_mixer() in commit `897cc72b08` ("ASoC: soc-card: Use snd_ctl_find_id_mixer() instead of open-coding") which make the iface fix to be IFACE_MIXER. Fixes: `897cc72b08` ("ASoC: soc-card: Use snd_ctl_find_id_mixer() instead of open-coding") Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com> Link: https://patch.msgid.link/20241126053254.3657344-2-shengjiu.wang@nxp.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-12-11 18:45:26 +00:00
James Clark	f7e36d02d7	libperf: evlist: Fix --cpu argument on hybrid platform Since the linked fixes: commit, specifying a CPU on hybrid platforms results in an error because Perf tries to open an extended type event on "any" CPU which isn't valid. Extended type events can only be opened on CPUs that match the type. Before (working): $ perf record --cpu 1 -- true [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 2.385 MB perf.data (7 samples) ] After (not working): $ perf record -C 1 -- true WARNING: A requested CPU in '1' is not supported by PMU 'cpu_atom' (CPUs 16-27) for event 'cycles:P' Error: The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu_atom/cycles:P/). /bin/dmesg \| grep -i perf may provide additional information. (Ignore the warning message, that's expected and not particularly relevant to this issue). This is because perf_cpu_map__intersect() of the user specified CPU (1) and one of the PMU's CPUs (16-27) correctly results in an empty (NULL) CPU map. However for the purposes of opening an event, libperf converts empty CPU maps into an any CPU (-1) which the kernel rejects. Fix it by deleting evsels with empty CPU maps in the specific case where user requested CPU maps are evaluated. Fixes: `251aa04024` ("perf parse-events: Wildcard most "numeric" events") Reviewed-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Signed-off-by: James Clark <james.clark@linaro.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20241114160450.295844-2-james.clark@linaro.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2024-12-11 09:19:44 -08:00
Ian Rogers	a93a620c38	perf test expr: Fix system_tsc_freq for only x86 The refactoring of tool PMU events to have a PMU then adding the expr literals to the tool PMU made it so that the literal system_tsc_freq was only supported on x86. Update the test expectations to match - namely the parsing is x86 specific and only yields a non-zero value on Intel. Fixes: `609aa2667f` ("perf tool_pmu: Switch to standard pmu functions and json descriptions") Reported-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Closes: https://lore.kernel.org/linux-perf-users/20241022140156.98854-1-atrajeev@linux.vnet.ibm.com/ Co-developed-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@linaro.org> Cc: akanksha@linux.ibm.com Cc: hbathini@linux.ibm.com Cc: kjain@linux.ibm.com Cc: maddy@linux.ibm.com Cc: disgoel@linux.vnet.ibm.com Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20241205022305.158202-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2024-12-11 09:19:44 -08:00
Hari Bathini	777f290ab3	selftests/ftrace: adjust offset for kprobe syntax error test In 'NOFENTRY_ARGS' test case for syntax check, any offset X of `vfs_read+X` except function entry offset (0) fits the criterion, even if that offset is not at instruction boundary, as the parser comes before probing. But with "ENDBR64" instruction on x86, offset 4 is treated as function entry. So, X can't be 4 as well. Thus, 8 was used as offset for the test case. On 64-bit powerpc though, any offset <= 16 can be considered function entry depending on build configuration (see arch_kprobe_on_func_entry() for implementation details). So, use `vfs_read+20` to accommodate that scenario too. Link: https://lore.kernel.org/r/20241129202621.721159-1-hbathini@linux.ibm.com Fixes: `4231f30fcc` ("selftests/ftrace: Add BTF arguments test cases") Suggested-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2024-12-11 10:08:04 -07:00
Alexander Gordeev	5fa49dd8e5	s390/ipl: Fix never less than zero warning DEFINE_IPL_ATTR_STR_RW() macro produces "unsigned 'len' is never less than zero." warning when sys_vmcmd_on_*_store() callbacks are defined. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202412081614.5uel8F6W-lkp@intel.com/ Fixes: `247576bf62` ("s390/ipl: Do not accept z/VM CP diag X'008' cmds longer than max length") Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>	2024-12-11 18:05:49 +01:00
Michal Luczaj	3e643e4efa	Bluetooth: Improve setsockopt() handling of malformed user input The bt_copy_from_sockptr() return value is being misinterpreted by most users: a non-zero result is mistakenly assumed to represent an error code, but actually indicates the number of bytes that could not be copied. Remove bt_copy_from_sockptr() and adapt callers to use copy_safe_from_sockptr(). For sco_sock_setsockopt() (case BT_CODEC) use copy_struct_from_sockptr() to scrub parts of uninitialized buffer. Opportunistically, rename `len` to `optlen` in hci_sock_setsockopt_old() and hci_sock_setsockopt(). Fixes: `51eda36d33` ("Bluetooth: SCO: Fix not validating setsockopt user input") Fixes: `a97de7bff1` ("Bluetooth: RFCOMM: Fix not validating setsockopt user input") Fixes: `4f3951242a` ("Bluetooth: L2CAP: Fix not validating setsockopt user input") Fixes: `9e8742cdfc` ("Bluetooth: ISO: Fix not validating setsockopt user input") Fixes: `b2186061d6` ("Bluetooth: hci_sock: Fix not validating setsockopt user input") Reviewed-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Reviewed-by: David Wei <dw@davidwei.uk> Signed-off-by: Michal Luczaj <mhal@rbox.co> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2024-12-11 11:54:57 -05:00
Nikita Zhandarovich	2dd59fe0e1	media: dvb-frontends: dib3000mb: fix uninit-value in dib3000_write_reg Syzbot reports [1] an uninitialized value issue found by KMSAN in dib3000_read_reg(). Local u8 rb[2] is used in i2c_transfer() as a read buffer; in case that call fails, the buffer may end up with some undefined values. Since no elaborate error handling is expected in dib3000_write_reg(), simply zero out rb buffer to mitigate the problem. [1] Syzkaller report dvb-usb: bulk message failed: -22 (6/0) ===================================================== BUG: KMSAN: uninit-value in dib3000mb_attach+0x2d8/0x3c0 drivers/media/dvb-frontends/dib3000mb.c:758 dib3000mb_attach+0x2d8/0x3c0 drivers/media/dvb-frontends/dib3000mb.c:758 dibusb_dib3000mb_frontend_attach+0x155/0x2f0 drivers/media/usb/dvb-usb/dibusb-mb.c:31 dvb_usb_adapter_frontend_init+0xed/0x9a0 drivers/media/usb/dvb-usb/dvb-usb-dvb.c:290 dvb_usb_adapter_init drivers/media/usb/dvb-usb/dvb-usb-init.c:90 [inline] dvb_usb_init drivers/media/usb/dvb-usb/dvb-usb-init.c:186 [inline] dvb_usb_device_init+0x25a8/0x3760 drivers/media/usb/dvb-usb/dvb-usb-init.c:310 dibusb_probe+0x46/0x250 drivers/media/usb/dvb-usb/dibusb-mb.c:110 ... Local variable rb created at: dib3000_read_reg+0x86/0x4e0 drivers/media/dvb-frontends/dib3000mb.c:54 dib3000mb_attach+0x123/0x3c0 drivers/media/dvb-frontends/dib3000mb.c:758 ... Fixes: `74340b0a8b` ("V4L/DVB (4457): Remove dib3000-common-module") Reported-by: syzbot+c88fc0ebe0d5935c70da@syzkaller.appspotmail.com Signed-off-by: Nikita Zhandarovich <n.zhandarovich@fintech.ru> Link: https://lore.kernel.org/r/20240517155800.9881-1-n.zhandarovich@fintech.ru Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>	2024-12-11 17:54:19 +01:00
Akira Yokosawa	d933949672	scripts/kernel-doc: Get -export option working again Since commit `cdd30ebb1b` ("module: Convert symbol namespace to string literal"), exported symbols marked by EXPORT_SYMBOL_NS(_GPL) are ignored by "kernel-doc -export" in fresh build of "make htmldocs". This is because regex in the perl script for those markers fails to match the new signatures: - EXPORT_SYMBOL_NS(symbol, "ns"); - EXPORT_SYMBOL_NS_GPL(symbol, "ns"); Update the regex so that it matches quoted string. Note: Escape sequence of \w is good for C identifiers, but can be too strict for quoted strings. Instead, use \S, which matches any non-whitespace character, for compatibility with possible extension of namespace convention in the future [1]. Fixes: `cdd30ebb1b` ("module: Convert symbol namespace to string literal") Link: https://lore.kernel.org/CAK7LNATMufXP0EA6QUE9hBkZMa6vJO6ZiaYuak2AhOrd2nSVKQ@mail.gmail.com/ [1] Signed-off-by: Akira Yokosawa <akiyks@gmail.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/e5c43f36-45cd-49f4-b7b8-ff342df3c7a4@gmail.com	2024-12-11 09:15:26 -07:00
Waiman Long	9b496a8bbe	cgroup/cpuset: Prevent leakage of isolated CPUs into sched domains Isolated CPUs are not allowed to be used in a non-isolated partition. The only exception is the top cpuset which is allowed to contain boot time isolated CPUs. Commit `ccac8e8de9` ("cgroup/cpuset: Fix remote root partition creation problem") introduces a simplified scheme of including only partition roots in sched domain generation. However, it does not properly account for this exception case. This can result in leakage of isolated CPUs into a sched domain. Fix it by making sure that isolated CPUs are excluded from the top cpuset before generating sched domains. Also update the way the boot time isolated CPUs are handled in test_cpuset_prs.sh to make sure that those isolated CPUs are really isolated instead of just skipping them in the tests. Fixes: `ccac8e8de9` ("cgroup/cpuset: Fix remote root partition creation problem") Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2024-12-11 05:45:52 -10:00
Changwoo Min	48ca421268	MAINTAINERS: add me as reviewer for sched_ext Add me as a reviewer for sched_ext. I have been actively working on the project and would like to help review patches and address related kernel issues/features. Signed-off-by: Changwoo Min <changwoo@igalia.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2024-12-11 05:28:17 -10:00
Xi Pardee	83848e37f6	platform/x86/intel/vsec: Add support for Panther Lake Add Panther Lake PMT telemetry support. Signed-off-by: Xi Pardee <xi.pardee@linux.intel.com> Link: https://lore.kernel.org/r/20241210212646.239211-1-xi.pardee@linux.intel.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>	2024-12-11 16:00:56 +02:00
Jithu Joseph	6c0a473fc5	platform/x86/intel/ifs: Add Clearwater Forest to CPU support list Add Clearwater Forest (INTEL_ATOM_DARKMONT_X) to the x86 match table of Intel In Field Scan (IFS) driver, enabling IFS functionality on this processor. Signed-off-by: Jithu Joseph <jithu.joseph@intel.com> Link: https://lore.kernel.org/r/20241210203152.1136463-1-jithu.joseph@intel.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>	2024-12-11 16:00:36 +02:00
Huy Minh	220326c465	platform/x86: touchscreen_dmi: Add info for SARY Tab 3 tablet There's no info about the OEM behind the tablet, only online stores listing. This tablet uses an Intel Atom x5-Z8300, 4GB of RAM & 64GB of storage. Signed-off-by: Huy Minh <buingoc67@gmail.com> Link: https://lore.kernel.org/r/20241210154500.32124-1-buingoc67@gmail.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>	2024-12-11 15:58:35 +02:00
Shenghao Ding	2aa13da97e	ASoC: tas2781: Fix calibration issue in stress test One specific test condition: the default registers of p[j].reg ~ p[j+3].reg are 0, TASDEVICE_REG(0x00, 0x14, 0x38)(PLT_FLAG_REG), TASDEVICE_REG(0x00, 0x14, 0x40)(SINEGAIN_REG), and TASDEVICE_REG(0x00, 0x14, 0x44)(SINEGAIN2_REG). After first calibration, they are freshed to TASDEVICE_REG(0x00, 0x1a, 0x20), TASDEVICE_REG(0x00, 0x16, 0x58)(PLT_FLAG_REG), TASDEVICE_REG(0x00, 0x14, 0x44)(SINEGAIN_REG), and TASDEVICE_REG(0x00, 0x16, 0x64)(SINEGAIN2_REG) via "Calibration Start" kcontrol. In second calibration, the p[j].reg ~ p[j+3].reg have already become tas2781_cali_start_reg. However, p[j+2].reg, TASDEVICE_REG(0x00, 0x14, 0x44)(SINEGAIN_REG), will be freshed to TASDEVICE_REG(0x00, 0x16, 0x64), which is the third register in the input params of the kcontrol. This is why only first calibration can work, the second-time, third-time or more-time calibration always failed without reboot. Of course, if no p[j].reg is in the list of tas2781_cali_start_reg, this stress test can work well. Fixes: `49e2e353fb` ("ASoC: tas2781: Add Calibration Kcontrols for Chromebook") Signed-off-by: Shenghao Ding <shenghao-ding@ti.com> Link: https://patch.msgid.link/20241211043859.1328-1-shenghao-ding@ti.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-12-11 13:27:45 +00:00
Christian Brauner	867f85679c	Merge patch series "iomap: fix zero padding data issue in concurrent append writes" Long Li <leo.lilong@huawei.com> says: This patch series fixes zero padding data issues in concurrent append write scenarios. A detailed problem description and solution can be found in patch 2. Patch 1 is introduced as preparation for the fix in patch 2, eliminating the need to resample inode size for io_size trimming and avoiding issues caused by inode size changes during concurrent writeback and truncate operations. * patches from https://lore.kernel.org/r/20241209114241.3725722-1-leo.lilong@huawei.com: iomap: fix zero padding data issue in concurrent append writes iomap: pass byte granular end position to iomap_add_to_ioend Link: https://lore.kernel.org/r/20241209114241.3725722-1-leo.lilong@huawei.com Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-12-11 11:09:16 +01:00
Long Li	51d20d1dac	iomap: fix zero padding data issue in concurrent append writes During concurrent append writes to XFS filesystem, zero padding data may appear in the file after power failure. This happens due to imprecise disk size updates when handling write completion. Consider this scenario with concurrent append writes same file: Thread 1: Thread 2: ------------ ----------- write [A, A+B] update inode size to A+B submit I/O [A, A+BS] write [A+B, A+B+C] update inode size to A+B+C <I/O completes, updates disk size to min(A+B+C, A+BS)> <power failure> After reboot: 1) with A+B+C < A+BS, the file has zero padding in range [A+B, A+B+C] \|< Block Size (BS) >\| \|DDDDDDDDDDDDDDDD0000000000000000\| ^ ^ ^ A A+B A+B+C (EOF) 2) with A+B+C > A+BS, the file has zero padding in range [A+B, A+BS] \|< Block Size (BS) >\|< Block Size (BS) >\| \|DDDDDDDDDDDDDDDD0000000000000000\|00000000000000000000000000000000\| ^ ^ ^ ^ A A+B A+BS A+B+C (EOF) D = Valid Data 0 = Zero Padding The issue stems from disk size being set to min(io_offset + io_size, inode->i_size) at I/O completion. Since io_offset+io_size is block size granularity, it may exceed the actual valid file data size. In the case of concurrent append writes, inode->i_size may be larger than the actual range of valid file data written to disk, leading to inaccurate disk size updates. This patch modifies the meaning of io_size to represent the size of valid data within EOF in an ioend. If the ioend spans beyond i_size, io_size will be trimmed to provide the file with more accurate size information. This is particularly useful for on-disk size updates at completion time. After this change, ioends that span i_size will not grow or merge with other ioends in concurrent scenarios. However, these cases that need growth/merging rarely occur and it seems no noticeable performance impact. Although rounding up io_size could enable ioend growth/merging in these scenarios, we decided to keep the code simple after discussion [1]. Another benefit is that it makes the xfs_ioend_is_append() check more accurate, which can reduce unnecessary end bio callbacks of xfs_end_bio() in certain scenarios, such as repeated writes at the file tail without extending the file size. Link [1]: https://patchwork.kernel.org/project/xfs/patch/20241113091907.56937-1-leo.lilong@huawei.com Fixes: `ae259a9c85` ("fs: introduce iomap infrastructure") # goes further back than this Signed-off-by: Long Li <leo.lilong@huawei.com> Link: https://lore.kernel.org/r/20241209114241.3725722-3-leo.lilong@huawei.com Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-12-11 11:09:05 +01:00
Long Li	b44679c63e	iomap: pass byte granular end position to iomap_add_to_ioend This is a preparatory patch for fixing zero padding issues in concurrent append write scenarios. In the following patches, we need to obtain byte-granular writeback end position for io_size trimming after EOF handling. Due to concurrent writeback and truncate operations, inode size may shrink. Resampling inode size would force writeback code to handle the newly appeared post-EOF blocks, which is undesirable. As Dave explained in [1]: "Really, the issue is that writeback mappings have to be able to handle the range being mapped suddenly appear to be beyond EOF. This behaviour is a longstanding writeback constraint, and is what iomap_writepage_handle_eof() is attempting to handle. We handle this by only sampling i_size_read() whilst we have the folio locked and can determine the action we should take with that folio (i.e. nothing, partial zeroing, or skip altogether). Once we've made the decision that the folio is within EOF and taken action on it (i.e. moved the folio to writeback state), we cannot then resample the inode size because a truncate may have started and changed the inode size." To avoid resampling inode size after EOF handling, we convert end_pos to byte-granular writeback position and return it from EOF handling function. Since iomap_set_range_dirty() can handle unaligned lengths, this conversion has no impact on it. However, iomap_find_dirty_range() requires aligned start and end range to find dirty blocks within the given range, so the end position needs to be rounded up when passed to it. LINK [1]: https://lore.kernel.org/linux-xfs/Z1Gg0pAa54MoeYME@localhost.localdomain/ Signed-off-by: Long Li <leo.lilong@huawei.com> Link: https://lore.kernel.org/r/20241209114241.3725722-2-leo.lilong@huawei.com Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-12-11 11:09:02 +01:00
Daniele Palmas	8366e64a44	USB: serial: option: add Telit FE910C04 rmnet compositions Add the following Telit FE910C04 compositions: 0x10c0: rmnet + tty (AT/NMEA) + tty (AT) + tty (diag) T: Bus=02 Lev=01 Prnt=03 Port=06 Cnt=01 Dev#= 13 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=10c0 Rev=05.15 S: Manufacturer=Telit Cinterion S: Product=FE910 S: SerialNumber=f71b8b32 C: #Ifs= 4 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=82(I) Atr=03(Int.) MxPS= 8 Ivl=32ms I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=60 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=84(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=86(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms 0x10c4: rmnet + tty (AT) + tty (AT) + tty (diag) T: Bus=02 Lev=01 Prnt=03 Port=06 Cnt=01 Dev#= 14 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=10c4 Rev=05.15 S: Manufacturer=Telit Cinterion S: Product=FE910 S: SerialNumber=f71b8b32 C: #Ifs= 4 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=82(I) Atr=03(Int.) MxPS= 8 Ivl=32ms I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=84(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=86(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms 0x10c8: rmnet + tty (AT) + tty (diag) + DPL (data packet logging) + adb T: Bus=02 Lev=01 Prnt=03 Port=06 Cnt=01 Dev#= 17 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=10c8 Rev=05.15 S: Manufacturer=Telit Cinterion S: Product=FE910 S: SerialNumber=f71b8b32 C: #Ifs= 5 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=82(I) Atr=03(Int.) MxPS= 8 Ivl=32ms I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=84(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 3 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=80 Driver=(none) E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none) E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms Signed-off-by: Daniele Palmas <dnlplm@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <johan@kernel.org>	2024-12-11 10:37:28 +01:00
Jack Wu	f07dfa6a1b	USB: serial: option: add MediaTek T7XX compositions Add the MediaTek T7XX compositions: T: Bus=03 Lev=01 Prnt=01 Port=05 Cnt=01 Dev#= 74 Spd=480 MxCh= 0 D: Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=0e8d ProdID=7129 Rev= 0.01 S: Manufacturer=MediaTek Inc. S: Product=USB DATA CARD S: SerialNumber=004402459035402 C:* #Ifs=10 Cfg#= 1 Atr=a0 MxPwr=500mA A: FirstIf#= 0 IfCount= 2 Cls=02(comm.) Sub=0e Prot=00 I:* If#= 0 Alt= 0 #EPs= 1 Cls=02(comm.) Sub=0e Prot=00 Driver=cdc_mbim E: Ad=82(I) Atr=03(Int.) MxPS= 64 Ivl=32ms I: If#= 1 Alt= 0 #EPs= 0 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim I:* If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none) E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 6 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=06(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 7 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=88(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=07(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 8 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=89(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=08(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 9 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=8a(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=09(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms ------------------------------- \| If Number \| Function \| ------------------------------- \| 2 \| USB AP Log Port \| ------------------------------- \| 3 \| USB AP GNSS Port\| ------------------------------- \| 4 \| USB AP META Port\| ------------------------------- \| 5 \| ADB port \| ------------------------------- \| 6 \| USB MD AT Port \| ------------------------------ \| 7 \| USB MD META Port\| ------------------------------- \| 8 \| USB NTZ Port \| ------------------------------- \| 9 \| USB Debug port \| ------------------------------- Signed-off-by: Jack Wu <wojackbb@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <johan@kernel.org>	2024-12-11 10:26:51 +01:00
Mank Wang	aa954ae082	USB: serial: option: add Netprisma LCUK54 modules for WWAN Ready LCUK54-WRD's pid/vid 0x3731/0x010a 0x3731/0x010c LCUK54-WWD's pid/vid 0x3731/0x010b 0x3731/0x010d Above products use the exact same interface layout and option driver: MBIM + GNSS + DIAG + NMEA + AT + QDSS + DPL T: Bus=01 Lev=01 Prnt=01 Port=01 Cnt=02 Dev#= 5 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=3731 ProdID=0101 Rev= 5.04 S: Manufacturer=NetPrisma S: Product=LCUK54-WRD S: SerialNumber=feeba631 C:* #Ifs= 8 Cfg#= 1 Atr=a0 MxPwr=500mA A: FirstIf#= 0 IfCount= 2 Cls=02(comm.) Sub=0e Prot=00 I:* If#= 0 Alt= 0 #EPs= 1 Cls=02(comm.) Sub=0e Prot=00 Driver=cdc_mbim E: Ad=81(I) Atr=03(Int.) MxPS= 64 Ivl=32ms I: If#= 1 Alt= 0 #EPs= 0 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim I:* If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim E: Ad=8e(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=0f(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 2 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none) E: Ad=82(I) Atr=03(Int.) MxPS= 64 Ivl=32ms I:* If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=40 Driver=option E: Ad=85(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 5 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=87(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 6 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=70 Driver=(none) E: Ad=88(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 7 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=80 Driver=(none) E: Ad=8f(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms Signed-off-by: Mank Wang <mank.wang@netprisma.com> [ johan: use lower case hex notation ] Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <johan@kernel.org>	2024-12-11 10:26:18 +01:00
Michal Hrusecky	724d461e44	USB: serial: option: add MeiG Smart SLM770A Update the USB serial option driver to support MeiG Smart SLM770A. ID 2dee:4d57 Marvell Mobile Composite Device Bus T: Bus=02 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=2dee ProdID=4d57 Rev= 1.00 S: Manufacturer=Marvell S: Product=Mobile Composite Device Bus C:* #Ifs= 6 Cfg#= 1 Atr=c0 MxPwr=500mA A: FirstIf#= 0 IfCount= 2 Cls=e0(wlcon) Sub=01 Prot=03 I:* If#= 0 Alt= 0 #EPs= 1 Cls=e0(wlcon) Sub=01 Prot=03 Driver=rndis_host E: Ad=87(I) Atr=03(Int.) MxPS= 64 Ivl=4096ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=rndis_host E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=0c(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=0b(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=88(I) Atr=03(Int.) MxPS= 64 Ivl=4096ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=0a(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=89(I) Atr=03(Int.) MxPS= 64 Ivl=4096ms E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=0f(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=0e(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms Tested successfully connecting to the Internet via rndis interface after dialing via AT commands on If#=3 or If#=4. Not sure of the purpose of the other serial interfaces. Signed-off-by: Michal Hrusecky <michal.hrusecky@turris.com> Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <johan@kernel.org>	2024-12-11 10:05:43 +01:00
Daniel Swanemar	fdad4fb7c5	USB: serial: option: add TCL IK512 MBIM & ECM Add the following TCL IK512 compositions: 0x0530: Modem + Diag + AT + MBIM T: Bus=04 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 3 Spd=10000 MxCh= 0 D: Ver= 3.20 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 9 #Cfgs= 1 P: Vendor=1bbb ProdID=0530 Rev=05.04 S: Manufacturer=TCL S: Product=TCL 5G USB Dongle S: SerialNumber=3136b91a C: #Ifs= 5 Cfg#= 1 Atr=80 MxPwr=896mA I: If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=01(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=82(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=84(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=85(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 3 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim E: Ad=86(I) Atr=03(Int.) MxPS= 64 Ivl=32ms I: If#= 4 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim E: Ad=0f(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=8e(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms 0x0640: ECM + Modem + Diag + AT T: Bus=04 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 4 Spd=10000 MxCh= 0 D: Ver= 3.20 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 9 #Cfgs= 1 P: Vendor=1bbb ProdID=0640 Rev=05.04 S: Manufacturer=TCL S: Product=TCL 5G USB Dongle S: SerialNumber=3136b91a C: #Ifs= 5 Cfg#= 1 Atr=80 MxPwr=896mA I: If#= 0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=06 Prot=00 Driver=cdc_ether E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=32ms I: If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=cdc_ether E: Ad=0f(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=8e(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=01(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=82(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=83(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=84(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=86(I) Atr=03(Int.) MxPS= 10 Ivl=32ms Signed-off-by: Daniel Swanemar <d.swanemar@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <johan@kernel.org>	2024-12-11 09:57:34 +01:00
Mario Limonciello	e34f1717ef	thunderbolt: Don't display nvm_version unless upgrade supported The read will never succeed if NVM wasn't initialized due to an unknown format. Add a new callback for visibility to only show when supported. Cc: stable@vger.kernel.org Fixes: `aef9c693e7` ("thunderbolt: Move vendor specific NVM handling into nvm.c") Reported-by: Richard Hughes <hughsient@gmail.com> Closes: https://github.com/fwupd/fwupd/issues/8200 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>	2024-12-11 09:11:51 +02:00
Costa Shulyupin	eb1dd15fb2	cgroup/cpuset: Remove stale text Task's cpuset pointer was removed by commit `8793d854ed` ("Task Control Groups: make cpusets a client of cgroups") Paragraph "The task_lock() exception ...." was removed by commit `2df167a300` ("cgroups: update comments in cpuset.c") Remove stale text: We also require taking task_lock() when dereferencing a task's cpuset pointer. See "The task_lock() exception", at the end of this comment. Accessing a task's cpuset should be done in accordance with the guidelines for accessing subsystem state in kernel/cgroup.c and reformat. Co-developed-by: Michal Koutný <mkoutny@suse.com> Co-developed-by: Waiman Long <longman@redhat.com> Signed-off-by: Costa Shulyupin <costa.shul@redhat.com> Acked-by: Waiman Long <longman@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2024-12-10 20:38:41 -10:00
Andrea Righi	2d2f25405a	MAINTAINERS: add self as reviewer for sched_ext Add myself as a reviewer for sched_ext, as I am actively working on this project and would like to help review relevant patches and address any related kernel issues. Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2024-12-10 20:36:05 -10:00
David Vernet	b8f614207b	scx: Fix maximal BPF selftest prog maximal.bpf.c is still dispatching to and consuming from SCX_DSQ_GLOBAL. Let's have it use its own DSQ to avoid any runtime errors. Signed-off-by: David Vernet <void@manifault.com> Tested-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2024-12-10 20:30:50 -10:00
Nikita Yushchenko	3dd002f200	net: renesas: rswitch: handle stop vs interrupt race Currently the stop routine of rswitch driver does not immediately prevent hardware from continuing to update descriptors and requesting interrupts. It can happen that when rswitch_stop() executes the masking of interrupts from the queues of the port being closed, napi poll for that port is already scheduled or running on a different CPU. When execution of this napi poll completes, it will unmask the interrupts. And unmasked interrupt can fire after rswitch_stop() returns from napi_disable() call. Then, the handler won't mask it, because napi_schedule_prep() will return false, and interrupt storm will happen. This can't be fixed by making rswitch_stop() call napi_disable() before masking interrupts. In this case, the interrupt storm will happen if interrupt fires between napi_disable() and masking. Fix this by checking for priv->opened_ports bit when unmasking interrupts after napi poll. For that to be consistent, move priv->opened_ports changes into spinlock-protected areas, and reorder other operations in rswitch_open() and rswitch_stop() accordingly. Signed-off-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com> Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Fixes: `3590918b5d` ("net: ethernet: renesas: Add support for "Ethernet Switch"") Link: https://patch.msgid.link/20241209113204.175015-1-nikita.yoush@cogentembedded.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-10 19:08:00 -08:00
Jakub Kicinski	93763e68f1	Merge branch 'net-renesas-rswitch-several-fixes' Nikita Yushchenko says: ==================== net: renesas: rswitch: several fixes This series fixes several glitches found in the rswitch driver. Repost of https://lore.kernel.org/20241206190015.4194153-1-nikita.yoush@cogentembedded.com ==================== Link: https://patch.msgid.link/20241208095004.69468-1-nikita.yoush@cogentembedded.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-10 19:02:49 -08:00
Nikita Yushchenko	66b7e9f85b	net: renesas: rswitch: avoid use-after-put for a device tree node The device tree node saved in the rswitch_device structure is used at several driver locations. So passing this node to of_node_put() after the first use is wrong. Move of_node_put() for this node to exit paths. Fixes: `b46f1e5793` ("net: renesas: rswitch: Simplify struct phy * handling") Signed-off-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com> Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Link: https://patch.msgid.link/20241208095004.69468-5-nikita.yoush@cogentembedded.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-10 19:02:47 -08:00
Nikita Yushchenko	bb617328ba	net: renesas: rswitch: fix leaked pointer on error path If error path is taken while filling descriptor for a frame, skb pointer is left in the entry. Later, on the ring entry reuse, the same entry could be used as a part of a multi-descriptor frame, and skb for that new frame could be stored in a different entry. Then, the stale pointer will reach the completion routine, and passed to the release operation. Fix that by clearing the saved skb pointer at the error path. Fixes: `d2c96b9d5f` ("net: rswitch: Add jumbo frames handling for TX") Signed-off-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com> Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Link: https://patch.msgid.link/20241208095004.69468-4-nikita.yoush@cogentembedded.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-10 19:02:47 -08:00
Nikita Yushchenko	0c9547e6cc	net: renesas: rswitch: fix race window between tx start and complete If hardware is already transmitting, it can start handling the descriptor being written to immediately after it observes updated DT field, before the queue is kicked by a write to GWTRC. If the start_xmit() execution is preempted at unfortunate moment, this transmission can complete, and interrupt handled, before gq->cur gets updated. With the current implementation of completion, this will cause the last entry not completed. Fix that by changing completion loop to check DT values directly, instead of depending on gq->cur. Fixes: `3590918b5d` ("net: ethernet: renesas: Add support for "Ethernet Switch"") Signed-off-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com> Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Link: https://patch.msgid.link/20241208095004.69468-3-nikita.yoush@cogentembedded.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-10 19:02:47 -08:00
Nikita Yushchenko	5cb099902b	net: renesas: rswitch: fix possible early skb release When sending frame split into multiple descriptors, hardware processes descriptors one by one, including writing back DT values. The first descriptor could be already marked as completed when processing of next descriptors for the same frame is still in progress. Although only the last descriptor is configured to generate interrupt, completion of the first descriptor could be noticed by the driver when handling interrupt for the previous frame. Currently, driver stores skb in the entry that corresponds to the first descriptor. This results into skb could be unmapped and freed when hardware did not complete the send yet. This opens a window for corrupting the data being sent. Fix this by saving skb in the entry that corresponds to the last descriptor used to send the frame. Fixes: `d2c96b9d5f` ("net: rswitch: Add jumbo frames handling for TX") Signed-off-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com> Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Link: https://patch.msgid.link/20241208095004.69468-2-nikita.yoush@cogentembedded.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-10 19:02:47 -08:00
Enzo Matsumiya	633609c48a	smb: client: destroy cfid_put_wq on module exit Fix potential problem in rmmod Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>	2024-12-10 20:47:39 -06:00
Thorsten Blum	8676c4dfae	cifs: Use str_yes_no() helper in cifs_ses_add_channel() Remove hard-coded strings by using the str_yes_no() helper function. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Signed-off-by: Steve French <stfrench@microsoft.com>	2024-12-10 20:47:37 -06:00
David Howells	bb57c81e97	cifs: Fix rmdir failure due to ongoing I/O on deleted file The cifs_io_request struct (a wrapper around netfs_io_request) holds open the file on the server, even beyond the local Linux file being closed. This can cause problems with Windows-based filesystems as the file's name still exists after deletion until the file is closed, preventing the parent directory from being removed and causing spurious test failures in xfstests due to inability to remove a directory. The symptom looks something like this in the test output: rm: cannot remove '/mnt/scratch/test/p0/d3': Directory not empty rm: cannot remove '/mnt/scratch/test/p1/dc/dae': Directory not empty Fix this by waiting in unlink and rename for any outstanding I/O requests to be completed on the target file before removing that file. Note that this doesn't prevent Linux from trying to start new requests after deletion if it still has the file open locally - something that's perfectly acceptable on a UNIX system. Note also that whilst I've marked this as fixing the commit to make cifs use netfslib, I don't know that it won't occur before that. Fixes: `3ee1a1fc39` ("cifs: Cut over to using netfslib") Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>	2024-12-10 20:47:34 -06:00
Jakub Kicinski	d02af27fa2	Merge tag 'wireless-2024-12-10' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Johannes Berg says: ==================== A small set of fixes: - avoid CSA warnings during link removal (by changing link bitmap after remove) - fix # of spatial streams initialisation - fix queues getting stuck in some CSA cases and resume failures - fix interface address when switching monitor mode - fix MBSS change flags 32-bit stack corruption - more UBSAN __counted_by "fixes" ... - fix link ID netlink validation * tag 'wireless-2024-12-10' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: wifi: cfg80211: sme: init n_channels before channels[] access wifi: mac80211: fix station NSS capability initialization order wifi: mac80211: fix vif addr when switching from monitor to station wifi: mac80211: fix a queue stall in certain cases of CSA wifi: mac80211: wake the queues in case of failure in resume wifi: cfg80211: clear link ID from bitmap during link delete after clean up wifi: mac80211: init cnt before accessing elem in ieee80211_copy_mbssid_beacon wifi: mac80211: fix mbss changed flags corruption on 32 bit systems wifi: nl80211: fix NL80211_ATTR_MLO_LINK_ID off-by-one ==================== Link: https://patch.msgid.link/20241210130145.28618-3-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-10 18:44:25 -08:00
MoYuanhao	06d64ab46f	tcp: check space before adding MPTCP SYN options Ensure there is enough space before adding MPTCP options in tcp_syn_options(). Without this check, 'remaining' could underflow, and causes issues. If there is not enough space, MPTCP should not be used. Signed-off-by: MoYuanhao <moyuanhao3676@163.com> Fixes: `cec37a6e41` ("mptcp: Handle MP_CAPABLE options for outgoing connections") Cc: stable@vger.kernel.org Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> [ Matt: Add Fixes, cc Stable, update Description ] Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241209-net-mptcp-check-space-syn-v1-1-2da992bb6f74@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-10 18:26:52 -08:00
Petr Machata	bbe4b41259	Documentation: networking: Add a caveat to nexthop_compat_mode sysctl net.ipv4.nexthop_compat_mode was added when nexthop objects were added to provide the view of nexthop objects through the usual lens of the route UAPI. As nexthop objects evolved, the information provided through this lens became incomplete. For example, details of resilient nexthop groups are obviously omitted. Now that 16-bit nexthop group weights are a thing, the 8-bit UAPI cannot convey the >8-bit weight accurately. Instead of inventing workarounds for an obsolete interface, just document the expectations of inaccuracy. Fixes: `b72a6a7ab9` ("net: nexthop: Increase weight to u16") Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/b575e32399ccacd09079b2a218255164535123bd.1733740749.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-10 18:26:24 -08:00
Michael Chan	24c6843b73	bnxt_en: Fix aggregation ID mask to prevent oops on 5760X chips The 5760X (P7) chip's HW GRO/LRO interface is very similar to that of the previous generation (5750X or P5). However, the aggregation ID fields in the completion structures on P7 have been redefined from 16 bits to 12 bits. The freed up 4 bits are redefined for part of the metadata such as the VLAN ID. The aggregation ID mask was not modified when adding support for P7 chips. Including the extra 4 bits for the aggregation ID can potentially cause the driver to store or fetch the packet header of GRO/LRO packets in the wrong TPA buffer. It may hit the BUG() condition in __skb_pull() because the SKB contains no valid packet header: kernel BUG at include/linux/skbuff.h:2766! Oops: invalid opcode: 0000 1 PREEMPT SMP NOPTI CPU: 4 UID: 0 PID: 0 Comm: swapper/4 Kdump: loaded Tainted: G OE 6.12.0-rc2+ #7 Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE Hardware name: Dell Inc. PowerEdge R760/0VRV9X, BIOS 1.0.1 12/27/2022 RIP: 0010:eth_type_trans+0xda/0x140 Code: 80 00 00 00 eb c1 8b 47 70 2b 47 74 48 8b 97 d0 00 00 00 83 f8 01 7e 1b 48 85 d2 74 06 66 83 3a ff 74 09 b8 00 04 00 00 eb a5 <0f> 0b b8 00 01 00 00 eb 9c 48 85 ff 74 eb 31 f6 b9 02 00 00 00 48 RSP: 0018:ff615003803fcc28 EFLAGS: 00010283 RAX: 00000000000022d2 RBX: 0000000000000003 RCX: ff2e8c25da334040 RDX: 0000000000000040 RSI: ff2e8c25c1ce8000 RDI: ff2e8c25869f9000 RBP: ff2e8c258c31c000 R08: ff2e8c25da334000 R09: 0000000000000001 R10: ff2e8c25da3342c0 R11: ff2e8c25c1ce89c0 R12: ff2e8c258e0990b0 R13: ff2e8c25bb120000 R14: ff2e8c25c1ce89c0 R15: ff2e8c25869f9000 FS: 0000000000000000(0000) GS:ff2e8c34be300000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055f05317e4c8 CR3: 000000108bac6006 CR4: 0000000000773ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <IRQ> ? die+0x33/0x90 ? do_trap+0xd9/0x100 ? eth_type_trans+0xda/0x140 ? do_error_trap+0x65/0x80 ? eth_type_trans+0xda/0x140 ? exc_invalid_op+0x4e/0x70 ? eth_type_trans+0xda/0x140 ? asm_exc_invalid_op+0x16/0x20 ? eth_type_trans+0xda/0x140 bnxt_tpa_end+0x10b/0x6b0 [bnxt_en] ? bnxt_tpa_start+0x195/0x320 [bnxt_en] bnxt_rx_pkt+0x902/0xd90 [bnxt_en] ? __bnxt_tx_int.constprop.0+0x89/0x300 [bnxt_en] ? kmem_cache_free+0x343/0x440 ? __bnxt_tx_int.constprop.0+0x24f/0x300 [bnxt_en] __bnxt_poll_work+0x193/0x370 [bnxt_en] bnxt_poll_p5+0x9a/0x300 [bnxt_en] ? try_to_wake_up+0x209/0x670 __napi_poll+0x29/0x1b0 Fix it by redefining the aggregation ID mask for P5_PLUS chips to be 12 bits. This will work because the maximum aggregation ID is less than 4096 on all P5_PLUS chips. Fixes: `13d2d3d381` ("bnxt_en: Add new P7 hardware interface definitions") Reviewed-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241209015448.1937766-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-10 18:25:38 -08:00
Linus Torvalds	f92f474986	Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fixes from Stephen Boyd: "Two reverts and two EN7581 driver fixes: - Revert the attempt to make CLK_GET_RATE_NOCACHE flag work in clk_set_rate() because it led to problems with the Qualcomm CPUFreq driver - Revert Amlogic reset driver back to the initial implementation. This broke probe of the audio subsystem on axg based platforms and also had compilation problems. We'll try again next time. - Fix a clk frequency and fix array bounds runtime checks in the Airoha EN7581 driver" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: en7523: Initialize num before accessing hws in en7523_register_clocks() clk: en7523: Fix wrong BUS clock for EN7581 clk: amlogic: axg-audio: revert reset implementation Revert "clk: Fix invalid execution of clk_set_rate"	2024-12-10 18:21:40 -08:00
Linus Torvalds	5a087a6b17	Merge tag 'for-6.13-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "A few more fixes. Apart from the one liners and updated bio splitting error handling there's a fix for subvolume mount with different flags. This was known and fixed for some time but I've delayed it to give it more testing. - fix unbalanced locking when swapfile activation fails when the subvolume gets deleted in the meantime - add btrfs error handling after bio_split() calls that got error handling recently - during unmount, flush delalloc workers at the right time before the cleaner thread is shut down - fix regression in buffered write folio conversion, explicitly wait for writeback as FGP_STABLE flag is currently a no-op on btrfs - handle race in subvolume mount with different flags, the conversion to the new mount API did not handle the case where multiple subvolumes get mounted in parallel, which is a distro use case" * tag 'for-6.13-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: flush delalloc workers queue before stopping cleaner kthread during unmount btrfs: handle bio_split() errors btrfs: properly wait for writeback before buffered write btrfs: fix missing snapshot drew unlock when root is dead during swap activation btrfs: fix mount failure due to remount races	2024-12-10 18:18:01 -08:00
Linus Torvalds	1594c49394	Merge tag 'probes-fixes-v6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull eprobes fix from Masami Hiramatsu: - release eprobe when failing to add dyn_event. This unregisters event call and release eprobe when it fails to add a dynamic event. Found in cleaning up. * tag 'probes-fixes-v6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing/eprobe: Fix to release eprobe when failed to add dyn_event	2024-12-10 18:15:25 -08:00
Namjae Jeon	21e46a79bb	ksmbd: set ATTR_CTIME flags when setting mtime David reported that the new warning from setattr_copy_mgtime is coming like the following. [ 113.215316] ------------[ cut here ]------------ [ 113.215974] WARNING: CPU: 1 PID: 31 at fs/attr.c:300 setattr_copy+0x1ee/0x200 [ 113.219192] CPU: 1 UID: 0 PID: 31 Comm: kworker/1:1 Not tainted 6.13.0-rc1+ #234 [ 113.220127] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014 [ 113.221530] Workqueue: ksmbd-io handle_ksmbd_work [ksmbd] [ 113.222220] RIP: 0010:setattr_copy+0x1ee/0x200 [ 113.222833] Code: 24 28 49 8b 44 24 30 48 89 53 58 89 43 6c 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc 48 89 df e8 77 d6 ff ff e9 cd fe ff ff <0f> 0b e9 be fe ff ff 66 0 [ 113.225110] RSP: 0018:ffffaf218010fb68 EFLAGS: 00010202 [ 113.225765] RAX: 0000000000000120 RBX: ffffa446815f8568 RCX: 0000000000000003 [ 113.226667] RDX: ffffaf218010fd38 RSI: ffffa446815f8568 RDI: ffffffff94eb03a0 [ 113.227531] RBP: ffffaf218010fb90 R08: 0000001a251e217d R09: 00000000675259fa [ 113.228426] R10: 0000000002ba8a6d R11: ffffa4468196c7a8 R12: ffffaf218010fd38 [ 113.229304] R13: 0000000000000120 R14: ffffffff94eb03a0 R15: 0000000000000000 [ 113.230210] FS: 0000000000000000(0000) GS:ffffa44739d00000(0000) knlGS:0000000000000000 [ 113.231215] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 113.232055] CR2: 00007efe0053d27e CR3: 000000000331a000 CR4: 00000000000006b0 [ 113.232926] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 113.233812] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 113.234797] Call Trace: [ 113.235116] <TASK> [ 113.235393] ? __warn+0x73/0xd0 [ 113.235802] ? setattr_copy+0x1ee/0x200 [ 113.236299] ? report_bug+0xf3/0x1e0 [ 113.236757] ? handle_bug+0x4d/0x90 [ 113.237202] ? exc_invalid_op+0x13/0x60 [ 113.237689] ? asm_exc_invalid_op+0x16/0x20 [ 113.238185] ? setattr_copy+0x1ee/0x200 [ 113.238692] btrfs_setattr+0x80/0x820 [btrfs] [ 113.239285] ? get_stack_info_noinstr+0x12/0xf0 [ 113.239857] ? __module_address+0x22/0xa0 [ 113.240368] ? handle_ksmbd_work+0x6e/0x460 [ksmbd] [ 113.240993] ? __module_text_address+0x9/0x50 [ 113.241545] ? __module_address+0x22/0xa0 [ 113.242033] ? unwind_next_frame+0x10e/0x920 [ 113.242600] ? __pfx_stack_trace_consume_entry+0x10/0x10 [ 113.243268] notify_change+0x2c2/0x4e0 [ 113.243746] ? stack_depot_save_flags+0x27/0x730 [ 113.244339] ? set_file_basic_info+0x130/0x2b0 [ksmbd] [ 113.244993] set_file_basic_info+0x130/0x2b0 [ksmbd] [ 113.245613] ? process_scheduled_works+0xbe/0x310 [ 113.246181] ? worker_thread+0x100/0x240 [ 113.246696] ? kthread+0xc8/0x100 [ 113.247126] ? ret_from_fork+0x2b/0x40 [ 113.247606] ? ret_from_fork_asm+0x1a/0x30 [ 113.248132] smb2_set_info+0x63f/0xa70 [ksmbd] ksmbd is trying to set the atime and mtime via notify_change without also setting the ctime. so This patch add ATTR_CTIME flags when setting mtime to avoid a warning. Reported-by: David Disseldorp <ddiss@suse.de> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2024-12-10 17:48:06 -06:00
Namjae Jeon	b95629435b	ksmbd: fix racy issue from session lookup and expire Increment the session reference count within the lock for lookup to avoid racy issue with session expire. Cc: stable@vger.kernel.org Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-25737 Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2024-12-10 17:48:06 -06:00
Hobin Woo	2b904d61a9	ksmbd: retry iterate_dir in smb2_query_dir Some file systems do not ensure that the single call of iterate_dir reaches the end of the directory. For example, FUSE fetches entries from a daemon using 4KB buffer and stops fetching if entries exceed the buffer. And then an actor of caller, KSMBD, is used to fill the entries from the buffer. Thus, pattern searching on FUSE, files located after the 4KB could not be found and STATUS_NO_SUCH_FILE was returned. Signed-off-by: Hobin Woo <hobin.woo@samsung.com> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Reviewed-by: Namjae Jeon <linkinjeon@kernel.org> Tested-by: Yoonho Shin <yoonho.shin@samsung.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2024-12-10 17:48:06 -06:00
Danilo Krummrich	2872e21c47	MAINTAINERS: align Danilo's maintainer entries Some entries use my kernel.org address, while others use my Red Hat one. Since this is a bit of an inconvinience for me, align them to all use the same (kernel.org) address. Acked-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Danilo Krummrich <dakr@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20241204152248.8644-1-dakr@kernel.org	2024-12-10 23:56:37 +01:00
Huaisheng Ye	76467a9481	cxl/region: Fix region creation for greater than x2 switches The cxl_port_setup_targets() algorithm fails to identify valid target list ordering in the presence of 4-way and above switches resulting in 'cxl create-region' failures of the form: $ cxl create-region -d decoder0.0 -g 1024 -s 2G -t ram -w 8 -m mem4 mem1 mem6 mem3 mem2 mem5 mem7 mem0 cxl region: create_region: region0: failed to set target7 to mem0 cxl region: cmd_create_region: created 0 regions [kernel debug message] check_last_peer:1213: cxl region0: pci0000:0c:port1: cannot host mem6:decoder7.0 at 2 bus_remove_device:574: bus: 'cxl': remove device region0 QEMU can create this failing topology: ACPI0017:00 [root0] \| HB_0 [port1] / \ RP_0 RP_1 \| \| USP [port2] USP [port3] / / \ \ / / \ \ DSP DSP DSP DSP DSP DSP DSP DSP \| \| \| \| \| \| \| \| mem4 mem6 mem2 mem7 mem1 mem3 mem5 mem0 Pos: 0 2 4 6 1 3 5 7 HB: Host Bridge RP: Root Port USP: Upstream Port DSP: Downstream Port ...with the following command steps: $ qemu-system-x86_64 -machine q35,cxl=on,accel=tcg \ -smp cpus=8 \ -m 8G \ -hda /home/work/vm-images/centos-stream8-02.qcow2 \ -object memory-backend-ram,size=4G,id=m0 \ -object memory-backend-ram,size=4G,id=m1 \ -object memory-backend-ram,size=2G,id=cxl-mem0 \ -object memory-backend-ram,size=2G,id=cxl-mem1 \ -object memory-backend-ram,size=2G,id=cxl-mem2 \ -object memory-backend-ram,size=2G,id=cxl-mem3 \ -object memory-backend-ram,size=2G,id=cxl-mem4 \ -object memory-backend-ram,size=2G,id=cxl-mem5 \ -object memory-backend-ram,size=2G,id=cxl-mem6 \ -object memory-backend-ram,size=2G,id=cxl-mem7 \ -numa node,memdev=m0,cpus=0-3,nodeid=0 \ -numa node,memdev=m1,cpus=4-7,nodeid=1 \ -netdev user,id=net0,hostfwd=tcp::2222-:22 \ -device virtio-net-pci,netdev=net0 \ -device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \ -device cxl-rp,port=0,bus=cxl.1,id=root_port0,chassis=0,slot=0 \ -device cxl-rp,port=1,bus=cxl.1,id=root_port1,chassis=0,slot=1 \ -device cxl-upstream,bus=root_port0,id=us0 \ -device cxl-downstream,port=0,bus=us0,id=swport0,chassis=0,slot=4 \ -device cxl-type3,bus=swport0,volatile-memdev=cxl-mem0,id=cxl-vmem0 \ -device cxl-downstream,port=1,bus=us0,id=swport1,chassis=0,slot=5 \ -device cxl-type3,bus=swport1,volatile-memdev=cxl-mem1,id=cxl-vmem1 \ -device cxl-downstream,port=2,bus=us0,id=swport2,chassis=0,slot=6 \ -device cxl-type3,bus=swport2,volatile-memdev=cxl-mem2,id=cxl-vmem2 \ -device cxl-downstream,port=3,bus=us0,id=swport3,chassis=0,slot=7 \ -device cxl-type3,bus=swport3,volatile-memdev=cxl-mem3,id=cxl-vmem3 \ -device cxl-upstream,bus=root_port1,id=us1 \ -device cxl-downstream,port=4,bus=us1,id=swport4,chassis=0,slot=8 \ -device cxl-type3,bus=swport4,volatile-memdev=cxl-mem4,id=cxl-vmem4 \ -device cxl-downstream,port=5,bus=us1,id=swport5,chassis=0,slot=9 \ -device cxl-type3,bus=swport5,volatile-memdev=cxl-mem5,id=cxl-vmem5 \ -device cxl-downstream,port=6,bus=us1,id=swport6,chassis=0,slot=10 \ -device cxl-type3,bus=swport6,volatile-memdev=cxl-mem6,id=cxl-vmem6 \ -device cxl-downstream,port=7,bus=us1,id=swport7,chassis=0,slot=11 \ -device cxl-type3,bus=swport7,volatile-memdev=cxl-mem7,id=cxl-vmem7 \ -M cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=32G & In Guest OS: $ cxl create-region -d decoder0.0 -g 1024 -s 2G -t ram -w 8 -m mem4 mem1 mem6 mem3 mem2 mem5 mem7 mem0 Fix the method to calculate @distance by iterativeley multiplying the number of targets per switch port. This also follows the algorithm recommended here [1]. Fixes: `27b3f8d138` ("cxl/region: Program target lists") Link: http://lore.kernel.org/6538824b52349_7258329466@dwillia2-xfh.jf.intel.com.notmuch [1] Signed-off-by: Huaisheng Ye <huaisheng.ye@intel.com> Tested-by: Li Zhijian <lizhijian@fujitsu.com> [djbw: add a comment explaining 'distance'] Signed-off-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/173378716722.1270362.9546805175813426729.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2024-12-10 14:50:34 -07:00
Li Ming	09ceba3a93	cxl/pci: Check dport->regs.rcd_pcie_cap availability before accessing RCD Upstream Port's PCI Express Capability is a component registers block stored in RCD Upstream Port RCRB. CXL PCI driver helps to map it during the RCD probing, but mapping failure is allowed for component registers blocks in CXL PCI driver. dport->regs.rcd_pcie_cap is used to store the virtual address of the RCD Upstream Port's PCI Express Capability, add a dport->regs.rcd_pcie_cap checking in rcd_pcie_cap_emit() just in case user accesses a invalid address via RCD sysfs. Fixes: `c5eaec79fa` ("cxl/pci: Add sysfs attribute for CXL 1.1 device link status") Signed-off-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20241129132825.569237-1-ming.li@zohomail.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2024-12-10 14:49:14 -07:00
Davidlohr Bueso	da4d8c8335	cxl/pci: Fix potential bogus return value upon successful probing If cxl_pci_ras_unmask() returns non-zero, cxl_pci_probe() will end up returning that value, instead of zero. Fixes: `248529edc8` ("cxl: add RAS status unmasking for CXL") Reviewed-by: Fan Ni <fan.ni@samsung.com> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://patch.msgid.link/20241115170032.108445-1-dave@stgolabs.net Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2024-12-10 14:30:51 -07:00
Jann Horn	7d0d673627	bpf: Fix theoretical prog_array UAF in __uprobe_perf_func() Currently, the pointer stored in call->prog_array is loaded in __uprobe_perf_func(), with no RCU annotation and no immediately visible RCU protection, so it looks as if the loaded pointer can immediately be dangling. Later, bpf_prog_run_array_uprobe() starts a RCU-trace read-side critical section, but this is too late. It then uses rcu_dereference_check(), but this use of rcu_dereference_check() does not actually dereference anything. Fix it by aligning the semantics to bpf_prog_run_array(): Let the caller provide rcu_read_lock_trace() protection and then load call->prog_array with rcu_dereference_check(). This issue seems to be theoretical: I don't know of any way to reach this code without having handle_swbp() further up the stack, which is already holding a rcu_read_lock_trace() lock, so where we take rcu_read_lock_trace() in __uprobe_perf_func()/bpf_prog_run_array_uprobe() doesn't actually have any effect. Fixes: `8c7dcb84e3` ("bpf: implement sleepable uprobes by chaining gps") Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241210-bpf-fix-uprobe-uaf-v4-1-5fc8959b2b74@google.com	2024-12-10 13:06:51 -08:00
LongPing Wei	790eb09e59	block: get wp_offset by bdev_offset_from_zone_start Call bdev_offset_from_zone_start() instead of open-coding it. Fixes: `dd291d77cc` ("block: Introduce zone write plugging") Signed-off-by: LongPing Wei <weilongping@oppo.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20241107020439.1644577-1-weilongping@oppo.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-10 13:24:24 -07:00
Anton Protopopov	c4441ca86a	bpf: fix potential error return The bpf_remove_insns() function returns WARN_ON_ONCE(error), where error is a result of bpf_adj_branches(), and thus should be always 0 However, if for any reason it is not 0, then it will be converted to boolean by WARN_ON_ONCE and returned to user space as 1, not an actual error value. Fix this by returning the original err after the WARN check. Signed-off-by: Anton Protopopov <aspsk@isovalent.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20241210114245.836164-1-aspsk@isovalent.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-12-10 11:17:53 -08:00
Paul Barker	ccb84dc8f4	Documentation: PM: Clarify pm_runtime_resume_and_get() return value Update the documentation to match the behaviour of the code. pm_runtime_resume_and_get() always returns 0 on success, even if __pm_runtime_resume() returns 1. Fixes: `2c412337cf` ("PM: runtime: Add documentation for pm_runtime_resume_and_get()") Signed-off-by: Paul Barker <paul.barker.ct@bp.renesas.com> Link: https://patch.msgid.link/20241203143729.478-1-paul.barker.ct@bp.renesas.com [ rjw: Subject and changelog edits, adjusted new comment formatting ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2024-12-10 20:14:22 +01:00
Alexei Starovoitov	cf8b876363	Merge branch 'bpf-track-changes_pkt_data-property-for-global-functions' Eduard Zingerman says: ==================== bpf: track changes_pkt_data property for global functions Nick Zavaritsky reported [0] a bug in verifier, where the following unsafe program is not rejected: __attribute__((__noinline__)) long skb_pull_data(struct __sk_buff sk, __u32 len) { return bpf_skb_pull_data(sk, len); } SEC("tc") int test_invalidate_checks(struct __sk_buff sk) { int p = (void )(long)sk->data; if ((void )(p + 1) > (void )(long)sk->data_end) return TCX_DROP; skb_pull_data(sk, 0); /* not safe, p is invalid after bpf_skb_pull_data call / p = 42; return TCX_PASS; } This happens because verifier does not track package invalidation effect of global sub-programs. This patch-set fixes the issue by modifying check_cfg() to compute whether or not each sub-program calls (directly or indirectly) helper invalidating packet pointers. As global functions could be replaced with extension programs, a new field 'changes_pkt_data' is added to struct bpf_prog_aux. Verifier only allows replacing functions that do not change packet data with functions that do not change packet data. In case if there is a need to a have a global function that does not change packet data, but allow replacing it with function that does, the recommendation is to add a noop call to a helper, e.g.: - for skb do 'bpf_skb_change_proto(skb, 0, 0)'; - for xdp do 'bpf_xdp_adjust_meta(xdp, 0)'. Functions also can do tail calls. Effects of the tail call cannot be analyzed before-hand, thus verifier assumes that tail calls always change packet data. Changes v1 [1] -> v2: - added handling of extension programs and tail calls (thanks, Alexei, for all the input). [0] https://lore.kernel.org/bpf/0498CA22-5779-4767-9C0C-A9515CEA711F@gmail.com/ [1] https://lore.kernel.org/bpf/20241206040307.568065-1-eddyz87@gmail.com/ ==================== Link: https://patch.msgid.link/20241210041100.1898468-1-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-12-10 10:24:58 -08:00
Eduard Zingerman	d9706b56e1	selftests/bpf: validate that tail call invalidates packet pointers Add a test case with a tail call done from a global sub-program. Such tails calls should be considered as invalidating packet pointers. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20241210041100.1898468-9-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-12-10 10:24:58 -08:00
Eduard Zingerman	1a4607ffba	bpf: consider that tail calls invalidate packet pointers Tail-called programs could execute any of the helpers that invalidate packet pointers. Hence, conservatively assume that each tail call invalidates packet pointers. Making the change in bpf_helper_changes_pkt_data() automatically makes use of check_cfg() logic that computes 'changes_pkt_data' effect for global sub-programs, such that the following program could be rejected: int tail_call(struct __sk_buff sk) { bpf_tail_call_static(sk, &jmp_table, 0); return 0; } SEC("tc") int not_safe(struct __sk_buff sk) { int p = (void )(long)sk->data; ... make p valid ... tail_call(sk); p = 42; / this is unsafe */ ... } The tc_bpf2bpf.c:subprog_tc() needs change: mark it as a function that can invalidate packet pointers. Otherwise, it can't be freplaced with tailcall_freplace.c:entry_freplace() that does a tail call. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20241210041100.1898468-8-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-12-10 10:24:57 -08:00
Eduard Zingerman	89ff40890d	selftests/bpf: freplace tests for tracking of changes_packet_data Try different combinations of global functions replacement: - replace function that changes packet data with one that doesn't; - replace function that changes packet data with one that does; - replace function that doesn't change packet data with one that does; - replace function that doesn't change packet data with one that doesn't; Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20241210041100.1898468-7-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-12-10 10:24:57 -08:00
Eduard Zingerman	81f6d0530b	bpf: check changes_pkt_data property for extension programs When processing calls to global sub-programs, verifier decides whether to invalidate all packet pointers in current state depending on the changes_pkt_data property of the global sub-program. Because of this, an extension program replacing a global sub-program must be compatible with changes_pkt_data property of the sub-program being replaced. This commit: - adds changes_pkt_data flag to struct bpf_prog_aux: - this flag is set in check_cfg() for main sub-program; - in jit_subprogs() for other sub-programs; - modifies bpf_check_attach_btf_id() to check changes_pkt_data flag; - moves call to check_attach_btf_id() after the call to check_cfg(), because it needs changes_pkt_data flag to be set: bpf_check: ... ... - check_attach_btf_id resolve_pseudo_ldimm64 resolve_pseudo_ldimm64 --> bpf_prog_is_offloaded bpf_prog_is_offloaded check_cfg check_cfg + check_attach_btf_id ... ... The following fields are set by check_attach_btf_id(): - env->ops - prog->aux->attach_btf_trace - prog->aux->attach_func_name - prog->aux->attach_func_proto - prog->aux->dst_trampoline - prog->aux->mod - prog->aux->saved_dst_attach_type - prog->aux->saved_dst_prog_type - prog->expected_attach_type Neither of these fields are used by resolve_pseudo_ldimm64() or bpf_prog_offload_verifier_prep() (for netronome and netdevsim drivers), so the reordering is safe. Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20241210041100.1898468-6-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-12-10 10:24:57 -08:00
Eduard Zingerman	3f23ee5590	selftests/bpf: test for changing packet data from global functions Check if verifier is aware of packet pointers invalidation done in global functions. Based on a test shared by Nick Zavaritsky in [0]. [0] https://lore.kernel.org/bpf/0498CA22-5779-4767-9C0C-A9515CEA711F@gmail.com/ Suggested-by: Nick Zavaritsky <mejedi@gmail.com> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20241210041100.1898468-5-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-12-10 10:24:57 -08:00
Eduard Zingerman	51081a3f25	bpf: track changes_pkt_data property for global functions When processing calls to certain helpers, verifier invalidates all packet pointers in a current state. For example, consider the following program: __attribute__((__noinline__)) long skb_pull_data(struct __sk_buff sk, __u32 len) { return bpf_skb_pull_data(sk, len); } SEC("tc") int test_invalidate_checks(struct __sk_buff sk) { int p = (void )(long)sk->data; if ((void )(p + 1) > (void )(long)sk->data_end) return TCX_DROP; skb_pull_data(sk, 0); *p = 42; return TCX_PASS; } After a call to bpf_skb_pull_data() the pointer 'p' can't be used safely. See function filter.c:bpf_helper_changes_pkt_data() for a list of such helpers. At the moment verifier invalidates packet pointers when processing helper function calls, and does not traverse global sub-programs when processing calls to global sub-programs. This means that calls to helpers done from global sub-programs do not invalidate pointers in the caller state. E.g. the program above is unsafe, but is not rejected by verifier. This commit fixes the omission by computing field bpf_subprog_info->changes_pkt_data for each sub-program before main verification pass. changes_pkt_data should be set if: - subprogram calls helper for which bpf_helper_changes_pkt_data returns true; - subprogram calls a global function, for which bpf_subprog_info->changes_pkt_data should be set. The verifier.c:check_cfg() pass is modified to compute this information. The commit relies on depth first instruction traversal done by check_cfg() and absence of recursive function calls: - check_cfg() would eventually visit every call to subprogram S in a state when S is fully explored; - when S is fully explored: - every direct helper call within S is explored (and thus changes_pkt_data is set if needed); - every call to subprogram S1 called by S was visited with S1 fully explored (and thus S inherits changes_pkt_data from S1). The downside of such approach is that dead code elimination is not taken into account: if a helper call inside global function is dead because of current configuration, verifier would conservatively assume that the call occurs for the purpose of the changes_pkt_data computation. Reported-by: Nick Zavaritsky <mejedi@gmail.com> Closes: https://lore.kernel.org/bpf/0498CA22-5779-4767-9C0C-A9515CEA711F@gmail.com/ Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20241210041100.1898468-4-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-12-10 10:24:57 -08:00
Eduard Zingerman	b238e187b4	bpf: refactor bpf_helper_changes_pkt_data to use helper number Use BPF helper number instead of function pointer in bpf_helper_changes_pkt_data(). This would simplify usage of this function in verifier.c:check_cfg() (in a follow-up patch), where only helper number is easily available and there is no real need to lookup helper proto. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20241210041100.1898468-3-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-12-10 10:24:57 -08:00
Eduard Zingerman	27e88bc4df	bpf: add find_containing_subprog() utility function Add a utility function, looking for a subprogram containing a given instruction index, rewrite find_subprog() to use this function. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20241210041100.1898468-2-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-12-10 10:24:57 -08:00
Jiri Olsa	978c4486cc	bpf,perf: Fix invalid prog_array access in perf_event_detach_bpf_prog Syzbot reported [1] crash that happens for following tracing scenario: - create tracepoint perf event with attr.inherit=1, attach it to the process and set bpf program to it - attached process forks -> chid creates inherited event the new child event shares the parent's bpf program and tp_event (hence prog_array) which is global for tracepoint - exit both process and its child -> release both events - first perf_event_detach_bpf_prog call will release tp_event->prog_array and second perf_event_detach_bpf_prog will crash, because tp_event->prog_array is NULL The fix makes sure the perf_event_detach_bpf_prog checks prog_array is valid before it tries to remove the bpf program from it. [1] https://lore.kernel.org/bpf/Z1MR6dCIKajNS6nU@krava/T/#m91dbf0688221ec7a7fc95e896a7ef9ff93b0b8ad Fixes: `0ee288e69d` ("bpf,perf: Fix perf_event_detach_bpf_prog error handling") Reported-by: syzbot+2e0d2840414ce817aaac@syzkaller.appspotmail.com Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241208142507.1207698-1-jolsa@kernel.org	2024-12-10 10:16:28 -08:00
Jann Horn	ef1b808e3b	bpf: Fix UAF via mismatching bpf_prog/attachment RCU flavors Uprobes always use bpf_prog_run_array_uprobe() under tasks-trace-RCU protection. But it is possible to attach a non-sleepable BPF program to a uprobe, and non-sleepable BPF programs are freed via normal RCU (see __bpf_prog_put_noref()). This leads to UAF of the bpf_prog because a normal RCU grace period does not imply a tasks-trace-RCU grace period. Fix it by explicitly waiting for a tasks-trace-RCU grace period after removing the attachment of a bpf_prog to a perf_event. Fixes: `8c7dcb84e3` ("bpf: implement sleepable uprobes by chaining gps") Suggested-by: Andrii Nakryiko <andrii@kernel.org> Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/bpf/20241210-bpf-fix-actual-uprobe-uaf-v1-1-19439849dd44@google.com	2024-12-10 10:14:02 -08:00
Hardevsinh Palaniya	7dd9eb6ba8	ARC: bpf: Correct conditional check in 'check_jmp_32' The original code checks 'if (ARC_CC_AL)', which is always true since ARC_CC_AL is a constant. This makes the check redundant and likely obscures the intention of verifying whether the jump is conditional. Updates the code to check cond == ARC_CC_AL instead, reflecting the intent to differentiate conditional from unconditional jumps. Suggested-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Acked-by: Shahab Vahedi <list+bpf@vahedi.org> Signed-off-by: Hardevsinh Palaniya <hardevsinh.palaniya@siliconsignals.io> Signed-off-by: Vineet Gupta <vgupta@kernel.org>	2024-12-10 10:12:56 -08:00
Uwe Kleine-König	4d93ffe661	ARC: dts: Replace deprecated snps,nr-gpios property for snps,dw-apb-gpio-port devices snps,dw-apb-gpio-port is deprecated since commit `ef42a8da3c` ("dt-bindings: gpio: dwapb: Add ngpios property support"). The respective driver supports this since commit `7569486d79` ("gpio: dwapb: Add ngpios DT-property support") which is included in Linux v5.10-rc1. This change was created using git grep -l snps,nr-gpios arch/arc/boot/dts \| xargs perl -p -i -e 's/\bsnps,nr-gpios\b/ngpios/ . Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Signed-off-by: Vineet Gupta <vgupta@kernel.org>	2024-12-10 10:12:56 -08:00
Paul E. McKenney	1e8af9f043	ARC: build: Use __force to suppress per-CPU cmpxchg warnings Currently, the cast of the first argument to cmpxchg_emu_u8() drops the __percpu address-space designator, which results in sparse complaints when applying cmpxchg() to per-CPU variables in ARC. Therefore, use __force to suppress these complaints, given that this does not pertain to cmpxchg() semantics, which are plently well-defined on variables in general, whether per-CPU or otherwise. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202409251336.ToC0TvWB-lkp@intel.com/ Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: <linux-snps-arc@lists.infradead.org> Signed-off-by: Vineet Gupta <vgupta@kernel.org>	2024-12-10 10:12:56 -08:00
Lukas Bulwahn	dd2b2302ef	ARC: fix reference of dependency for PAE40 config Commit d71e629bed5b ("ARC: build: disallow invalid PAE40 + 4K page config") reworks the build dependencies for ARC_HAS_PAE40, and accidentally refers to the non-existing config option MMU_V4 rather than the intended option ARC_MMU_V4. Note the missing prefix in the name here. Refer to the intended config option in the dependency of the ARC_HAS_PAE40 config. Fixes: d71e629bed5b ("ARC: build: disallow invalid PAE40 + 4K page config") Signed-off-by: Lukas Bulwahn <lukas.bulwahn@redhat.com> Signed-off-by: Vineet Gupta <vgupta@kernel.org>	2024-12-10 10:12:56 -08:00
Vineet Gupta	8871331b17	ARC: build: disallow invalid PAE40 + 4K page config The config option being built was \| CONFIG_ARC_MMU_V4=y \| CONFIG_ARC_PAGE_SIZE_4K=y \| CONFIG_HIGHMEM=y \| CONFIG_ARC_HAS_PAE40=y This was hitting a BUILD_BUG_ON() since a 4K page can't hoist 1k, 8-byte PTE entries (8 byte due to PAE40). BUILD_BUG_ON() is a good last ditch resort, but such a config needs to be disallowed explicitly in Kconfig. Side-note: the actual fix is single liner dependency, but while at it cleaned out a few things: - 4K dependency on MMU v3 or v4 is always true, since `288ff7de62` ("ARC: retire MMUv1 and MMUv2 support") - PAE40 dependency in on MMU ver not really ISA, although that follows eventually. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202409160223.xydgucbY-lkp@intel.com/ Signed-off-by: Vineet Gupta <vgupta@kernel.org>	2024-12-10 10:12:56 -08:00
Benjamin Szőke	c0cd2941bc	arc: rename aux.h to arc_aux.h The goal is to clean-up Linux repository from AUX file names, because the use of such file names is prohibited on other operating systems such as Windows, so the Linux repository cannot be cloned and edited on them. Reviewed-by: Shahab Vahedi <list+bpf@vahedi.org> Signed-off-by: Benjamin Szőke <egyszeregy@freemail.hu> Signed-off-by: Vineet Gupta <vgupta@kernel.org>	2024-12-10 10:12:56 -08:00
Zijun Hu	0f7ca6f693	of/irq: Fix using uninitialized variable @addr_len in API of_irq_parse_one() of_irq_parse_one() may use uninitialized variable @addr_len as shown below: // @addr_len is uninitialized int addr_len; // This operation does not touch @addr_len if it fails. addr = of_get_property(device, "reg", &addr_len); // Use uninitialized @addr_len if the operation fails. if (addr_len > sizeof(addr_buf)) addr_len = sizeof(addr_buf); // Check the operation result here. if (addr) memcpy(addr_buf, addr, addr_len); Fix by initializing @addr_len before the operation. Fixes: `b739dffa5d` ("of/irq: Prevent device address out-of-bounds read in interrupt map walk") Cc: stable@vger.kernel.org Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Link: https://lore.kernel.org/r/20241209-of_irq_fix-v1-4-782f1419c8a1@quicinc.com Signed-off-by: Rob Herring (Arm) <robh@kernel.org>	2024-12-10 10:52:45 -06:00
Zijun Hu	fec3edc47d	of/irq: Fix interrupt-map cell length check in of_irq_parse_imap_parent() On a malformed interrupt-map property which is shorter than expected by 1 cell, we may read bogus data past the end of the property instead of returning an error in of_irq_parse_imap_parent(). Decrement the remaining length when skipping over the interrupt parent phandle cell. Fixes: `935df1bd40` ("of/irq: Factor out parsing of interrupt-map parent phandle+args from of_irq_parse_raw()") Cc: stable@vger.kernel.org Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Link: https://lore.kernel.org/r/20241209-of_irq_fix-v1-1-782f1419c8a1@quicinc.com [rh: reword commit msg] Signed-off-by: Rob Herring (Arm) <robh@kernel.org>	2024-12-10 10:52:45 -06:00
Zijun Hu	5d009e0240	of: Fix refcount leakage for OF node returned by __of_get_dma_parent() __of_get_dma_parent() returns OF device node @args.np, but the node's refcount is increased twice, by both of_parse_phandle_with_args() and of_node_get(), so causes refcount leakage for the node. Fix by directly returning the node got by of_parse_phandle_with_args(). Fixes: `f83a6e5dea` ("of: address: Add support for the parent DMA bus") Cc: stable@vger.kernel.org Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Link: https://lore.kernel.org/r/20241206-of_core_fix-v1-4-dc28ed56bec3@quicinc.com Signed-off-by: Rob Herring (Arm) <robh@kernel.org>	2024-12-10 10:52:45 -06:00
Michal Luczaj	11d5245f60	selftests/bpf: Extend test for sockmap update with same Verify that the sockmap link was not severed, and socket's entry is indeed removed from the map when the corresponding descriptor gets closed. Signed-off-by: Michal Luczaj <mhal@rbox.co> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20241202-sockmap-replace-v1-2-1e88579e7bd5@rbox.co	2024-12-10 17:38:55 +01:00
Michal Luczaj	ed1fc5d76b	bpf, sockmap: Fix race between element replace and close() Element replace (with a socket different from the one stored) may race with socket's close() link popping & unlinking. __sock_map_delete() unconditionally unrefs the (wrong) element: // set map[0] = s0 map_update_elem(map, 0, s0) // drop fd of s0 close(s0) sock_map_close() lock_sock(sk) (s0!) sock_map_remove_links(sk) link = sk_psock_link_pop() sock_map_unlink(sk, link) sock_map_delete_from_link // replace map[0] with s1 map_update_elem(map, 0, s1) sock_map_update_elem (s1!) lock_sock(sk) sock_map_update_common psock = sk_psock(sk) spin_lock(&stab->lock) osk = stab->sks[idx] sock_map_add_link(..., &stab->sks[idx]) sock_map_unref(osk, &stab->sks[idx]) psock = sk_psock(osk) sk_psock_put(sk, psock) if (refcount_dec_and_test(&psock)) sk_psock_drop(sk, psock) spin_unlock(&stab->lock) unlock_sock(sk) __sock_map_delete spin_lock(&stab->lock) sk = *psk // s1 replaced s0; sk == s1 if (!sk_test \|\| sk_test == sk) // sk_test (s0) != sk (s1); no branch sk = xchg(psk, NULL) if (sk) sock_map_unref(sk, psk) // unref s1; sks[idx] will dangle psock = sk_psock(sk) sk_psock_put(sk, psock) if (refcount_dec_and_test()) sk_psock_drop(sk, psock) spin_unlock(&stab->lock) release_sock(sk) Then close(map) enqueues bpf_map_free_deferred, which finally calls sock_map_free(). This results in some refcount_t warnings along with a KASAN splat [1]. Fix __sock_map_delete(), do not allow sock_map_unref() on elements that may have been replaced. [1]: BUG: KASAN: slab-use-after-free in sock_map_free+0x10e/0x330 Write of size 4 at addr ffff88811f5b9100 by task kworker/u64:12/1063 CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Not tainted 6.12.0+ #125 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 Workqueue: events_unbound bpf_map_free_deferred Call Trace: <TASK> dump_stack_lvl+0x68/0x90 print_report+0x174/0x4f6 kasan_report+0xb9/0x190 kasan_check_range+0x10f/0x1e0 sock_map_free+0x10e/0x330 bpf_map_free_deferred+0x173/0x320 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> Allocated by task 1202: kasan_save_stack+0x1e/0x40 kasan_save_track+0x10/0x30 __kasan_slab_alloc+0x85/0x90 kmem_cache_alloc_noprof+0x131/0x450 sk_prot_alloc+0x5b/0x220 sk_alloc+0x2c/0x870 unix_create1+0x88/0x8a0 unix_create+0xc5/0x180 __sock_create+0x241/0x650 __sys_socketpair+0x1ce/0x420 __x64_sys_socketpair+0x92/0x100 do_syscall_64+0x93/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e Freed by task 46: kasan_save_stack+0x1e/0x40 kasan_save_track+0x10/0x30 kasan_save_free_info+0x37/0x60 __kasan_slab_free+0x4b/0x70 kmem_cache_free+0x1a1/0x590 __sk_destruct+0x388/0x5a0 sk_psock_destroy+0x73e/0xa50 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 The buggy address belongs to the object at ffff88811f5b9080 which belongs to the cache UNIX-STREAM of size 1984 The buggy address is located 128 bytes inside of freed 1984-byte region [ffff88811f5b9080, ffff88811f5b9840) The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11f5b8 head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 memcg:ffff888127d49401 flags: 0x17ffffc0000040(head\|node=0\|zone=2\|lastcpupid=0x1fffff) page_type: f5(slab) raw: 0017ffffc0000040 ffff8881042e4500 dead000000000122 0000000000000000 raw: 0000000000000000 00000000800f000f 00000001f5000000 ffff888127d49401 head: 0017ffffc0000040 ffff8881042e4500 dead000000000122 0000000000000000 head: 0000000000000000 00000000800f000f 00000001f5000000 ffff888127d49401 head: 0017ffffc0000003 ffffea00047d6e01 ffffffffffffffff 0000000000000000 head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88811f5b9000: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88811f5b9080: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff88811f5b9180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88811f5b9200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb Disabling lock debugging due to kernel taint refcount_t: addition on 0; use-after-free. WARNING: CPU: 14 PID: 1063 at lib/refcount.c:25 refcount_warn_saturate+0xce/0x150 CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Tainted: G B 6.12.0+ #125 Tainted: [B]=BAD_PAGE Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 Workqueue: events_unbound bpf_map_free_deferred RIP: 0010:refcount_warn_saturate+0xce/0x150 Code: 34 73 eb 03 01 e8 82 53 ad fe 0f 0b eb b1 80 3d 27 73 eb 03 00 75 a8 48 c7 c7 80 bd 95 84 c6 05 17 73 eb 03 01 e8 62 53 ad fe <0f> 0b eb 91 80 3d 06 73 eb 03 00 75 88 48 c7 c7 e0 bd 95 84 c6 05 RSP: 0018:ffff88815c49fc70 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff88811f5b9100 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000001 RBP: 0000000000000002 R08: 0000000000000001 R09: ffffed10bcde6349 R10: ffff8885e6f31a4b R11: 0000000000000000 R12: ffff88813be0b000 R13: ffff88811f5b9100 R14: ffff88811f5b9080 R15: ffff88813be0b024 FS: 0000000000000000(0000) GS:ffff8885e6f00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055dda99b0250 CR3: 000000015dbac000 CR4: 0000000000752ef0 PKRU: 55555554 Call Trace: <TASK> ? __warn.cold+0x5f/0x1ff ? refcount_warn_saturate+0xce/0x150 ? report_bug+0x1ec/0x390 ? handle_bug+0x58/0x90 ? exc_invalid_op+0x13/0x40 ? asm_exc_invalid_op+0x16/0x20 ? refcount_warn_saturate+0xce/0x150 sock_map_free+0x2e5/0x330 bpf_map_free_deferred+0x173/0x320 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> irq event stamp: 10741 hardirqs last enabled at (10741): [<ffffffff84400ec6>] asm_sysvec_apic_timer_interrupt+0x16/0x20 hardirqs last disabled at (10740): [<ffffffff811e532d>] handle_softirqs+0x60d/0x770 softirqs last enabled at (10506): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 softirqs last disabled at (10301): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 refcount_t: underflow; use-after-free. WARNING: CPU: 14 PID: 1063 at lib/refcount.c:28 refcount_warn_saturate+0xee/0x150 CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Tainted: G B W 6.12.0+ #125 Tainted: [B]=BAD_PAGE, [W]=WARN Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 Workqueue: events_unbound bpf_map_free_deferred RIP: 0010:refcount_warn_saturate+0xee/0x150 Code: 17 73 eb 03 01 e8 62 53 ad fe 0f 0b eb 91 80 3d 06 73 eb 03 00 75 88 48 c7 c7 e0 bd 95 84 c6 05 f6 72 eb 03 01 e8 42 53 ad fe <0f> 0b e9 6e ff ff ff 80 3d e6 72 eb 03 00 0f 85 61 ff ff ff 48 c7 RSP: 0018:ffff88815c49fc70 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff88811f5b9100 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000001 RBP: 0000000000000003 R08: 0000000000000001 R09: ffffed10bcde6349 R10: ffff8885e6f31a4b R11: 0000000000000000 R12: ffff88813be0b000 R13: ffff88811f5b9100 R14: ffff88811f5b9080 R15: ffff88813be0b024 FS: 0000000000000000(0000) GS:ffff8885e6f00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055dda99b0250 CR3: 000000015dbac000 CR4: 0000000000752ef0 PKRU: 55555554 Call Trace: <TASK> ? __warn.cold+0x5f/0x1ff ? refcount_warn_saturate+0xee/0x150 ? report_bug+0x1ec/0x390 ? handle_bug+0x58/0x90 ? exc_invalid_op+0x13/0x40 ? asm_exc_invalid_op+0x16/0x20 ? refcount_warn_saturate+0xee/0x150 sock_map_free+0x2d3/0x330 bpf_map_free_deferred+0x173/0x320 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> irq event stamp: 10741 hardirqs last enabled at (10741): [<ffffffff84400ec6>] asm_sysvec_apic_timer_interrupt+0x16/0x20 hardirqs last disabled at (10740): [<ffffffff811e532d>] handle_softirqs+0x60d/0x770 softirqs last enabled at (10506): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 softirqs last disabled at (10301): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 Fixes: `604326b41a` ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Michal Luczaj <mhal@rbox.co> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20241202-sockmap-replace-v1-3-1e88579e7bd5@rbox.co	2024-12-10 17:38:05 +01:00
Michal Luczaj	75e072a390	bpf, sockmap: Fix update element with same Consider a sockmap entry being updated with the same socket: osk = stab->sks[idx]; sock_map_add_link(psock, link, map, &stab->sks[idx]); stab->sks[idx] = sk; if (osk) sock_map_unref(osk, &stab->sks[idx]); Due to sock_map_unref(), which invokes sock_map_del_link(), all the psock's links for stab->sks[idx] are torn: list_for_each_entry_safe(link, tmp, &psock->link, list) { if (link->link_raw == link_raw) { ... list_del(&link->list); sk_psock_free_link(link); } } And that includes the new link sock_map_add_link() added just before the unref. This results in a sockmap holding a socket, but without the respective link. This in turn means that close(sock) won't trigger the cleanup, i.e. a closed socket will not be automatically removed from the sockmap. Stop tearing the links when a matching link_raw is found. Fixes: `604326b41a` ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Michal Luczaj <mhal@rbox.co> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20241202-sockmap-replace-v1-1-1e88579e7bd5@rbox.co	2024-12-10 17:37:02 +01:00
Mario Limonciello	2993b29b2a	cpufreq/amd-pstate: Use boost numerator for upper bound of frequencies commit `18d9b52271` ("cpufreq/amd-pstate: Use nominal perf for limits when boost is disabled") introduced different semantics for min/max limits based upon whether the user turned off boost from sysfs. This however is not necessary when the highest perf value is the boost numerator. Suggested-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Fixes: `18d9b52271` ("cpufreq/amd-pstate: Use nominal perf for limits when boost is disabled") Link: https://lore.kernel.org/r/20241209185248.16301-3-mario.limonciello@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>	2024-12-10 10:17:43 -06:00
Mario Limonciello	50a062a762	cpufreq/amd-pstate: Store the boost numerator as highest perf again commit `ad4caad58d` ("cpufreq: amd-pstate: Merge amd_pstate_highest_perf_set() into amd_get_boost_ratio_numerator()") changed the semantics for highest perf and commit `18d9b52271` ("cpufreq/amd-pstate: Use nominal perf for limits when boost is disabled") worked around those semantic changes. This however is a confusing result and furthermore makes it awkward to change frequency limits and boost due to the scaling differences. Restore the boost numerator to highest perf again. Suggested-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Fixes: `ad4caad58d` ("cpufreq: amd-pstate: Merge amd_pstate_highest_perf_set() into amd_get_boost_ratio_numerator()") Link: https://lore.kernel.org/r/20241209185248.16301-2-mario.limonciello@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>	2024-12-10 10:17:43 -06:00
Tejun Heo	86e6ca55b8	blk-cgroup: Fix UAF in blkcg_unpin_online() blkcg_unpin_online() walks up the blkcg hierarchy putting the online pin. To walk up, it uses blkcg_parent(blkcg) but it was calling that after blkcg_destroy_blkgs(blkcg) which could free the blkcg, leading to the following UAF: ================================================================== BUG: KASAN: slab-use-after-free in blkcg_unpin_online+0x15a/0x270 Read of size 8 at addr ffff8881057678c0 by task kworker/9:1/117 CPU: 9 UID: 0 PID: 117 Comm: kworker/9:1 Not tainted 6.13.0-rc1-work-00182-gb8f52214c61a-dirty #48 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS unknown 02/02/2022 Workqueue: cgwb_release cgwb_release_workfn Call Trace: <TASK> dump_stack_lvl+0x27/0x80 print_report+0x151/0x710 kasan_report+0xc0/0x100 blkcg_unpin_online+0x15a/0x270 cgwb_release_workfn+0x194/0x480 process_scheduled_works+0x71b/0xe20 worker_thread+0x82a/0xbd0 kthread+0x242/0x2c0 ret_from_fork+0x33/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> ... Freed by task 1944: kasan_save_track+0x2b/0x70 kasan_save_free_info+0x3c/0x50 __kasan_slab_free+0x33/0x50 kfree+0x10c/0x330 css_free_rwork_fn+0xe6/0xb30 process_scheduled_works+0x71b/0xe20 worker_thread+0x82a/0xbd0 kthread+0x242/0x2c0 ret_from_fork+0x33/0x70 ret_from_fork_asm+0x1a/0x30 Note that the UAF is not easy to trigger as the free path is indirected behind a couple RCU grace periods and a work item execution. I could only trigger it with artifical msleep() injected in blkcg_unpin_online(). Fix it by reading the parent pointer before destroying the blkcg's blkg's. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Abagail ren <renzezhongucas@gmail.com> Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org> Fixes: `4308a434e5` ("blkcg: don't offline parent blkcg first") Cc: stable@vger.kernel.org # v5.7+ Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-10 09:17:00 -07:00
Coly Li	6f604b59c2	MAINTAINERS: update Coly Li's email address This patch updates Coly Li's email addres to colyli@kernel.org. Signed-off-by: Coly Li <colyli@kernel.org> Link: https://lore.kernel.org/r/20241208115350.85103-1-colyli@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-10 09:16:24 -07:00
Damien Le Moal	fe0418eb9b	block: Prevent potential deadlocks in zone write plug error recovery Zone write plugging for handling writes to zones of a zoned block device always execute a zone report whenever a write BIO to a zone fails. The intent of this is to ensure that the tracking of a zone write pointer is always correct to ensure that the alignment to a zone write pointer of write BIOs can be checked on submission and that we can always correctly emulate zone append operations using regular write BIOs. However, this error recovery scheme introduces a potential deadlock if a device queue freeze is initiated while BIOs are still plugged in a zone write plug and one of these write operation fails. In such case, the disk zone write plug error recovery work is scheduled and executes a report zone. This in turn can result in a request allocation in the underlying driver to issue the report zones command to the device. But with the device queue freeze already started, this allocation will block, preventing the report zone execution and the continuation of the processing of the plugged BIOs. As plugged BIOs hold a queue usage reference, the queue freeze itself will never complete, resulting in a deadlock. Avoid this problem by completely removing from the zone write plugging code the use of report zones operations after a failed write operation, instead relying on the device user to either execute a report zones, reset the zone, finish the zone, or give up writing to the device (which is a fairly common pattern for file systems which degrade to read-only after write failures). This is not an unreasonnable requirement as all well-behaved applications, FSes and device mapper already use report zones to recover from write errors whenever possible by comparing the current position of a zone write pointer with what their assumption about the position is. The changes to remove the automatic error recovery are as follows: - Completely remove the error recovery work and its associated resources (zone write plug list head, disk error list, and disk zone_wplugs_work work struct). This also removes the functions disk_zone_wplug_set_error() and disk_zone_wplug_clear_error(). - Change the BLK_ZONE_WPLUG_ERROR zone write plug flag into BLK_ZONE_WPLUG_NEED_WP_UPDATE. This new flag is set for a zone write plug whenever a write opration targetting the zone of the zone write plug fails. This flag indicates that the zone write pointer offset is not reliable and that it must be updated when the next report zone, reset zone, finish zone or disk revalidation is executed. - Modify blk_zone_write_plug_bio_endio() to set the BLK_ZONE_WPLUG_NEED_WP_UPDATE flag for the target zone of a failed write BIO. - Modify the function disk_zone_wplug_set_wp_offset() to clear this new flag, thus implementing recovery of a correct write pointer offset with the reset (all) zone and finish zone operations. - Modify blkdev_report_zones() to always use the disk_report_zones_cb() callback so that disk_zone_wplug_sync_wp_offset() can be called for any zone marked with the BLK_ZONE_WPLUG_NEED_WP_UPDATE flag. This implements recovery of a correct write pointer offset for zone write plugs marked with BLK_ZONE_WPLUG_NEED_WP_UPDATE and within the range of the report zones operation executed by the user. - Modify blk_revalidate_seq_zone() to call disk_zone_wplug_sync_wp_offset() for all sequential write required zones when a zoned block device is revalidated, thus always resolving any inconsistency between the write pointer offset of zone write plugs and the actual write pointer position of sequential zones. Fixes: `dd291d77cc` ("block: Introduce zone write plugging") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20241209122357.47838-5-dlemoal@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-10 09:15:33 -07:00
Damien Le Moal	b76b840fd9	dm: Fix dm-zoned-reclaim zone write pointer alignment The zone reclaim processing of the dm-zoned device mapper uses blkdev_issue_zeroout() to align the write pointer of a zone being used for reclaiming another zone, to write the valid data blocks from the zone being reclaimed at the same position relative to the zone start in the reclaim target zone. The first call to blkdev_issue_zeroout() will try to use hardware offload using a REQ_OP_WRITE_ZEROES operation if the device reports a non-zero max_write_zeroes_sectors queue limit. If this operation fails because of the lack of hardware support, blkdev_issue_zeroout() falls back to using a regular write operation with the zero-page as buffer. Currently, such REQ_OP_WRITE_ZEROES failure is automatically handled by the block layer zone write plugging code which will execute a report zones operation to ensure that the write pointer of the target zone of the failed operation has not changed and to "rewind" the zone write pointer offset of the target zone as it was advanced when the write zero operation was submitted. So the REQ_OP_WRITE_ZEROES failure does not cause any issue and blkdev_issue_zeroout() works as expected. However, since the automatic recovery of zone write pointers by the zone write plugging code can potentially cause deadlocks with queue freeze operations, a different recovery must be implemented in preparation for the removal of zone write plugging report zones based recovery. Do this by introducing the new function blk_zone_issue_zeroout(). This function first calls blkdev_issue_zeroout() with the flag BLKDEV_ZERO_NOFALLBACK to intercept failures on the first execution which attempt to use the device hardware offload with the REQ_OP_WRITE_ZEROES operation. If this attempt fails, a report zone operation is issued to restore the zone write pointer offset of the target zone to the correct position and blkdev_issue_zeroout() is called again without the BLKDEV_ZERO_NOFALLBACK flag. The report zones operation performing this recovery is implemented using the helper function disk_zone_sync_wp_offset() which calls the gendisk report_zones file operation with the callback disk_report_zones_cb(). This callback updates the target write pointer offset of the target zone using the new function disk_zone_wplug_sync_wp_offset(). dmz_reclaim_align_wp() is modified to change its call to blkdev_issue_zeroout() to a call to blk_zone_issue_zeroout() without any other change needed as the two functions are functionnally equivalent. Fixes: `dd291d77cc` ("block: Introduce zone write plugging") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Mike Snitzer <snitzer@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20241209122357.47838-4-dlemoal@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-10 09:15:33 -07:00
Damien Le Moal	5eb3317aa5	block: Ignore REQ_NOWAIT for zone reset and zone finish operations There are currently any issuer of REQ_OP_ZONE_RESET and REQ_OP_ZONE_FINISH operations that set REQ_NOWAIT. However, as we cannot handle this flag correctly due to the potential request allocation failure that may happen in blk_mq_submit_bio() after blk_zone_plug_bio() has handled the zone write plug write pointer updates for the targeted zones, modify blk_zone_wplug_handle_reset_or_finish() to warn if this flag is set and ignore it. Fixes: `dd291d77cc` ("block: Introduce zone write plugging") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20241209122357.47838-3-dlemoal@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-10 09:15:33 -07:00
Damien Le Moal	cae0056708	block: Use a zone write plug BIO work for REQ_NOWAIT BIOs For zoned block devices, a write BIO issued to a zone that has no on-going writes will be prepared for execution and allowed to execute immediately by blk_zone_wplug_handle_write() (called from blk_zone_plug_bio()). However, if this BIO specifies REQ_NOWAIT, the allocation of a request for its execution in blk_mq_submit_bio() may fail after blk_zone_plug_bio() completed, marking the target zone of the BIO as plugged. When this BIO is retried later on, it will be blocked as the zone write plug of the target zone is in a plugged state without any on-going write operation (completion of write operations trigger unplugging of the next write BIOs for a zone). This leads to a BIO that is stuck in a zone write plug and never completes, which results in various issues such as hung tasks. Avoid this problem by always executing REQ_NOWAIT write BIOs using the BIO work of a zone write plug. This ensure that we never block the BIO issuer and can thus safely ignore the REQ_NOWAIT flag when executing the BIO from the zone write plug BIO work. Since such BIO may be the first write BIO issued to a zone with no on-going write, modify disk_zone_wplug_add_bio() to schedule the zone write plug BIO work if the write plug is not already marked with the BLK_ZONE_WPLUG_PLUGGED flag. This scheduling is otherwise not necessary as the completion of the on-going write for the zone will schedule the execution of the next plugged BIOs. blk_zone_wplug_handle_write() is also fixed to better handle zone write plug allocation failures for REQ_NOWAIT BIOs by failing a write BIO using bio_wouldblock_error() instead of bio_io_error(). Reported-by: Bart Van Assche <bvanassche@acm.org> Fixes: `dd291d77cc` ("block: Introduce zone write plugging") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20241209122357.47838-2-dlemoal@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-12-10 09:15:33 -07:00
Jesse.zhang@amd.com	438b39ac74	drm/amdkfd: pause autosuspend when creating pdd When using MES creating a pdd will require talking to the GPU to setup the relevant context. The code here forgot to wake up the GPU in case it was in suspend, this causes KVM to EFAULT for passthrough GPU for example. This issue can be masked if the GPU was woken up by other things (e.g. opening the KMS node) first and have not yet gone to sleep. v4: do the allocation of proc_ctx_bo in a lazy fashion when the first queue is created in a process (Felix) Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Yunxiang Li <Yunxiang.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2024-12-10 10:26:18 -05:00
Christian König	f4df208177	drm/amdgpu: fix when the cleaner shader is emitted Emitting the cleaner shader must come after the check if a VM switch is necessary or not. Otherwise we will emit the cleaner shader every time and not just when it is necessary because we switched between applications. This can otherwise crash on gang submit and probably decreases performance quite a bit. v2: squash in fix from Srini (Alex) Signed-off-by: Christian König <christian.koenig@amd.com> Fixes: `ee7a846ea2` ("drm/amdgpu: Emit cleaner shader at end of IB submission") Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2024-12-10 10:26:18 -05:00
Pratap Nirujogi	ee2003d5fd	drm/amdgpu: Fix ISP HW init issue ISP hw_init is not called with the recent changes related to hw init levels. AMDGPU_INIT_LEVEL_DEFAULT is ignoring the ISP IP block as AMDGPU_IP_BLK_MASK_ALL is derived using incorrect max number of IP blocks. Update AMDGPU_IP_BLK_MASK_ALL to use AMD_IP_BLOCK_TYPE_NUM instead of AMDGPU_MAX_IP_NUM to fix the issue. Fixes: `14f2fe34f5` ("drm/amdgpu: Add init levels") Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Pratap Nirujogi <pratap.nirujogi@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-12-10 10:19:20 -05:00
Harish Kasiviswanathan	d50bf3f0fa	drm/amdkfd: hard-code MALL cacheline size for gfx11, gfx12 This information is not available in ip discovery table. Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Reviewed-by: David Belanger <david.belanger@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2024-12-10 10:19:05 -05:00
Harish Kasiviswanathan	321048c4a3	drm/amdkfd: hard-code cacheline size for gfx11 This information is not available in ip discovery table. Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Reviewed-by: David Belanger <david.belanger@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2024-12-10 10:17:56 -05:00
Andrew Martin	a592bb19ab	drm/amdkfd: Dereference null return value In the function pqm_uninit there is a call-assignment of "pdd = kfd_get_process_device_data" which could be null, and this value was later dereferenced without checking. Fixes: `fb91065851` ("drm/amdkfd: Refactor queue wptr_bo GART mapping") Signed-off-by: Andrew Martin <Andrew.Martin@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2024-12-10 10:16:09 -05:00
Geert Uytterhoeven	5751eee5c6	i2c: nomadik: Add missing sentinel to match table The OF match table is not NULL-terminated. Fix this by adding a sentinel to nmk_i2c_eyeq_match_table[]. Fixes: `a0d15cc47f` ("i2c: nomadik: switch from of_device_is_compatible() to of_match_device()") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Théo Lebrun <theo.lebrun@bootlin.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>	2024-12-10 16:07:53 +01:00
Joe Hattori	f3d87abe11	mmc: mtk-sd: disable wakeup in .remove() and in the error path of .probe() Current implementation leaves pdev->dev as a wakeup source. Add a device_init_wakeup(&pdev->dev, false) call in the .remove() function and in the error path of the .probe() function. Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp> Fixes: `527f36f5ef` ("mmc: mediatek: add support for SDIO eint wakup IRQ") Cc: stable@vger.kernel.org Message-ID: <20241203023442.2434018-1-joe@pf.is.s.u-tokyo.ac.jp> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>	2024-12-10 16:02:34 +01:00
Prathamesh Shete	a56335c85b	mmc: sdhci-tegra: Remove SDHCI_QUIRK_BROKEN_ADMA_ZEROLEN_DESC quirk Value 0 in ADMA length descriptor is interpreted as 65536 on new Tegra chips, remove SDHCI_QUIRK_BROKEN_ADMA_ZEROLEN_DESC quirk to make sure max ADMA2 length is 65536. Fixes: `4346b7c794` ("mmc: tegra: Add Tegra186 support") Cc: stable@vger.kernel.org Signed-off-by: Prathamesh Shete <pshete@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Message-ID: <20241209101009.22710-1-pshete@nvidia.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>	2024-12-10 16:00:07 +01:00
Vladimir Riabchun	7363f2d4c1	i2c: pnx: Fix timeout in wait functions Since commit `f63b94be69` ("i2c: pnx: Fix potential deadlock warning from del_timer_sync() call in isr") jiffies are stored in i2c_pnx_algo_data.timeout, but wait_timeout and wait_reset are still using it as milliseconds. Convert jiffies back to milliseconds to wait for the expected amount of time. Fixes: `f63b94be69` ("i2c: pnx: Fix potential deadlock warning from del_timer_sync() call in isr") Signed-off-by: Vladimir Riabchun <ferr.lambarginio@gmail.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>	2024-12-10 15:50:50 +01:00
Heiko Carstens	41856638e6	s390/mm: Fix DirectMap accounting With uncoupling of physical and virtual address spaces population of the identity mapping was changed to use the type POPULATE_IDENTITY instead of POPULATE_DIRECT. This breaks DirectMap accounting: > cat /proc/meminfo DirectMap4k: 55296 kB DirectMap1M: 18446744073709496320 kB Adjust all locations of update_page_count() in vmem.c to use POPULATE_IDENTITY instead of POPULATE_DIRECT as well. With this accounting is correct again: > cat /proc/meminfo DirectMap4k: 54264 kB DirectMap1M: 8334336 kB Fixes: `c98d2ecae0` ("s390/mm: Uncouple physical vs virtual address spaces") Cc: stable@vger.kernel.org Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>	2024-12-10 15:40:41 +01:00
Paolo Abeni	51a00be6a0	udp: fix l4 hash after reconnect After the blamed commit below, udp_rehash() is supposed to be called with both local and remote addresses set. Currently that is already the case for IPv6 sockets, but for IPv4 the destination address is updated after rehashing. Address the issue moving the destination address and port initialization before rehashing. Fixes: `1b29a730ef` ("ipv6/udp: Add 4-tuple hash for connected socket") Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/4761e466ab9f7542c68cdc95f248987d127044d2.1733499715.git.pabeni@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-12-10 15:30:36 +01:00
Shin'ichiro Kawasaki	360c400d0f	p2sb: Do not scan and remove the P2SB device when it is unhidden When drivers access P2SB device resources, it calls p2sb_bar(). Before the commit `5913320eb0` ("platform/x86: p2sb: Allow p2sb_bar() calls during PCI device probe"), p2sb_bar() obtained the resources and then called pci_stop_and_remove_bus_device() for clean up. Then the P2SB device disappeared. The commit `5913320eb0` introduced the P2SB device resource cache feature in the boot process. During the resource cache, pci_stop_and_remove_bus_device() is called for the P2SB device, then the P2SB device disappears regardless of whether p2sb_bar() is called or not. Such P2SB device disappearance caused a confusion [1]. To avoid the confusion, avoid the pci_stop_and_remove_bus_device() call when the BIOS does not hide the P2SB device. For that purpose, cache the P2SB device resources only if the BIOS hides the P2SB device. Call p2sb_scan_and_cache() only if p2sb_hidden_by_bios is true. This allows removing two branches from p2sb_scan_and_cache(). When p2sb_bar() is called, get the resources from the cache if the P2SB device is hidden. Otherwise, read the resources from the unhidden P2SB device. Reported-by: Daniel Walker (danielwa) <danielwa@cisco.com> Closes: https://lore.kernel.org/lkml/ZzTI+biIUTvFT6NC@goliath/ [1] Fixes: `5913320eb0` ("platform/x86: p2sb: Allow p2sb_bar() calls during PCI device probe") Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20241128002836.373745-5-shinichiro.kawasaki@wdc.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>	2024-12-10 16:24:51 +02:00
Shin'ichiro Kawasaki	0286070c74	p2sb: Move P2SB hide and unhide code to p2sb_scan_and_cache() To prepare for the following fix, move the code to hide and unhide the P2SB device from p2sb_cache_resources() to p2sb_scan_and_cache(). Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20241128002836.373745-4-shinichiro.kawasaki@wdc.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>	2024-12-10 16:24:49 +02:00
Shin'ichiro Kawasaki	ae3e6ebc5a	p2sb: Introduce the global flag p2sb_hidden_by_bios To prepare for the following fix, introduce the global flag p2sb_hidden_by_bios. Check if the BIOS hides the P2SB device and store the result in the flag. This allows to refer to the check result across functions. Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20241128002836.373745-3-shinichiro.kawasaki@wdc.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>	2024-12-10 16:24:48 +02:00
Shin'ichiro Kawasaki	9244524d60	p2sb: Factor out p2sb_read_from_cache() To prepare for the following fix, factor out the code to read the P2SB resource from the cache to the new function p2sb_read_from_cache(). Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20241128002836.373745-2-shinichiro.kawasaki@wdc.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>	2024-12-10 16:24:45 +02:00
Kurt Borja	54a8cada2f	alienware-wmi: Adds support to Alienware m16 R1 AMD Adds support to Alienware m16 R1 AMD. Tested-by: Cihan Ozakca <cozakca@outlook.com> Signed-off-by: Kurt Borja <kuurtb@gmail.com> Reviewed-by: Armin Wolf <W_Armin@gmx.de> Link: https://lore.kernel.org/r/20241208003013.6490-3-kuurtb@gmail.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>	2024-12-10 16:22:55 +02:00
Kurt Borja	c1043cdb01	alienware-wmi: Fix X Series and G Series quirks Devices that are known to support the WMI thermal interface do not support the legacy LED control interface. Make `.num_zones = 0` and avoid calling alienware_zone_init() if that's the case. Fixes: `9f6c430415` ("alienware-wmi: added platform profile support") Fixes: `1c1eb70e7d` ("alienware-wmi: extends the list of supported models") Suggested-by: Armin Wolf <W_Armin@gmx.de> Reviewed-by: Armin Wolf <W_Armin@gmx.de> Signed-off-by: Kurt Borja <kuurtb@gmail.com> Link: https://lore.kernel.org/r/20241208002652.5885-4-kuurtb@gmail.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>	2024-12-10 16:22:52 +02:00
Paolo Bonzini	3154bddf8c	Merge tag 'kvmarm-fixes-6.13-2' of https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 6.13, part #2 - Fix confusion with implicitly-shifted MDCR_EL2 masks breaking SPE/TRBE initialization - Align nested page table walker with the intended memory attribute combining rules of the architecture - Prevent userspace from constraining the advertised ASID width, avoiding horrors of guest TLBIs not matching the intended context in hardware - Don't leak references on LPIs when insertion into the translation cache fails	2024-12-10 08:50:55 -05:00
Stephen Gordon	687630aa58	ASoC: audio-graph-card: Call of_node_put() on correct node Signed-off-by: Stephen Gordon <gordoste@iinet.net.au> Acked-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Link: https://patch.msgid.link/20241207122257.165096-1-gordoste@iinet.net.au Signed-off-by: Mark Brown <broonie@kernel.org>	2024-12-10 12:44:16 +00:00
Venkata Prasad Potturu	984795e76d	ASoC: amd: yc: Fix the wrong return value With the current implementation, when ACP driver fails to read ACPI _WOV entry then the DMI overrides code won't invoke, may cause regressions for some BIOS versions. Add a condition check to jump to check the DMI entries incase of ACP driver fail to read ACPI _WOV method. Fixes: `4095cf8720` (ASoC: amd: yc: Fix for enabling DMIC on acp6x via _DSD entry) Signed-off-by: Venkata Prasad Potturu <venkataprasad.potturu@amd.com> Link: https://patch.msgid.link/20241210091026.996860-1-venkataprasad.potturu@amd.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-12-10 12:44:15 +00:00
Geert Uytterhoeven	c8f8d4344d	openrisc: Fix misalignments in head.S Align all line continuations and (sub)section headers in a consistent way. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Stafford Horne <shorne@gmail.com>	2024-12-10 12:04:19 +00:00
Masahiro Yamada	a412f04070	openrisc: place exception table at the head of vmlinux Since commit `0043ecea23` ("vmlinux.lds.h: Adjust symbol ordering in text output section"), the exception table in arch/openrisc/kernel/head.S is no longer positioned at the very beginning of the kernel image, which causes a boot failure. Currently, the exception table resides in the regular .text section. Previously, it was placed at the head by relying on the linker receiving arch/openrisc/kernel/head.o as the first object. However, this behavior has changed because sections like .text.{asan,unknown,unlikely,hot} now precede the regular .text section. The .head.text section is intended for entry points requiring special placement. However, in OpenRISC, this section has been misused: instead of the entry points, it contains boot code meant to be discarded after booting. This feature is typically handled by the .init.text section. This commit addresses the issue by replacing the current __HEAD marker with __INIT and re-annotating the entry points with __HEAD. Additionally, it adds __REF to entry.S to suppress the following modpost warning: WARNING: modpost: vmlinux: section mismatch in reference: _tng_kernel_start+0x70 (section: .text) -> _start (section: .init.text) Fixes: `0043ecea23` ("vmlinux.lds.h: Adjust symbol ordering in text output section") Reported-by: Guenter Roeck <linux@roeck-us.net> Closes: https://lore.kernel.org/all/5e032233-5b65-4ad5-ac50-d2eb6c00171c@roeck-us.net/#t Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Tested-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Rong Xu <xur@google.com> Signed-off-by: Stafford Horne <shorne@gmail.com>	2024-12-10 12:04:19 +00:00
Takashi Iwai	b2e538a982	ALSA: control: Avoid WARN() for symlink errors Using WARN() for showing the error of symlink creations don't give more information than telling that something goes wrong, since the usual code path is a lregister callback from each control element creation. More badly, the use of WARN() rather confuses fuzzer as if it were serious issues. This patch downgrades the warning messages to use the normal dev_err() instead of WARN(). For making it clearer, add the function name to the prefix, too. Fixes: `a135dfb5de` ("ALSA: led control - add sysfs kcontrol LED marking layer") Reported-by: syzbot+4e7919b09c67ffd198ae@syzkaller.appspotmail.com Closes: https://lore.kernel.org/675664c7.050a0220.a30f1.018c.GAE@google.com Link: https://patch.msgid.link/20241209095614.4273-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-12-10 12:32:34 +01:00
Uwe Kleine-König	9ac4b58fce	gpio: idio-16: Actually make use of the GPIO_IDIO_16 symbol namespace DEFAULT_SYMBOL_NAMESPACE must already be defined when <linux/export.h> is included. So move the define above the include block. Fixes: `b9b1fc1ae1` ("gpio: idio-16: Introduce the ACCES IDIO-16 GPIO library module") Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Acked-by: William Breathitt Gray <wbg@kernel.org> Link: https://lore.kernel.org/r/20241203172631.1647792-2-u.kleine-koenig@baylibre.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>	2024-12-10 11:38:57 +01:00
Bartosz Golaszewski	dd4d315f38	Merge tag 'v6.13-rc2' into gpio/for-current Linux 6.13-rc2	2024-12-10 11:38:38 +01:00
Paolo Abeni	00301ef4b2	Merge branch 'virtio_net-correct-netdev_tx_reset_queue-invocation-points' Koichiro Den says: ==================== virtio_net: correct netdev_tx_reset_queue() invocation points When virtnet_close is followed by virtnet_open, some TX completions can possibly remain unconsumed, until they are finally processed during the first NAPI poll after the netdev_tx_reset_queue(), resulting in a crash [1]. Commit `b96ed2c97c` ("virtio_net: move netdev_tx_reset_queue() call before RX napi enable") was not sufficient to eliminate all BQL crash scenarios for virtio-net. This issue can be reproduced with the latest net-next master by running: `while :; do ip l set DEV down; ip l set DEV up; done` under heavy network TX load from inside the machine. This patch series resolves the issue and also addresses similar existing problems: (a). Drop netdev_tx_reset_queue() from open/close path. This eliminates the BQL crashes due to the problematic open/close path. (b). As a result of (a), netdev_tx_reset_queue() is now explicitly required in freeze/restore path. Add netdev_tx_reset_queue() immediately after free_unused_bufs() invocation. (c). Fix missing resetting in virtnet_tx_resize(). virtnet_tx_resize() has lacked proper resetting since commit `c8bd1f7f3e` ("virtio_net: add support for Byte Queue Limits"). (d). Fix missing resetting in the XDP_SETUP_XSK_POOL path. Similar to (c), this path lacked proper resetting. Call netdev_tx_reset_queue() when virtqueue_reset() has actually recycled unused buffers. This patch series consists of six commits: [1/6]: Resolves (a) and (b). # also -stable 6.11.y [2/6]: Minor fix to make [4/6] streamlined. [3/6]: Prerequisite for (c). # also -stable 6.11.y [4/6]: Resolves (c) (incl. Prerequisite for (d)) # also -stable 6.11.y [5/6]: Preresuisite for (d). [6/6]: Resolves (d). Changes for v4: - move netdev_tx_reset_queue() out of free_unused_bufs() - submit to net, not net-next Changes for v3: - replace 'flushed' argument with 'recycle_done' Changes for v2: - add tx queue resetting for (b) to (d) above v3: https://lore.kernel.org/all/20241204050724.307544-1-koichiro.den@canonical.com/ v2: https://lore.kernel.org/all/20241203073025.67065-1-koichiro.den@canonical.com/ v1: https://lore.kernel.org/all/20241130181744.3772632-1-koichiro.den@canonical.com/ [1]: ------------[ cut here ]------------ kernel BUG at lib/dynamic_queue_limits.c:99! Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI CPU: 7 UID: 0 PID: 1598 Comm: ip Tainted: G N 6.12.0net-next_main+ #2 Tainted: [N]=TEST Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), \ BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 RIP: 0010:dql_completed+0x26b/0x290 Code: b7 c2 49 89 e9 44 89 da 89 c6 4c 89 d7 e8 ed 17 47 00 58 65 ff 0d 4d 27 90 7e 0f 85 fd fe ff ff e8 ea 53 8d ff e9 f3 fe ff ff <0f> 0b 01 d2 44 89 d1 29 d1 ba 00 00 00 00 0f 48 ca e9 28 ff ff ff RSP: 0018:ffffc900002b0d08 EFLAGS: 00010297 RAX: 0000000000000000 RBX: ffff888102398c80 RCX: 0000000080190009 RDX: 0000000000000000 RSI: 000000000000006a RDI: 0000000000000000 RBP: ffff888102398c00 R08: 0000000000000000 R09: 0000000000000000 R10: 00000000000000ca R11: 0000000000015681 R12: 0000000000000001 R13: ffffc900002b0d68 R14: ffff88811115e000 R15: ffff8881107aca40 FS: 00007f41ded69500(0000) GS:ffff888667dc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000556ccc2dc1a0 CR3: 0000000104fd8003 CR4: 0000000000772ef0 PKRU: 55555554 Call Trace: <IRQ> ? die+0x32/0x80 ? do_trap+0xd9/0x100 ? dql_completed+0x26b/0x290 ? dql_completed+0x26b/0x290 ? do_error_trap+0x6d/0xb0 ? dql_completed+0x26b/0x290 ? exc_invalid_op+0x4c/0x60 ? dql_completed+0x26b/0x290 ? asm_exc_invalid_op+0x16/0x20 ? dql_completed+0x26b/0x290 __free_old_xmit+0xff/0x170 [virtio_net] free_old_xmit+0x54/0xc0 [virtio_net] virtnet_poll+0xf4/0xe30 [virtio_net] ? __update_load_avg_cfs_rq+0x264/0x2d0 ? update_curr+0x35/0x260 ? reweight_entity+0x1be/0x260 __napi_poll.constprop.0+0x28/0x1c0 net_rx_action+0x329/0x420 ? enqueue_hrtimer+0x35/0x90 ? trace_hardirqs_on+0x1d/0x80 ? kvm_sched_clock_read+0xd/0x20 ? sched_clock+0xc/0x30 ? kvm_sched_clock_read+0xd/0x20 ? sched_clock+0xc/0x30 ? sched_clock_cpu+0xd/0x1a0 handle_softirqs+0x138/0x3e0 do_softirq.part.0+0x89/0xc0 </IRQ> <TASK> __local_bh_enable_ip+0xa7/0xb0 virtnet_open+0xc8/0x310 [virtio_net] __dev_open+0xfa/0x1b0 __dev_change_flags+0x1de/0x250 dev_change_flags+0x22/0x60 do_setlink.isra.0+0x2df/0x10b0 ? rtnetlink_rcv_msg+0x34f/0x3f0 ? netlink_rcv_skb+0x54/0x100 ? netlink_unicast+0x23e/0x390 ? netlink_sendmsg+0x21e/0x490 ? ____sys_sendmsg+0x31b/0x350 ? avc_has_perm_noaudit+0x67/0xf0 ? cred_has_capability.isra.0+0x75/0x110 ? __nla_validate_parse+0x5f/0xee0 ? __pfx___probestub_irq_enable+0x3/0x10 ? __create_object+0x5e/0x90 ? security_capable+0x3b/0x7[I0 rtnl_newlink+0x784/0xaf0 ? avc_has_perm_noaudit+0x67/0xf0 ? cred_has_capability.isra.0+0x75/0x110 ? stack_depot_save_flags+0x24/0x6d0 ? __pfx_rtnl_newlink+0x10/0x10 rtnetlink_rcv_msg+0x34f/0x3f0 ? do_syscall_64+0x6c/0x180 ? entry_SYSCALL_64_after_hwframe+0x76/0x7e ? __pfx_rtnetlink_rcv_msg+0x10/0x10 netlink_rcv_skb+0x54/0x100 netlink_unicast+0x23e/0x390 netlink_sendmsg+0x21e/0x490 ____sys_sendmsg+0x31b/0x350 ? copy_msghdr_from_user+0x6d/0xa0 ___sys_sendmsg+0x86/0xd0 ? __pte_offset_map+0x17/0x160 ? preempt_count_add+0x69/0xa0 ? __call_rcu_common.constprop.0+0x147/0x610 ? preempt_count_add+0x69/0xa0 ? preempt_count_add+0x69/0xa0 ? _raw_spin_trylock+0x13/0x60 ? trace_hardirqs_on+0x1d/0x80 __sys_sendmsg+0x66/0xc0 do_syscall_64+0x6c/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f41defe5b34 Code: 15 e1 12 0f 00 f7 d8 64 89 02 b8 ff ff ff ff eb bf 0f 1f 44 00 00 f3 0f 1e fa 80 3d 35 95 0f 00 00 74 13 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 4c c3 0f 1f 00 55 48 89 e5 48 83 ec 20 89 55 RSP: 002b:00007ffe5336ecc8 EFLAGS: 00000202 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f41defe5b34 RDX: 0000000000000000 RSI: 00007ffe5336ed30 RDI: 0000000000000003 RBP: 00007ffe5336eda0 R08: 0000000000000010 R09: 0000000000000001 R10: 00007ffe5336f6f9 R11: 0000000000000202 R12: 0000000000000003 R13: 0000000067452259 R14: 0000556ccc28b040 R15: 0000000000000000 </TASK> [...] ==================== Link: https://patch.msgid.link/20241206011047.923923-1-koichiro.den@canonical.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-12-10 11:23:56 +01:00
Koichiro Den	76a771ec4c	virtio_net: ensure netdev_tx_reset_queue is called on bind xsk for tx virtnet_sq_bind_xsk_pool() flushes tx skbs and then resets tx queue, so DQL counters need to be reset when flushing has actually occurred, Add virtnet_sq_free_unused_buf_done() as a callback for virtqueue_resize() to handle this. Fixes: `21a4e3ce6d` ("virtio_net: xsk: bind/unbind xsk for tx") Signed-off-by: Koichiro Den <koichiro.den@canonical.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-12-10 11:22:21 +01:00
Koichiro Den	8d2da07c81	virtio_ring: add a func argument 'recycle_done' to virtqueue_reset() When virtqueue_reset() has actually recycled all unused buffers, additional work may be required in some cases. Relying solely on its return status is fragile, so introduce a new function argument 'recycle_done', which is invoked when it really occurs. Signed-off-by: Koichiro Den <koichiro.den@canonical.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-12-10 11:22:21 +01:00
Koichiro Den	1480f0f61b	virtio_net: ensure netdev_tx_reset_queue is called on tx ring resize virtnet_tx_resize() flushes remaining tx skbs, requiring DQL counters to be reset when flushing has actually occurred. Add virtnet_sq_free_unused_buf_done() as a callback for virtqueue_reset() to handle this. Fixes: `c8bd1f7f3e` ("virtio_net: add support for Byte Queue Limits") Cc: <stable@vger.kernel.org> # v6.11+ Signed-off-by: Koichiro Den <koichiro.den@canonical.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-12-10 11:22:21 +01:00
Koichiro Den	8d6712c892	virtio_ring: add a func argument 'recycle_done' to virtqueue_resize() When virtqueue_resize() has actually recycled all unused buffers, additional work may be required in some cases. Relying solely on its return status is fragile, so introduce a new function argument 'recycle_done', which is invoked when the recycle really occurs. Cc: <stable@vger.kernel.org> # v6.11+ Signed-off-by: Koichiro Den <koichiro.den@canonical.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-12-10 11:22:21 +01:00
Koichiro Den	4571dc7272	virtio_net: replace vq2rxq with vq2txq where appropriate While not harmful, using vq2rxq where it's always sq appears odd. Replace it with the more appropriate vq2txq for clarity and correctness. Fixes: `89f86675cb` ("virtio_net: xsk: tx: support xmit xsk buffer") Signed-off-by: Koichiro Den <koichiro.den@canonical.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-12-10 11:22:21 +01:00
Koichiro Den	3ddccbefeb	virtio_net: correct netdev_tx_reset_queue() invocation point When virtnet_close is followed by virtnet_open, some TX completions can possibly remain unconsumed, until they are finally processed during the first NAPI poll after the netdev_tx_reset_queue(), resulting in a crash [1]. Commit `b96ed2c97c` ("virtio_net: move netdev_tx_reset_queue() call before RX napi enable") was not sufficient to eliminate all BQL crash cases for virtio-net. This issue can be reproduced with the latest net-next master by running: `while :; do ip l set DEV down; ip l set DEV up; done` under heavy network TX load from inside the machine. netdev_tx_reset_queue() can actually be dropped from virtnet_open path; the device is not stopped in any case. For BQL core part, it's just like traffic nearly ceases to exist for some period. For stall detector added to BQL, even if virtnet_close could somehow lead to some TX completions delayed for long, followed by virtnet_open, we can just take it as stall as mentioned in commit `6025b9135f` ("net: dqs: add NIC stall detector based on BQL"). Note also that users can still reset stall_max via sysfs. So, drop netdev_tx_reset_queue() from virtnet_enable_queue_pair(). This eliminates the BQL crashes. As a result, netdev_tx_reset_queue() is now explicitly required in freeze/restore path. This patch adds it to immediately after free_unused_bufs(), following the rule of thumb: netdev_tx_reset_queue() should follow any SKB freeing not followed by netdev_tx_completed_queue(). This seems the most consistent and streamlined approach, and now netdev_tx_reset_queue() runs whenever free_unused_bufs() is done. [1]: ------------[ cut here ]------------ kernel BUG at lib/dynamic_queue_limits.c:99! Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI CPU: 7 UID: 0 PID: 1598 Comm: ip Tainted: G N 6.12.0net-next_main+ #2 Tainted: [N]=TEST Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), \ BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 RIP: 0010:dql_completed+0x26b/0x290 Code: b7 c2 49 89 e9 44 89 da 89 c6 4c 89 d7 e8 ed 17 47 00 58 65 ff 0d 4d 27 90 7e 0f 85 fd fe ff ff e8 ea 53 8d ff e9 f3 fe ff ff <0f> 0b 01 d2 44 89 d1 29 d1 ba 00 00 00 00 0f 48 ca e9 28 ff ff ff RSP: 0018:ffffc900002b0d08 EFLAGS: 00010297 RAX: 0000000000000000 RBX: ffff888102398c80 RCX: 0000000080190009 RDX: 0000000000000000 RSI: 000000000000006a RDI: 0000000000000000 RBP: ffff888102398c00 R08: 0000000000000000 R09: 0000000000000000 R10: 00000000000000ca R11: 0000000000015681 R12: 0000000000000001 R13: ffffc900002b0d68 R14: ffff88811115e000 R15: ffff8881107aca40 FS: 00007f41ded69500(0000) GS:ffff888667dc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000556ccc2dc1a0 CR3: 0000000104fd8003 CR4: 0000000000772ef0 PKRU: 55555554 Call Trace: <IRQ> ? die+0x32/0x80 ? do_trap+0xd9/0x100 ? dql_completed+0x26b/0x290 ? dql_completed+0x26b/0x290 ? do_error_trap+0x6d/0xb0 ? dql_completed+0x26b/0x290 ? exc_invalid_op+0x4c/0x60 ? dql_completed+0x26b/0x290 ? asm_exc_invalid_op+0x16/0x20 ? dql_completed+0x26b/0x290 __free_old_xmit+0xff/0x170 [virtio_net] free_old_xmit+0x54/0xc0 [virtio_net] virtnet_poll+0xf4/0xe30 [virtio_net] ? __update_load_avg_cfs_rq+0x264/0x2d0 ? update_curr+0x35/0x260 ? reweight_entity+0x1be/0x260 __napi_poll.constprop.0+0x28/0x1c0 net_rx_action+0x329/0x420 ? enqueue_hrtimer+0x35/0x90 ? trace_hardirqs_on+0x1d/0x80 ? kvm_sched_clock_read+0xd/0x20 ? sched_clock+0xc/0x30 ? kvm_sched_clock_read+0xd/0x20 ? sched_clock+0xc/0x30 ? sched_clock_cpu+0xd/0x1a0 handle_softirqs+0x138/0x3e0 do_softirq.part.0+0x89/0xc0 </IRQ> <TASK> __local_bh_enable_ip+0xa7/0xb0 virtnet_open+0xc8/0x310 [virtio_net] __dev_open+0xfa/0x1b0 __dev_change_flags+0x1de/0x250 dev_change_flags+0x22/0x60 do_setlink.isra.0+0x2df/0x10b0 ? rtnetlink_rcv_msg+0x34f/0x3f0 ? netlink_rcv_skb+0x54/0x100 ? netlink_unicast+0x23e/0x390 ? netlink_sendmsg+0x21e/0x490 ? ____sys_sendmsg+0x31b/0x350 ? avc_has_perm_noaudit+0x67/0xf0 ? cred_has_capability.isra.0+0x75/0x110 ? __nla_validate_parse+0x5f/0xee0 ? __pfx___probestub_irq_enable+0x3/0x10 ? __create_object+0x5e/0x90 ? security_capable+0x3b/0x70 rtnl_newlink+0x784/0xaf0 ? avc_has_perm_noaudit+0x67/0xf0 ? cred_has_capability.isra.0+0x75/0x110 ? stack_depot_save_flags+0x24/0x6d0 ? __pfx_rtnl_newlink+0x10/0x10 rtnetlink_rcv_msg+0x34f/0x3f0 ? do_syscall_64+0x6c/0x180 ? entry_SYSCALL_64_after_hwframe+0x76/0x7e ? __pfx_rtnetlink_rcv_msg+0x10/0x10 netlink_rcv_skb+0x54/0x100 netlink_unicast+0x23e/0x390 netlink_sendmsg+0x21e/0x490 ____sys_sendmsg+0x31b/0x350 ? copy_msghdr_from_user+0x6d/0xa0 ___sys_sendmsg+0x86/0xd0 ? __pte_offset_map+0x17/0x160 ? preempt_count_add+0x69/0xa0 ? __call_rcu_common.constprop.0+0x147/0x610 ? preempt_count_add+0x69/0xa0 ? preempt_count_add+0x69/0xa0 ? _raw_spin_trylock+0x13/0x60 ? trace_hardirqs_on+0x1d/0x80 __sys_sendmsg+0x66/0xc0 do_syscall_64+0x6c/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f41defe5b34 Code: 15 e1 12 0f 00 f7 d8 64 89 02 b8 ff ff ff ff eb bf 0f 1f 44 00 00 f3 0f 1e fa 80 3d 35 95 0f 00 00 74 13 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 4c c3 0f 1f 00 55 48 89 e5 48 83 ec 20 89 55 RSP: 002b:00007ffe5336ecc8 EFLAGS: 00000202 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f41defe5b34 RDX: 0000000000000000 RSI: 00007ffe5336ed30 RDI: 0000000000000003 RBP: 00007ffe5336eda0 R08: 0000000000000010 R09: 0000000000000001 R10: 00007ffe5336f6f9 R11: 0000000000000202 R12: 0000000000000003 R13: 0000000067452259 R14: 0000556ccc28b040 R15: 0000000000000000 </TASK> [...] Fixes: `c8bd1f7f3e` ("virtio_net: add support for Byte Queue Limits") Cc: <stable@vger.kernel.org> # v6.11+ Signed-off-by: Koichiro Den <koichiro.den@canonical.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> [ pabeni: trimmed possibly troublesome separator ] Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-12-10 11:20:50 +01:00
Arnd Bergmann	efb113fc30	drm: rework FB_CORE dependency The 'select FB_CORE' statement moved from CONFIG_DRM to DRM_CLIENT_LIB, but there are now configurations that have code calling into fb_core as built-in even though the client_lib itself is a loadable module: x86_64-linux-ld: drivers/gpu/drm/drm_fbdev_shmem.o: in function `drm_fbdev_shmem_driver_fbdev_probe': drm_fbdev_shmem.c:(.text+0x1fc): undefined reference to `fb_deferred_io_init' x86_64-linux-ld: drivers/gpu/drm/drm_fbdev_shmem.o: in function `drm_fbdev_shmem_fb_destroy': drm_fbdev_shmem.c:(.text+0x2e1): undefined reference to `fb_deferred_io_cleanup' In addition to DRM_CLIENT_LIB, the 'select' needs to be at least in two more parts, DRM_KMS_HELPER and DRM_GEM_SHMEM_HELPER, so add those here. v3: - Remove FB_CORE from DRM_KMS_HELPER to avoid circular dependency Fixes: `dadd28d414` ("drm/client: Add client-lib module") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20241115162323.3555229-1-arnd@kernel.org	2024-12-10 11:11:39 +01:00
Alan Borzeszkowski	0bb18e34ab	gpio: graniterapids: Fix GPIO Ack functionality Interrupt status (GPI_IS) register is cleared by writing 1 to it, not 0. Cc: stable@vger.kernel.org Signed-off-by: Alan Borzeszkowski <alan.borzeszkowski@linux.intel.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Acked-by: Andy Shevchenko <andy@kernel.org> Link: https://lore.kernel.org/r/20241204070415.1034449-8-mika.westerberg@linux.intel.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>	2024-12-10 10:35:13 +01:00
Alan Borzeszkowski	c0ec4890d6	gpio: graniterapids: Check if GPIO line can be used for IRQs GPIO line can only be used as interrupt if its INTSEL register is programmed by the BIOS. Cc: stable@vger.kernel.org Signed-off-by: Alan Borzeszkowski <alan.borzeszkowski@linux.intel.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Acked-by: Andy Shevchenko <andy@kernel.org> Link: https://lore.kernel.org/r/20241204070415.1034449-7-mika.westerberg@linux.intel.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>	2024-12-10 10:35:13 +01:00
Alan Borzeszkowski	0588504d28	gpio: graniterapids: Determine if GPIO pad can be used by driver Add check of HOSTSW_MODE bit to determine if GPIO pad can be used by the driver. Cc: stable@vger.kernel.org Signed-off-by: Alan Borzeszkowski <alan.borzeszkowski@linux.intel.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Acked-by: Andy Shevchenko <andy@kernel.org> Link: https://lore.kernel.org/r/20241204070415.1034449-6-mika.westerberg@linux.intel.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>	2024-12-10 10:35:13 +01:00
Shankar Bandal	15636b00a0	gpio: graniterapids: Fix invalid RXEVCFG register bitmask Correct RX Level/Edge Configuration register (RXEVCFG) bitmask. Cc: stable@vger.kernel.org Signed-off-by: Shankar Bandal <shankar.bandal@intel.com> Signed-off-by: Alan Borzeszkowski <alan.borzeszkowski@linux.intel.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Acked-by: Andy Shevchenko <andy@kernel.org> Link: https://lore.kernel.org/r/20241204070415.1034449-5-mika.westerberg@linux.intel.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>	2024-12-10 10:35:13 +01:00
Shankar Bandal	0fe329b552	gpio: graniterapids: Fix invalid GPI_IS register offset Update GPI Interrupt Status register offset to correct value. Cc: stable@vger.kernel.org Signed-off-by: Shankar Bandal <shankar.bandal@intel.com> Signed-off-by: Alan Borzeszkowski <alan.borzeszkowski@linux.intel.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Acked-by: Andy Shevchenko <andy@kernel.org> Link: https://lore.kernel.org/r/20241204070415.1034449-4-mika.westerberg@linux.intel.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>	2024-12-10 10:35:13 +01:00
Alan Borzeszkowski	7382d2f0e8	gpio: graniterapids: Fix incorrect BAR assignment Base Address of vGPIO MMIO register is provided directly by the BIOS instead of using offsets. Update address assignment to reflect this change in driver. Cc: stable@vger.kernel.org Signed-off-by: Alan Borzeszkowski <alan.borzeszkowski@linux.intel.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Acked-by: Andy Shevchenko <andy@kernel.org> Link: https://lore.kernel.org/r/20241204070415.1034449-3-mika.westerberg@linux.intel.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>	2024-12-10 10:35:12 +01:00
Alan Borzeszkowski	eb9640fd1c	gpio: graniterapids: Fix vGPIO driver crash Move setting irq_chip.name from probe() function to the initialization of "irq_chip" struct in order to fix vGPIO driver crash during bootup. Crash was caused by unauthorized modification of irq_chip.name field where irq_chip struct was initialized as const. This behavior is a consequence of suboptimal implementation of gpio_irq_chip_set_chip(), which should be changed to avoid casting away const qualifier. Crash log: BUG: unable to handle page fault for address: ffffffffc0ba81c0 /#PF: supervisor write access in kernel mode /#PF: error_code(0x0003) - permissions violation CPU: 33 UID: 0 PID: 1075 Comm: systemd-udevd Not tainted 6.12.0-rc6-00077-g2e1b3cc9d7f7 #1 Hardware name: Intel Corporation Kaseyville RP/Kaseyville RP, BIOS KVLDCRB1.PGS.0026.D73.2410081258 10/08/2024 RIP: 0010:gnr_gpio_probe+0x171/0x220 [gpio_graniterapids] Cc: stable@vger.kernel.org Signed-off-by: Alan Borzeszkowski <alan.borzeszkowski@linux.intel.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Acked-by: Andy Shevchenko <andy@kernel.org> Link: https://lore.kernel.org/r/20241204070415.1034449-2-mika.westerberg@linux.intel.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>	2024-12-10 10:35:12 +01:00
Jason Gunthorpe	91da87d38c	iommu/amd: Add lockdep asserts for domain->dev_list Add an assertion to all the iteration points that don't obviously have the lock held already. These all take the locker higher in their call chains. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/2-v1-3b9edcf8067d+3975-amd_dev_list_locking_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-12-10 10:12:06 +01:00
Jason Gunthorpe	4a552f7890	iommu/amd: Put list_add/del(dev_data) back under the domain->lock The list domain->dev_list is protected by the domain->lock spinlock. Any iteration, addition or removal must be under the lock. Move the list_del() up into the critical section. pdom_is_sva_capable(), and destroy_gcr3_table() do not interact with the list element. Wrap the list_add() in a lock, it would make more sense if this was under the same critical section as adjusting the refcounts earlier, but that requires more complications. Fixes: `d6b47dec36` ("iommu/amd: Reduce domain lock scope in attach device path") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/1-v1-3b9edcf8067d+3975-amd_dev_list_locking_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-12-10 10:12:05 +01:00
Geetha sowjanya	af47a328e8	octeontx2-af: Fix installation of PF multicast rule Due to target variable is being reassigned in npc_install_flow() function, PF multicast rules are not getting installed. This patch addresses the issue by fixing the "IF" condition checks when rules are installed by AF. Fixes: `6c40ca957f` ("octeontx2-pf: Adds TC offload support"). Signed-off-by: Geetha sowjanya <gakula@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241205113435.10601-1-gakula@marvell.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-12-10 10:11:07 +01:00
Anumula Murali Mohan Reddy	a4048c83fd	RDMA/core: Fix ENODEV error for iWARP test over vlan If traffic is over vlan, cma_validate_port() fails to match net_device ifindex with bound_if_index and results in ENODEV error. As iWARP gid table is static, it contains entry corresponding to only one net device which is either real netdev or vlan netdev for cases like siw attached to a vlan interface. This patch fixes the issue by assigning bound_if_index with net device index, if real net device obtained from bound if index matches with net device retrieved from gid table Fixes: `f8ef1be816` ("RDMA/cma: Avoid GID lookups on iWARP devices") Link: https://lore.kernel.org/all/ZzNgdrjo1kSCGbRz@chelsio.com/ Signed-off-by: Anumula Murali Mohan Reddy <anumula@chelsio.com> Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Link: https://patch.msgid.link/20241203140052.3985-1-anumula@chelsio.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-12-10 03:58:03 -05:00
Shakeel Butt	b7ffecbe19	memcg: slub: fix SUnreclaim for post charged objects Large kmalloc directly allocates from the page allocator and then use lruvec_stat_mod_folio() to increment the unreclaimable slab stats for global and memcg. However when post memcg charging of slab objects was added in commit `9028cdeb38` ("memcg: add charging of already allocated slab objects"), it missed to correctly handle the unreclaimable slab stats for memcg. One user visisble effect of that bug is that the node level unreclaimable slab stat will work correctly but the memcg level stat can underflow as kernel correctly handles the free path but the charge path missed to increment the memcg level unreclaimable slab stat. Let's fix by correctly handle in the post charge code path. Fixes: `9028cdeb38` ("memcg: add charging of already allocated slab objects") Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev> Cc: <stable@vger.kernel.org> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>	2024-12-10 09:25:39 +01:00
Mika Westerberg	8644b48714	thunderbolt: Add support for Intel Panther Lake-M/P Intel Panther Lake-M/P has the same integrated Thunderbolt/USB4 controller as Lunar Lake. Add these PCI IDs to the driver list of supported devices. Cc: stable@vger.kernel.org Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>	2024-12-10 08:02:17 +02:00
Chenghai Huang	cd26cd6547	crypto: hisilicon/debugfs - fix the struct pointer incorrectly offset problem Offset based on (id * size) is wrong for sqc and cqc. (sqc/cqc + 1) can already offset sizeof(struct(Xqc)) length. Fixes: `15f112f9ce` ("crypto: hisilicon/debugfs - mask the unnecessary info from the dump") Cc: <stable@vger.kernel.org> Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-12-10 13:40:25 +08:00
Herbert Xu	8552cb04e0	crypto: rsassa-pkcs1 - Copy source data for SG list As virtual addresses in general may not be suitable for DMA, always perform a copy before using them in an SG list. Fixes: `1e562deace` ("crypto: rsassa-pkcs1 - Migrate to sig_alg backend") Reported-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-12-10 13:34:05 +08:00
K Prateek Nayak	919bfa9b2d	cpufreq/amd-pstate: Detect preferred core support before driver registration Booting with amd-pstate on 3rd Generation EPYC system incorrectly enabled ITMT support despite the system not supporting Preferred Core ranking. amd_pstate_init_prefcore() called during amd_pstate_cpu_init() requires "amd_pstate_prefcore" to be set correctly however the preferred core support is detected only after driver registration which is too late. Swap the function calls around to detect preferred core support before registring the driver via amd_pstate_register_driver(). This ensures amd_pstate_cpu_init() sees the correct value of "amd_pstate_prefcore" considering the platform support. Fixes: `279f838a61` ("x86/amd: Detect preferred cores in amd_get_boost_ratio_numerator()") Fixes: `ff2653ded4` ("cpufreq/amd-pstate: Move registration after static function call update") Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com> Acked-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20241210032557.754-1-kprateek.nayak@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>	2024-12-09 21:57:34 -06:00
liuderong	f103396ae3	scsi: ufs: core: Update compl_time_stamp_local_clock after completing a cqe lrbp->compl_time_stamp_local_clock is set to zero after sending a sqe but it is not updated after completing a cqe. Thus the printed information in ufshcd_print_tr() will always be zero. Update lrbp->cmpl_time_stamp_local_clock after completing a cqe. Log sample: ufshcd-qcom 1d84000.ufshc: UPIU[8] - issue time 8750227249 us ufshcd-qcom 1d84000.ufshc: UPIU[8] - complete time 0 us Fixes: `c30d8d010b` ("scsi: ufs: core: Prepare for completion in MCQ") Reviewed-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Peter Wang <peter.wang@mediatek.com> Signed-off-by: liuderong <liuderong@oppo.com> Link: https://lore.kernel.org/r/1733470182-220841-1-git-send-email-liuderong@oppo.com Reviewed-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-12-09 21:49:08 -05:00
Jakub Kicinski	f136552b7c	Merge branch 'qca_spi-fix-spi-specific-issues' Stefan Wahren says: ==================== qca_spi: Fix SPI specific issues This small series address two annoying SPI specific issues of the qca_spi driver. ==================== Link: https://patch.msgid.link/20241206184643.123399-1-wahrenst@gmx.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-09 18:26:53 -08:00
Stefan Wahren	becc6399ce	qca_spi: Make driver probing reliable The module parameter qcaspi_pluggable controls if QCA7000 signature should be checked at driver probe (current default) or not. Unfortunately this could fail in case the chip is temporary in reset, which isn't under total control by the Linux host. So disable this check per default in order to avoid unexpected probe failures. Fixes: `291ab06ecf` ("net: qualcomm: new Ethernet over SPI driver for QCA7000") Signed-off-by: Stefan Wahren <wahrenst@gmx.net> Link: https://patch.msgid.link/20241206184643.123399-3-wahrenst@gmx.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-09 18:26:48 -08:00
Stefan Wahren	4dba406fac	qca_spi: Fix clock speed for multiple QCA7000 Storing the maximum clock speed in module parameter qcaspi_clkspeed has the unintended side effect that the first probed instance defines the value for all other instances. Fix this issue by storing it in max_speed_hz of the relevant SPI device. This fix keeps the priority of the speed parameter (module parameter, device tree property, driver default). Btw this uses the opportunity to get the rid of the unused member clkspeed. Fixes: `291ab06ecf` ("net: qualcomm: new Ethernet over SPI driver for QCA7000") Signed-off-by: Stefan Wahren <wahrenst@gmx.net> Link: https://patch.msgid.link/20241206184643.123399-2-wahrenst@gmx.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-09 18:26:48 -08:00
Thomas Weißschuh	c28dc9fc24	power: supply: cros_charge-control: hide start threshold on v2 cmd ECs implementing the v2 command will not stop charging when the end threshold is reached. Instead they will begin discharging until the start threshold is reached, leading to permanent charge and discharge cycles. This defeats the point of the charge control mechanism. Avoid the issue by hiding the start threshold on v2 systems. Instead on those systems program the EC with start == end which forces the EC to reach and stay at that level. v1 does not support thresholds and v3 works correctly, at least judging from the code. Reported-by: Thomas Koch <linrunner@gmx.net> Fixes: `c6ed48ef52` ("power: supply: add ChromeOS EC based charge control driver") Cc: stable@vger.kernel.org Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20241208-cros_charge-control-v2-v1-3-8d168d0f08a3@weissschuh.net Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>	2024-12-10 02:51:45 +01:00
Thomas Weißschuh	e65a1b7fad	power: supply: cros_charge-control: allow start_threshold == end_threshold Allow setting the start and stop thresholds to the same value. There is no reason to disallow it. Suggested-by: Thomas Koch <linrunner@gmx.net> Fixes: `c6ed48ef52` ("power: supply: add ChromeOS EC based charge control driver") Cc: stable@vger.kernel.org Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20241208-cros_charge-control-v2-v1-2-8d168d0f08a3@weissschuh.net Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>	2024-12-10 02:51:45 +01:00
Thomas Weißschuh	e5f84d1cf5	power: supply: cros_charge-control: add mutex for driver data Concurrent accesses through sysfs may lead to inconsistent state in the priv data. Introduce a mutex to avoid this. Fixes: `c6ed48ef52` ("power: supply: add ChromeOS EC based charge control driver") Cc: stable@vger.kernel.org Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20241208-cros_charge-control-v2-v1-1-8d168d0f08a3@weissschuh.net Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>	2024-12-10 02:51:45 +01:00
Dimitri Fedrau	afc6e39e82	power: supply: gpio-charger: Fix set charge current limits Fix set charge current limits for devices which allow to set the lowest charge current limit to be greater zero. If requested charge current limit is below lowest limit, the index equals current_limit_map_size which leads to accessing memory beyond allocated memory. Fixes: `be2919d835` ("power: supply: gpio-charger: add charge-current-limit feature") Cc: stable@vger.kernel.org Signed-off-by: Dimitri Fedrau <dimitri.fedrau@liebherr.com> Link: https://lore.kernel.org/r/20241209-fix-charge-current-limit-v1-1-760d9b8f2af3@liebherr.com Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>	2024-12-10 02:51:24 +01:00
Anumula Murali Mohan Reddy	356983f569	cxgb4: use port number to set mac addr t4_set_vf_mac_acl() uses pf to set mac addr, but t4vf_get_vf_mac_acl() uses port number to get mac addr, this leads to error when an attempt to set MAC address on VF's of PF2 and PF3. This patch fixes the issue by using port number to set mac address. Fixes: `e0cdac65ba` ("cxgb4vf: configure ports accessible by the VF") Signed-off-by: Anumula Murali Mohan Reddy <anumula@chelsio.com> Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241206062014.49414-1-anumula@chelsio.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-09 16:12:43 -08:00
Miguel Ojeda	7a5f93ea58	rust: kbuild: set `bindgen`'s Rust target version Each `bindgen` release may upgrade the list of Rust targets. For instance, currently, in their master branch [1], the latest ones are: Nightly => { vectorcall_abi: #124485, ptr_metadata: #81513, layout_for_ptr: #69835, }, Stable_1_77(77) => { offset_of: #106655 }, Stable_1_73(73) => { thiscall_abi: #42202 }, Stable_1_71(71) => { c_unwind_abi: #106075 }, Stable_1_68(68) => { abi_efiapi: #105795 }, By default, the highest stable release in their list is used, and users are expected to set one if they need to support older Rust versions (e.g. see [2]). Thus, over time, new Rust features are used by default, and at some point, it is likely that `bindgen` will emit Rust code that requires a Rust version higher than our minimum (or perhaps enabling an unstable feature). Currently, there is no problem because the maximum they have, as seen above, is Rust 1.77.0, and our current minimum is Rust 1.78.0. Therefore, set a Rust target explicitly now to prevent going forward in time too much and thus getting potential build failures at some point. Since we also support a minimum `bindgen` version, and since `bindgen` does not support passing unknown Rust target versions, we need to use the list of our minimum `bindgen` version, rather than the latest. So, since `bindgen` 0.65.1 had this list [3], we need to use Rust 1.68.0: /// Rust stable 1.64 /// * `core_ffi_c` ([Tracking issue](https://github.com/rust-lang/rust/issues/94501)) => Stable_1_64 => 1.64; /// Rust stable 1.68 /// * `abi_efiapi` calling convention ([Tracking issue](https://github.com/rust-lang/rust/issues/65815)) => Stable_1_68 => 1.68; /// Nightly rust /// * `thiscall` calling convention ([Tracking issue](https://github.com/rust-lang/rust/issues/42202)) /// * `vectorcall` calling convention (no tracking issue) /// * `c_unwind` calling convention ([Tracking issue](https://github.com/rust-lang/rust/issues/74990)) => Nightly => nightly; ... /// Latest stable release of Rust pub const LATEST_STABLE_RUST: RustTarget = RustTarget::Stable_1_68; Thus add the `--rust-target 1.68` parameter. Add a comment as well explaining this. An alternative would be to use the currently running (i.e. actual) `rustc` and `bindgen` versions to pick a "better" Rust target version. However, that would introduce more moving parts depending on the user setup and is also more complex to implement. Starting with `bindgen` 0.71.0 [4], we will be able to set any future Rust version instead, i.e. we will be able to set here our minimum supported Rust version. Christian implemented it [5] after seeing this patch. Thanks! Cc: Christian Poveda <git@pvdrz.com> Cc: Emilio Cobos Álvarez <emilio@crisal.io> Cc: stable@vger.kernel.org # needed for 6.12.y; unneeded for 6.6.y; do not apply to 6.1.y Fixes: `c844fa64a2` ("rust: start supporting several `bindgen` versions") Link: `21c60f473f/bindgen/features.rs (L97-L105)` [1] Link: https://github.com/rust-lang/rust-bindgen/issues/2960 [2] Link: `7d243056d3/bindgen/features.rs (L131-L150)` [3] Link: https://github.com/rust-lang/rust-bindgen/blob/main/CHANGELOG.md#0710-2024-12-06 [4] Link: https://github.com/rust-lang/rust-bindgen/pull/2993 [5] Reviewed-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20241123180323.255997-1-ojeda@kernel.org Signed-off-by: Miguel Ojeda <ojeda@kernel.org>	2024-12-10 01:06:10 +01:00
Luis Claudio R. Goncalves	1f80621816	iommu/tegra241-cmdqv: do not use smp_processor_id in preemptible context During boot some of the calls to tegra241_cmdqv_get_cmdq() will happen in preemptible context. As this function calls smp_processor_id(), if CONFIG_DEBUG_PREEMPT is enabled, these calls will trigger a series of "BUG: using smp_processor_id() in preemptible" backtraces. As tegra241_cmdqv_get_cmdq() only calls smp_processor_id() to use the CPU number as a factor to balance out traffic on cmdq usage, it is safe to use raw_smp_processor_id() here. Cc: <stable@vger.kernel.org> Fixes: `918eb5c856` ("iommu/arm-smmu-v3: Add in-kernel support for NVIDIA Tegra241 (Grace) CMDQV") Signed-off-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/Z1L1mja3nXzsJ0Pk@uudg.org Signed-off-by: Will Deacon <will@kernel.org>	2024-12-09 23:38:30 +00:00
Miguel Ojeda	4011b351b1	drm/panic: remove spurious empty line to clean warning Clippy in the upcoming Rust 1.83.0 spots a spurious empty line since the `clippy::empty_line_after_doc_comments` warning is now enabled by default given it is part of the `suspicious` group [1]: error: empty line after doc comment --> drivers/gpu/drm/drm_panic_qr.rs:931:1 \| 931 \| / /// They must remain valid for the duration of the function call. 932 \| \| \| \|_ 933 \| #[no_mangle] 934 \| / pub unsafe extern "C" fn drm_panic_qr_generate( 935 \| \| url: const i8, 936 \| \| data: mut u8, 937 \| \| data_len: usize, ... \| 940 \| \| tmp_size: usize, 941 \| \| ) -> u8 { \| \|_______- the comment documents this function \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#empty_line_after_doc_comments = note: `-D clippy::empty-line-after-doc-comments` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::empty_line_after_doc_comments)]` = help: if the empty line is unintentional remove it Thus remove the empty line. Cc: stable@vger.kernel.org Fixes: `cb5164ac43` ("drm/panic: Add a QR code panic screen") Link: https://github.com/rust-lang/rust-clippy/pull/13091 [1] Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://lore.kernel.org/r/20241125233332.697497-1-ojeda@kernel.org Signed-off-by: Miguel Ojeda <ojeda@kernel.org>	2024-12-10 00:32:38 +01:00
Ian Rogers	d4e17a322a	perf test hwmon_pmu: Fix event file location The temp directory is made and a known fake hwmon PMU created within it. Prior to this fix the events were being incorrectly written to the temp directory rather than the fake PMU directory. This didn't impact the test as the directory fd matched the wrong location, but it doesn't mirror what a hwmon PMU would actually look like. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@kernel.org> Link: https://lore.kernel.org/r/20241206042306.1055913-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2024-12-09 15:00:26 -08:00
Ian Rogers	3f61a12b08	perf hwmon_pmu: Use openat rather than dup to refresh directory The hwmon PMU test will make a temp directory, open the directory with O_DIRECTORY then fill it with contents. As the open is before the filling the contents the later fdopendir may reflect the initial empty state, meaning no events are seen. Change to re-open the directory, rather than dup the fd, so the latest contents are seen. Minor tweaks/additions to debug messages. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@kernel.org> Link: https://lore.kernel.org/r/20241206042306.1055913-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2024-12-09 15:00:03 -08:00
Kuan-Wei Chiu	246dfe3dc1	perf ftrace: Fix undefined behavior in cmp_profile_data() The comparison function cmp_profile_data() violates the C standard's requirements for qsort() comparison functions, which mandate symmetry and transitivity: * Symmetry: If x < y, then y > x. * Transitivity: If x < y and y < z, then x < z. When v1 and v2 are equal, the function incorrectly returns 1, breaking symmetry and transitivity. This causes undefined behavior, which can lead to memory corruption in certain versions of glibc [1]. Fix the issue by returning 0 when v1 and v2 are equal, ensuring compliance with the C standard and preventing undefined behavior. Link: https://www.qualys.com/2024/01/30/qsort.txt [1] Fixes: `0f223813ed` ("perf ftrace: Add 'profile' command") Fixes: `74ae366c37` ("perf ftrace profile: Add -s/--sort option") Cc: stable@vger.kernel.org Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: jserv@ccns.ncku.edu.tw Cc: chuang@cs.nycu.edu.tw Link: https://lore.kernel.org/r/20241209134226.1939163-1-visitorckw@gmail.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2024-12-09 13:54:08 -08:00
Steve French	6d44a78063	smb3: fix compiler warning in reparse code utf8s_to_utf16s() specifies pwcs as a wchar_t pointer (whether big endian or little endian is passed in as an additional parm), so to remove a distracting compile warning it needs to be cast as (wchar_t *) in parse_reparse_wsl_symlink() as done by other callers. Fixes: `06a7adf318` ("cifs: Add support for parsing WSL-style symlinks") Reviewed-by: Pali Rohár <pali@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2024-12-09 15:20:58 -06:00
Ilpo Järvinen	7899ca9f3b	ACPI: resource: Fix memory resource type union access In acpi_decode_space() addr->info.mem.caching is checked on main level for any resource type but addr->info.mem is part of union and thus valid only if the resource type is memory range. Move the check inside the preceeding switch/case to only execute it when the union is of correct type. Fixes: `fcb29bbcd5` ("ACPI: Add prefetch decoding to the address space parser") Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://patch.msgid.link/20241202100614.20731-1-ilpo.jarvinen@linux.intel.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2024-12-09 21:30:10 +01:00
Olaf Hering	175c71c2ac	tools/hv: reduce resource usage in hv_kvp_daemon hv_kvp_daemon uses popen(3) and system(3) as convinience helper to launch external helpers. These helpers are invoked via a temporary shell process. There is no need to keep this temporary process around while the helper runs. Replace this temporary shell with the actual helper process via 'exec'. Signed-off-by: Olaf Hering <olaf@aepfle.de> Link: https://lore.kernel.org/linux-hyperv/20241202123520.27812-1-olaf@aepfle.de/ Signed-off-by: Wei Liu <wei.liu@kernel.org>	2024-12-09 18:44:15 +00:00
Olaf Hering	becc7fe329	tools/hv: add a .gitignore file Remove generated files from 'git status' output after 'make -C tools/hv'. Signed-off-by: Olaf Hering <olaf@aepfle.de> Link: https://lore.kernel.org/r/20241202124107.28650-1-olaf@aepfle.de Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20241202124107.28650-1-olaf@aepfle.de>	2024-12-09 18:44:15 +00:00
Olaf Hering	a4d024fe2e	tools/hv: reduce resouce usage in hv_get_dns_info helper Remove the usage of cat. Replace the shell process with awk with 'exec'. Also use a generic shell because no bash specific features will be used. Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: Wei Liu <wei.liu@kernel.org> Link: https://lore.kernel.org/r/20241202120432.21115-1-olaf@aepfle.de Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20241202120432.21115-1-olaf@aepfle.de>	2024-12-09 18:44:15 +00:00
Vitaly Kuznetsov	07dfa6e821	hv/hv_kvp_daemon: Pass NIC name to hv_get_dns_info as well The reference implementation of hv_get_dns_info which is in the tree uses /etc/resolv.conf to get DNS servers and this does not require to know which NIC is queried. Distro specific implementations, however, may want to provide per-NIC, fine grained information. E.g. NetworkManager keeps track of DNS servers per connection. Similar to hv_get_dhcp_info, pass NIC name as a parameter to hv_get_dns_info script. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Link: https://lore.kernel.org/r/20241112150401.217094-1-vkuznets@redhat.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20241112150401.217094-1-vkuznets@redhat.com>	2024-12-09 18:44:15 +00:00
Michael Kelley	07a756a49f	Drivers: hv: util: Avoid accessing a ringbuffer not initialized yet If the KVP (or VSS) daemon starts before the VMBus channel's ringbuffer is fully initialized, we can hit the panic below: hv_utils: Registering HyperV Utility Driver hv_vmbus: registering driver hv_utils ... BUG: kernel NULL pointer dereference, address: 0000000000000000 CPU: 44 UID: 0 PID: 2552 Comm: hv_kvp_daemon Tainted: G E 6.11.0-rc3+ #1 RIP: 0010:hv_pkt_iter_first+0x12/0xd0 Call Trace: ... vmbus_recvpacket hv_kvp_onchannelcallback vmbus_on_event tasklet_action_common tasklet_action handle_softirqs irq_exit_rcu sysvec_hyperv_stimer0 </IRQ> <TASK> asm_sysvec_hyperv_stimer0 ... kvp_register_done hvt_op_read vfs_read ksys_read __x64_sys_read This can happen because the KVP/VSS channel callback can be invoked even before the channel is fully opened: 1) as soon as hv_kvp_init() -> hvutil_transport_init() creates /dev/vmbus/hv_kvp, the kvp daemon can open the device file immediately and register itself to the driver by writing a message KVP_OP_REGISTER1 to the file (which is handled by kvp_on_msg() ->kvp_handle_handshake()) and reading the file for the driver's response, which is handled by hvt_op_read(), which calls hvt->on_read(), i.e. kvp_register_done(). 2) the problem with kvp_register_done() is that it can cause the channel callback to be called even before the channel is fully opened, and when the channel callback is starting to run, util_probe()-> vmbus_open() may have not initialized the ringbuffer yet, so the callback can hit the panic of NULL pointer dereference. To reproduce the panic consistently, we can add a "ssleep(10)" for KVP in __vmbus_open(), just before the first hv_ringbuffer_init(), and then we unload and reload the driver hv_utils, and run the daemon manually within the 10 seconds. Fix the panic by reordering the steps in util_probe() so the char dev entry used by the KVP or VSS daemon is not created until after vmbus_open() has completed. This reordering prevents the race condition from happening. Reported-by: Dexuan Cui <decui@microsoft.com> Fixes: `e0fa3e5e7d` ("Drivers: hv: utils: fix a race on userspace daemons registration") Cc: stable@vger.kernel.org Signed-off-by: Michael Kelley <mhklinux@outlook.com> Acked-by: Wei Liu <wei.liu@kernel.org> Link: https://lore.kernel.org/r/20241106154247.2271-3-mhklinux@outlook.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20241106154247.2271-3-mhklinux@outlook.com>	2024-12-09 18:44:15 +00:00
Michael Kelley	96e052d147	Drivers: hv: util: Don't force error code to ENODEV in util_probe() If the util_init function call in util_probe() returns an error code, util_probe() always return ENODEV, and the error code from the util_init function is lost. The error message output in the caller, vmbus_probe(), doesn't show the real error code. Fix this by just returning the error code from the util_init function. There doesn't seem to be a reason to force ENODEV, as other errors such as ENOMEM can already be returned from util_probe(). And the code in call_driver_probe() implies that ENODEV should mean that a matching driver wasn't found, which is not the case here. Suggested-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Michael Kelley <mhklinux@outlook.com> Acked-by: Wei Liu <wei.liu@kernel.org> Link: https://lore.kernel.org/r/20241106154247.2271-2-mhklinux@outlook.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20241106154247.2271-2-mhklinux@outlook.com>	2024-12-09 18:44:14 +00:00
Olaf Hering	a9640fcdd4	tools/hv: terminate fcopy daemon if read from uio fails Terminate endless loop in reading fails, to avoid flooding syslog. This happens if the state of "Guest services" integration service is changed from "enabled" to "disabled" at runtime in the VM settings. In this case pread returns EIO. Also handle an interrupted system call, and continue in this case. Signed-off-by: Olaf Hering <olaf@aepfle.de> Reviewed-by: Saurabh Sengar <ssengar@linux.microsoft.com> Link: https://lore.kernel.org/r/20241105081437.15689-1-olaf@aepfle.de Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20241105081437.15689-1-olaf@aepfle.de>	2024-12-09 18:44:14 +00:00
Easwar Hariharan	67b5e1042d	drivers: hv: Convert open-coded timeouts to secs_to_jiffies() We have several places where timeouts are open-coded as N (seconds) * HZ, but best practice is to use the utility functions from jiffies.h. Convert the timeouts to be compliant. This doesn't fix any bugs, it's a simple code improvement. Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/20241030-open-coded-timeouts-v3-2-9ba123facf88@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20241030-open-coded-timeouts-v3-2-9ba123facf88@linux.microsoft.com>	2024-12-09 18:44:14 +00:00
Olaf Hering	91ae69c7ed	tools: hv: change permissions of NetworkManager configuration file Align permissions of the resulting .nmconnection file, instead of the input file from hv_kvp_daemon. To avoid the tiny time frame where the output file is world-readable, use umask instead of chmod. Fixes: `42999c9046` ("hv/hv_kvp_daemon:Support for keyfile based connection profile") Signed-off-by: Olaf Hering <olaf@aepfle.de> Reviewed-by: Shradha Gupta <shradhagupta@linux.microsoft.com> Link: https://lore.kernel.org/r/20241016143521.3735-1-olaf@aepfle.de Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20241016143521.3735-1-olaf@aepfle.de>	2024-12-09 18:42:52 +00:00
Naman Jain	bcc80dec91	x86/hyperv: Fix hv tsc page based sched_clock for hibernation read_hv_sched_clock_tsc() assumes that the Hyper-V clock counter is bigger than the variable hv_sched_clock_offset, which is cached during early boot, but depending on the timing this assumption may be false when a hibernated VM starts again (the clock counter starts from 0 again) and is resuming back (Note: hv_init_tsc_clocksource() is not called during hibernation/resume); consequently, read_hv_sched_clock_tsc() may return a negative integer (which is interpreted as a huge positive integer since the return type is u64) and new kernel messages are prefixed with huge timestamps before read_hv_sched_clock_tsc() grows big enough (which typically takes several seconds). Fix the issue by saving the Hyper-V clock counter just before the suspend, and using it to correct the hv_sched_clock_offset in resume. This makes hv tsc page based sched_clock continuous and ensures that post resume, it starts from where it left off during suspend. Override x86_platform.save_sched_clock_state and x86_platform.restore_sched_clock_state routines to correct this as soon as possible. Note: if Invariant TSC is available, the issue doesn't happen because 1) we don't register read_hv_sched_clock_tsc() for sched clock: See commit `e5313f1c54` ("clocksource/drivers/hyper-v: Rework clocksource and sched clock setup"); 2) the common x86 code adjusts TSC similarly: see __restore_processor_state() -> tsc_verify_tsc_adjust(true) and x86_platform.restore_sched_clock_state(). Cc: stable@vger.kernel.org Fixes: `1349401ff1` ("clocksource/drivers/hyper-v: Suspend/resume Hyper-V clocksource for hibernation") Co-developed-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Naman Jain <namjain@linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/20240917053917.76787-1-namjain@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20240917053917.76787-1-namjain@linux.microsoft.com>	2024-12-09 18:42:42 +00:00
Dexuan Cui	cb1b78f1c7	tools: hv: Fix a complier warning in the fcopy uio daemon hv_fcopy_uio_daemon.c:436:53: warning: '%s' directive output may be truncated writing up to 14 bytes into a region of size 10 [-Wformat-truncation=] 436 \| snprintf(uio_dev_path, sizeof(uio_dev_path), "/dev/%s", uio_name); Also added 'static' for the array 'desc[]'. Fixes: `82b0945ce2` ("tools: hv: Add new fcopy application based on uio driver") Cc: stable@vger.kernel.org # 6.10+ Signed-off-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Saurabh Sengar <ssengar@linux.microsoft.com> Link: https://lore.kernel.org/r/20240910004433.50254-1-decui@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20240910004433.50254-1-decui@microsoft.com>	2024-12-09 18:42:42 +00:00
Linus Torvalds	7cb1b46631	Merge tag 'locking_urgent_for_v6.13_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Borislav Petkov: - Remove if_not_guard() as it is generating incorrect code - Fix the initialization of the fake lockdep_map for the first locked ww_mutex * tag 'locking_urgent_for_v6.13_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: headers/cleanup.h: Remove the if_not_guard() facility locking/ww_mutex: Fix ww_mutex dummy lockdep map selftest warnings	2024-12-09 10:34:41 -08:00
Linus Torvalds	e4c995f92b	Merge tag 'perf_urgent_for_v6.13_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 perf fixes from Borislav Petkov: - Make sure the PEBS buffer is drained before reconfiguring the hardware - Add Arrow Lake U support * tag 'perf_urgent_for_v6.13_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel/ds: Unconditionally drain PEBS DS when changing PEBS_DATA_CFG perf/x86/intel: Add Arrow Lake U support	2024-12-09 10:31:35 -08:00
Linus Torvalds	df9e2102de	Merge tag 'sched_urgent_for_v6.13_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Borislav Petkov: - Remove wrong enqueueing of a task for a later wakeup when a task blocks on a RT mutex - Do not setup a new deadline entity on a boosted task as that has happened already - Update preempt= kernel command line param - Prevent needless softirqd wakeups in the idle task's context - Detect the case where the idle load balancer CPU becomes busy and avoid unnecessary load balancing invocation - Remove an unnecessary load balancing need_resched() call in nohz_csd_func() - Allow for raising of SCHED_SOFTIRQ softirq type on RT but retain the warning to catch any other cases - Remove a wrong warning when a cpuset update makes the task affinity no longer a subset of the cpuset * tag 'sched_urgent_for_v6.13_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking: rtmutex: Fix wake_q logic in task_blocks_on_rt_mutex sched/deadline: Fix warning in migrate_enable for boosted tasks sched/core: Update kernel boot parameters for LAZY preempt. sched/core: Prevent wakeup of ksoftirqd during idle load balance sched/fair: Check idle_cpu() before need_resched() to detect ilb CPU turning busy sched/core: Remove the unnecessary need_resched() check in nohz_csd_func() softirq: Allow raising SCHED_SOFTIRQ from SMP-call-function on RT kernel sched: fix warning in sched_setaffinity sched/deadline: Fix replenish_dl_new_period dl_server condition	2024-12-09 10:28:55 -08:00
Damien Le Moal	aeb6893761	x86: Fix build regression with CONFIG_KEXEC_JUMP enabled Build 6.13-rc12 for x86_64 with gcc 14.2.1 fails with the error: ld: vmlinux.o: in function `virtual_mapped': linux/arch/x86/kernel/relocate_kernel_64.S:249:(.text+0x5915b): undefined reference to `saved_context_gdt_desc' when CONFIG_KEXEC_JUMP is enabled. This was introduced by commit `07fa619f2a` ("x86/kexec: Restore GDT on return from ::preserve_context kexec") which introduced a use of saved_context_gdt_desc without a declaration for it. Fix that by including asm/asm-offsets.h where saved_context_gdt_desc is defined (indirectly in include/generated/asm-offsets.h which asm/asm-offsets.h includes). Fixes: `07fa619f2a` ("x86/kexec: Restore GDT on return from ::preserve_context kexec") Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Closes: https://lore.kernel.org/oe-kbuild-all/202411270006.ZyyzpYf8-lkp@intel.com/ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2024-12-09 10:13:28 -08:00
Linus Torvalds	32913f3482	futex: fix user access on powerpc The powerpc user access code is special, and unlike other architectures distinguishes between user access for reading and writing. And commit `43a43faf53` ("futex: improve user space accesses") messed that up. It went undetected elsewhere, but caused ppc32 to fail early during boot, because the user access had been started with user_read_access_begin(), but then finished off with just a plain "user_access_end()". Note that the address-masking user access helpers don't even have that read-vs-write distinction, so if powerpc ever wants to do address masking tricks, we'll have to do some extra work for it. [ Make sure to also do it for the EFAULT case, as pointed out by Christophe Leroy ] Reported-by: Andreas Schwab <schwab@linux-m68k.org> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Link: https://lore.kernel.org/all/87bjxl6b0i.fsf@igel.home/ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2024-12-09 10:00:25 -08:00
Arnd Bergmann	c9bc45b346	Merge tag 'scmi-fix-6.13' of https://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into arm/fixes Arm SCMI fix for v6.13 Fix for the build issue in the ASoC driver with the SCMI support by enforcing the link-time dependency if IMX_SCMI_MISC_DRV is a loadable module but not if that is disabled. * tag 'scmi-fix-6.13' of https://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux: firmware: arm_scmi: Fix i.MX build dependency Link: https://lore.kernel.org/r/20241205114348.708618-1-sudeep.holla@arm.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2024-12-09 16:05:07 +01:00
Arnd Bergmann	90386e1ba4	Merge tag 'juno-fix-6.13' of https://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into arm/fixes Armv8 Juno fix for v6.13 Just a single fix updating the PCIe bus address range to accommodate the full ECAM window of 256MB available on most of the recent versions of RevC FVP models. * tag 'juno-fix-6.13' of https://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux: arm64: dts: fvp: Update PCIe bus-range property Link: https://lore.kernel.org/r/20241205114302.708433-1-sudeep.holla@arm.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2024-12-09 16:04:48 +01:00
David S. Miller	b4906787d4	Merge branch 'net-sparx5-lan969x-fixes' Daniel Machon says: ==================== net: sparx5: misc fixes for sparx5 and lan969x This series fixes various issues in the Sparx5 and lan969x drivers. Most of the fixes are for new issues introduced by the recent series adding lan969x switch support in the Sparx5 driver. Most notable is patch 1/5 that moves the lan969x dir into the sparx5 dir, in order to address a cyclic dependency issue reported by depmod, when installing modules. Details are in the commit descriptions. To: Andrew Lunn <andrew+netdev@lunn.ch> To: David S. Miller <davem@davemloft.net> To: Eric Dumazet <edumazet@google.com> To: Jakub Kicinski <kuba@kernel.org> To: Paolo Abeni <pabeni@redhat.com> To: Lars Povlsen <lars.povlsen@microchip.com> To: Steen Hegelund <Steen.Hegelund@microchip.com> To: UNGLinuxDriver@microchip.com To: Richard Cochran <richardcochran@gmail.com> To: Bjarni Jonasson <bjarni.jonasson@microchip.com> To: jensemil.schulzostergaard@microchip.com To: horatiu.vultur@microchip.com To: arnd@arndb.de To: jacob.e.keller@intel.com To: Parthiban.Veerasooran@microchip.com Cc: Calvin Owens <calvin@wbinvd.org> Cc: Muhammad Usama Anjum <Usama.Anjum@collabora.com> Cc: linux-kernel@vger.kernel.org Cc: netdev@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org ==================== Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-12-09 13:30:17 +00:00
Daniel Machon	ddd7ba0060	net: sparx5: fix the maximum frame length register On port initialization, we configure the maximum frame length accepted by the receive module associated with the port. This value is currently written to the MAX_LEN field of the DEV10G_MAC_ENA_CFG register, when in fact, it should be written to the DEV10G_MAC_MAXLEN_CFG register. Fix this. Fixes: `946e7fd505` ("net: sparx5: add port module support") Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-12-09 13:30:17 +00:00
Daniel Machon	e4d505fda6	net: sparx5: fix default value of monitor ports When doing port mirroring, the physical port to send the frame to, is written to the FRMC_PORT_VAL field of the QFWD_FRAME_COPY_CFG register. This field is 7 bits wide on sparx5 and 6 bits wide on lan969x, and has a default value of 65 and 30, respectively (the number of front ports). On mirror deletion, we set the default value of the monitor port to 65 for this field, in case no more ports exists for the mirror. Needless to say, this will not fit the 6 bits on lan969x. Fix this by correctly using the n_ports constant instead. Fixes: `3f9e46347a` ("net: sparx5: use SPX5_CONST for constants which already have a symbol") Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-12-09 13:30:16 +00:00
Daniel Machon	f004f2e535	net: sparx5: fix FDMA performance issue The FDMA handler is responsible for scheduling a NAPI poll, which will eventually fetch RX packets from the FDMA queue. Currently, the FDMA handler is run in a threaded context. For some reason, this kills performance. Admittedly, I did not do a thorough investigation to see exactly what causes the issue, however, I noticed that in the other driver utilizing the same FDMA engine, we run the FDMA handler in hard IRQ context. Fix this performance issue, by running the FDMA handler in hard IRQ context, not deferring any work to a thread. Prior to this change, the RX UDP performance was: Interval Transfer Bitrate Jitter 0.00-10.20 sec 44.6 MBytes 36.7 Mbits/sec 0.027 ms After this change, the rx UDP performance is: Interval Transfer Bitrate Jitter 0.00-9.12 sec 1.01 GBytes 953 Mbits/sec 0.020 ms Fixes: `10615907e9` ("net: sparx5: switchdev: adding frame DMA functionality") Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-12-09 13:30:16 +00:00
Daniel Machon	aa5fc88984	net: lan969x: fix the use of spin_lock in PTP handler We are mixing the use of spin_lock() and spin_lock_irqsave() functions in the PTP handler of lan969x. Fix this by correctly using the _irqsave variants. Fixes: `24fe835417` ("net: lan969x: add PTP handler function") Signed-off-by: Daniel Machon <daniel.machon@microchip.com> [1]: https://lore.kernel.org/netdev/20241024-sparx5-lan969x-switch-driver-2-v2-10-a0b5fae88a0f@microchip.com/ Signed-off-by: David S. Miller <davem@davemloft.net>	2024-12-09 13:30:16 +00:00
Daniel Machon	1cd7523f4b	net: lan969x: fix cyclic dependency reported by depmod Depmod reports a cyclic dependency between modules sparx5-switch.ko and lan969x-switch.ko: depmod: ERROR: Cycle detected: lan969x_switch -> sparx5_switch -> lan969x_switch depmod: ERROR: Found 2 modules in dependency cycles! make[2]: * [scripts/Makefile.modinst:132: depmod] Error 1 make: * [Makefile:224: __sub-make] Error 2 This makes sense, as they both require symbols from each other. Fix this by compiling lan969x support into the sparx5-switch.ko module. In order to do this, in a sensible way, we move the lan969x/ dir into the sparx5/ dir and do some code cleanup of code that is no longer required. After this patch, depmod will no longer complain, as lan969x support is compiled into the sparx5-swicth.ko module, and can no longer be compiled as a standalone module. Fixes: `98a0111960` ("net: sparx5: add compatible string for lan969x") Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-12-09 13:30:16 +00:00
Niravkumar L Rabara	25fb0e77b9	spi: spi-cadence-qspi: Disable STIG mode for Altera SoCFPGA. STIG mode is enabled by default for less than 8 bytes data read/write. STIG mode doesn't work with Altera SocFPGA platform due hardware limitation. Add a quirks to disable STIG mode for Altera SoCFPGA platform. Signed-off-by: Niravkumar L Rabara <niravkumar.l.rabara@intel.com> Link: https://patch.msgid.link/20241204063338.296959-1-niravkumar.l.rabara@intel.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-12-09 13:06:24 +00:00
Christian Loehle	0bb394067a	spi: rockchip: Fix PM runtime count on no-op cs The early bail out that caused an out-of-bounds write was removed with commit `5c018e378f` ("spi: spi-rockchip: Fix out of bounds array access") Unfortunately that caused the PM runtime count to be unbalanced and underflowed on the first call. To fix that reintroduce a no-op check by reading the register directly. Cc: stable@vger.kernel.org Fixes: `5c018e378f` ("spi: spi-rockchip: Fix out of bounds array access") Signed-off-by: Christian Loehle <christian.loehle@arm.com> Link: https://patch.msgid.link/1f2b3af4-2b7a-4ac8-ab95-c80120ebf44c@arm.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-12-09 13:06:23 +00:00
Christophe JAILLET	c84dda3751	spi: aspeed: Fix an error handling path in aspeed_spi_[read\|write]_user() A aspeed_spi_start_user() is not balanced by a corresponding aspeed_spi_stop_user(). Add the missing call. Fixes: `e3228ed928` ("spi: spi-mem: Convert Aspeed SMC driver to spi-mem") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://patch.msgid.link/4052aa2f9a9ea342fa6af83fa991b55ce5d5819e.1732051814.git.christophe.jaillet@wanadoo.fr Signed-off-by: Mark Brown <broonie@kernel.org>	2024-12-09 13:06:22 +00:00
Philippe Simons	f07ae52f5c	regulator: axp20x: AXP717: set ramp_delay AXP717 datasheet says that regulator ramp delay is 15.625 us/step, which is 10mV in our case. Add a AXP_DESC_RANGES_DELAY macro and update AXP_DESC_RANGES macro to expand to AXP_DESC_RANGES_DELAY with ramp_delay = 0 For DCDC4, steps is 100mv Add a AXP_DESC_DELAY macro and update AXP_DESC macro to expand to AXP_DESC_DELAY with ramp_delay = 0 This patch fix crashes when using CPU DVFS. Signed-off-by: Philippe Simons <simons.philippe@gmail.com> Tested-by: Hironori KIKUCHI <kikuchan98@gmail.com> Tested-by: Chris Morgan <macromorgan@hotmail.com> Reviewed-by: Chen-Yu Tsai <wens@csie.org> Fixes: `d2ac3df75c` ("regulator: axp20x: add support for the AXP717") Link: https://patch.msgid.link/20241208124308.5630-1-simons.philippe@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-12-09 13:06:07 +00:00
Janaki Ramaiah Thota	8099b1f7e3	regulator: dt-bindings: qcom,qca6390-pmu: document wcn6750-pmu Add description of the PMU node for the WCN6750B module. Signed-off-by: Janaki Ramaiah Thota <quic_janathot@quicinc.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://patch.msgid.link/20241209103455.9675-2-quic_janathot@quicinc.com Signed-off-by: Mark Brown <broonie@kernel.org>	2024-12-09 13:06:06 +00:00
Jesse Taube	5f12203006	ARM: dts: imxrt1050: Fix clocks for mmc One of the usdhc1 controller's clocks should be IMXRT1050_CLK_AHB_PODF not IMXRT1050_CLK_OSC. Fixes: `1c4f01be34` ("ARM: dts: imx: Add i.MXRT1050-EVK support") Signed-off-by: Jesse Taube <Mr.Bossman075@gmail.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org>	2024-12-09 20:40:23 +08:00
Peter Zijlstra	76f2f78329	sched/eevdf: More PELT vs DELAYED_DEQUEUE Vincent and Dietmar noted that while commit `fc1892becd` ("sched/eevdf: Fixup PELT vs DELAYED_DEQUEUE") fixes the entity runnable stats, it does not adjust the cfs_rq runnable stats, which are based off of h_nr_running. Track h_nr_delayed such that we can discount those and adjust the signal. Fixes: `fc1892becd` ("sched/eevdf: Fixup PELT vs DELAYED_DEQUEUE") Closes: https://lore.kernel.org/lkml/a9a45193-d0c6-4ba2-a822-464ad30b550e@arm.com/ Closes: https://lore.kernel.org/lkml/CAKfTPtCNUvWE_GX5LyvTF-WdxUT=ZgvZZv-4t=eWntg5uOFqiQ@mail.gmail.com/ [ Fixes checkpatch warnings and rebased ] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reported-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Reported-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: "Peter Zijlstra (Intel)" <peterz@infradead.org> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Tested-by: K Prateek Nayak <kprateek.nayak@amd.com> Link: https://lore.kernel.org/r/20241202174606.4074512-3-vincent.guittot@linaro.org	2024-12-09 11:48:09 +01:00
Vincent Guittot	c1f43c342e	sched/fair: Fix sched_can_stop_tick() for fair tasks We can't stop the tick of a rq if there are at least 2 tasks enqueued in the whole hierarchy and not only at the root cfs rq. rq->cfs.nr_running tracks the number of sched_entity at one level whereas rq->cfs.h_nr_running tracks all queued tasks in the hierarchy. Fixes: `11cc374f46` ("sched_ext: Simplify scx_can_stop_tick() invocation in sched_can_stop_tick()") Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Link: https://lore.kernel.org/r/20241202174606.4074512-2-vincent.guittot@linaro.org	2024-12-09 11:48:09 +01:00
K Prateek Nayak	493afbd187	sched/fair: Fix NEXT_BUDDY Adam reports that enabling NEXT_BUDDY insta triggers a WARN in pick_next_entity(). Moving clear_buddies() up before the delayed dequeue bits ensures no ->next buddy becomes delayed. Further ensure no new ->next buddy ever starts as delayed. Fixes: `152e11f6df` ("sched/fair: Implement delayed dequeue") Reported-by: Adam Li <adamli@os.amperecomputing.com> Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Adam Li <adamli@os.amperecomputing.com> Link: https://lkml.kernel.org/r/670a0d54-e398-4b1f-8a6e-90784e2fdf89@amd.com	2024-12-09 11:48:09 +01:00
Jiasheng Jiang	2828e5808b	drm/i915: Fix memory leak by correcting cache object name in error handler Replace "slab_priorities" with "slab_dependencies" in the error handler to avoid memory leak. Fixes: `32eb6bcfdd` ("drm/i915: Make request allocation caches global") Cc: <stable@vger.kernel.org> # v5.2+ Signed-off-by: Jiasheng Jiang <jiashengjiangcool@outlook.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241127201042.29620-1-jiashengjiangcool@gmail.com (cherry picked from commit `9bc5e7dc69`) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>	2024-12-09 10:29:06 +00:00
Eugene Kobyak	da0b986256	drm/i915: Fix NULL pointer dereference in capture_engine When the intel_context structure contains NULL, it raises a NULL pointer dereference error in drm_info(). Fixes: `e8a3319c31` ("drm/i915: Allow error capture without a request") Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12309 Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: <stable@vger.kernel.org> # v6.3+ Signed-off-by: Eugene Kobyak <eugene.kobyak@intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/xmsgfynkhycw3cf56akp4he2ffg44vuratocsysaowbsnhutzi@augnqbm777at (cherry picked from commit `754302a5bc`) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>	2024-12-09 10:29:03 +00:00
Ville Syrjälä	cd3da567e2	drm/i915/color: Stop using non-posted DSB writes for legacy LUT DSB LUT register writes vs. palette anti-collision logic appear to interact in interesting ways: - posted DSB writes simply vanish into thin air while anti-collision is active - non-posted DSB writes actually get blocked by the anti-collision logic, but unfortunately this ends up hogging the bus for long enough that unrelated parallel CPU MMIO accesses start to disappear instead Even though we are updating the LUT during vblank we aren't immune to the anti-collision logic because it kicks in briefly for pipe prefill (initiated at frame start). The safe time window for performing the LUT update is thus between the undelayed vblank and frame start. Turns out that with low enough CDCLK frequency (DSB execution speed depends on CDCLK) we can exceed that. As we are currently using non-posted writes for the legacy LUT updates, in which case we can hit the far more severe failure mode. The problem is exacerbated by the fact that non-posted writes are much slower than posted writes (~4x it seems). To mititage the problem let's switch to using posted DSB writes for legacy LUT updates (which will involve using the double write approach to avoid other problems with DSB vs. legacy LUT writes). Despite writing each register twice this will in fact make the legacy LUT update faster when compared to the non-posted write approach, making the problem less likely to appear. The failure mode is also less severe. This isn't the 100% solution we need though. That will involve estimating how long the LUT update will take, and pushing frame start and/or delayed vblank forward to guarantee that the update will have finished by the time the pipe prefill starts... Cc: stable@vger.kernel.org Fixes: `34d8311f4a` ("drm/i915/dsb: Re-instate DSB for LUT updates") Fixes: `25ea3411bd` ("drm/i915/dsb: Use non-posted register writes for legacy LUT") Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12494 Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241120164123.12706-3-ville.syrjala@linux.intel.com Reviewed-by: Uma Shankar <uma.shankar@intel.com> (cherry picked from commit `2504a316b3`) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>	2024-12-09 10:28:59 +00:00
Ville Syrjälä	70ec2e8be7	drm/i915/dsb: Don't use indexed register writes needlessly Turns out the DSB indexed register write command has rather significant initial overhead compared to the normal MMIO write command. Based on some quick experiments on TGL you have to write the register at least ~5 times for the indexed write command to come out ahead. If you write the register less times than that the MMIO write is faster. So it seems my automagic indexed write logic was a bit misguided. Go back to the original approach only use indexed writes for the cases we know will benefit from it (indexed LUT register updates). Currently we shouldn't have any cases where this truly matters (just some rare double writes to the precision LUT index registers), but we will need to switch the legacy LUT updates to write each LUT register twice (to avoid some palette anti-collision logic troubles). This would be close to the worst case for using indexed writes (two writes per register, and 256 separate registers). Using the MMIO write command should shave off around 30% of the execution time compared to using the indexed write command. Cc: stable@vger.kernel.org Fixes: `34d8311f4a` ("drm/i915/dsb: Re-instate DSB for LUT updates") Fixes: `25ea3411bd` ("drm/i915/dsb: Use non-posted register writes for legacy LUT") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241120164123.12706-2-ville.syrjala@linux.intel.com Reviewed-by: Uma Shankar <uma.shankar@intel.com> (cherry picked from commit `ecba559a88`) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>	2024-12-09 10:28:56 +00:00
Adrian Ratiu	b50a3e9844	sound: usb: format: don't warn that raw DSD is unsupported UAC 2 & 3 DAC's set bit 31 of the format to signal support for a RAW_DATA type, typically used for DSD playback. This is correctly tested by (format & UAC*_FORMAT_TYPE_I_RAW_DATA), fp->dsd_raw = true; and call snd_usb_interface_dsd_format_quirks(), however a confusing and unnecessary message gets printed because the bit is not properly tested in the last "unsupported" if test: if (format & ~0x3F) { ... } For example the output: usb 7-1: new high-speed USB device number 5 using xhci_hcd usb 7-1: New USB device found, idVendor=262a, idProduct=9302, bcdDevice=0.01 usb 7-1: New USB device strings: Mfr=1, Product=2, SerialNumber=6 usb 7-1: Product: TC44C usb 7-1: Manufacturer: TC44C usb 7-1: SerialNumber: 5000000001 hid-generic 0003:262A:9302.001E: No inputs registered, leaving hid-generic 0003:262A:9302.001E: hidraw6: USB HID v1.00 Device [DDHIFI TC44C] on usb-0000:08:00.3-1/input0 usb 7-1: 2:4 : unsupported format bits 0x100000000 This last "unsupported format" is actually wrong: we know the format is a RAW_DATA which we assume is DSD, so there is no need to print the confusing message. This we unset bit 31 of the format after recognizing it, to avoid the message. Suggested-by: Takashi Iwai <tiwai@suse.com> Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com> Link: https://patch.msgid.link/20241209090529.16134-2-adrian.ratiu@collabora.com Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-12-09 10:57:20 +01:00
Adrian Ratiu	c84bd6c810	sound: usb: enable DSD output for ddHiFi TC44C This is a UAC 2 DAC capable of raw DSD on intf 2 alt 4: Bus 007 Device 004: ID 262a:9302 SAVITECH Corp. TC44C Device Descriptor: bLength 18 bDescriptorType 1 bcdUSB 2.00 bDeviceClass 239 Miscellaneous Device bDeviceSubClass 2 [unknown] bDeviceProtocol 1 Interface Association bMaxPacketSize0 64 idVendor 0x262a SAVITECH Corp. idProduct 0x9302 TC44C bcdDevice 0.01 iManufacturer 1 DDHIFI iProduct 2 TC44C iSerial 6 5000000001 ....... Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 2 bAlternateSetting 4 bNumEndpoints 2 bInterfaceClass 1 Audio bInterfaceSubClass 2 Streaming bInterfaceProtocol 32 iInterface 0 AudioStreaming Interface Descriptor: bLength 16 bDescriptorType 36 bDescriptorSubtype 1 (AS_GENERAL) bTerminalLink 3 bmControls 0x00 bFormatType 1 bmFormats 0x80000000 bNrChannels 2 bmChannelConfig 0x00000000 iChannelNames 0 ....... Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com> Link: https://patch.msgid.link/20241209090529.16134-1-adrian.ratiu@collabora.com Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-12-09 10:57:10 +01:00
Stefan Eichenberger	1ddb61a7c0	ARM: imx_v6_v7_defconfig: enable SND_SOC_SPDIF Enable SND_SOC_SPDIF in imx_v6_v7_defconfig to support SPDIF audio. With commit `d469b771af` ("ARM: dts: imx6: update spdif sound card node properties"), the more generic audio-codec property is used instead of the old spdif-controller property. Since most i.MX6 boards now use the audio-codec property together with the linux,spdif-dit and linux,spdif-dir compatible driver, it makes sense to enable SND_SOC_SPDIF in the imx_v6_v7_defconfig. This will ensure compatibility with the updated device tree. Without this change, boards that use the audio-codec property will show the following error message during boot when using the imx_v6_v7_defconfig and spdif audio is not working: [ 24.165534] platform sound-spdif: deferred probe pending: fsl-asoc-card: snd_soc_register_card failed Signed-off-by: Stefan Eichenberger <stefan.eichenberger@toradex.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org>	2024-12-09 17:33:24 +08:00
James Bottomley	2ab0837cb9	efivarfs: Fix error on non-existent file When looking up a non-existent file, efivarfs returns -EINVAL if the file does not conform to the NAME-GUID format and -ENOENT if it does. This is caused by efivars_d_hash() returning -EINVAL if the name is not formatted correctly. This error is returned before simple_lookup() returns a negative dentry, and is the error value that the user sees. Fix by removing this check. If the file does not exist, simple_lookup() will return a negative dentry leading to -ENOENT and efivarfs_create() already has a validity check before it creates an entry (and will correctly return -EINVAL) Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: <stable@vger.kernel.org> [ardb: make efivarfs_valid_name() static] Signed-off-by: Ard Biesheuvel <ardb@kernel.org>	2024-12-09 10:00:04 +01:00
Wei Fang	c5b8d2c370	arm64: dts: imx95: correct the address length of netcmix_blk_ctrl The netc_blk_ctrl is controlled by the imx95-blk-ctl clock driver and provides relevant clock configurations for NETC, SAI and MQS. Its address length should be 8 bytes instead of 0x1000. Fixes: `7764fef26e` ("arm64: dts: imx95: Add NETCMIX block control support") Signed-off-by: Wei Fang <wei.fang@nxp.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org>	2024-12-09 16:27:05 +08:00
Vasiliy Kovalev	50db91fcce	ALSA: hda/realtek: Add new alc2xx-fixup-headset-mic model Introduces the alc2xx-fixup-headset-mic model to simplify enabling headset microphones on ALC2XX codecs. Many recent configurations, as well as older systems that lacked this fix for a long time, leave headset microphones inactive by default. This addition provides a flexible workaround using the existing ALC2XX_FIXUP_HEADSET_MIC quirk. Signed-off-by: Vasiliy Kovalev <kovalev@altlinux.org> Link: https://patch.msgid.link/20241207201836.6879-1-kovalev@altlinux.org Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-12-09 09:25:27 +01:00
Takashi Iwai	7c005292e2	ALSA: hda/ca0132: Use standard HD-audio quirk matching helpers CA0132 used the PCI SSID lookup helper that doesn't support the model string matching or quirk aliasing. Replace it with the standard HD-audio quirk helpers for supporting those, and add the definition of the model strings for supported quirks, too. There should be no visible change to the outside for the working system, but the driver will parse the model option and apply the quirk based on it from now on. Link: https://patch.msgid.link/20241207133754.3658-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-12-09 09:24:38 +01:00
Frank Li	c70812cb28	arm64: dts: imx8-ss-audio: add fallback compatible string fsl,imx6ull-esai for esai The ESAI of i.MX8QM is the same as i.MX6ULL. So add fsl,imx6ull-esai for esai. Signed-off-by: Frank Li <Frank.Li@nxp.com> Acked-by: Rob Herring (Arm) <robh@kernel.org> Fixes: `adf7ea48ce` ("ASoC: dt-bindings: fsl-esai: allow fsl,imx8qm-esai fallback to fsl,imx6ull-esai") Signed-off-by: Shawn Guo <shawnguo@kernel.org>	2024-12-09 15:53:53 +08:00
Joe Hattori	676fe1f6f7	ata: sata_highbank: fix OF node reference leak in highbank_initialize_phys() The OF node reference obtained by of_parse_phandle_with_args() is not released on early return. Add a of_node_put() call before returning. Fixes: `8996b89d6b` ("ata: add platform driver for Calxeda AHCI controller") Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>	2024-12-09 09:06:58 +09:00
Javier Carrasco	54d394905c	iio: adc: ti-ads1119: fix sample size in scan struct for triggered buffer This device returns signed, 16-bit samples as stated in its datasheet (see 8.5.2 Data Format). That is in line with the scan_type definition for the IIO_VOLTAGE channel, but 'unsigned int' is being used to read and push the data to userspace. Given that the size of that type depends on the architecture (at least 2 bytes to store values up to 65535, but its actual size is often 4 bytes), use the 's16' type to provide the same structure in all cases. Fixes: `a9306887eb` ("iio: adc: ti-ads1119: Add driver") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Reviewed-by: Francesco Dolcini <francesco.dolcini@toradex.com> Link: https://patch.msgid.link/20241202-ti-ads1119_s16_chan-v1-1-fafe3136dc90@gmail.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-12-08 16:59:13 +00:00
Javier Carrasco	2f43d5200c	iio: temperature: tmp006: fix information leak in triggered buffer The 'scan' local struct is used to push data to user space from a triggered buffer, but it has a hole between the two 16-bit data channels and the timestamp. This hole is never initialized. Initialize the struct to zero before using it to avoid pushing uninitialized information to userspace. Fixes: `91f75ccf9f` ("iio: temperature: tmp006: add triggered buffer support") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Link: https://patch.msgid.link/20241204-iio_memset_scan_holes-v2-1-3f941592a76d@gmail.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-12-08 16:51:35 +00:00
Joe Hattori	64f43895b4	iio: inkern: call iio_device_put() only on mapped devices In the error path of iio_channel_get_all(), iio_device_put() is called on all IIO devices, which can cause a refcount imbalance. Fix this error by calling iio_device_put() only on IIO devices whose refcounts were previously incremented by iio_device_get(). Fixes: `314be14bb8` ("iio: Rename _st_ functions to loose the bit that meant the staging version.") Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp> Link: https://patch.msgid.link/20241204111342.1246706-1-joe@pf.is.s.u-tokyo.ac.jp Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-12-08 16:49:09 +00:00
Cristian Ciocaltea	9d23e48654	phy: rockchip: samsung-hdptx: Set drvdata before enabling runtime PM In some cases, rk_hdptx_phy_runtime_resume() may be invoked before platform_set_drvdata() is executed in ->probe(), leading to a NULL pointer dereference when using the return of dev_get_drvdata(). Ensure platform_set_drvdata() is called before devm_pm_runtime_enable(). Reported-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Fixes: `553be2830c` ("phy: rockchip: Add Samsung HDMI/eDP Combo PHY driver") Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20241023-phy-sam-hdptx-rpm-fix-v1-1-87f4c994e346@collabora.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-12-08 21:42:07 +05:30
Masami Hiramatsu (Google)	494b332064	tracing/eprobe: Fix to release eprobe when failed to add dyn_event Fix eprobe event to unregister event call and release eprobe when it fails to add dynamic event correctly. Link: https://lore.kernel.org/all/173289886698.73724.1959899350183686006.stgit@devnote2/ Fixes: `7491e2c442` ("tracing: Add a probe that attaches to trace events") Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>	2024-12-08 23:25:09 +09:00
Dan Carpenter	09310cfd4e	rtnetlink: fix error code in rtnl_newlink() If rtnl_get_peer_net() fails, then propagate the error code. Don't return success. Fixes: `4832756676` ("rtnetlink: fix double call of rtnl_link_get_net_ifla()") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/a2d20cd4-387a-4475-887c-bb7d0e88e25a@stanley.mountain Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-07 18:25:09 -08:00
Jakub Kicinski	ab80e715b7	Merge branch 'ocelot-ptp-fixes' Vladimir Oltean says: ==================== Ocelot PTP fixes This is another attempt at "net: mscc: ocelot: be resilient to loss of PTP packets during transmission". https://lore.kernel.org/netdev/20241203164755.16115-1-vladimir.oltean@nxp.com/ The central change is in patch 4/5. It recovers a port from a very reproducible condition where previously, it would become unable to take any hardware TX timestamp. Then we have patches 1/5 and 5/5 which are extra bug fixes. Patches 2/5 and 3/5 are logical changes split out of the monolithic patch previously submitted as v1, so that patch 4/5 is hopefully easier to follow. ==================== Link: https://patch.msgid.link/20241205145519.1236778-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-07 17:56:49 -08:00
Vladimir Oltean	43a4166349	net: mscc: ocelot: perform error cleanup in ocelot_hwstamp_set() An unsupported RX filter will leave the port with TX timestamping still applied as per the new request, rather than the old setting. When parsing the tx_type, don't apply it just yet, but delay that until after we've parsed the rx_filter as well (and potentially returned -ERANGE for that). Similarly, copy_to_user() may fail, which is a rare occurrence, but should still be treated by unwinding what was done. Fixes: `96ca08c058` ("net: mscc: ocelot: set up traps for PTP packets") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20241205145519.1236778-6-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-07 17:56:46 -08:00
Vladimir Oltean	b454abfab5	net: mscc: ocelot: be resilient to loss of PTP packets during transmission The Felix DSA driver presents unique challenges that make the simplistic ocelot PTP TX timestamping procedure unreliable: any transmitted packet may be lost in hardware before it ever leaves our local system. This may happen because there is congestion on the DSA conduit, the switch CPU port or even user port (Qdiscs like taprio may delay packets indefinitely by design). The technical problem is that the kernel, i.e. ocelot_port_add_txtstamp_skb(), runs out of timestamp IDs eventually, because it never detects that packets are lost, and keeps the IDs of the lost packets on hold indefinitely. The manifestation of the issue once the entire timestamp ID range becomes busy looks like this in dmesg: mscc_felix 0000:00:00.5: port 0 delivering skb without TX timestamp mscc_felix 0000:00:00.5: port 1 delivering skb without TX timestamp At the surface level, we need a timeout timer so that the kernel knows a timestamp ID is available again. But there is a deeper problem with the implementation, which is the monotonically increasing ocelot_port->ts_id. In the presence of packet loss, it will be impossible to detect that and reuse one of the holes created in the range of free timestamp IDs. What we actually need is a bitmap of 63 timestamp IDs tracking which one is available. That is able to use up holes caused by packet loss, but also gives us a unique opportunity to not implement an actual timer_list for the timeout timer (very complicated in terms of locking). We could only declare a timestamp ID stale on demand (lazily), aka when there's no other timestamp ID available. There are pros and cons to this approach: the implementation is much more simple than per-packet timers would be, but most of the stale packets would be quasi-leaked - not really leaked, but blocked in driver memory, since this algorithm sees no reason to free them. An improved technique would be to check for stale timestamp IDs every time we allocate a new one. Assuming a constant flux of PTP packets, this avoids stale packets being blocked in memory, but of course, packets lost at the end of the flux are still blocked until the flux resumes (nobody left to kick them out). Since implementing per-packet timers is way too complicated, this should be good enough. Testing procedure: Persistently block traffic class 5 and try to run PTP on it: $ tc qdisc replace dev swp3 parent root taprio num_tc 8 \ map 0 1 2 3 4 5 6 7 queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \ base-time 0 sched-entry S 0xdf 100000 flags 0x2 [ 126.948141] mscc_felix 0000:00:00.5: port 3 tc 5 min gate length 0 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 1 octets including FCS $ ptp4l -i swp3 -2 -P -m --socket_priority 5 --fault_reset_interval ASAP --logSyncInterval -3 ptp4l[70.351]: port 1 (swp3): INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[70.354]: port 0 (/var/run/ptp4l): INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[70.358]: port 0 (/var/run/ptp4lro): INITIALIZING to LISTENING on INIT_COMPLETE [ 70.394583] mscc_felix 0000:00:00.5: port 3 timestamp id 0 ptp4l[70.406]: timed out while polling for tx timestamp ptp4l[70.406]: increasing tx_timestamp_timeout or increasing kworker priority may correct this issue, but a driver bug likely causes it ptp4l[70.406]: port 1 (swp3): send peer delay response failed ptp4l[70.407]: port 1 (swp3): clearing fault immediately ptp4l[70.952]: port 1 (swp3): new foreign master d858d7.fffe.00ca6d-1 [ 71.394858] mscc_felix 0000:00:00.5: port 3 timestamp id 1 ptp4l[71.400]: timed out while polling for tx timestamp ptp4l[71.400]: increasing tx_timestamp_timeout or increasing kworker priority may correct this issue, but a driver bug likely causes it ptp4l[71.401]: port 1 (swp3): send peer delay response failed ptp4l[71.401]: port 1 (swp3): clearing fault immediately [ 72.393616] mscc_felix 0000:00:00.5: port 3 timestamp id 2 ptp4l[72.401]: timed out while polling for tx timestamp ptp4l[72.402]: increasing tx_timestamp_timeout or increasing kworker priority may correct this issue, but a driver bug likely causes it ptp4l[72.402]: port 1 (swp3): send peer delay response failed ptp4l[72.402]: port 1 (swp3): clearing fault immediately ptp4l[72.952]: port 1 (swp3): new foreign master d858d7.fffe.00ca6d-1 [ 73.395291] mscc_felix 0000:00:00.5: port 3 timestamp id 3 ptp4l[73.400]: timed out while polling for tx timestamp ptp4l[73.400]: increasing tx_timestamp_timeout or increasing kworker priority may correct this issue, but a driver bug likely causes it ptp4l[73.400]: port 1 (swp3): send peer delay response failed ptp4l[73.400]: port 1 (swp3): clearing fault immediately [ 74.394282] mscc_felix 0000:00:00.5: port 3 timestamp id 4 ptp4l[74.400]: timed out while polling for tx timestamp ptp4l[74.401]: increasing tx_timestamp_timeout or increasing kworker priority may correct this issue, but a driver bug likely causes it ptp4l[74.401]: port 1 (swp3): send peer delay response failed ptp4l[74.401]: port 1 (swp3): clearing fault immediately ptp4l[74.953]: port 1 (swp3): new foreign master d858d7.fffe.00ca6d-1 [ 75.396830] mscc_felix 0000:00:00.5: port 3 invalidating stale timestamp ID 0 which seems lost [ 75.405760] mscc_felix 0000:00:00.5: port 3 timestamp id 0 ptp4l[75.410]: timed out while polling for tx timestamp ptp4l[75.411]: increasing tx_timestamp_timeout or increasing kworker priority may correct this issue, but a driver bug likely causes it ptp4l[75.411]: port 1 (swp3): send peer delay response failed ptp4l[75.411]: port 1 (swp3): clearing fault immediately (...) Remove the blocking condition and see that the port recovers: $ same tc command as above, but use "sched-entry S 0xff" instead $ same ptp4l command as above ptp4l[99.489]: port 1 (swp3): INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[99.490]: port 0 (/var/run/ptp4l): INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[99.492]: port 0 (/var/run/ptp4lro): INITIALIZING to LISTENING on INIT_COMPLETE [ 100.403768] mscc_felix 0000:00:00.5: port 3 invalidating stale timestamp ID 0 which seems lost [ 100.412545] mscc_felix 0000:00:00.5: port 3 invalidating stale timestamp ID 1 which seems lost [ 100.421283] mscc_felix 0000:00:00.5: port 3 invalidating stale timestamp ID 2 which seems lost [ 100.430015] mscc_felix 0000:00:00.5: port 3 invalidating stale timestamp ID 3 which seems lost [ 100.438744] mscc_felix 0000:00:00.5: port 3 invalidating stale timestamp ID 4 which seems lost [ 100.447470] mscc_felix 0000:00:00.5: port 3 timestamp id 0 [ 100.505919] mscc_felix 0000:00:00.5: port 3 timestamp id 0 ptp4l[100.963]: port 1 (swp3): new foreign master d858d7.fffe.00ca6d-1 [ 101.405077] mscc_felix 0000:00:00.5: port 3 timestamp id 0 [ 101.507953] mscc_felix 0000:00:00.5: port 3 timestamp id 0 [ 102.405405] mscc_felix 0000:00:00.5: port 3 timestamp id 0 [ 102.509391] mscc_felix 0000:00:00.5: port 3 timestamp id 0 [ 103.406003] mscc_felix 0000:00:00.5: port 3 timestamp id 0 [ 103.510011] mscc_felix 0000:00:00.5: port 3 timestamp id 0 [ 104.405601] mscc_felix 0000:00:00.5: port 3 timestamp id 0 [ 104.510624] mscc_felix 0000:00:00.5: port 3 timestamp id 0 ptp4l[104.965]: selected best master clock d858d7.fffe.00ca6d ptp4l[104.966]: port 1 (swp3): assuming the grand master role ptp4l[104.967]: port 1 (swp3): LISTENING to GRAND_MASTER on RS_GRAND_MASTER [ 105.106201] mscc_felix 0000:00:00.5: port 3 timestamp id 0 [ 105.232420] mscc_felix 0000:00:00.5: port 3 timestamp id 0 [ 105.359001] mscc_felix 0000:00:00.5: port 3 timestamp id 0 [ 105.405500] mscc_felix 0000:00:00.5: port 3 timestamp id 0 [ 105.485356] mscc_felix 0000:00:00.5: port 3 timestamp id 0 [ 105.511220] mscc_felix 0000:00:00.5: port 3 timestamp id 0 [ 105.610938] mscc_felix 0000:00:00.5: port 3 timestamp id 0 [ 105.737237] mscc_felix 0000:00:00.5: port 3 timestamp id 0 (...) Notice that in this new usage pattern, a non-congested port should basically use timestamp ID 0 all the time, progressing to higher numbers only if there are unacknowledged timestamps in flight. Compare this to the old usage, where the timestamp ID used to monotonically increase modulo OCELOT_MAX_PTP_ID. In terms of implementation, this simplifies the bookkeeping of the ocelot_port :: ts_id and ptp_skbs_in_flight. Since we need to traverse the list of two-step timestampable skbs for each new packet anyway, the information can already be computed and does not need to be stored. Also, ocelot_port->tx_skbs is always accessed under the switch-wide ocelot->ts_id_lock IRQ-unsafe spinlock, so we don't need the skb queue's lock and can use the unlocked primitives safely. This problem was actually detected using the tc-taprio offload, and is causing trouble in TSN scenarios, which Felix (NXP LS1028A / VSC9959) supports but Ocelot (VSC7514) does not. Thus, I've selected the commit to blame as the one adding initial timestamping support for the Felix switch. Fixes: `c0bcf53766` ("net: dsa: ocelot: add hardware timestamping support for Felix") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20241205145519.1236778-5-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-07 17:56:46 -08:00
Vladimir Oltean	0c53cdb95e	net: mscc: ocelot: ocelot->ts_id_lock and ocelot_port->tx_skbs.lock are IRQ-safe ocelot_get_txtstamp() is a threaded IRQ handler, requested explicitly as such by both ocelot_ptp_rdy_irq_handler() and vsc9959_irq_handler(). As such, it runs with IRQs enabled, and not in hardirq context. Thus, ocelot_port_add_txtstamp_skb() has no reason to turn off IRQs, it cannot be preempted by ocelot_get_txtstamp(). For the same reason, dev_kfree_skb_any_reason() will always evaluate as kfree_skb_reason() in this calling context, so just simplify the dev_kfree_skb_any() call to kfree_skb(). Also, ocelot_port_txtstamp_request() runs from NET_TX softirq context, not with hardirqs enabled. Thus, ocelot_get_txtstamp() which shares the ocelot_port->tx_skbs.lock lock with it, has no reason to disable hardirqs. This is part of a larger rework of the TX timestamping procedure. A logical subportion of the rework has been split into a separate change. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20241205145519.1236778-4-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-07 17:56:46 -08:00
Vladimir Oltean	b6fba4b3f0	net: mscc: ocelot: improve handling of TX timestamp for unknown skb This condition, theoretically impossible to trigger, is not really handled well. By "continuing", we are skipping the write to SYS_PTP_NXT which advances the timestamp FIFO to the next entry. So we are reading the same FIFO entry all over again, printing stack traces and eventually killing the kernel. No real problem has been observed here. This is part of a larger rework of the timestamp IRQ procedure, with this logical change split out into a patch of its own. We will need to "goto next_ts" for other conditions as well. Fixes: `9fde506e0c` ("net: mscc: ocelot: warn when a PTP IRQ is raised for an unknown skb") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20241205145519.1236778-3-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-07 17:56:46 -08:00
Vladimir Oltean	4b01bec25b	net: mscc: ocelot: fix memory leak on ocelot_port_add_txtstamp_skb() If ocelot_port_add_txtstamp_skb() fails, for example due to a full PTP timestamp FIFO, we must undo the skb_clone_sk() call with kfree_skb(). Otherwise, the reference to the skb clone is lost. Fixes: `52849bcf00` ("net: mscc: ocelot: avoid overflowing the PTP timestamp FIFO") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20241205145519.1236778-2-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-07 17:56:46 -08:00
Kuniyuki Iwashima	cdd0b9132d	ip: Return drop reason if in_dev is NULL in ip_route_input_rcu(). syzkaller reported a warning in __sk_skb_reason_drop(). Commit `61b95c70f3` ("net: ip: make ip_route_input_rcu() return drop reasons") missed a path where -EINVAL is returned. Then, the cited commit started to trigger the warning with the invalid error. Let's fix it by returning SKB_DROP_REASON_NOT_SPECIFIED. [0]: WARNING: CPU: 0 PID: 10 at net/core/skbuff.c:1216 __sk_skb_reason_drop net/core/skbuff.c:1216 [inline] WARNING: CPU: 0 PID: 10 at net/core/skbuff.c:1216 sk_skb_reason_drop+0x97/0x1b0 net/core/skbuff.c:1241 Modules linked in: CPU: 0 UID: 0 PID: 10 Comm: kworker/0:1 Not tainted 6.12.0-10686-gbb18265c3aba #10 1c308307628619808b5a4a0495c4aab5637b0551 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 Workqueue: wg-crypt-wg2 wg_packet_decrypt_worker RIP: 0010:__sk_skb_reason_drop net/core/skbuff.c:1216 [inline] RIP: 0010:sk_skb_reason_drop+0x97/0x1b0 net/core/skbuff.c:1241 Code: 5d 41 5c 41 5d 41 5e e9 e7 9e 95 fd e8 e2 9e 95 fd 31 ff 44 89 e6 e8 58 a1 95 fd 45 85 e4 0f 85 a2 00 00 00 e8 ca 9e 95 fd 90 <0f> 0b 90 e8 c1 9e 95 fd 44 89 e6 bf 01 00 00 00 e8 34 a1 95 fd 41 RSP: 0018:ffa0000000007650 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 000000000000ffff RCX: ffffffff83bc3592 RDX: ff110001002a0000 RSI: ffffffff83bc34d6 RDI: 0000000000000007 RBP: ff11000109ee85f0 R08: 0000000000000001 R09: ffe21c00213dd0da R10: 000000000000ffff R11: 0000000000000000 R12: 00000000ffffffea R13: 0000000000000000 R14: ff11000109ee86d4 R15: ff11000109ee8648 FS: 0000000000000000(0000) GS:ff1100011a000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020177000 CR3: 0000000108a3d006 CR4: 0000000000771ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000600 PKRU: 55555554 Call Trace: <IRQ> kfree_skb_reason include/linux/skbuff.h:1263 [inline] ip_rcv_finish_core.constprop.0+0x896/0x2320 net/ipv4/ip_input.c:424 ip_list_rcv_finish.constprop.0+0x1b2/0x710 net/ipv4/ip_input.c:610 ip_sublist_rcv net/ipv4/ip_input.c:636 [inline] ip_list_rcv+0x34a/0x460 net/ipv4/ip_input.c:670 __netif_receive_skb_list_ptype net/core/dev.c:5715 [inline] __netif_receive_skb_list_core+0x536/0x900 net/core/dev.c:5762 __netif_receive_skb_list net/core/dev.c:5814 [inline] netif_receive_skb_list_internal+0x77c/0xdc0 net/core/dev.c:5905 gro_normal_list include/net/gro.h:515 [inline] gro_normal_list include/net/gro.h:511 [inline] napi_complete_done+0x219/0x8c0 net/core/dev.c:6256 wg_packet_rx_poll+0xbff/0x1e40 drivers/net/wireguard/receive.c:488 __napi_poll.constprop.0+0xb3/0x530 net/core/dev.c:6877 napi_poll net/core/dev.c:6946 [inline] net_rx_action+0x9eb/0xe30 net/core/dev.c:7068 handle_softirqs+0x1ac/0x740 kernel/softirq.c:554 do_softirq kernel/softirq.c:455 [inline] do_softirq+0x48/0x80 kernel/softirq.c:442 </IRQ> <TASK> __local_bh_enable_ip+0xed/0x110 kernel/softirq.c:382 spin_unlock_bh include/linux/spinlock.h:396 [inline] ptr_ring_consume_bh include/linux/ptr_ring.h:367 [inline] wg_packet_decrypt_worker+0x3ba/0x580 drivers/net/wireguard/receive.c:499 process_one_work+0x940/0x1a70 kernel/workqueue.c:3229 process_scheduled_works kernel/workqueue.c:3310 [inline] worker_thread+0x639/0xe30 kernel/workqueue.c:3391 kthread+0x283/0x350 kernel/kthread.c:389 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:244 </TASK> Fixes: `82d9983ebe` ("net: ip: make ip_route_input_noref() return drop reasons") Reported-by: syzkaller <syzkaller@googlegroups.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20241206020715.80207-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-07 17:55:37 -08:00
Russell King (Oracle)	4c49f38e20	net: stmmac: fix TSO DMA API usage causing oops Commit `66600fac7a` ("net: stmmac: TSO: Fix unbalanced DMA map/unmap for non-paged SKB data") moved the assignment of tx_skbuff_dma[]'s members to be later in stmmac_tso_xmit(). The buf (dma cookie) and len stored in this structure are passed to dma_unmap_single() by stmmac_tx_clean(). The DMA API requires that the dma cookie passed to dma_unmap_single() is the same as the value returned from dma_map_single(). However, by moving the assignment later, this is not the case when priv->dma_cap.addr64 > 32 as "des" is offset by proto_hdr_len. This causes problems such as: dwc-eth-dwmac 2490000.ethernet eth0: Tx DMA map failed and with DMA_API_DEBUG enabled: DMA-API: dwc-eth-dwmac 2490000.ethernet: device driver tries to +free DMA memory it has not allocated [device address=0x000000ffffcf65c0] [size=66 bytes] Fix this by maintaining "des" as the original DMA cookie, and use tso_des to pass the offset DMA cookie to stmmac_tso_allocator(). Full details of the crashes can be found at: https://lore.kernel.org/all/d8112193-0386-4e14-b516-37c2d838171a@nvidia.com/ https://lore.kernel.org/all/klkzp5yn5kq5efgtrow6wbvnc46bcqfxs65nz3qy77ujr5turc@bwwhelz2l4dw/ Reported-by: Jon Hunter <jonathanh@nvidia.com> Reported-by: Thierry Reding <thierry.reding@gmail.com> Fixes: `66600fac7a` ("net: stmmac: TSO: Fix unbalanced DMA map/unmap for non-paged SKB data") Tested-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Furong Xu <0x1207@gmail.com> Link: https://patch.msgid.link/E1tJXcx-006N4Z-PC@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-07 17:28:53 -08:00
Christophe JAILLET	bbf6b6d53e	iio: adc: ad9467: Fix the "don't allow reading vref if not available" case The commit in Fixes adds a special case when only one possible scale is available. If several scales are available, it sets the .read_avail field of the struct iio_info to ad9467_read_avail(). However, this field already holds this function pointer, so the code is a no-op. Use another struct iio_info instead to actually reflect the intent described in the commit message. This way, the structure to use is selected at runtime and they can be kept as const. This is safer because modifying static structs that are shared between all instances like this, based on the properties of a single instance, is asking for trouble down the road. Fixes: `b92f94f748` ("iio: adc: ad9467: don't allow reading vref if not available") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://patch.msgid.link/cc65da19e0578823d29e11996f86042e84d5715c.1733503146.git.christophe.jaillet@wanadoo.fr Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-12-07 17:57:19 +00:00
Joe Hattori	de6a73bad1	iio: adc: at91: call input_free_device() on allocated iio_dev Current implementation of at91_ts_register() calls input_free_deivce() on st->ts_input, however, the err label can be reached before the allocated iio_dev is stored to st->ts_input. Thus call input_free_device() on input instead of st->ts_input. Fixes: `84882b0603` ("iio: adc: at91_adc: Add support for touchscreens without TSMR") Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp> Link: https://patch.msgid.link/20241207043045.1255409-1-joe@pf.is.s.u-tokyo.ac.jp Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-12-07 17:28:39 +00:00
David Lechner	36a44e05cd	iio: adc: ad7173: fix using shared static info struct Fix a possible race condition during driver probe in the ad7173 driver due to using a shared static info struct. If more that one instance of the driver is probed at the same time, some of the info could be overwritten by the other instance, leading to incorrect operation. To fix this, make the static info struct const so that it is read-only and make a copy of the info struct for each instance of the driver that can be modified. Reported-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Fixes: `76a1e6a428` ("iio: adc: ad7173: add AD7173 driver") Signed-off-by: David Lechner <dlechner@baylibre.com> Tested-by: Guillaume Ranquet <granquet@baylibre.com> Link: https://patch.msgid.link/20241127-iio-adc-ad7313-fix-non-const-info-struct-v2-1-b6d7022b7466@baylibre.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-12-07 17:23:42 +00:00
Fabio Estevam	2a8e34096e	iio: adc: ti-ads124s08: Use gpiod_set_value_cansleep() Using gpiod_set_value() to control the reset GPIO causes some verbose warnings during boot when the reset GPIO is controlled by an I2C IO expander. As the caller can sleep, use the gpiod_set_value_cansleep() variant to fix the issue. Tested on a custom i.MX93 board with a ADS124S08 ADC. Cc: stable@kernel.org Fixes: `e717f8c6df` ("iio: adc: Add the TI ads124s08 ADC code") Signed-off-by: Fabio Estevam <festevam@gmail.com> Link: https://patch.msgid.link/20241122164308.390340-1-festevam@gmail.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-12-07 17:23:42 +00:00
Javier Carrasco	75f339d3ec	iio: adc: ti-ads1119: fix information leak in triggered buffer The 'scan' local struct is used to push data to user space from a triggered buffer, but it has a hole between the sample (unsigned int) and the timestamp. This hole is never initialized. Initialize the struct to zero before using it to avoid pushing uninitialized information to userspace. Cc: stable@vger.kernel.org Fixes: `a9306887eb` ("iio: adc: ti-ads1119: Add driver") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Reviewed-by: Francesco Dolcini <francesco.dolcini@toradex.com> Link: https://patch.msgid.link/20241125-iio_memset_scan_holes-v1-2-0cb6e98d895c@gmail.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-12-07 17:23:42 +00:00
Javier Carrasco	6007d10c52	iio: pressure: zpa2326: fix information leak in triggered buffer The 'sample' local struct is used to push data to user space from a triggered buffer, but it has a hole between the temperature and the timestamp (u32 pressure, u16 temperature, GAP, u64 timestamp). This hole is never initialized. Initialize the struct to zero before using it to avoid pushing uninitialized information to userspace. Cc: stable@vger.kernel.org Fixes: `03b262f2bb` ("iio:pressure: initial zpa2326 barometer support") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Link: https://patch.msgid.link/20241125-iio_memset_scan_holes-v1-3-0cb6e98d895c@gmail.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-12-07 17:23:42 +00:00
Javier Carrasco	3872459136	iio: adc: rockchip_saradc: fix information leak in triggered buffer The 'data' local struct is used to push data to user space from a triggered buffer, but it does not set values for inactive channels, as it only uses iio_for_each_active_channel() to assign new values. Initialize the struct to zero before using it to avoid pushing uninitialized information to userspace. Cc: stable@vger.kernel.org Fixes: `4e130dc7b4` ("iio: adc: rockchip_saradc: Add support iio buffers") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Link: https://patch.msgid.link/20241125-iio_memset_scan_holes-v1-4-0cb6e98d895c@gmail.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-12-07 17:23:42 +00:00
Javier Carrasco	6ae053113f	iio: imu: kmx61: fix information leak in triggered buffer The 'buffer' local array is used to push data to user space from a triggered buffer, but it does not set values for inactive channels, as it only uses iio_for_each_active_channel() to assign new values. Initialize the array to zero before using it to avoid pushing uninitialized information to userspace. Cc: stable@vger.kernel.org Fixes: `c3a23ecc09` ("iio: imu: kmx61: Add support for data ready triggers") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Link: https://patch.msgid.link/20241125-iio_memset_scan_holes-v1-5-0cb6e98d895c@gmail.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-12-07 17:23:42 +00:00
Javier Carrasco	47b43e53c0	iio: light: vcnl4035: fix information leak in triggered buffer The 'buffer' local array is used to push data to userspace from a triggered buffer, but it does not set an initial value for the single data element, which is an u16 aligned to 8 bytes. That leaves at least 4 bytes uninitialized even after writing an integer value with regmap_read(). Initialize the array to zero before using it to avoid pushing uninitialized information to userspace. Cc: stable@vger.kernel.org Fixes: `ec90b52c07` ("iio: light: vcnl4035: Fix buffer alignment in iio_push_to_buffers_with_timestamp()") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Link: https://patch.msgid.link/20241125-iio_memset_scan_holes-v1-6-0cb6e98d895c@gmail.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-12-07 17:23:41 +00:00
Javier Carrasco	b62fbe3b8e	iio: light: bh1745: fix information leak in triggered buffer The 'scan' local struct is used to push data to user space from a triggered buffer, but it does not set values for inactive channels, as it only uses iio_for_each_active_channel() to assign new values. Initialize the struct to zero before using it to avoid pushing uninitialized information to userspace. Cc: stable@vger.kernel.org Fixes: `eab35358aa` ("iio: light: ROHM BH1745 colour sensor") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Link: https://patch.msgid.link/20241125-iio_memset_scan_holes-v1-7-0cb6e98d895c@gmail.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-12-07 17:23:41 +00:00
Javier Carrasco	2a7377ccfd	iio: adc: ti-ads8688: fix information leak in triggered buffer The 'buffer' local array is used to push data to user space from a triggered buffer, but it does not set values for inactive channels, as it only uses iio_for_each_active_channel() to assign new values. Initialize the array to zero before using it to avoid pushing uninitialized information to userspace. Cc: stable@vger.kernel.org Fixes: `61fa5dfa5f` ("iio: adc: ti-ads8688: Fix alignment of buffer in iio_push_to_buffers_with_timestamp()") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Link: https://patch.msgid.link/20241125-iio_memset_scan_holes-v1-8-0cb6e98d895c@gmail.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-12-07 17:23:41 +00:00
Javier Carrasco	333be433ee	iio: dummy: iio_simply_dummy_buffer: fix information leak in triggered buffer The 'data' array is allocated via kmalloc() and it is used to push data to user space from a triggered buffer, but it does not set values for inactive channels, as it only uses iio_for_each_active_channel() to assign new values. Use kzalloc for the memory allocation to avoid pushing uninitialized information to userspace. Cc: stable@vger.kernel.org Fixes: `415f792447` ("iio: Move IIO Dummy Driver out of staging") Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Link: https://patch.msgid.link/20241125-iio_memset_scan_holes-v1-9-0cb6e98d895c@gmail.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-12-07 17:23:41 +00:00
Matti Vaittinen	fbeba4364c	iio: test: Fix GTS test config The test config contained a copy-paste error. The IIO GTS helper test was errorneously titled as "Test IIO formatting functions" in the menuconfig. Change the title of the tests to reflect what is tested. Fixes: `cf996f0396` ("iio: test: test gain-time-scale helpers") Signed-off-by: Matti Vaittinen <mazziesaccount@gmail.com> Link: https://patch.msgid.link/Z0gt5R86WdeK73u2@mva-rohm Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-12-07 17:23:41 +00:00
Sean Nyekjaer	55d82a7ac7	dt-bindings: iio: st-sensors: Re-add IIS2MDC magnetometer "iio: st-sensors: Update ST Sensor bindings" accidentially dropped the compatible for the IIS2MDC magnetometer. Fixes: `0cd7114580` ("iio: st-sensors: Update ST Sensor bindings") Signed-off-by: Sean Nyekjaer <sean@geanix.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Link: https://patch.msgid.link/20241129-stmagdt-v1-1-963f0347fb0a@geanix.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-12-07 17:23:41 +00:00
Charles Han	bcb394bb28	iio: adc: ti-ads1298: Add NULL check in ads1298_init devm_kasprintf() can return a NULL pointer on failure. A check on the return value of such a call in ads1298_init() is missing. Add it. Fixes: `00ef7708fa` ("iio: adc: ti-ads1298: Add driver") Signed-off-by: Charles Han <hanchunchao@inspur.com> Link: https://patch.msgid.link/20241118090208.14586-1-hanchunchao@inspur.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-12-07 17:23:41 +00:00
Olivier Moysan	ad8479ac08	iio: adc: stm32-dfsdm: handle label as an optional property The label property is defined as optional in the DFSDM binding. Parse the label property only when it is defined in the device tree. Fixes: `3208fa0cd9` ("iio: adc: stm32-dfsdm: adopt generic channels bindings") Signed-off-by: Olivier Moysan <olivier.moysan@foss.st.com> Link: https://patch.msgid.link/20241114102459.2497178-1-olivier.moysan@foss.st.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-12-07 17:23:41 +00:00
Trevor Gamblin	dddfd0c489	iio: adc: ad4695: fix buffered read, single sample timings Modify ad4695_buffer_preenable() by adding an extra SPI transfer after each data read to help ensure that the timing requirement between the last SCLK rising edge and the next CNV rising edge is met. This requires a restructure of the buf_read_xfer array in ad4695_state. Also define AD4695_T_SCK_CNV_DELAY_NS to use for each added transfer. Without this change it is possible for the data to become corrupted on sequential buffered reads due to the device not properly exiting conversion mode. Similarly, make adjustments to ad4695_read_one_sample() so that timings are respected, and clean up the function slightly in the process. Fixes: `6cc7e4bf2e` ("iio: adc: ad4695: implement triggered buffer") Co-developed-by: David Lechner <dlechner@baylibre.com> Signed-off-by: David Lechner <dlechner@baylibre.com> Signed-off-by: Trevor Gamblin <tgamblin@baylibre.com> Reviewed-by: David Lechner <dlechner@baylibre.com> Tested-by: David Lechner <dlechner@baylibre.com> Link: https://patch.msgid.link/20241113-tgamblin-ad4695_improvements-v2-1-b6bb7c758fc4@baylibre.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-12-07 17:23:41 +00:00
Jean-Baptiste Maneyrol	65a60a5901	iio: imu: inv_icm42600: fix timestamps after suspend if sensor is on Currently suspending while sensors are one will result in timestamping continuing without gap at resume. It can work with monotonic clock but not with other clocks. Fix that by resetting timestamping. Fixes: `ec74ae9fd3` ("iio: imu: inv_icm42600: add accurate timestamping") Cc: stable@vger.kernel.org Signed-off-by: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol@tdk.com> Link: https://patch.msgid.link/20241113-inv_icm42600-fix-timestamps-after-suspend-v1-1-dfc77c394173@tdk.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-12-07 17:23:41 +00:00
Jean-Baptiste Maneyrol	c0f866de4c	iio: imu: inv_icm42600: fix spi burst write not supported Burst write with SPI is not working for all icm42600 chips. It was only used for setting user offsets with regmap_bulk_write. Add specific SPI regmap config for using only single write with SPI. Fixes: `9f9ff91b77` ("iio: imu: inv_icm42600: add SPI driver for inv_icm42600 driver") Cc: stable@vger.kernel.org Signed-off-by: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol@tdk.com> Link: https://patch.msgid.link/20241112-inv-icm42600-fix-spi-burst-write-not-supported-v2-1-97690dc03607@tdk.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-12-07 17:22:54 +00:00
Carlos Song	fa13ac6cdf	iio: gyro: fxas21002c: Fix missing data update in trigger handler The fxas21002c_trigger_handler() may fail to acquire sample data because the runtime PM enters the autosuspend state and sensor can not return sample data in standby mode.. Resume the sensor before reading the sample data into the buffer within the trigger handler. After the data is read, place the sensor back into the autosuspend state. Fixes: `a0701b6263` ("iio: gyro: add core driver for fxas21002c") Signed-off-by: Carlos Song <carlos.song@nxp.com> Signed-off-by: Frank Li <Frank.Li@nxp.com> Link: https://patch.msgid.link/20241116152945.4006374-1-Frank.Li@nxp.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-12-07 17:13:25 +00:00
Pei Xiao	aaa90d0751	iio: test : check null return of kunit_kmalloc in iio_rescale_test_scale kunit_kmalloc may fail, return value might be NULL and will cause NULL pointer dereference.Add KUNIT_ASSERT_NOT_ERR_OR_NULL fix it. Signed-off-by: Pei Xiao <xiaopei01@kylinos.cn> Fixes: `8e74a48d17` ("iio: test: add basic tests for the iio-rescale driver") Link: https://patch.msgid.link/ecd56a85e54a96c2f0313c114075a21a76071ea2.1730259869.git.xiaopei01@kylinos.cn Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-12-07 17:13:25 +00:00
Uwe Kleine-König	4be339af33	iio: adc: ad7124: Disable all channels at probe time When during a measurement two channels are enabled, two measurements are done that are reported sequencially in the DATA register. As the code triggered by reading one of the sysfs properties expects that only one channel is enabled it only reads the first data set which might or might not belong to the intended channel. To prevent this situation disable all channels during probe. This fixes a problem in practise because the reset default for channel 0 is enabled. So all measurements before the first measurement on channel 0 (which disables channel 0 at the end) might report wrong values. Fixes: `7b8d045e49` ("iio: adc: ad7124: allow more than 8 channels") Reviewed-by: Nuno Sa <nuno.sa@analog.com> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Link: https://patch.msgid.link/20241104101905.845737-2-u.kleine-koenig@baylibre.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-12-07 17:13:25 +00:00
Zicheng Qu	4636e859eb	staging: iio: ad9832: Correct phase range check User Perspective: When a user sets the phase value, the ad9832_write_phase() is called. The phase register has a 12-bit resolution, so the valid range is 0 to 4095. If the phase offset value of 4096 is input, it effectively exactly equals 0 in the lower 12 bits, meaning no offset. Reasons for the Change: 1) Original Condition (phase > BIT(AD9832_PHASE_BITS)): This condition allows a phase value equal to 2^12, which is 4096. However, this value exceeds the valid 12-bit range, as the maximum valid phase value should be 4095. 2) Modified Condition (phase >= BIT(AD9832_PHASE_BITS)): Ensures that the phase value is within the valid range, preventing invalid datafrom being written. Impact on Subsequent Logic: st->data = cpu_to_be16(addr \| phase): If the phase value is 2^12, i.e., 4096 (0001 0000 0000 0000), and addr is AD9832_REG_PHASE0 (1100 0000 0000 0000), then addr \| phase results in 1101 0000 0000 0000, occupying DB12. According to the section of WRITING TO A PHASE REGISTER in the datasheet, the MSB 12 PHASE0 bits should be DB11. The original condition leads to incorrect DB12 usage, which contradicts the datasheet and could pose potential issues for future updates if DB12 is used in such related cases. Fixes: `ea707584ba` ("Staging: IIO: DDS: AD9832 / AD9835 driver") Cc: stable@vger.kernel.org Signed-off-by: Zicheng Qu <quzicheng@huawei.com> Link: https://patch.msgid.link/20241107011015.2472600-3-quzicheng@huawei.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-12-07 17:13:25 +00:00
Zicheng Qu	c0599762f0	staging: iio: ad9834: Correct phase range check User Perspective: When a user sets the phase value, the ad9834_write_phase() is called. The phase register has a 12-bit resolution, so the valid range is 0 to 4095. If the phase offset value of 4096 is input, it effectively exactly equals 0 in the lower 12 bits, meaning no offset. Reasons for the Change: 1) Original Condition (phase > BIT(AD9834_PHASE_BITS)): This condition allows a phase value equal to 2^12, which is 4096. However, this value exceeds the valid 12-bit range, as the maximum valid phase value should be 4095. 2) Modified Condition (phase >= BIT(AD9834_PHASE_BITS)): Ensures that the phase value is within the valid range, preventing invalid datafrom being written. Impact on Subsequent Logic: st->data = cpu_to_be16(addr \| phase): If the phase value is 2^12, i.e., 4096 (0001 0000 0000 0000), and addr is AD9834_REG_PHASE0 (1100 0000 0000 0000), then addr \| phase results in 1101 0000 0000 0000, occupying DB12. According to the section of WRITING TO A PHASE REGISTER in the datasheet, the MSB 12 PHASE0 bits should be DB11. The original condition leads to incorrect DB12 usage, which contradicts the datasheet and could pose potential issues for future updates if DB12 is used in such related cases. Fixes: `12b9d5bf76` ("Staging: IIO: DDS: AD9833 / AD9834 driver") Cc: stable@vger.kernel.org Signed-off-by: Zicheng Qu <quzicheng@huawei.com> Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://patch.msgid.link/20241107011015.2472600-2-quzicheng@huawei.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	2024-12-07 17:13:24 +00:00
Jaakko Salo	82fdcf9b51	ALSA: usb-audio: Add implicit feedback quirk for Yamaha THR5 Use implicit feedback from the capture endpoint to fix popping sounds during playback. Link: https://bugzilla.kernel.org/show_bug.cgi?id=219567 Signed-off-by: Jaakko Salo <jaakkos@gmail.com> Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/20241206164448.8136-1-jaakkos@gmail.com Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-12-07 14:03:34 +01:00
Ingo Molnar	b4d83c8323	headers/cleanup.h: Remove the if_not_guard() facility Linus noticed that the new if_not_guard() definition is fragile: "This macro generates actively wrong code if it happens to be inside an if-statement or a loop without a block. IOW, code like this: for (iterate-over-something) if_not_guard(a) return -BUSY; looks like will build fine, but will generate completely incorrect code." The reason is that the __if_not_guard() macro is multi-statement, so while most kernel developers expect macros to be simple or at least compound statements - but for __if_not_guard() it is not so: #define __if_not_guard(_name, _id, args...) \ BUILD_BUG_ON(!__is_cond_ptr(_name)); \ CLASS(_name, _id)(args); \ if (!__guard_ptr(_name)(&_id)) To add insult to injury, the placement of the BUILD_BUG_ON() line makes the macro appear to compile fine, but it will generate incorrect code as Linus reported, for example if used within iteration or conditional statements that will use the first statement of a macro as a loop body or conditional statement body. [ I'd also like to note that the original submission by David Lechner did not contain the BUILD_BUG_ON() line, so it was safer than what we ended up committing. Mea culpa. ] It doesn't appear to be possible to turn this macro into a robust single or compound statement that could be used in single statements, due to the necessity to define an auto scope variable with an open scope and the necessity of it having to expand to a partial 'if' statement with no body. Instead of trying to work around this fragility, just remove the construct before it gets used. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: David Lechner <dlechner@baylibre.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/Z1LBnX9TpZLR5Dkf@gmail.com	2024-12-07 11:15:14 +01:00
Eric Dumazet	0f6ede9fbc	net: defer final 'struct net' free in netns dismantle Ilya reported a slab-use-after-free in dst_destroy [1] Issue is in xfrm6_net_init() and xfrm4_net_init() : They copy xfrm[46]_dst_ops_template into net->xfrm.xfrm[46]_dst_ops. But net structure might be freed before all the dst callbacks are called. So when dst_destroy() calls later : if (dst->ops->destroy) dst->ops->destroy(dst); dst->ops points to the old net->xfrm.xfrm[46]_dst_ops, which has been freed. See a relevant issue fixed in : `ac888d5886` ("net: do not delay dst_entries_add() in dst_release()") A fix is to queue the 'struct net' to be freed after one another cleanup_net() round (and existing rcu_barrier()) [1] BUG: KASAN: slab-use-after-free in dst_destroy (net/core/dst.c:112) Read of size 8 at addr ffff8882137ccab0 by task swapper/37/0 Dec 03 05:46:18 kernel: CPU: 37 UID: 0 PID: 0 Comm: swapper/37 Kdump: loaded Not tainted 6.12.0 #67 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.1-1.el9 04/01/2014 Call Trace: <IRQ> dump_stack_lvl (lib/dump_stack.c:124) print_address_description.constprop.0 (mm/kasan/report.c:378) ? dst_destroy (net/core/dst.c:112) print_report (mm/kasan/report.c:489) ? dst_destroy (net/core/dst.c:112) ? kasan_addr_to_slab (mm/kasan/common.c:37) kasan_report (mm/kasan/report.c:603) ? dst_destroy (net/core/dst.c:112) ? rcu_do_batch (kernel/rcu/tree.c:2567) dst_destroy (net/core/dst.c:112) rcu_do_batch (kernel/rcu/tree.c:2567) ? __pfx_rcu_do_batch (kernel/rcu/tree.c:2491) ? lockdep_hardirqs_on_prepare (kernel/locking/lockdep.c:4339 kernel/locking/lockdep.c:4406) rcu_core (kernel/rcu/tree.c:2825) handle_softirqs (kernel/softirq.c:554) __irq_exit_rcu (kernel/softirq.c:589 kernel/softirq.c:428 kernel/softirq.c:637) irq_exit_rcu (kernel/softirq.c:651) sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1049 arch/x86/kernel/apic/apic.c:1049) </IRQ> <TASK> asm_sysvec_apic_timer_interrupt (./arch/x86/include/asm/idtentry.h:702) RIP: 0010:default_idle (./arch/x86/include/asm/irqflags.h:37 ./arch/x86/include/asm/irqflags.h:92 arch/x86/kernel/process.c:743) Code: 00 4d 29 c8 4c 01 c7 4c 29 c2 e9 6e ff ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 66 90 0f 00 2d c7 c9 27 00 fb f4 <fa> c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 90 RSP: 0018:ffff888100d2fe00 EFLAGS: 00000246 RAX: 00000000001870ed RBX: 1ffff110201a5fc2 RCX: ffffffffb61a3e46 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffb3d4d123 RBP: 0000000000000000 R08: 0000000000000001 R09: ffffed11c7e1835d R10: ffff888e3f0c1aeb R11: 0000000000000000 R12: 0000000000000000 R13: ffff888100d20000 R14: dffffc0000000000 R15: 0000000000000000 ? ct_kernel_exit.constprop.0 (kernel/context_tracking.c:148) ? cpuidle_idle_call (kernel/sched/idle.c:186) default_idle_call (./include/linux/cpuidle.h:143 kernel/sched/idle.c:118) cpuidle_idle_call (kernel/sched/idle.c:186) ? __pfx_cpuidle_idle_call (kernel/sched/idle.c:168) ? lock_release (kernel/locking/lockdep.c:467 kernel/locking/lockdep.c:5848) ? lockdep_hardirqs_on_prepare (kernel/locking/lockdep.c:4347 kernel/locking/lockdep.c:4406) ? tsc_verify_tsc_adjust (arch/x86/kernel/tsc_sync.c:59) do_idle (kernel/sched/idle.c:326) cpu_startup_entry (kernel/sched/idle.c:423 (discriminator 1)) start_secondary (arch/x86/kernel/smpboot.c:202 arch/x86/kernel/smpboot.c:282) ? __pfx_start_secondary (arch/x86/kernel/smpboot.c:232) ? soft_restart_cpu (arch/x86/kernel/head_64.S:452) common_startup_64 (arch/x86/kernel/head_64.S:414) </TASK> Dec 03 05:46:18 kernel: Allocated by task 12184: kasan_save_stack (mm/kasan/common.c:48) kasan_save_track (./arch/x86/include/asm/current.h:49 mm/kasan/common.c:60 mm/kasan/common.c:69) __kasan_slab_alloc (mm/kasan/common.c:319 mm/kasan/common.c:345) kmem_cache_alloc_noprof (mm/slub.c:4085 mm/slub.c:4134 mm/slub.c:4141) copy_net_ns (net/core/net_namespace.c:421 net/core/net_namespace.c:480) create_new_namespaces (kernel/nsproxy.c:110) unshare_nsproxy_namespaces (kernel/nsproxy.c:228 (discriminator 4)) ksys_unshare (kernel/fork.c:3313) __x64_sys_unshare (kernel/fork.c:3382) do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) Dec 03 05:46:18 kernel: Freed by task 11: kasan_save_stack (mm/kasan/common.c:48) kasan_save_track (./arch/x86/include/asm/current.h:49 mm/kasan/common.c:60 mm/kasan/common.c:69) kasan_save_free_info (mm/kasan/generic.c:582) __kasan_slab_free (mm/kasan/common.c:271) kmem_cache_free (mm/slub.c:4579 mm/slub.c:4681) cleanup_net (net/core/net_namespace.c:456 net/core/net_namespace.c:446 net/core/net_namespace.c:647) process_one_work (kernel/workqueue.c:3229) worker_thread (kernel/workqueue.c:3304 kernel/workqueue.c:3391) kthread (kernel/kthread.c:389) ret_from_fork (arch/x86/kernel/process.c:147) ret_from_fork_asm (arch/x86/entry/entry_64.S:257) Dec 03 05:46:18 kernel: Last potentially related work creation: kasan_save_stack (mm/kasan/common.c:48) __kasan_record_aux_stack (mm/kasan/generic.c:541) insert_work (./include/linux/instrumented.h:68 ./include/asm-generic/bitops/instrumented-non-atomic.h:141 kernel/workqueue.c:788 kernel/workqueue.c:795 kernel/workqueue.c:2186) __queue_work (kernel/workqueue.c:2340) queue_work_on (kernel/workqueue.c:2391) xfrm_policy_insert (net/xfrm/xfrm_policy.c:1610) xfrm_add_policy (net/xfrm/xfrm_user.c:2116) xfrm_user_rcv_msg (net/xfrm/xfrm_user.c:3321) netlink_rcv_skb (net/netlink/af_netlink.c:2536) xfrm_netlink_rcv (net/xfrm/xfrm_user.c:3344) netlink_unicast (net/netlink/af_netlink.c:1316 net/netlink/af_netlink.c:1342) netlink_sendmsg (net/netlink/af_netlink.c:1886) sock_write_iter (net/socket.c:729 net/socket.c:744 net/socket.c:1165) vfs_write (fs/read_write.c:590 fs/read_write.c:683) ksys_write (fs/read_write.c:736) do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) Dec 03 05:46:18 kernel: Second to last potentially related work creation: kasan_save_stack (mm/kasan/common.c:48) __kasan_record_aux_stack (mm/kasan/generic.c:541) insert_work (./include/linux/instrumented.h:68 ./include/asm-generic/bitops/instrumented-non-atomic.h:141 kernel/workqueue.c:788 kernel/workqueue.c:795 kernel/workqueue.c:2186) __queue_work (kernel/workqueue.c:2340) queue_work_on (kernel/workqueue.c:2391) __xfrm_state_insert (./include/linux/workqueue.h:723 net/xfrm/xfrm_state.c:1150 net/xfrm/xfrm_state.c:1145 net/xfrm/xfrm_state.c:1513) xfrm_state_update (./include/linux/spinlock.h:396 net/xfrm/xfrm_state.c:1940) xfrm_add_sa (net/xfrm/xfrm_user.c:912) xfrm_user_rcv_msg (net/xfrm/xfrm_user.c:3321) netlink_rcv_skb (net/netlink/af_netlink.c:2536) xfrm_netlink_rcv (net/xfrm/xfrm_user.c:3344) netlink_unicast (net/netlink/af_netlink.c:1316 net/netlink/af_netlink.c:1342) netlink_sendmsg (net/netlink/af_netlink.c:1886) sock_write_iter (net/socket.c:729 net/socket.c:744 net/socket.c:1165) vfs_write (fs/read_write.c:590 fs/read_write.c:683) ksys_write (fs/read_write.c:736) do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) Fixes: `a8a572a6b5` ("xfrm: dst_entries_init() per-net dst_ops") Reported-by: Ilya Maximets <i.maximets@ovn.org> Closes: https://lore.kernel.org/netdev/CANn89iKKYDVpB=MtmfH7nyv2p=rJWSLedO5k7wSZgtY_tO8WQg@mail.gmail.com/T/#m02c98c3009fe66382b73cfb4db9cf1df6fab3fbf Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20241204125455.3871859-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-06 17:45:08 -08:00
Eric Dumazet	a6d75ecee2	net: lapb: increase LAPB_HEADER_LEN It is unclear if net/lapb code is supposed to be ready for 8021q. We can at least avoid crashes like the following : skbuff: skb_under_panic: text:ffffffff8aabe1f6 len:24 put:20 head:ffff88802824a400 data:ffff88802824a3fe tail:0x16 end:0x140 dev:nr0.2 ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:206 ! Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI CPU: 1 UID: 0 PID: 5508 Comm: dhcpcd Not tainted 6.12.0-rc7-syzkaller-00144-g66418447d27b #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/30/2024 RIP: 0010:skb_panic net/core/skbuff.c:206 [inline] RIP: 0010:skb_under_panic+0x14b/0x150 net/core/skbuff.c:216 Code: 0d 8d 48 c7 c6 2e 9e 29 8e 48 8b 54 24 08 8b 0c 24 44 8b 44 24 04 4d 89 e9 50 41 54 41 57 41 56 e8 1a 6f 37 02 48 83 c4 20 90 <0f> 0b 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 RSP: 0018:ffffc90002ddf638 EFLAGS: 00010282 RAX: 0000000000000086 RBX: dffffc0000000000 RCX: 7a24750e538ff600 RDX: 0000000000000000 RSI: 0000000000000201 RDI: 0000000000000000 RBP: ffff888034a86650 R08: ffffffff8174b13c R09: 1ffff920005bbe60 R10: dffffc0000000000 R11: fffff520005bbe61 R12: 0000000000000140 R13: ffff88802824a400 R14: ffff88802824a3fe R15: 0000000000000016 FS: 00007f2a5990d740(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000110c2631fd CR3: 0000000029504000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> skb_push+0xe5/0x100 net/core/skbuff.c:2636 nr_header+0x36/0x320 net/netrom/nr_dev.c:69 dev_hard_header include/linux/netdevice.h:3148 [inline] vlan_dev_hard_header+0x359/0x480 net/8021q/vlan_dev.c:83 dev_hard_header include/linux/netdevice.h:3148 [inline] lapbeth_data_transmit+0x1f6/0x2a0 drivers/net/wan/lapbether.c:257 lapb_data_transmit+0x91/0xb0 net/lapb/lapb_iface.c:447 lapb_transmit_buffer+0x168/0x1f0 net/lapb/lapb_out.c:149 lapb_establish_data_link+0x84/0xd0 lapb_device_event+0x4e0/0x670 notifier_call_chain+0x19f/0x3e0 kernel/notifier.c:93 __dev_notify_flags+0x207/0x400 dev_change_flags+0xf0/0x1a0 net/core/dev.c:8922 devinet_ioctl+0xa4e/0x1aa0 net/ipv4/devinet.c:1188 inet_ioctl+0x3d7/0x4f0 net/ipv4/af_inet.c:1003 sock_do_ioctl+0x158/0x460 net/socket.c:1227 sock_ioctl+0x626/0x8e0 net/socket.c:1346 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:907 [inline] __se_sys_ioctl+0xf9/0x170 fs/ioctl.c:893 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Reported-by: syzbot+fb99d1b0c0f81d94a5e2@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/67506220.050a0220.17bd51.006c.GAE@google.com/T/#u Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241204141031.4030267-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-06 17:43:08 -08:00
Jakub Kicinski	ff9b305315	Merge branch 'bnxt_en-bug-fixes' Michael Chan says: ==================== bnxt_en: Bug fixes There are 2 bug fixes in this series. This first one fixes the issue of setting the gso_type incorrectly for HW GRO packets on 5750X (Thor) chips. This can cause HW GRO packets to be dropped by the stack if they are re-segmented. The second one fixes a potential division by zero crash when dumping FW log coredump. ==================== Link: https://patch.msgid.link/20241204215918.1692597-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-06 17:39:18 -08:00
Hongguang Gao	fab4b4d2c9	bnxt_en: Fix potential crash when dumping FW log coredump If the FW log context memory is retained after FW reset, the existing code is not handling the condition correctly and zeroes out the data structures. This potentially will cause a division by zero crash when the user runs ethtool -w. The last_type is also not set correctly when the context memory is retained. This will cause errors because the last_type signals to the FW that all context memory types have been configured. Oops: divide error: 0000 1 PREEMPT SMP NOPTI CPU: 53 UID: 0 PID: 7019 Comm: ethtool Kdump: loaded Tainted: G OE 6.12.0-rc7+ #1 Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE Hardware name: Supermicro SYS-621C-TN12R/X13DDW-A, BIOS 1.4 08/10/2023 RIP: 0010:__bnxt_copy_ctx_mem.constprop.0.isra.0+0x86/0x160 [bnxt_en] Code: 0a 31 d2 4c 89 6c 24 10 45 8b a5 fc df ff ff 4c 8b 74 24 20 31 db 66 89 44 24 06 48 63 c5 c1 e5 09 4c 0f af e0 48 8b 44 24 30 <49> f7 f4 4c 89 64 24 08 48 63 c5 4d 89 ec 31 ed 48 89 44 24 18 49 RSP: 0018:ff480591603d78b8 EFLAGS: 00010206 RAX: 0000000000100000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ff23959e46740000 RDI: 0000000000000000 RBP: 0000000000000000 R08: 0000000000100000 R09: ff23959e46740000 R10: ff480591603d7a18 R11: 0000000000000010 R12: 0000000000000000 R13: ff23959e46742008 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f04227c1740(0000) GS:ff2395adbf680000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f04225b33a5 CR3: 000000108b9a4001 CR4: 0000000000773ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> ? die+0x33/0x90 ? do_trap+0xd9/0x100 ? __bnxt_copy_ctx_mem.constprop.0.isra.0+0x86/0x160 [bnxt_en] ? do_error_trap+0x65/0x80 ? __bnxt_copy_ctx_mem.constprop.0.isra.0+0x86/0x160 [bnxt_en] ? exc_divide_error+0x36/0x50 ? __bnxt_copy_ctx_mem.constprop.0.isra.0+0x86/0x160 [bnxt_en] ? asm_exc_divide_error+0x16/0x20 ? __bnxt_copy_ctx_mem.constprop.0.isra.0+0x86/0x160 [bnxt_en] ? __bnxt_copy_ctx_mem.constprop.0.isra.0+0xda/0x160 [bnxt_en] bnxt_get_ctx_coredump.constprop.0+0x1ed/0x390 [bnxt_en] ? __memcg_slab_post_alloc_hook+0x21c/0x3c0 ? __bnxt_get_coredump+0x473/0x4b0 [bnxt_en] __bnxt_get_coredump+0x473/0x4b0 [bnxt_en] ? security_file_alloc+0x74/0xe0 ? cred_has_capability.isra.0+0x78/0x120 bnxt_get_coredump_length+0x4b/0xf0 [bnxt_en] bnxt_get_dump_flag+0x40/0x60 [bnxt_en] __dev_ethtool+0x17e4/0x1fc0 ? syscall_exit_to_user_mode+0xc/0x1d0 ? do_syscall_64+0x85/0x150 ? unmap_page_range+0x299/0x4b0 ? vma_interval_tree_remove+0x215/0x2c0 ? __kmalloc_cache_noprof+0x10a/0x300 dev_ethtool+0xa8/0x170 dev_ioctl+0x1b5/0x580 ? sk_ioctl+0x4a/0x110 sock_do_ioctl+0xab/0xf0 sock_ioctl+0x1ca/0x2e0 __x64_sys_ioctl+0x87/0xc0 do_syscall_64+0x79/0x150 Fixes: `24d694aec1` ("bnxt_en: Allocate backing store memory for FW trace logs") Signed-off-by: Hongguang Gao <hongguang.gao@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20241204215918.1692597-3-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-06 17:39:13 -08:00
Michael Chan	de37faf41a	bnxt_en: Fix GSO type for HW GRO packets on 5750X chips The existing code is using RSS profile to determine IPV4/IPV6 GSO type on all chips older than 5760X. This won't work on 5750X chips that may be using modified RSS profiles. This commit from 2018 has updated the driver to not use RSS profile for HW GRO packets on newer chips: `50f011b63d` ("bnxt_en: Update RSS setup and GRO-HW logic according to the latest spec.") However, a recent commit to add support for the newest 5760X chip broke the logic. If the GRO packet needs to be re-segmented by the stack, the wrong GSO type will cause the packet to be dropped. Fix it to only use RSS profile to determine GSO type on the oldest 5730X/5740X chips which cannot use the new method and is safe to use the RSS profiles. Also fix the L3/L4 hash type for RX packets by not using the RSS profile for the same reason. Use the ITYPE field in the RX completion to determine L3/L4 hash types correctly. Fixes: `a7445d6980` ("bnxt_en: Add support for new RX and TPA_START completion types for P7") Reviewed-by: Colin Winegarden <colin.winegarden@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20241204215918.1692597-2-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-06 17:39:13 -08:00
Thomas Weißschuh	5e7aa97c7a	ptp: kvm: x86: Return EOPNOTSUPP instead of ENODEV from kvm_arch_ptp_init() The caller, ptp_kvm_init(), emits a warning if kvm_arch_ptp_init() exits with any error which is not EOPNOTSUPP: "fail to initialize ptp_kvm" Replace ENODEV with EOPNOTSUPP to avoid this spurious warning, aligning with the ARM implementation. Fixes: `a86ed2cfa1` ("ptp: Don't print an error if ptp_kvm is not supported") Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://patch.msgid.link/20241203-kvm_ptp-eopnotsuppp-v2-1-d1d060f27aa6@weissschuh.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-06 17:38:11 -08:00
Jakub Kicinski	bd0c907043	Merge branch 'selftests-mlxsw-add-few-fixes-for-sharedbuffer-test' Petr Machata says: ==================== selftests: mlxsw: Add few fixes for sharedbuffer test Danielle Ratson writes: Currently, the sharedbuffer test fails sometimes because it is reading a maximum occupancy that is larger than expected on some different cases. This is happening because the test assumes that the packet it is sending is the only packet being passed to the device. In addition, some duplications on one hand, and redundant test cases on the other hand, were found in the test. Add egress filters on h1 and h2 that will guarantee that the packets in the buffer are sent in the test, and remove the redundant test cases. ==================== Link: https://patch.msgid.link/cover.1733414773.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-06 17:37:46 -08:00
Danielle Ratson	5f2c7ab15f	selftests: mlxsw: sharedbuffer: Ensure no extra packets are counted The test assumes that the packet it is sending is the only packet being passed to the device. However, it is not the case and so other packets are filling the buffers as well. Therefore, the test sometimes fails because it is reading a maximum occupancy that is larger than expected. Add egress filters on $h1 and $h2 that will guarantee the above. Fixes: `a865ad9996` ("selftests: mlxsw: Add shared buffer traffic test") Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/64c28bc9b1cc1d78c4a73feda7cedbe9526ccf8b.1733414773.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-06 17:37:37 -08:00
Danielle Ratson	6c46ad4d1b	selftests: mlxsw: sharedbuffer: Remove duplicate test cases On both port_tc_ip_test() and port_tc_arp_test(), the max occupancy is checked on $h2 twice, when only the error message is different and does not match the check itself. Remove the two duplicated test cases from the test. Fixes: `a865ad9996` ("selftests: mlxsw: Add shared buffer traffic test") Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/d9eb26f6fc16a06a30b5c2c16ad80caf502bc561.1733414773.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-06 17:37:37 -08:00
Danielle Ratson	cf3515c556	selftests: mlxsw: sharedbuffer: Remove h1 ingress test case The test is sending only one packet generated with mausezahn from $h1 to $h2. However, for some reason, it is testing for non-zero maximum occupancy in both the ingress pool of $h1 and $h2. The former only passes when $h2 happens to send a packet. Avoid intermittent failures by removing unintentional test case regarding the ingress pool of $h1. Fixes: `a865ad9996` ("selftests: mlxsw: Add shared buffer traffic test") Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/5b7344608d5e06f38209e48d8af8c92fa11b6742.1733414773.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-06 17:37:37 -08:00
Haoyu Li	3396995f9f	gpio: ljca: Initialize num before accessing item in ljca_gpio_config With the new __counted_by annocation in ljca_gpio_packet, the "num" struct member must be set before accessing the "item" array. Failing to do so will trigger a runtime warning when enabling CONFIG_UBSAN_BOUNDS and CONFIG_FORTIFY_SOURCE. Fixes: `1034cc423f` ("gpio: update Intel LJCA USB GPIO driver") Cc: stable@vger.kernel.org Signed-off-by: Haoyu Li <lihaoyu499@gmail.com> Link: https://lore.kernel.org/stable/20241203141451.342316-1-lihaoyu499%40gmail.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>	2024-12-06 19:04:42 +01:00
Ard Biesheuvel	0b2c29fb68	efi/zboot: Limit compression options to GZIP and ZSTD For historical reasons, the legacy decompressor code on various architectures supports 7 different compression types for the compressed kernel image. EFI zboot is not a compression library museum, and so the options can be limited to what is likely to be useful in practice: - GZIP is tried and tested, and is still one of the fastest at decompression time, although the compression ratio is not very high; moreover, Fedora is already shipping EFI zboot kernels for arm64 that use GZIP, and QEMU implements direct support for it when booting a kernel without firmware loaded; - ZSTD has a very high compression ratio (although not the highest), and is almost as fast as GZIP at decompression time. Reducing the number of options makes it less of a hassle for other consumers of the EFI zboot format (such as QEMU today, and kexec in the future) to support it transparently without having to carry 7 different decompression libraries. Acked-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>	2024-12-06 16:59:56 +01:00
Filipe Manana	f10bef73fb	btrfs: flush delalloc workers queue before stopping cleaner kthread during unmount During the unmount path, at close_ctree(), we first stop the cleaner kthread, using kthread_stop() which frees the associated task_struct, and then stop and destroy all the work queues. However after we stopped the cleaner we may still have a worker from the delalloc_workers queue running inode.c:submit_compressed_extents(), which calls btrfs_add_delayed_iput(), which in turn tries to wake up the cleaner kthread - which was already destroyed before, resulting in a use-after-free on the task_struct. Syzbot reported this with the following stack traces: BUG: KASAN: slab-use-after-free in __lock_acquire+0x78/0x2100 kernel/locking/lockdep.c:5089 Read of size 8 at addr ffff8880259d2818 by task kworker/u8:3/52 CPU: 1 UID: 0 PID: 52 Comm: kworker/u8:3 Not tainted 6.13.0-rc1-syzkaller-00002-gcdd30ebb1b9f #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 Workqueue: btrfs-delalloc btrfs_work_helper Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0x169/0x550 mm/kasan/report.c:489 kasan_report+0x143/0x180 mm/kasan/report.c:602 __lock_acquire+0x78/0x2100 kernel/locking/lockdep.c:5089 lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5849 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0xd5/0x120 kernel/locking/spinlock.c:162 class_raw_spinlock_irqsave_constructor include/linux/spinlock.h:551 [inline] try_to_wake_up+0xc2/0x1470 kernel/sched/core.c:4205 submit_compressed_extents+0xdf/0x16e0 fs/btrfs/inode.c:1615 run_ordered_work fs/btrfs/async-thread.c:288 [inline] btrfs_work_helper+0x96f/0xc40 fs/btrfs/async-thread.c:324 process_one_work kernel/workqueue.c:3229 [inline] process_scheduled_works+0xa66/0x1840 kernel/workqueue.c:3310 worker_thread+0x870/0xd30 kernel/workqueue.c:3391 kthread+0x2f0/0x390 kernel/kthread.c:389 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 </TASK> Allocated by task 2: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x3f/0x80 mm/kasan/common.c:68 unpoison_slab_object mm/kasan/common.c:319 [inline] __kasan_slab_alloc+0x66/0x80 mm/kasan/common.c:345 kasan_slab_alloc include/linux/kasan.h:250 [inline] slab_post_alloc_hook mm/slub.c:4104 [inline] slab_alloc_node mm/slub.c:4153 [inline] kmem_cache_alloc_node_noprof+0x1d9/0x380 mm/slub.c:4205 alloc_task_struct_node kernel/fork.c:180 [inline] dup_task_struct+0x57/0x8c0 kernel/fork.c:1113 copy_process+0x5d1/0x3d50 kernel/fork.c:2225 kernel_clone+0x223/0x870 kernel/fork.c:2807 kernel_thread+0x1bc/0x240 kernel/fork.c:2869 create_kthread kernel/kthread.c:412 [inline] kthreadd+0x60d/0x810 kernel/kthread.c:767 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 Freed by task 24: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x3f/0x80 mm/kasan/common.c:68 kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:582 poison_slab_object mm/kasan/common.c:247 [inline] __kasan_slab_free+0x59/0x70 mm/kasan/common.c:264 kasan_slab_free include/linux/kasan.h:233 [inline] slab_free_hook mm/slub.c:2338 [inline] slab_free mm/slub.c:4598 [inline] kmem_cache_free+0x195/0x410 mm/slub.c:4700 put_task_struct include/linux/sched/task.h:144 [inline] delayed_put_task_struct+0x125/0x300 kernel/exit.c:227 rcu_do_batch kernel/rcu/tree.c:2567 [inline] rcu_core+0xaaa/0x17a0 kernel/rcu/tree.c:2823 handle_softirqs+0x2d4/0x9b0 kernel/softirq.c:554 run_ksoftirqd+0xca/0x130 kernel/softirq.c:943 smpboot_thread_fn+0x544/0xa30 kernel/smpboot.c:164 kthread+0x2f0/0x390 kernel/kthread.c:389 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 Last potentially related work creation: kasan_save_stack+0x3f/0x60 mm/kasan/common.c:47 __kasan_record_aux_stack+0xac/0xc0 mm/kasan/generic.c:544 __call_rcu_common kernel/rcu/tree.c:3086 [inline] call_rcu+0x167/0xa70 kernel/rcu/tree.c:3190 context_switch kernel/sched/core.c:5372 [inline] __schedule+0x1803/0x4be0 kernel/sched/core.c:6756 __schedule_loop kernel/sched/core.c:6833 [inline] schedule+0x14b/0x320 kernel/sched/core.c:6848 schedule_timeout+0xb0/0x290 kernel/time/sleep_timeout.c:75 do_wait_for_common kernel/sched/completion.c:95 [inline] __wait_for_common kernel/sched/completion.c:116 [inline] wait_for_common kernel/sched/completion.c:127 [inline] wait_for_completion+0x355/0x620 kernel/sched/completion.c:148 kthread_stop+0x19e/0x640 kernel/kthread.c:712 close_ctree+0x524/0xd60 fs/btrfs/disk-io.c:4328 generic_shutdown_super+0x139/0x2d0 fs/super.c:642 kill_anon_super+0x3b/0x70 fs/super.c:1237 btrfs_kill_super+0x41/0x50 fs/btrfs/super.c:2112 deactivate_locked_super+0xc4/0x130 fs/super.c:473 cleanup_mnt+0x41f/0x4b0 fs/namespace.c:1373 task_work_run+0x24f/0x310 kernel/task_work.c:239 ptrace_notify+0x2d2/0x380 kernel/signal.c:2503 ptrace_report_syscall include/linux/ptrace.h:415 [inline] ptrace_report_syscall_exit include/linux/ptrace.h:477 [inline] syscall_exit_work+0xc7/0x1d0 kernel/entry/common.c:173 syscall_exit_to_user_mode_prepare kernel/entry/common.c:200 [inline] __syscall_exit_to_user_mode_work kernel/entry/common.c:205 [inline] syscall_exit_to_user_mode+0x24a/0x340 kernel/entry/common.c:218 do_syscall_64+0x100/0x230 arch/x86/entry/common.c:89 entry_SYSCALL_64_after_hwframe+0x77/0x7f The buggy address belongs to the object at ffff8880259d1e00 which belongs to the cache task_struct of size 7424 The buggy address is located 2584 bytes inside of freed 7424-byte region [ffff8880259d1e00, ffff8880259d3b00) The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x259d0 head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 memcg:ffff88802f4b56c1 flags: 0xfff00000000040(head\|node=0\|zone=1\|lastcpupid=0x7ff) page_type: f5(slab) raw: 00fff00000000040 ffff88801bafe500 dead000000000100 dead000000000122 raw: 0000000000000000 0000000000040004 00000001f5000000 ffff88802f4b56c1 head: 00fff00000000040 ffff88801bafe500 dead000000000100 dead000000000122 head: 0000000000000000 0000000000040004 00000001f5000000 ffff88802f4b56c1 head: 00fff00000000003 ffffea0000967401 ffffffffffffffff 0000000000000000 head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected page_owner tracks the page as allocated page last allocated via order 3, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO\|__GFP_FS\|__GFP_NOWARN\|__GFP_NORETRY\|__GFP_COMP\|__GFP_NOMEMALLOC), pid 12, tgid 12 (kworker/u8:1), ts 7328037942, free_ts 0 set_page_owner include/linux/page_owner.h:32 [inline] post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1556 prep_new_page mm/page_alloc.c:1564 [inline] get_page_from_freelist+0x3651/0x37a0 mm/page_alloc.c:3474 __alloc_pages_noprof+0x292/0x710 mm/page_alloc.c:4751 alloc_pages_mpol_noprof+0x3e8/0x680 mm/mempolicy.c:2265 alloc_slab_page+0x6a/0x140 mm/slub.c:2408 allocate_slab+0x5a/0x2f0 mm/slub.c:2574 new_slab mm/slub.c:2627 [inline] ___slab_alloc+0xcd1/0x14b0 mm/slub.c:3815 __slab_alloc+0x58/0xa0 mm/slub.c:3905 __slab_alloc_node mm/slub.c:3980 [inline] slab_alloc_node mm/slub.c:4141 [inline] kmem_cache_alloc_node_noprof+0x269/0x380 mm/slub.c:4205 alloc_task_struct_node kernel/fork.c:180 [inline] dup_task_struct+0x57/0x8c0 kernel/fork.c:1113 copy_process+0x5d1/0x3d50 kernel/fork.c:2225 kernel_clone+0x223/0x870 kernel/fork.c:2807 user_mode_thread+0x132/0x1a0 kernel/fork.c:2885 call_usermodehelper_exec_work+0x5c/0x230 kernel/umh.c:171 process_one_work kernel/workqueue.c:3229 [inline] process_scheduled_works+0xa66/0x1840 kernel/workqueue.c:3310 worker_thread+0x870/0xd30 kernel/workqueue.c:3391 page_owner free stack trace missing Memory state around the buggy address: ffff8880259d2700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8880259d2780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >ffff8880259d2800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff8880259d2880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8880259d2900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== Fix this by flushing the delalloc workers queue before stopping the cleaner kthread. Reported-by: syzbot+b7cf50a0c173770dcb14@syzkaller.appspotmail.com Link: https://lore.kernel.org/linux-btrfs/674ed7e8.050a0220.48a03.0031.GAE@google.com/ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2024-12-06 15:04:18 +01:00
Johannes Thumshirn	c7c97ceff9	btrfs: handle bio_split() errors Commit `e546fe1da9` ("block: Rework bio_split() return value") changed bio_split() so that it can return errors. Add error handling for it in btrfs_split_bio() and ultimately btrfs_submit_chunk(). As the bio is not submitted, the bio counter must be decremented to pair btrfs_bio_counter_inc_blocked(). Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2024-12-06 15:04:13 +01:00
Qu Wenruo	c83d77eb0f	btrfs: properly wait for writeback before buffered write [BUG] Before commit `e820dbeb6a` ("btrfs: convert btrfs_buffered_write() to use folios"), function prepare_one_folio() will always wait for folio writeback to finish before returning the folio. However commit `e820dbeb6a` ("btrfs: convert btrfs_buffered_write() to use folios") changed to use FGP_STABLE to do the writeback wait, but FGP_STABLE is calling folio_wait_stable(), which only calls folio_wait_writeback() if the address space has AS_STABLE_WRITES, which is not set for btrfs inodes. This means we will not wait for the folio writeback at all. [CAUSE] The cause is FGP_STABLE is not waiting for writeback unconditionally, but only for address spaces with AS_STABLE_WRITES, normally such flag is set when the super block has SB_I_STABLE_WRITES flag. Such super block flag is set when the block device has hardware digest support or has internal checksum requirement. I'd argue btrfs should set such super block due to its default data checksum behavior, but it is not set yet, so this means FGP_STABLE flag will have no effect at all. (For NODATASUM inodes, we can skip the waiting in theory but that should be an optimization in the future.) This can lead to data checksum mismatch, as we can modify the folio while it's still under writeback, this will make the contents differ from the contents at submission and checksum calculation. [FIX] Instead of fully relying on FGP_STABLE, manually do the folio writeback waiting, until we set the address space or super flag. Fixes: `e820dbeb6a` ("btrfs: convert btrfs_buffered_write() to use folios") Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2024-12-06 15:04:07 +01:00
Michael Neuling	ea6398a5af	RISC-V: KVM: Fix csr_write -> csr_set for HVIEN PMU overflow bit This doesn't cause a problem currently as HVIEN isn't used elsewhere yet. Found by inspection. Signed-off-by: Michael Neuling <michaelneuling@tenstorrent.com> Fixes: `16b0bde9a3` ("RISC-V: KVM: Add perf sampling support for guests") Reviewed-by: Atish Patra <atishp@rivosinc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20241127041840.419940-1-michaelneuling@tenstorrent.com Signed-off-by: Anup Patel <anup@brainfault.org>	2024-12-06 18:42:38 +05:30
Vasiliy Kovalev	1b452c2de5	ALSA: hda/realtek - Add support for ASUS Zen AIO 27 Z272SD_A272SD audio Introduces necessary quirks to enable audio functionality on the ASUS Zen AIO 27 Z272SD_A272SD: - configures verbs to activate internal speakers and headphone jack. - implements adjustments for headset microphone functionality. The speaker and jack configurations were derived from a dump of the working Windows driver, while the headset microphone functionality was fine-tuned through experimental testing. Link: https://lore.kernel.org/all/CAGGMHBOGDUnMewBTrZgoBKFk_A4sNF4fEXwfH9Ay8SNTzjy0-g@mail.gmail.com/T/ Signed-off-by: Vasiliy Kovalev <kovalev@altlinux.org> Link: https://patch.msgid.link/20241205210306.977634-1-kovalev@altlinux.org Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-12-06 13:57:25 +01:00
Hridesh MG	5a69e3d0a1	ALSA: hda/realtek: Fix headset mic on Acer Nitro 5 Add a PCI quirk to enable microphone input on the headphone jack on the Acer Nitro 5 AN515-58 laptop. Signed-off-by: Hridesh MG <hridesh699@gmail.com> Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/20241205171843.7787-1-hridesh699@gmail.com Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-12-06 13:56:43 +01:00
Simon Trimmer	47b17ba05a	ALSA: hda: cs35l56: Remove calls to cs35l56_force_sync_asp1_registers_from_cache() Commit `5d7e328e20` ("ASoC: cs35l56: Revert support for dual-ownership of ASP registers") replaced cs35l56_force_sync_asp1_registers_from_cache() with a dummy implementation so that the HDA driver would continue to build. Remove the calls from HDA and remove the stub function. Signed-off-by: Simon Trimmer <simont@opensource.cirrus.com> Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com> Link: https://patch.msgid.link/20241206105757.718750-1-rf@opensource.cirrus.com Signed-off-by: Takashi Iwai <tiwai@suse.de>	2024-12-06 13:54:06 +01:00
Haoyu Li	f1d3334d60	wifi: cfg80211: sme: init n_channels before channels[] access With the __counted_by annocation in cfg80211_scan_request struct, the "n_channels" struct member must be set before accessing the "channels" array. Failing to do so will trigger a runtime warning when enabling CONFIG_UBSAN_BOUNDS and CONFIG_FORTIFY_SOURCE. Fixes: `e3eac9f32e` ("wifi: cfg80211: Annotate struct cfg80211_scan_request with __counted_by") Signed-off-by: Haoyu Li <lihaoyu499@gmail.com> Link: https://patch.msgid.link/20241203152049.348806-1-lihaoyu499@gmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-12-06 10:45:22 +01:00
Guenter Roeck	4e450dfd0f	tty: serial: Work around warning backtrace in serial8250_set_defaults Commit `7c7e6c8924` ("tty: serial: handle HAS_IOPORT dependencies") triggers warning backtraces on a number of platforms which don't support IO ports. WARNING: CPU: 0 PID: 0 at drivers/tty/serial/8250/8250_port.c:470 serial8250_set_defaults+0x148/0x1d8 Unsupported UART type 0 The problem is seen because serial8250_set_defaults() is called for all members of the serial8250_ports[] array even if that array is not initialized. Work around the problem by only displaying the warning if the port type is not 0 (UPIO_PORT) or if iobase is set for the port. Fixes: `7c7e6c8924` ("tty: serial: handle HAS_IOPORT dependencies") Acked-by: Arnd Bergmann <arnd@arndb.de> Cc: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Stafford Horne <shorne@gmail.com> Link: https://lore.kernel.org/r/20241205143033.2695333-1-linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-06 08:00:01 +01:00
Dan Carpenter	11776cff0b	net/mlx5: DR, prevent potential error pointer dereference The dr_domain_add_vport_cap() function generally returns NULL on error but sometimes we want it to return ERR_PTR(-EBUSY) so the caller can retry. The problem here is that "ret" can be either -EBUSY or -ENOMEM and if it's and -ENOMEM then the error pointer is propogated back and eventually dereferenced in dr_ste_v0_build_src_gvmi_qpn_tag(). Fixes: `11a45def2e` ("net/mlx5: DR, Add support for SF vports") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/07477254-e179-43e2-b1b3-3b9db4674195@stanley.mountain Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-05 18:45:09 -08:00
Eric Dumazet	b04d86fff6	tipc: fix NULL deref in cleanup_bearer() syzbot found [1] that after blamed commit, ub->ubsock->sk was NULL when attempting the atomic_dec() : atomic_dec(&tipc_net(sock_net(ub->ubsock->sk))->wq_count); Fix this by caching the tipc_net pointer. [1] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000006: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000030-0x0000000000000037] CPU: 0 UID: 0 PID: 5896 Comm: kworker/0:3 Not tainted 6.13.0-rc1-next-20241203-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 Workqueue: events cleanup_bearer RIP: 0010:read_pnet include/net/net_namespace.h:387 [inline] RIP: 0010:sock_net include/net/sock.h:655 [inline] RIP: 0010:cleanup_bearer+0x1f7/0x280 net/tipc/udp_media.c:820 Code: 18 48 89 d8 48 c1 e8 03 42 80 3c 28 00 74 08 48 89 df e8 3c f7 99 f6 48 8b 1b 48 83 c3 30 e8 f0 e4 60 00 48 89 d8 48 c1 e8 03 <42> 80 3c 28 00 74 08 48 89 df e8 1a f7 99 f6 49 83 c7 e8 48 8b 1b RSP: 0018:ffffc9000410fb70 EFLAGS: 00010206 RAX: 0000000000000006 RBX: 0000000000000030 RCX: ffff88802fe45a00 RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffc9000410f900 RBP: ffff88807e1f0908 R08: ffffc9000410f907 R09: 1ffff92000821f20 R10: dffffc0000000000 R11: fffff52000821f21 R12: ffff888031d19980 R13: dffffc0000000000 R14: dffffc0000000000 R15: ffff88807e1f0918 FS: 0000000000000000(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000556ca050b000 CR3: 0000000031c0c000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Fixes: `6a2fa13312` ("tipc: Fix use-after-free of kernel socket in cleanup_bearer().") Reported-by: syzbot+46aa5474f179dacd1a3b@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/67508b5f.050a0220.17bd51.0070.GAE@google.com/T/#u Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20241204170548.4152658-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-05 17:36:22 -08:00
Namhyung Kim	c33aea446b	perf tools: Fix precise_ip fallback logic Sometimes it returns other than EOPNOTSUPP for invalid precise_ip so it cannot check the error code. Let's move the fallback after the missing feature checks so that it can handle EINVAL as well. This also aligns well with the existing behavior which blindly turns off the precise_ip but we check the missing features correctly now. Fixes: `af954f76ee` ("perf tools: Check fallback error and order") Reported-by: kernel test robot <oliver.sang@intel.com> Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Closes: https://lore.kernel.org/oe-lkp/202411301431.799e5531-lkp@intel.com Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/Z1DV0lN8qHSysX7f@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2024-12-05 15:15:29 -08:00
Remi Pommarel	fff8f17c1a	batman-adv: Do not let TT changes list grows indefinitely When TT changes list is too big to fit in packet due to MTU size, an empty OGM is sent expected other node to send TT request to get the changes. The issue is that tt.last_changeset was not built thus the originator was responding with previous changes to those TT requests (see batadv_send_my_tt_response). Also the changes list was never cleaned up effectively never ending growing from this point onwards, repeatedly sending the same TT response changes over and over, and creating a new empty OGM every OGM interval expecting for the local changes to be purged. When there is more TT changes that can fit in packet, drop all changes, send empty OGM and wait for TT request so we can respond with a full table instead. Fixes: `e1bf0c1409` ("batman-adv: tvlv - convert tt data sent within OGMs") Signed-off-by: Remi Pommarel <repk@triplefau.lt> Acked-by: Antonio Quartulli <Antonio@mandelbit.com> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>	2024-12-05 22:38:26 +01:00
Remi Pommarel	8038806db6	batman-adv: Remove uninitialized data in full table TT response The number of entries filled by batadv_tt_tvlv_generate() can be less than initially expected in batadv_tt_prepare_tvlv_{global,local}_data() (changes can be removed by batadv_tt_local_event() in ADD+DEL sequence in the meantime as the lock held during the whole tvlv global/local data generation). Thus tvlv_len could be bigger than the actual TT entry size that need to be sent so full table TT_RESPONSE could hold invalid TT entries such as below. * 00:00:00:00:00:00 -1 [....] ( 0) 88:12:4e:ad:7e:ba (179) (0x45845380) * 00:00:00:00:78:79 4092 [.W..] ( 0) 88:12:4e:ad:7e:3c (145) (0x8ebadb8b) Remove the extra allocated space to avoid sending uninitialized entries for full table TT_RESPONSE in both batadv_send_other_tt_response() and batadv_send_my_tt_response(). Fixes: `7ea7b4a142` ("batman-adv: make the TT CRC logic VLAN specific") Signed-off-by: Remi Pommarel <repk@triplefau.lt> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>	2024-12-05 22:38:26 +01:00
Remi Pommarel	f2f7358c38	batman-adv: Do not send uninitialized TT changes The number of TT changes can be less than initially expected in batadv_tt_tvlv_container_update() (changes can be removed by batadv_tt_local_event() in ADD+DEL sequence between reading tt_diff_entries_num and actually iterating the change list under lock). Thus tt_diff_len could be bigger than the actual changes size that need to be sent. Because batadv_send_my_tt_response sends the whole packet, uninitialized data can be interpreted as TT changes on other nodes leading to weird TT global entries on those nodes such as: * 00:00:00:00:00:00 -1 [....] ( 0) 88:12:4e:ad:7e:ba (179) (0x45845380) * 00:00:00:00:78:79 4092 [.W..] ( 0) 88:12:4e:ad:7e:3c (145) (0x8ebadb8b) All of the above also applies to OGM tvlv container buffer's tvlv_len. Remove the extra allocated space to avoid sending uninitialized TT changes in batadv_send_my_tt_response() and batadv_v_ogm_send_softif(). Fixes: `e1bf0c1409` ("batman-adv: tvlv - convert tt data sent within OGMs") Signed-off-by: Remi Pommarel <repk@triplefau.lt> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>	2024-12-05 22:38:26 +01:00
Prike Liang	5c3de6b02d	drm/amdkfd: Correct the migration DMA map direction The SVM DMA device map direction should be set the same as the DMA unmap setting, otherwise the DMA core will report the following warning. Before finialize this solution, there're some discussion on the DMA mapping type(stream-based or coherent) in this KFD migration case, followed by https://lore.kernel.org/all/04d4ab32 -45a1-4b88-86ee-fb0f35a0ca40@amd.com/T/. As there's no dma_sync_single_for_*() in the DMA buffer accessed that because this migration operation should be sync properly and automatically. Give that there's might not be a performance problem in various cache sync policy of DMA sync. Therefore, in order to simplify the DMA direction setting alignment, let's set the DMA map direction as BIDIRECTIONAL. [ 150.834218] WARNING: CPU: 8 PID: 1812 at kernel/dma/debug.c:1028 check_unmap+0x1cc/0x930 [ 150.834225] Modules linked in: amdgpu(OE) amdxcp drm_exec(OE) gpu_sched drm_buddy(OE) drm_ttm_helper(OE) ttm(OE) drm_suballoc_helper(OE) drm_display_helper(OE) drm_kms_helper(OE) i2c_algo_bit rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace netfs xt_conntrack xt_MASQUERADE nf_conntrack_netlink xfrm_user xfrm_algo iptable_nat xt_addrtype iptable_filter br_netfilter nvme_fabrics overlay nfnetlink_cttimeout nfnetlink openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c bridge stp llc sch_fq_codel intel_rapl_msr amd_atl intel_rapl_common snd_hda_codec_realtek snd_hda_codec_generic snd_hda_scodec_component snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg edac_mce_amd snd_pci_acp6x snd_hda_codec snd_acp_config snd_hda_core snd_hwdep snd_soc_acpi kvm_amd sunrpc snd_pcm kvm binfmt_misc snd_seq_midi crct10dif_pclmul snd_seq_midi_event ghash_clmulni_intel sha512_ssse3 snd_rawmidi nls_iso8859_1 sha256_ssse3 sha1_ssse3 snd_seq aesni_intel snd_seq_device crypto_simd snd_timer cryptd input_leds [ 150.834310] wmi_bmof serio_raw k10temp rapl snd sp5100_tco ipmi_devintf soundcore ccp ipmi_msghandler cm32181 industrialio mac_hid msr parport_pc ppdev lp parport efi_pstore drm(OE) ip_tables x_tables pci_stub crc32_pclmul nvme ahci libahci i2c_piix4 r8169 nvme_core i2c_designware_pci realtek i2c_ccgx_ucsi video wmi hid_generic cdc_ether usbnet usbhid hid r8152 mii [ 150.834354] CPU: 8 PID: 1812 Comm: rocrtst64 Tainted: G OE 6.10.0-custom #492 [ 150.834358] Hardware name: AMD Majolica-RN/Majolica-RN, BIOS RMJ1009A 06/13/2021 [ 150.834360] RIP: 0010:check_unmap+0x1cc/0x930 [ 150.834363] Code: c0 4c 89 4d c8 e8 34 bf 86 00 4c 8b 4d c8 4c 8b 45 c0 48 8b 4d b8 48 89 c6 41 57 4c 89 ea 48 c7 c7 80 49 b4 84 e8 b4 81 f3 ff <0f> 0b 48 c7 c7 04 83 ac 84 e8 76 ba fc ff 41 8b 76 4c 49 8d 7e 50 [ 150.834365] RSP: 0018:ffffaac5023739e0 EFLAGS: 00010086 [ 150.834368] RAX: 0000000000000000 RBX: ffffffff8566a2e0 RCX: 0000000000000027 [ 150.834370] RDX: ffff8f6a8f621688 RSI: 0000000000000001 RDI: ffff8f6a8f621680 [ 150.834372] RBP: ffffaac502373a30 R08: 00000000000000c9 R09: ffffaac502373850 [ 150.834373] R10: ffffaac502373848 R11: ffffffff84f46328 R12: ffffaac502373a40 [ 150.834375] R13: ffff8f6741045330 R14: ffff8f6741a77700 R15: ffffffff84ac831b [ 150.834377] FS: 00007faf0fc94c00(0000) GS:ffff8f6a8f600000(0000) knlGS:0000000000000000 [ 150.834379] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 150.834381] CR2: 00007faf0b600020 CR3: 000000010a52e000 CR4: 0000000000350ef0 [ 150.834383] Call Trace: [ 150.834385] <TASK> [ 150.834387] ? show_regs+0x6d/0x80 [ 150.834393] ? __warn+0x8c/0x140 [ 150.834397] ? check_unmap+0x1cc/0x930 [ 150.834400] ? report_bug+0x193/0x1a0 [ 150.834406] ? handle_bug+0x46/0x80 [ 150.834410] ? exc_invalid_op+0x1d/0x80 [ 150.834413] ? asm_exc_invalid_op+0x1f/0x30 [ 150.834420] ? check_unmap+0x1cc/0x930 [ 150.834425] debug_dma_unmap_page+0x86/0x90 [ 150.834431] ? srso_return_thunk+0x5/0x5f [ 150.834435] ? rmap_walk+0x28/0x50 [ 150.834438] ? srso_return_thunk+0x5/0x5f [ 150.834441] ? remove_migration_ptes+0x79/0x80 [ 150.834445] ? srso_return_thunk+0x5/0x5f [ 150.834448] dma_unmap_page_attrs+0xfa/0x1d0 [ 150.834453] svm_range_dma_unmap_dev+0x8a/0xf0 [amdgpu] [ 150.834710] svm_migrate_ram_to_vram+0x361/0x740 [amdgpu] [ 150.834914] svm_migrate_to_vram+0xa8/0xe0 [amdgpu] [ 150.835111] svm_range_set_attr+0xff2/0x1450 [amdgpu] [ 150.835311] svm_ioctl+0x4a/0x50 [amdgpu] [ 150.835510] kfd_ioctl_svm+0x54/0x90 [amdgpu] [ 150.835701] kfd_ioctl+0x3c2/0x530 [amdgpu] [ 150.835888] ? __pfx_kfd_ioctl_svm+0x10/0x10 [amdgpu] [ 150.836075] ? srso_return_thunk+0x5/0x5f [ 150.836080] ? tomoyo_file_ioctl+0x20/0x30 [ 150.836086] __x64_sys_ioctl+0x9c/0xd0 [ 150.836091] x64_sys_call+0x1219/0x20d0 [ 150.836095] do_syscall_64+0x51/0x120 [ 150.836098] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 150.836102] RIP: 0033:0x7faf0f11a94f [ 150.836105] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <41> 89 c0 3d 00 f0 ff ff 77 1f 48 8b 44 24 18 64 48 2b 04 25 28 00 [ 150.836107] RSP: 002b:00007ffeced26bc0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 150.836110] RAX: ffffffffffffffda RBX: 000055c683528fb0 RCX: 00007faf0f11a94f [ 150.836112] RDX: 00007ffeced26c60 RSI: 00000000c0484b20 RDI: 0000000000000003 [ 150.836114] RBP: 00007ffeced26c50 R08: 0000000000000000 R09: 0000000000000001 [ 150.836115] R10: 0000000000000032 R11: 0000000000000246 R12: 000055c683528bd0 [ 150.836117] R13: 0000000000000000 R14: 0000000000000021 R15: 0000000000000000 [ 150.836122] </TASK> [ 150.836124] ---[ end trace 0000000000000000 ]--- Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-12-05 14:31:54 -05:00
Kenneth Feng	3912a78cf7	drm/amd/pm: Set SMU v13.0.7 default workload type Set the default workload type to bootup type on smu v13.0.7. This is because of the constraint on smu v13.0.7. Gfx activity has an even higher set point on 3D fullscreen mode than the one on bootup mode. This causes the 3D fullscreen mode's performance is worse than the bootup mode's performance for the lightweighted/medium workload. For the high workload, the performance is the same between 3D fullscreen mode and bootup mode. v2: set the default workload in ASIC specific file Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.11.x	2024-12-05 14:31:54 -05:00
Lijo Lazar	8eb966f240	drm/amd/pm: Initialize power profile mode Refactor such that individual SMU IP versions can choose the startup power profile mode. If no preference, then use the generic default power profile selection logic. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.11.x	2024-12-05 14:31:14 -05:00
David (Ming Qiang) Wu	47f402a3e0	amdgpu/uvd: get ring reference from rq scheduler base.sched may not be set for each instance and should not be used for cases such as non-IB tests. Fixes: `2320c9e6a7` ("drm/sched: memset() 'job' in drm_sched_job_init()") Signed-off-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-12-05 14:15:22 -05:00
Christian König	12f325bcd2	drm/amdgpu: fix UVD contiguous CS mapping problem When starting the mpv player, Radeon R9 users are observing the below error in dmesg. [drm:amdgpu_uvd_cs_pass2 [amdgpu]] ERROR msg/fb buffer ff00f7c000-ff00f7e000 out of 256MB segment! The patch tries to set the TTM_PL_FLAG_CONTIGUOUS for both user flag(AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS) set and not set cases. v2: Make the TTM_PL_FLAG_CONTIGUOUS mandatory for user BO's. v3: revert back to v1, but fix the check instead (chk). Closes:https://gitlab.freedesktop.org/drm/amd/-/issues/3599 Closes:https://gitlab.freedesktop.org/drm/amd/-/issues/3501 Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.10+	2024-12-05 14:14:53 -05:00
Victor Zhao	9a4ab400f1	drm/amdgpu: use sjt mec fw on gfx943 for sriov Use second jump table in sriov for live migration or mulitple VF support so different VF can load different version of MEC as long as they support sjt Signed-off-by: Victor Zhao <Victor.Zhao@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-12-05 14:14:40 -05:00
Pratap Nirujogi	9f4ddfdc2c	Revert "drm/amdgpu: Fix ISP hw init issue" This reverts commit `274e3f4596`. Additional review comments to address. Will resubmit. Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Pratap Nirujogi <pratap.nirujogi@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-12-05 14:14:11 -05:00
Yishai Hadas	9c7c5430bc	vfio/mlx5: Align the page tracking max message size with the device capability Align the page tracking maximum message size with the device's capability instead of relying on PAGE_SIZE. This adjustment resolves a mismatch on systems where PAGE_SIZE is 64K, but the firmware only supports a maximum message size of 4K. Now that we rely on the device's capability for max_message_size, we must account for potential future increases in its value. Key considerations include: - Supporting message sizes that exceed a single system page (e.g., an 8K message on a 4K system). - Ensuring the RQ size is adjusted to accommodate at least 4 WQEs/messages, in line with the device specification. The above has been addressed as part of the patch. Fixes: `79c3cf2799` ("vfio/mlx5: Init QP based resources for dirty tracking") Reviewed-by: Cédric Le Goater <clg@redhat.com> Tested-by: Yingshun Cui <yicui@redhat.com> Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Link: https://lore.kernel.org/r/20241205122654.235619-1-yishaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2024-12-05 11:56:01 -07:00
Valentina Fernandez	48808b55b0	firmware: microchip: fix UL_IAP lock check in mpfs_auto_update_state() To verify that Auto Update is possible, the mpfs_auto_update_state() function performs a "Query Security Service Request" to the system controller. Previously, the check was performed on the first element of the response message, which was accessed using a 32-bit pointer. This caused the bitwise operation to reference incorrect data, as the response should be inspected at the byte level. Fixed this by casting the response to a u8 * pointer, ensuring the check correctly inspects the appropriate byte of the response message. Additionally, rename "UL_Auto Update" to "UL_IAP" to match the PolarFire Family System Services User Guide. Signed-off-by: Valentina Fernandez <valentina.fernandezalanis@microchip.com> Signed-off-by: Conor Dooley <conor.dooley@microchip.com>	2024-12-05 15:08:51 +00:00
Aneesh Kumar K.V (Arm)	4f776d81bf	arm64: dts: fvp: Update PCIe bus-range property These days, the Fixed Virtual Platforms(FVP) Base RevC model supports more PCI devices. Update the max bus number so that Linux can enumerate them correctly. Without this, the kernel throws the below error while booting with the default hierarchy \| pci_bus 0000:01: busn_res: [bus 01] end is updated to 01 \| pci_bus 0000:02: busn_res: can not insert [bus 02-01] under \| [bus 00-01] (conflicts with (null) [bus 00-01]) \| pci_bus 0000:02: busn_res: [bus 02-01] end is updated to 02 \| pci_bus 0000:02: busn_res: can not insert [bus 02] under \| [bus 00-01] (conflicts with (null) [bus 00-01]) \| pci_bus 0000:03: busn_res: can not insert [bus 03-01] under \| [bus 00-01] (conflicts with (null) [bus 00-01]) \| pci_bus 0000:03: busn_res: [bus 03-01] end is updated to 03 \| pci_bus 0000:03: busn_res: can not insert [bus 03] under \| [bus 00-01] (conflicts with (null) [bus 00-01]) \| pci_bus 0000:04: busn_res: can not insert [bus 04-01] under \| [bus 00-01] (conflicts with (null) [bus 00-01]) \| pci_bus 0000:04: busn_res: [bus 04-01] end is updated to 04 \| pci_bus 0000:04: busn_res: can not insert [bus 04] under \| [bus 00-01] (conflicts with (null) [bus 00-01]) \| pci 0000:00:01.0: BAR 14: assigned [mem 0x50000000-0x500fffff] \| pci-host-generic 40000000.pci: ECAM at [mem 0x40000000-0x4fffffff] \| for [bus 00-01] The change is using 0xff as max bus number because the ECAM window is 256MB in size. Below is the lspci output with and without the change: without fix =========== \| 00:00.0 Host bridge: ARM Device 00ba (rev 01) \| 00:01.0 PCI bridge: ARM Device 0def \| 00:02.0 PCI bridge: ARM Device 0def \| 00:03.0 PCI bridge: ARM Device 0def \| 00:04.0 PCI bridge: ARM Device 0def \| 00:1e.0 Unassigned class [ff00]: ARM Device ff80 \| 00:1e.1 Unassigned class [ff00]: ARM Device ff80 \| 00:1f.0 SATA controller: Device 0abc:aced (rev 01) \| 01:00.0 SATA controller: Device 0abc:aced (rev 01) with fix ======== \| 00:00.0 Host bridge: ARM Device 00ba (rev 01) \| 00:01.0 PCI bridge: ARM Device 0def \| 00:02.0 PCI bridge: ARM Device 0def \| 00:03.0 PCI bridge: ARM Device 0def \| 00:04.0 PCI bridge: ARM Device 0def \| 00:1e.0 Unassigned class [ff00]: ARM Device ff80 \| 00:1e.1 Unassigned class [ff00]: ARM Device ff80 \| 00:1f.0 SATA controller: Device 0abc:aced (rev 01) \| 01:00.0 SATA controller: Device 0abc:aced (rev 01) \| 02:00.0 Unassigned class [ff00]: ARM Device ff80 \| 02:00.4 Unassigned class [ff00]: ARM Device ff80 \| 03:00.0 PCI bridge: ARM Device 0def \| 04:00.0 PCI bridge: ARM Device 0def \| 04:01.0 PCI bridge: ARM Device 0def \| 04:02.0 PCI bridge: ARM Device 0def \| 05:00.0 SATA controller: Device 0abc:aced (rev 01) \| 06:00.0 Unassigned class [ff00]: ARM Device ff80 \| 06:00.7 Unassigned class [ff00]: ARM Device ff80 \| 07:00.0 Unassigned class [ff00]: ARM Device ff80 \| 07:00.3 Unassigned class [ff00]: ARM Device ff80 \| 08:00.0 Unassigned class [ff00]: ARM Device ff80 \| 08:00.1 Unassigned class [ff00]: ARM Device ff80 Cc: Sudeep Holla <sudeep.holla@arm.com> Cc: Lorenzo Pieralisi <lpieralisi@kernel.org> Cc: Rob Herring <robh@kernel.org> Cc: Krzysztof Kozlowski <krzk+dt@kernel.org> Cc: Conor Dooley <conor+dt@kernel.org> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Signed-off-by: Aneesh Kumar K.V (Arm) <aneesh.kumar@kernel.org> Message-Id: <20241128152543.1821878-1-aneesh.kumar@kernel.org> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>	2024-12-05 10:28:26 +00:00
Roger Quadros	140054a25f	mtd: rawnand: omap2: Fix build warnings with W=1 Add kernel-doc for functions to get rid of below warnings when built with W=1. drivers/mtd/nand/raw/omap2.c:260: warning: Function parameter or struct member 'chip' not described in 'omap_nand_data_in_pref' drivers/mtd/nand/raw/omap2.c:260: warning: Function parameter or struct member 'buf' not described in 'omap_nand_data_in_pref' drivers/mtd/nand/raw/omap2.c:260: warning: Function parameter or struct member 'len' not described in 'omap_nand_data_in_pref' drivers/mtd/nand/raw/omap2.c:260: warning: Function parameter or struct member 'force_8bit' not described in 'omap_nand_data_in_pref' drivers/mtd/nand/raw/omap2.c:304: warning: Function parameter or struct member 'chip' not described in 'omap_nand_data_out_pref' drivers/mtd/nand/raw/omap2.c:304: warning: Function parameter or struct member 'buf' not described in 'omap_nand_data_out_pref' drivers/mtd/nand/raw/omap2.c:304: warning: Function parameter or struct member 'len' not described in 'omap_nand_data_out_pref' drivers/mtd/nand/raw/omap2.c:304: warning: Function parameter or struct member 'force_8bit' not described in 'omap_nand_data_out_pref' drivers/mtd/nand/raw/omap2.c:446: warning: Function parameter or struct member 'chip' not described in 'omap_nand_data_in_dma_pref' drivers/mtd/nand/raw/omap2.c:446: warning: Function parameter or struct member 'buf' not described in 'omap_nand_data_in_dma_pref' drivers/mtd/nand/raw/omap2.c:446: warning: Function parameter or struct member 'len' not described in 'omap_nand_data_in_dma_pref' drivers/mtd/nand/raw/omap2.c:446: warning: Function parameter or struct member 'force_8bit' not described in 'omap_nand_data_in_dma_pref' drivers/mtd/nand/raw/omap2.c:467: warning: Function parameter or struct member 'chip' not described in 'omap_nand_data_out_dma_pref' drivers/mtd/nand/raw/omap2.c:467: warning: Function parameter or struct member 'buf' not described in 'omap_nand_data_out_dma_pref' drivers/mtd/nand/raw/omap2.c:467: warning: Function parameter or struct member 'len' not described in 'omap_nand_data_out_dma_pref' drivers/mtd/nand/raw/omap2.c:467: warning: Function parameter or struct member 'force_8bit' not described in 'omap_nand_data_out_dma_pref' Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202412031716.JfNIh1Uu-lkp@intel.com/ Signed-off-by: Roger Quadros <rogerq@kernel.org> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>	2024-12-05 11:15:00 +01:00
Maciej Andrzejewski	11e6831fd8	mtd: rawnand: arasan: Fix missing de-registration of NAND The NAND chip-selects are registered for the Arasan driver during initialization but are not de-registered when the driver is unloaded. As a result, if the driver is loaded again, the chip-selects remain registered and busy, making them unavailable for use. Fixes: `197b88fecc` ("mtd: rawnand: arasan: Add new Arasan NAND controller") Cc: stable@vger.kernel.org Signed-off-by: Maciej Andrzejewski ICEYE <maciej.andrzejewski@m-works.net> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>	2024-12-05 11:13:52 +01:00
Maciej Andrzejewski	b086a46dae	mtd: rawnand: arasan: Fix double assertion of chip-select When two chip-selects are configured in the device tree, and the second is a non-native GPIO, both the GPIO-based chip-select and the first native chip-select may be asserted simultaneously. This double assertion causes incorrect read and write operations. The issue occurs because when nfc->ncs <= 2, nfc->spare_cs is always initialized to 0 due to static initialization. Consequently, when the second chip-select (GPIO-based) is selected in anfc_assert_cs(), it is detected by anfc_is_gpio_cs(), and nfc->native_cs is assigned the value 0. This results in both the GPIO-based chip-select being asserted and the NAND controller register receiving 0, erroneously selecting the native chip-select. This patch resolves the issue, as confirmed by oscilloscope testing with configurations involving two or more chip-selects in the device tree. Fixes: `acbd3d0945` ("mtd: rawnand: arasan: Leverage additional GPIO CS") Cc: stable@vger.kernel.org Signed-off-by: Maciej Andrzejewski <maciej.andrzejewski@m-works.net> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>	2024-12-05 11:13:41 +01:00
Zichen Xie	9b458e8be0	mtd: diskonchip: Cast an operand to prevent potential overflow There may be a potential integer overflow issue in inftl_partscan(). parts[0].size is defined as "uint64_t" while mtd->erasesize and ip->firstUnit are defined as 32-bit unsigned integer. The result of the calculation will be limited to 32 bits without correct casting. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Zichen Xie <zichenxie0106@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>	2024-12-05 11:09:12 +01:00
Dan Carpenter	d8e4771f99	mtd: rawnand: fix double free in atmel_pmecc_create_user() The "user" pointer was converted from being allocated with kzalloc() to being allocated by devm_kzalloc(). Calling kfree(user) will lead to a double free. Fixes: `6d734f1bfc` ("mtd: rawnand: atmel: Fix possible memory leak") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>	2024-12-05 11:06:43 +01:00
Kalesh AP	d507d29bfd	RDMA/bnxt_re: Don't fail destroy QP and cleanup debugfs earlier Change bnxt_re_destroy_qp to always return 0 and don't fail in case of error during destroy. In addition, delete debugfs QP to earlier stage. Fixes: `d7d54769c0` ("RDMA/bnxt_re: Add debugfs hook in the driver") Reviewed-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/20241204075416.478431-6-kalesh-anakkur.purayil@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-12-05 03:59:44 -05:00
Kashyap Desai	064c22408a	RDMA/bnxt_re: Avoid sending the modify QP workaround for latest adapters The workaround to modify the UD QP from RTS to RTS is required only for older adapters. Issuing this for latest adapters can caus some unexpected behavior. Fix it Fixes: `1801d87b35` ("RDMA/bnxt_re: Support new 5760X P7 devices") Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/20241204075416.478431-4-kalesh-anakkur.purayil@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-12-05 03:57:52 -05:00
Selvin Xavier	5effcacc8a	RDMA/bnxt_re: Avoid initializing the software queue for user queues Software Queues to hold the WRs needs to be created for only kernel queues. Avoid allocating the unnecessary memory for user Queues. Fixes: `1ac5a40479` ("RDMA/bnxt_re: Add bnxt_re RoCE driver") Fixes: `159fb4ceac` ("RDMA/bnxt_re: introduce a function to allocate swq") Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/20241204075416.478431-3-kalesh-anakkur.purayil@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-12-05 03:57:52 -05:00
Kashyap Desai	79d330fbdf	RDMA/bnxt_re: Fix max SGEs for the Work Request Gen P7 supports up to 13 SGEs for now. WQE software structure can hold only 6 now. Since the max send sge is reported as 13, the stack can give requests up to 13 SGEs. This is causing traffic failures and system crashes. Use the define for max SGE supported for variable size. This will work for both static and variable WQEs. Fixes: `227f51743b` ("RDMA/bnxt_re: Fix the max WQE size for static WQE support") Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/20241204075416.478431-2-kalesh-anakkur.purayil@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-12-05 03:57:52 -05:00
Patrisious Haddad	e05feab22f	RDMA/mlx5: Enforce same type port association for multiport RoCE Different core device types such as PFs and VFs shouldn't be affiliated together since they have different capabilities, fix that by enforcing type check before doing the affiliation. Fixes: `32f69e4be2` ("{net, IB}/mlx5: Manage port association for multiport RoCE") Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Link: https://patch.msgid.link/88699500f690dff1c1852c1ddb71f8a1cc8b956e.1733233480.git.leonro@nvidia.com Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-12-05 03:35:11 -05:00
Namhyung Kim	968121f0a6	perf tools: Fix build error on generated/fs_at_flags_array.c It should only have generic flags in the array but the recent header sync brought a new flags to fcntl.h and caused a build error. Let's update the shell script to exclude flags specific to name_to_handle_at(). CC trace/beauty/fs_at_flags.o In file included from trace/beauty/fs_at_flags.c:21: tools/perf/trace/beauty/generated/fs_at_flags_array.c:13:30: error: initialized field overwritten [-Werror=override-init] 13 \| [ilog2(0x002) + 1] = "HANDLE_CONNECTABLE", \| ^~~~~~~~~~~~~~~~~~~~ tools/perf/trace/beauty/generated/fs_at_flags_array.c:13:30: note: (near initialization for ‘fs_at_flags[2]’) Reviewed-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20241203035349.1901262-12-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2024-12-04 14:34:50 -08:00
Namhyung Kim	c994ac74cc	tools headers: Sync uapi/linux/prctl.h with the kernel sources To pick up the changes in this cset: `09d6775f50` riscv: Add support for userspace pointer masking `91e102e797` prctl: arch-agnostic prctl for shadow stack This addresses these perf build warnings: Warning: Kernel ABI header differences: diff -u tools/perf/trace/beauty/include/uapi/linux/prctl.h include/uapi/linux/prctl.h Please see tools/include/uapi/README for further details. Reviewed-by: James Clark <james.clark@linaro.org> Cc: Mark Brown <broonie@kernel.org> Cc: Palmer Dabbelt <palmer@rivosinc.com> Link: https://lore.kernel.org/r/20241203035349.1901262-11-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2024-12-04 14:34:50 -08:00
Namhyung Kim	02116fcfd8	tools headers: Sync uapi/linux/mount.h with the kernel sources To pick up the changes in this cset: `aefff51e1c` statmount: retrieve security mount options `2f4d4503e9` statmount: add flag to retrieve unescaped options `44010543fc` fs: add the ability for statmount() to report the sb_source `ed9d95f691` fs: add the ability for statmount() to report the fs_subtype This addresses these perf build warnings: Warning: Kernel ABI header differences: diff -u tools/perf/trace/beauty/include/uapi/linux/mount.h include/uapi/linux/mount.h Please see tools/include/uapi/README for further details. Reviewed-by: James Clark <james.clark@linaro.org> Cc: Christian Brauner <brauner@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20241203035349.1901262-10-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2024-12-04 14:34:50 -08:00
Namhyung Kim	6d442c69cb	tools headers: Sync uapi/linux/fcntl.h with the kernel sources To pick up the changes in this cset: `c374196b2b` ("fs: name_to_handle_at() support for "explicit connectable" file handles") `95f567f81e` ("fs: Simplify getattr interface function checking AT_GETATTR_NOSEC flag") This addresses these perf build warnings: Warning: Kernel ABI header differences: diff -u tools/perf/trace/beauty/include/uapi/linux/fcntl.h include/uapi/linux/fcntl.h Please see tools/include/uapi/README for further details. Reviewed-by: James Clark <james.clark@linaro.org> Cc: Jeff Layton <jlayton@kernel.org> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Alexander Aring <alex.aring@gmail.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20241203035349.1901262-9-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2024-12-04 14:34:50 -08:00
Namhyung Kim	3cef7d8b12	tools headers: Sync uapi/asm-generic/mman.h with the kernel sources To pick up the changes in this cset: `3630e82ab6` ("mman: Add map_shadow_stack() flags") This addresses these perf build warnings: Warning: Kernel ABI header differences: diff -u tools/include/uapi/asm-generic/mman.h include/uapi/asm-generic/mman.h Please see tools/include/uapi/README for further details. Reviewed-by: James Clark <james.clark@linaro.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Mark Brown <broonie@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: linux-arch@vger.kernel.org Link: https://lore.kernel.org/r/20241203035349.1901262-8-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2024-12-04 14:34:50 -08:00
Namhyung Kim	81b483f722	tools headers: Sync xattrat syscall changes with the kernel sources To pick up the changes in this cset: `6140be90ec` ("fs/xattr: add at family syscalls") This addresses these perf build warnings: Warning: Kernel ABI header differences: diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h diff -u tools/perf/arch/x86/entry/syscalls/syscall_32.tbl arch/x86/entry/syscalls/syscall_32.tbl diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl The arm64 changes are not included as it requires more changes in the tools. It'll be worked for the later cycle. Please see tools/include/uapi/README for further details. Reviewed-by: James Clark <james.clark@linaro.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Brauner <brauner@kernel.org> CC: x86@kernel.org CC: linux-mips@vger.kernel.org CC: linuxppc-dev@lists.ozlabs.org CC: linux-s390@vger.kernel.org Link: https://lore.kernel.org/r/20241203035349.1901262-7-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2024-12-04 14:34:50 -08:00
Namhyung Kim	76e2319979	tools headers: Sync arm64 kvm header with the kernel sources To pick up the changes in this cset: `97413cea1c` ("KVM: arm64: Add PSCI v1.3 SYSTEM_OFF2 function for hibernation") This addresses these perf build warnings: Warning: Kernel ABI header differences: diff -u tools/arch/arm64/include/uapi/asm/kvm.h arch/arm64/include/uapi/asm/kvm.h Please see tools/include/uapi/README for further details. Reviewed-by: James Clark <james.clark@linaro.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: kvmarm@lists.linux.dev Link: https://lore.kernel.org/r/20241203035349.1901262-6-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2024-12-04 14:34:49 -08:00
Namhyung Kim	e120829dbf	tools headers: Sync x86 kvm and cpufeature headers with the kernel To pick up the changes in this cset: `a0423af92c` ("x86: KVM: Advertise CPUIDs for new instructions in Clearwater Forest") `0c487010cb` ("x86/cpufeatures: Add X86_FEATURE_AMD_WORKLOAD_CLASS feature bit") `1ad4667066` ("x86/cpufeatures: Add X86_FEATURE_AMD_HETEROGENEOUS_CORES") `104edc6efc` ("x86/cpufeatures: Rename X86_FEATURE_FAST_CPPC to have AMD prefix") `3ea87dfa31` ("x86/cpufeatures: Add a IBPB_NO_RET BUG flag") `ff898623af` ("x86/cpufeatures: Define X86_FEATURE_AMD_IBPB_RET") `dcb988cdac` ("KVM: x86: Quirk initialization of feature MSRs to KVM's max configuration") This addresses these perf build warnings: Warning: Kernel ABI header differences: diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h Please see tools/include/uapi/README for further details. Reviewed-by: James Clark <james.clark@linaro.org> Cc: Sean Christopherson <seanjc@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: x86@kernel.org Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20241203035349.1901262-5-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2024-12-04 14:34:49 -08:00
Namhyung Kim	e2064b7c5d	tools headers: Sync uapi/linux/kvm.h with the kernel sources To pick up the changes in this cset: `e785dfacf7` ("LoongArch: KVM: Add PCHPIC device support") `2e8b9df826` ("LoongArch: KVM: Add EIOINTC device support") `c532de5a67` ("LoongArch: KVM: Add IPI device support") This addresses these perf build warnings: Warning: Kernel ABI header differences: diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h Please see tools/include/uapi/README for further details. Reviewed-by: James Clark <james.clark@linaro.org> Cc: Sean Christopherson <seanjc@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Tianrui Zhao <zhaotianrui@loongson.cn> Cc: Bibo Mao <maobibo@loongson.cn> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: kvm@vger.kernel.org Cc: loongarch@lists.linux.dev Link: https://lore.kernel.org/r/20241203035349.1901262-4-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2024-12-04 14:34:49 -08:00
Namhyung Kim	5229df8fb6	tools headers: Sync uapi/linux/perf_event.h with the kernel sources To pick up the changes in this cset: `18d92bb57c` ("perf/core: Add aux_pause, aux_resume, aux_start_paused") This addresses these perf build warnings: Warning: Kernel ABI header differences: diff -u tools/include/uapi/linux/perf_event.h include/uapi/linux/perf_event.h Please see tools/include/uapi/README for further details. Reviewed-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20241203035349.1901262-3-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2024-12-04 14:34:49 -08:00
Namhyung Kim	5fc3a088ee	tools headers: Sync uapi/drm/drm.h with the kernel sources To pick up the changes in this cset: `56c594d8df` ("drm: add DRM_SET_CLIENT_NAME ioctl") This addresses these perf build warnings: Warning: Kernel ABI header differences: diff -u tools/include/uapi/drm/drm.h include/uapi/drm/drm.h Please see tools/include/uapi/README for further details. Reviewed-by: James Clark <james.clark@linaro.org> Cc: David Airlie <airlied@gmail.com> Cc: Simona Vetter <simona@ffwll.ch> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: dri-devel@lists.freedesktop.org Link: https://lore.kernel.org/r/20241203035349.1901262-2-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2024-12-04 14:34:49 -08:00
Suraj Sonawane	265e98f72b	acpi: nfit: vmalloc-out-of-bounds Read in acpi_nfit_ctl Fix an issue detected by syzbot with KASAN: BUG: KASAN: vmalloc-out-of-bounds in cmd_to_func drivers/acpi/nfit/ core.c:416 [inline] BUG: KASAN: vmalloc-out-of-bounds in acpi_nfit_ctl+0x20e8/0x24a0 drivers/acpi/nfit/core.c:459 The issue occurs in cmd_to_func when the call_pkg->nd_reserved2 array is accessed without verifying that call_pkg points to a buffer that is appropriately sized as a struct nd_cmd_pkg. This can lead to out-of-bounds access and undefined behavior if the buffer does not have sufficient space. To address this, a check was added in acpi_nfit_ctl() to ensure that buf is not NULL and that buf_len is less than sizeof(*call_pkg) before accessing it. This ensures safe access to the members of call_pkg, including the nd_reserved2 array. Reported-by: syzbot+7534f060ebda6b8b51b3@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=7534f060ebda6b8b51b3 Tested-by: syzbot+7534f060ebda6b8b51b3@syzkaller.appspotmail.com Fixes: `ebe9f6f19d` ("acpi/nfit: Fix bus command validation") Signed-off-by: Suraj Sonawane <surajsonawane0215@gmail.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://patch.msgid.link/20241118162609.29063-1-surajsonawane0215@gmail.com Signed-off-by: Ira Weiny <ira.weiny@intel.com>	2024-12-04 14:58:44 -06:00
guanjing	f24d192985	sched_ext: fix application of sizeof to pointer sizeof when applied to a pointer typed expression gives the size of the pointer. The proper fix in this particular case is to code sizeof(*cpuset) instead of sizeof(cpuset). This issue was detected with the help of Coccinelle. Fixes: `22a920209a` ("sched_ext: Implement tickless support") Signed-off-by: guanjing <guanjing@cmss.chinamobile.com> Acked-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2024-12-04 09:47:39 -10:00
Ihor Solodrai	ef7009decc	selftests/sched_ext: fix build after renames in sched_ext API The selftests are falining to build on current tip of bpf-next and sched_ext [1]. This has broken BPF CI [2] after merge from upstream. Use appropriate function names in the selftests according to the recent changes in the sched_ext API [3]. [1] https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/commit/?id=fc39fb56917bb3cb53e99560ca3612a84456ada2 [2] https://github.com/kernel-patches/bpf/actions/runs/11959327258/job/33340923745 [3] https://lore.kernel.org/all/20241109194853.580310-1-tj@kernel.org/ Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me> Acked-by: Andrea Righi <arighi@nvidia.com> Acked-by: David Vernet <void@manifault.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2024-12-04 09:46:39 -10:00
Dave Penkler	48e8a8160d	staging: gpib: Fix i386 build issue These drivers cast resource_type_t to void * causing the build to fail. With CONFIG_X86_PAE enabled the resource_size_t type is a 64bit unsigned int which cannot be cast to a 32 bit pointer. Disable these drivers if X68_PAE is enabled Reported-by: Guenter Roeck <linux@roeck-us.net> Closes: https://lore.kernel.org/all/f10e976e-7a04-4454-b38d-39cd18f142da@roeck-us.net/ Fixes: `e9dc69956d` ("staging: gpib: Add Computer Boards GPIB driver") Fixes: `e1339245eb` ("staging: gpib: Add Computer Equipment Corporation GPIB driver") Fixes: `bb1bd92fa0` ("staging: gpib: Add ines GPIB driver") Fixes: `0cd5b05551` ("staging: gpib: Add TNT4882 chip based GPIB driver") Signed-off-by: Dave Penkler <dpenkler@gmail.com> Link: https://lore.kernel.org/r/20241204162128.25617-1-dpenkler@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-04 17:29:20 +01:00
Dave Penkler	1d8c2d4b89	staging: gpib: Fix faulty workaround for assignment in if This was detected by Coverity. Add the missing assignment in the else branch of the if Reported-by: Kees Bakker <kees@ijzerbout.nl> Fixes: `fce79512a9` ("staging: gpib: Add LPVO DIY USB GPIB driver") Signed-off-by: Dave Penkler <dpenkler@gmail.com> Link: https://lore.kernel.org/r/20241204145713.11889-5-dpenkler@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-04 16:58:33 +01:00
Dave Penkler	80242c4a9d	staging: gpib: Workaround for ppc build failure Make GPIB_FMH depend on !PPC Reported_by: Stephen Rothwell <sfr@canb.auug.org.au> Link: https://lore.kernel.org/all/20241015165538.634707e5@canb.auug.org.au/ Link: https://lore.kernel.org/r/20241204134736.6660-1-dpenkler@gmail.com Signed-off-by: Dave Penkler <dpenkler@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-04 16:56:30 +01:00
Nathan Chancellor	f580786ea9	staging: gpib: Make GPIB_NI_PCI_ISA depend on HAS_IOPORT After commit `78ecb03756` ("staging: gpib: make port I/O code conditional"), building tnt4882.ko on platforms without HAS_IOPORT (such as hexagon and s390) fails with: ERROR: modpost: "inb_wrapper" [drivers/staging/gpib/tnt4882/tnt4882.ko] undefined! ERROR: modpost: "inw_wrapper" [drivers/staging/gpib/tnt4882/tnt4882.ko] undefined! ERROR: modpost: "nec7210_locking_ioport_write_byte" [drivers/staging/gpib/tnt4882/tnt4882.ko] undefined! ERROR: modpost: "nec7210_locking_ioport_read_byte" [drivers/staging/gpib/tnt4882/tnt4882.ko] undefined! ERROR: modpost: "outb_wrapper" [drivers/staging/gpib/tnt4882/tnt4882.ko] undefined! ERROR: modpost: "outw_wrapper" [drivers/staging/gpib/tnt4882/tnt4882.ko] undefined! Only allow tnt4882.ko to be built when CONFIG_HAS_IOPORT is set to avoid this build failure, as this driver unconditionally needs it. Fixes: `78ecb03756` ("staging: gpib: make port I/O code conditional") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Link: https://lore.kernel.org/r/20241123-gpib-tnt4882-depends-on-has_ioport-v1-1-033c58b64751@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-04 16:56:21 +01:00
Claudiu Beznea	7cc0e0a43a	serial: sh-sci: Check if TX data was written to device in .tx_empty() On the Renesas RZ/G3S, when doing suspend to RAM, the uart_suspend_port() is called. The uart_suspend_port() calls 3 times the struct uart_port::ops::tx_empty() before shutting down the port. According to the documentation, the struct uart_port::ops::tx_empty() API tests whether the transmitter FIFO and shifter for the port is empty. The Renesas RZ/G3S SCIFA IP reports the number of data units stored in the transmit FIFO through the FDR (FIFO Data Count Register). The data units in the FIFOs are written in the shift register and transmitted from there. The TEND bit in the Serial Status Register reports if the data was transmitted from the shift register. In the previous code, in the tx_empty() API implemented by the sh-sci driver, it is considered that the TX is empty if the hardware reports the TEND bit set and the number of data units in the FIFO is zero. According to the HW manual, the TEND bit has the following meaning: 0: Transmission is in the waiting state or in progress. 1: Transmission is completed. It has been noticed that when opening the serial device w/o using it and then switch to a power saving mode, the tx_empty() call in the uart_port_suspend() function fails, leading to the "Unable to drain transmitter" message being printed on the console. This is because the TEND=0 if nothing has been transmitted and the FIFOs are empty. As the TEND=0 has double meaning (waiting state, in progress) we can't determined the scenario described above. Add a software workaround for this. This sets a variable if any data has been sent on the serial console (when using PIO) or if the DMA callback has been called (meaning something has been transmitted). In the tx_empty() API the status of the DMA transaction is also checked and if it is completed or in progress the code falls back in checking the hardware registers instead of relying on the software variable. Fixes: `73a19e4c03` ("serial: sh-sci: Add DMA support.") Cc: stable@vger.kernel.org Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Link: https://lore.kernel.org/r/20241125115856.513642-1-claudiu.beznea.uj@bp.renesas.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-04 16:51:25 +01:00
Lucas De Marchi	33ead7e538	usb: typec: ucsi: Fix connector status writing past buffer size Similar to commit `65c4c9447b` ("usb: typec: ucsi: Fix a missing bits to bytes conversion in ucsi_init()"), there was a missing conversion from bits to bytes. Here the outcome is worse though: since the value is lower than UCSI_MAX_DATA_LENGTH, instead of bailing out with an error, it writes past the buffer size. The error is then seen in other places like below: Oops: general protection fault, probably for non-canonical address 0x891e812cd0ed968: 0000 [#1] PREEMPT SMP NOPTI CPU: 3 UID: 110 PID: 906 Comm: prometheus-node Not tainted 6.13.0-rc1-xe #1 Hardware name: Intel Corporation Lunar Lake Client Platform/LNL-M LP5 RVP1, BIOS LNLMFWI1.R00.3222.D84.2410171025 10/17/2024 RIP: 0010:power_supply_get_property+0x3e/0xe0 Code: 85 c0 7e 4f 4c 8b 07 89 f3 49 89 d4 49 8b 48 20 48 85 c9 74 72 49 8b 70 18 31 d2 31 c0 eb 0b 83 c2 01 48 63 c2 48 39 c8 73 5d <3b> 1c 86 75 f0 49 8b 40 28 4c 89 e2 89 de ff d0 0f 1f 00 5b 41 5c RSP: 0018:ffffc900017dfa50 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000011 RCX: c963b02c06092008 RDX: 0000000000000000 RSI: 0891e812cd0ed968 RDI: ffff888121dd6800 RBP: ffffc900017dfa68 R08: ffff88810a4024b8 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffffc900017dfa78 R13: ffff888121dd6800 R14: ffff888138ad2c00 R15: ffff88810c57c528 FS: 00007713a2ffd6c0(0000) GS:ffff88846f380000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000c0004b1000 CR3: 0000000121ce8003 CR4: 0000000000f72ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> ? show_regs+0x6c/0x80 ? die_addr+0x37/0xa0 ? exc_general_protection+0x1c1/0x440 ? asm_exc_general_protection+0x27/0x30 ? power_supply_get_property+0x3e/0xe0 power_supply_hwmon_read+0x50/0xe0 hwmon_attr_show+0x46/0x170 dev_attr_show+0x1a/0x70 sysfs_kf_seq_show+0xaa/0x120 kernfs_seq_show+0x41/0x60 Just use the buffer size as argument to fix it. Fixes: `226ff2e681` ("usb: typec: ucsi: Convert connector specific commands to bitmaps") Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Thomas Weißschuh <linux@weissschuh.net> Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.com> Reported-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com> Closes: https://lore.kernel.org/all/SJ1PR11MB6129CCD82CD78D8EE6E27EF4B9362@SJ1PR11MB6129.namprd11.prod.outlook.com/ Suggested-by: Thomas Weißschuh <linux@weissschuh.net> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Tested-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com> Link: https://lore.kernel.org/r/20241203200010.2821132-1-lucas.demarchi@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-04 16:30:28 +01:00
Łukasz Bartosik	e37b383df9	usb: typec: ucsi: Fix completion notifications OPM PPM LPM \| 1.send cmd \| \| \|-------------------------->\| \| \| \|-- \| \| \| \| 2.set busy bit in CCI \| \| \|<- \| \| 3.notify the OPM \| \| \|<--------------------------\| \| \| \| 4.send cmd to be executed \| \| \|-------------------------->\| \| \| \| \| \| 5.cmd completed \| \| \|<--------------------------\| \| \| \| \| \|-- \| \| \| \| 6.set cmd completed \| \| \|<- bit in CCI \| \| \| \| \| 7.notify the OPM \| \| \|<--------------------------\| \| \| \| \| \| 8.handle notification \| \| \| from point 3, read CCI \| \| \|<--------------------------\| \| \| \| \| When the PPM receives command from the OPM (p.1) it sets the busy bit in the CCI (p.2), sends notification to the OPM (p.3) and forwards the command to be executed by the LPM (p.4). When the PPM receives command completion from the LPM (p.5) it sets command completion bit in the CCI (p.6) and sends notification to the OPM (p.7). If command execution by the LPM is fast enough then when the OPM starts handling the notification from p.3 in p.8 and reads the CCI value it will see command completion bit set and will call complete(). Then complete() might be called again when the OPM handles notification from p.7. This fix replaces test_bit() with test_and_clear_bit() in ucsi_notify_common() in order to call complete() only once per request. This fix also reinitializes completion variable in ucsi_sync_control_common() before a command is sent. Fixes: `584e8df589` ("usb: typec: ucsi: extract common code for command handling") Cc: stable@vger.kernel.org Signed-off-by: Łukasz Bartosik <ukaszb@chromium.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Reviewed-by: Benson Leung <bleung@chromium.org> Link: https://lore.kernel.org/r/20241203102318.3386345-1-ukaszb@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-04 16:30:16 +01:00
Stefan Wahren	1cf1bd88f1	usb: dwc2: Fix HCD port connection race On Raspberry Pis without onboard USB hub frequent device reconnects can trigger a interrupt storm after DWC2 entered host clock gating. This is caused by a race between _dwc2_hcd_suspend() and the port interrupt, which sets port_connect_status. The issue occurs if port_connect_status is still 1, but there is no connection anymore: usb 1-1: USB disconnect, device number 25 dwc2 3f980000.usb: _dwc2_hcd_suspend: port_connect_status: 1 dwc2 3f980000.usb: Entering host clock gating. Disabling IRQ #66 irq 66: nobody cared (try booting with the "irqpoll" option) CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.12.0-gc1bb81b13202-dirty #322 Hardware name: BCM2835 Call trace: unwind_backtrace from show_stack+0x10/0x14 show_stack from dump_stack_lvl+0x50/0x64 dump_stack_lvl from __report_bad_irq+0x38/0xc0 __report_bad_irq from note_interrupt+0x2ac/0x2f4 note_interrupt from handle_irq_event+0x88/0x8c handle_irq_event from handle_level_irq+0xb4/0x1ac handle_level_irq from generic_handle_domain_irq+0x24/0x34 generic_handle_domain_irq from bcm2836_chained_handle_irq+0x24/0x28 bcm2836_chained_handle_irq from generic_handle_domain_irq+0x24/0x34 generic_handle_domain_irq from generic_handle_arch_irq+0x34/0x44 generic_handle_arch_irq from __irq_svc+0x88/0xb0 Exception stack(0xc1d01f20 to 0xc1d01f68) 1f20: 0004ef3c 00000001 00000000 00000000 c1d09780 c1f6bb5c c1d04e54 c1c60ca8 1f40: c1d04e94 00000000 00000000 c1d092a8 c1f6af20 c1d01f70 c1211b98 c1212f40 1f60: 60000013 ffffffff __irq_svc from default_idle_call+0x1c/0xb0 default_idle_call from do_idle+0x21c/0x284 do_idle from cpu_startup_entry+0x28/0x2c cpu_startup_entry from kernel_init+0x0/0x12c handlers: [<e3a25c00>] dwc2_handle_common_intr [<58bf98a3>] usb_hcd_irq Disabling IRQ #66 So avoid this by reading the connection status directly. Fixes: `113f86d0c3` ("usb: dwc2: Update partial power down entering by system suspend") Signed-off-by: Stefan Wahren <wahrenst@gmx.net> Link: https://lore.kernel.org/r/20241202001631.75473-4-wahrenst@gmx.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-04 16:30:05 +01:00
Stefan Wahren	a8d3e4a734	usb: dwc2: hcd: Fix GetPortStatus & SetPortFeature On Rasperry Pis without onboard USB hub the power cycle during power connect init only disable the port but never enabled it again: usb usb1-port1: attempt power cycle The port relevant part in dwc2_hcd_hub_control() is skipped in case port_connect_status = 0 under the assumption the core is or will be soon in device mode. But this assumption is wrong, because after ClearPortFeature USB_PORT_FEAT_POWER the port_connect_status will also be 0 and SetPortFeature (incl. USB_PORT_FEAT_POWER) will be a no-op. Fix the behavior of dwc2_hcd_hub_control() by replacing the port_connect_status check with dwc2_is_device_mode(). Link: https://github.com/raspberrypi/linux/issues/6247 Fixes: `7359d482eb` ("staging: HCD files for the DWC2 driver") Signed-off-by: Stefan Wahren <wahrenst@gmx.net> Link: https://lore.kernel.org/r/20241202001631.75473-3-wahrenst@gmx.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-04 16:30:05 +01:00
Stefan Wahren	336f72d3cb	usb: dwc2: Fix HCD resume The Raspberry Pi can suffer on interrupt storms on HCD resume. The dwc2 driver sometimes misses to enable HCD_FLAG_HW_ACCESSIBLE before re-enabling the interrupts. This causes a situation where both handler ignore a incoming port interrupt and force the upper layers to disable the dwc2 interrupt line. This leaves the USB interface in a unusable state: irq 66: nobody cared (try booting with the "irqpoll" option) CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 6.10.0-rc3 Hardware name: BCM2835 Call trace: unwind_backtrace from show_stack+0x10/0x14 show_stack from dump_stack_lvl+0x50/0x64 dump_stack_lvl from __report_bad_irq+0x38/0xc0 __report_bad_irq from note_interrupt+0x2ac/0x2f4 note_interrupt from handle_irq_event+0x88/0x8c handle_irq_event from handle_level_irq+0xb4/0x1ac handle_level_irq from generic_handle_domain_irq+0x24/0x34 generic_handle_domain_irq from bcm2836_chained_handle_irq+0x24/0x28 bcm2836_chained_handle_irq from generic_handle_domain_irq+0x24/0x34 generic_handle_domain_irq from generic_handle_arch_irq+0x34/0x44 generic_handle_arch_irq from __irq_svc+0x88/0xb0 Exception stack(0xc1b01f20 to 0xc1b01f68) 1f20: 0005c0d4 00000001 00000000 00000000 c1b09780 c1d6b32c c1b04e54 c1a5eae8 1f40: c1b04e90 00000000 00000000 00000000 c1d6a8a0 c1b01f70 c11d2da8 c11d4160 1f60: 60000013 ffffffff __irq_svc from default_idle_call+0x1c/0xb0 default_idle_call from do_idle+0x21c/0x284 do_idle from cpu_startup_entry+0x28/0x2c cpu_startup_entry from kernel_init+0x0/0x12c handlers: [<f539e0f4>] dwc2_handle_common_intr [<75cd278b>] usb_hcd_irq Disabling IRQ #66 So enable the HCD_FLAG_HW_ACCESSIBLE flag in case there is a port connection. Fixes: `c74c26f6e3` ("usb: dwc2: Fix partial power down exiting by system resume") Closes: https://lore.kernel.org/linux-usb/3fd0c2fb-4752-45b3-94eb-42352703e1fd@gmx.net/T/ Link: https://lore.kernel.org/all/5e8cbce0-3260-2971-484f-fc73a3b2bd28@synopsys.com/ Signed-off-by: Stefan Wahren <wahrenst@gmx.net> Link: https://lore.kernel.org/r/20241202001631.75473-2-wahrenst@gmx.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-04 16:30:05 +01:00
Lianqin Hu	4cfbca86f6	usb: gadget: u_serial: Fix the issue that gs_start_io crashed due to accessing null pointer Considering that in some extreme cases, when u_serial driver is accessed by multiple threads, Thread A is executing the open operation and calling the gs_open, Thread B is executing the disconnect operation and calling the gserial_disconnect function,The port->port_usb pointer will be set to NULL. E.g. Thread A Thread B gs_open() gadget_unbind_driver() gs_start_io() composite_disconnect() gs_start_rx() gserial_disconnect() ... ... spin_unlock(&port->port_lock) status = usb_ep_queue() spin_lock(&port->port_lock) spin_lock(&port->port_lock) port->port_usb = NULL gs_free_requests(port->port_usb->in) spin_unlock(&port->port_lock) Crash This causes thread A to access a null pointer (port->port_usb is null) when calling the gs_free_requests function, causing a crash. If port_usb is NULL, the release request will be skipped as it will be done by gserial_disconnect. So add a null pointer check to gs_start_io before attempting to access the value of the pointer port->port_usb. Call trace: gs_start_io+0x164/0x25c gs_open+0x108/0x13c tty_open+0x314/0x638 chrdev_open+0x1b8/0x258 do_dentry_open+0x2c4/0x700 vfs_open+0x2c/0x3c path_openat+0xa64/0xc60 do_filp_open+0xb8/0x164 do_sys_openat2+0x84/0xf0 __arm64_sys_openat+0x70/0x9c invoke_syscall+0x58/0x114 el0_svc_common+0x80/0xe0 do_el0_svc+0x1c/0x28 el0_svc+0x38/0x68 Fixes: `c1dca562be` ("usb gadget: split out serial core") Cc: stable@vger.kernel.org Suggested-by: Prashanth K <quic_prashk@quicinc.com> Signed-off-by: Lianqin Hu <hulianqin@vivo.com> Acked-by: Prashanth K <quic_prashk@quicinc.com> Link: https://lore.kernel.org/r/TYUPR06MB62178DC3473F9E1A537DCD02D2362@TYUPR06MB6217.apcprd06.prod.outlook.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-04 16:27:23 +01:00
Radhey Shyam Pandey	ce15d6b3d5	usb: misc: onboard_usb_dev: skip suspend/resume sequence for USB5744 SMBus support USB5744 SMBus initialization is done once in probe() and doing it in resume is not supported so avoid going into suspend and reset the HUB. There is a sysfs property 'always_powered_in_suspend' to implement this feature but since default state should be set to a working configuration so override this property value. It fixes the suspend/resume testcase on Kria KR260 Robotics Starter Kit. Fixes: `6782311d04` ("usb: misc: onboard_usb_dev: add Microchip usb5744 SMBus programming support") Cc: stable@vger.kernel.org Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com> Link: https://lore.kernel.org/r/1733165302-1694891-1-git-send-email-radhey.shyam.pandey@amd.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-04 16:27:04 +01:00
Neal Frager	a48f744bef	usb: dwc3: xilinx: make sure pipe clock is deselected in usb2 only mode When the USB3 PHY is not defined in the Linux device tree, there could still be a case where there is a USB3 PHY active on the board and enabled by the first stage bootloader. If serdes clock is being used then the USB will fail to enumerate devices in 2.0 only mode. To solve this, make sure that the PIPE clock is deselected whenever the USB3 PHY is not defined and guarantees that the USB2 only mode will work in all cases. Fixes: `9678f3361a` ("usb: dwc3: xilinx: Skip resets and USB3 register settings for USB2.0 mode") Cc: stable@vger.kernel.org Signed-off-by: Neal Frager <neal.frager@amd.com> Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com> Acked-by: Peter Korsgaard <peter@korsgaard.com> Link: https://lore.kernel.org/r/1733163111-1414816-1-git-send-email-radhey.shyam.pandey@amd.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-04 16:26:54 +01:00
Xu Yang	d2ec94fbc4	usb: core: hcd: only check primary hcd skip_phy_initialization Before commit `53a2d95df8` ("usb: core: add phy notify connect and disconnect"), phy initialization will be skipped even when shared hcd doesn't set skip_phy_initialization flag. However, the situation is changed after the commit. The hcd.c will initialize phy when add shared hcd. This behavior is unexpected for some platforms which will handle phy initialization by themselves. To avoid the issue, this will only check skip_phy_initialization flag of primary hcd since shared hcd normally follow primary hcd setting. Fixes: `53a2d95df8` ("usb: core: add phy notify connect and disconnect") Cc: stable@vger.kernel.org Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Link: https://lore.kernel.org/r/20241105090120.2438366-1-xu.yang_2@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-04 16:26:44 +01:00
Takashi Iwai	8293705696	usb: gadget: midi2: Fix interpretation of is_midi1 bits The UMP Function Block info m1.0 field (represented by is_midi1 sysfs entry) is an enumeration from 0 to 2, while the midi2 gadget driver incorrectly copies it to the corresponding snd_ump_block_info.flags bits as-is. This made the wrong bit flags set when m1.0 = 2. This patch corrects the wrong interpretation of is_midi1 bits. Fixes: `29ee7a4ddd` ("usb: gadget: midi2: Add configfs support") Cc: stable@vger.kernel.org Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://lore.kernel.org/r/20241127070213.8232-1-tiwai@suse.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-04 16:26:34 +01:00
Xu Yang	a4faee0117	usb: dwc3: imx8mp: fix software node kernel dump When unbind and bind the device again, kernel will dump below warning: [ 173.972130] sysfs: cannot create duplicate filename '/devices/platform/soc/4c010010.usb/software_node' [ 173.981564] CPU: 2 UID: 0 PID: 536 Comm: sh Not tainted 6.12.0-rc6-06344-g2aed7c4a5c56 #144 [ 173.989923] Hardware name: NXP i.MX95 15X15 board (DT) [ 173.995062] Call trace: [ 173.997509] dump_backtrace+0x90/0xe8 [ 174.001196] show_stack+0x18/0x24 [ 174.004524] dump_stack_lvl+0x74/0x8c [ 174.008198] dump_stack+0x18/0x24 [ 174.011526] sysfs_warn_dup+0x64/0x80 [ 174.015201] sysfs_do_create_link_sd+0xf0/0xf8 [ 174.019656] sysfs_create_link+0x20/0x40 [ 174.023590] software_node_notify+0x90/0x100 [ 174.027872] device_create_managed_software_node+0xec/0x108 ... The '4c010010.usb' device is a platform device created during the initcall and is never removed, which causes its associated software node to persist indefinitely. The existing device_create_managed_software_node() does not provide a corresponding removal function. Replace device_create_managed_software_node() with the device_add_software_node() and device_remove_software_node() pair to ensure proper addition and removal of software nodes, addressing this issue. Fixes: `a9400f1979` ("usb: dwc3: imx8mp: add 2 software managed quirk properties for host mode") Cc: stable@vger.kernel.org Reviewed-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Link: https://lore.kernel.org/r/20241126032841.2458338-1-xu.yang_2@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-04 16:26:30 +01:00
Joe Hattori	ef42b906df	usb: typec: anx7411: fix OF node reference leaks in anx7411_typec_switch_probe() The refcounts of the OF nodes obtained by of_get_child_by_name() calls in anx7411_typec_switch_probe() are not decremented. Replace them with device_get_named_child_node() calls and store the return values to the newly created fwnode_handle fields in anx7411_data, and call fwnode_handle_put() on them in the error path and in the unregister functions. Fixes: `e45d7337dc` ("usb: typec: anx7411: Use of_get_child_by_name() instead of of_find_node_by_name()") Cc: stable@vger.kernel.org Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20241126014909.3687917-1-joe@pf.is.s.u-tokyo.ac.jp Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-04 16:25:54 +01:00
Joe Hattori	645d56e4cc	usb: typec: anx7411: fix fwnode_handle reference leak An fwnode_handle and usb_role_switch are obtained with an incremented refcount in anx7411_typec_port_probe(), however the refcounts are not decremented in the error path. The fwnode_handle is also not decremented in the .remove() function. Therefore, call fwnode_handle_put() and usb_role_switch_put() accordingly. Fixes: `fe6d8a9c8e` ("usb: typec: anx7411: Add Analogix PD ANX7411 support") Cc: stable@vger.kernel.org Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20241121023429.962848-1-joe@pf.is.s.u-tokyo.ac.jp Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-04 16:25:46 +01:00
Mark Tomlinson	0d2ada0522	usb: host: max3421-hcd: Correctly abort a USB request. If the current USB request was aborted, the spi thread would not respond to any further requests. This is because the "curr_urb" pointer would not become NULL, so no further requests would be taken off the queue. The solution here is to set the "urb_done" flag, as this will cause the correct handling of the URB. Also clear interrupts that should only be expected if an URB is in progress. Fixes: `2d53139f31` ("Add support for using a MAX3421E chip as a host driver.") Cc: stable <stable@kernel.org> Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz> Link: https://lore.kernel.org/r/20241124221430.1106080-1-mark.tomlinson@alliedtelesis.co.nz Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-04 16:25:30 +01:00
Xu Yang	282615d133	dt-bindings: phy: imx8mq-usb: correct reference to usb-switch.yaml The i.MX95 usb-phy can work with or without orientation-switch. With current setting, if usb-phy works without orientation-switch, the dt-schema check will show below error: phy@4c1f0040: 'oneOf' conditional failed, one must be fixed: 'port' is a required property 'ports' is a required property from schema $id: http://devicetree.org/schemas/phy/fsl,imx8mq-usb-phy.yaml# This will correct the behavior of referring to usb-switch.yaml. Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Acked-by: Rob Herring (Arm) <robh@kernel.org> Link: https://lore.kernel.org/r/20241119105017.917833-1-xu.yang_2@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-04 16:25:20 +01:00
Vitalii Mordan	97264eaaba	usb: ehci-hcd: fix call balance of clocks handling routines If the clocks priv->iclk and priv->fclk were not enabled in ehci_hcd_sh_probe, they should not be disabled in any path. Conversely, if they was enabled in ehci_hcd_sh_probe, they must be disabled in all error paths to ensure proper cleanup. Found by Linux Verification Center (linuxtesting.org) with Klever. Fixes: `63c8455222` ("usb: ehci-hcd: Add support for SuperH EHCI.") Cc: stable@vger.kernel.org # `ff30bd6a66`: sh: clk: Fix clk_enable() to return 0 on NULL clk Signed-off-by: Vitalii Mordan <mordan@ispras.ru> Reviewed-by: Alan Stern <stern@rowland.harvard.edu> Link: https://lore.kernel.org/r/20241121114700.2100520-1-mordan@ispras.ru Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-12-04 16:25:13 +01:00
Arnd Bergmann	2de679ecd7	phy: stm32: work around constant-value overflow assertion FIELD_PREP() checks that a constant fits into the available bitfield, but if one of the two lookup tables in stm32_impedance_tune() does not find a matching entry, the index is out of range, which gcc correctly complains about: In file included from <command-line>: In function 'stm32_impedance_tune', inlined from 'stm32_combophy_pll_init' at drivers/phy/st/phy-stm32-combophy.c:247:9: include/linux/compiler_types.h:517:38: error: call to '__compiletime_assert_447' declared with attribute error: FIELD_PREP: value too large for the field 517 \| _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) \| ^ include/linux/bitfield.h:68:3: note: in expansion of macro 'BUILD_BUG_ON_MSG' 68 \| BUILD_BUG_ON_MSG(__builtin_constant_p(_val) ? \ 115 \| __BF_FIELD_CHECK(_mask, 0ULL, _val, "FIELD_PREP: "); \ \| ^~~~~~~~~~~~~~~~ drivers/phy/st/phy-stm32-combophy.c:162:8: note: in expansion of macro 'FIELD_PREP' 162 \| FIELD_PREP(STM32MP25_PCIEPRG_IMPCTRL_VSWING, vswing_of)); \| ^~~~~~~~~~ Rework this so the field value gets set inside of the loop and otherwise set to zero. Fixes: `47e1bb6b4b` ("phy: stm32: Add support for STM32MP25 COMBOPHY.") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20241111103712.3520611-1-arnd@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-12-04 20:04:22 +05:30
Krishna Kurapati	8886fb3240	phy: qcom-qmp: Fix register name in RX Lane config of SC8280XP In RX Lane configuration sequence of SC8280XP, the register V5_RX_UCDR_FO_GAIN is incorrectly spelled as RX_UCDR_SO_GAIN and hence the programming sequence is wrong. Fix the register sequence accordingly to avoid any compliance failures. This has been tested on SA8775P by checking device mode enumeration in SuperSpeed. Cc: stable@vger.kernel.org Fixes: `c0c7769cda` ("phy: qcom-qmp: Add SC8280XP USB3 UNI phy") Signed-off-by: Krishna Kurapati <quic_kriskura@quicinc.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Link: https://lore.kernel.org/r/20241112092831.4110942-1-quic_kriskura@quicinc.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-12-04 20:02:27 +05:30
Geert Uytterhoeven	70327137eb	gpio: GPIO_MVEBU should not default to y when compile-testing Merely enabling compile-testing should not enable additional functionality. Fixes: `956ee0c5c9` ("gpio: mvebu: allow building the module with COMPILE_TEST=y") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/6b9e55dbf544297d5acf743f6fa473791ab10644.1733242798.git.geert+renesas@glider.be Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>	2024-12-04 15:27:49 +01:00
Dan Carpenter	d0257e089d	RDMA/uverbs: Prevent integer overflow issue In the expression "cmd.wqe_size * cmd.wr_count", both variables are u32 values that come from the user so the multiplication can lead to integer wrapping. Then we pass the result to uverbs_request_next_ptr() which also could potentially wrap. The "cmd.sge_count * sizeof(struct ib_uverbs_sge)" multiplication can also overflow on 32bit systems although it's fine on 64bit systems. This patch does two things. First, I've re-arranged the condition in uverbs_request_next_ptr() so that the use controlled variable "len" is on one side of the comparison by itself without any math. Then I've modified all the callers to use size_mul() for the multiplications. Fixes: `67cdb40ca4` ("[IB] uverbs: Implement more commands") Cc: stable@vger.kernel.org Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://patch.msgid.link/b8765ab3-c2da-4611-aae0-ddd6ba173d23@stanley.mountain Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-12-04 09:19:39 -05:00
Chukun Pan	fbcbffbac9	phy: rockchip: naneng-combphy: fix phy reset Currently, the USB port via combophy on the RK3528/RK3588 SoC is broken. usb usb8-port1: Cannot enable. Maybe the USB cable is bad? This is due to the combphy of RK3528/RK3588 SoC has multiple resets, but only "phy resets" need assert and deassert, "apb resets" don't need. So change the driver to only match the phy resets, which is also what the vendor kernel does. Fixes: `7160820d74` ("phy: rockchip: add naneng combo phy for RK3568") Cc: FUKAUMI Naoki <naoki@radxa.com> Cc: Michael Zimmermann <sigmaepsilon92@gmail.com> Signed-off-by: Chukun Pan <amadeus@jmu.edu.cn> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Tested-by: FUKAUMI Naoki <naoki@radxa.com> Link: https://lore.kernel.org/r/20241122073006.99309-2-amadeus@jmu.edu.cn Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-12-04 19:49:08 +05:30
Justin Chen	0a92ea87bd	phy: usb: Toggle the PHY power during init When bringing up the PHY, it might be in a bad state if left powered. One case is we lose the PLL lock if the PLL is gated while the PHY is powered. Toggle the PHY power so we can start from a known state. Fixes: `4e5b9c9a73` ("phy: usb: Add support for new Synopsys USB controller on the 7216") Signed-off-by: Justin Chen <justin.chen@broadcom.com> Acked-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://lore.kernel.org/r/20241024213540.1059412-1-justin.chen@broadcom.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-12-04 19:45:35 +05:30
Leon Romanovsky	eb867d797d	RDMA/bnxt_re: Remove always true dattr validity check res->dattr is always valid at this point as it was initialized during device addition in bnxt_re_add_device(). This change is fixing the following smatch error: drivers/infiniband/hw/bnxt_re/qplib_fp.c:1090 bnxt_qplib_create_qp() error: we previously assumed 'res->dattr' could be null (see line 985) Fixes: `07f830ae49` ("RDMA/bnxt_re: Adds MSN table capability for Gen P7 adapters") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/r/202411222329.YTrwonWi-lkp@intel.com/ Link: https://patch.msgid.link/be0d8836b64cba3e479fbcbca717acad04aae02e.1732626579.git.leonro@nvidia.com Acked-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>	2024-12-04 08:55:09 -05:00
Lizhi Hou	dcbef0798e	dmaengine: amd: qdma: Remove using the private get and set dma_ops APIs The get_dma_ops and set_dma_ops APIs were never for driver to use. Remove these calls from QDMA driver. Instead, pass the DMA device pointer from the qdma_platdata structure. Fixes: `73d5fc92a1` ("dmaengine: amd: qdma: Add AMD QDMA driver") Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20240918181022.2155715-1-lizhi.hou@amd.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-12-04 18:27:32 +05:30
Sasha Finkelstein	8d55e8a16f	dmaengine: apple-admac: Avoid accessing registers in probe The ADMAC attached to the AOP has complex power sequencing, and is power gated when the probe callback runs. Move the register reads to other functions, where we can guarantee that the hardware is switched on. Fixes: `568aa6dd64` ("dmaengine: apple-admac: Allocate cache SRAM to channels") Signed-off-by: Sasha Finkelstein <fnkl.kernel@gmail.com> Link: https://lore.kernel.org/r/20241124-admac-power-v1-1-58f2165a4d55@gmail.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-12-04 17:42:27 +05:30
Randy Dunlap	790fb9956e	linux/dmaengine.h: fix a few kernel-doc warnings The comment block for "Interleaved Transfer Request" should not begin with "/**" since it is not in kernel-doc format. Fix doc name for enum sum_check_flags. Fix all (4) missing struct member warnings. Use "Warning:" for one "Note:" in enum dma_desc_metadata_mode since scripts/kernel-doc does not allow more than one Note: per function or identifier description. This leaves around 49 kernel-doc warnings like: include/linux/dmaengine.h:43: warning: Enum value 'DMA_OUT_OF_ORDER' not described in enum 'dma_status' and another scripts/kernel-doc problem with it not being able to parse some typedefs. Fixes: `b14dab792d` ("DMAEngine: Define interleaved transfer request api") Fixes: `ad283ea4a3` ("async_tx: add sum check flags") Fixes: `272420214d` ("dmaengine: Add DMA_CTRL_REUSE") Fixes: `f067025bc6` ("dmaengine: add support to provide error result from a DMA transation") Fixes: `d38a8c622a` ("dmaengine: prepare for generic 'unmap' data") Fixes: `5878853fc9` ("dmaengine: Add API function dmaengine_prep_peripheral_dma_vec()") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Paul Cercueil <paul@crapouillou.net> Cc: Nuno Sa <nuno.sa@analog.com> Cc: Vinod Koul <vkoul@kernel.org> Cc: dmaengine@vger.kernel.org Link: https://lore.kernel.org/r/20241202172004.76020-1-rdunlap@infradead.org Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-12-04 17:41:25 +05:30
Christian Brauner	930e7c209b	Merge patch series "jbd2: two straightforward fixes" Two simple fixes for jdb2. * patches from https://lore.kernel.org/r/20241203014407.805916-1-yi.zhang@huaweicloud.com: jbd2: flush filesystem device before updating tail sequence jbd2: increase IO priority for writing revoke records Link: https://lore.kernel.org/r/20241203014407.805916-1-yi.zhang@huaweicloud.com Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-12-04 12:00:12 +01:00
Zhang Yi	a0851ea9cd	jbd2: flush filesystem device before updating tail sequence When committing transaction in jbd2_journal_commit_transaction(), the disk caches for the filesystem device should be flushed before updating the journal tail sequence. However, this step is missed if the journal is not located on the filesystem device. As a result, the filesystem may become inconsistent following a power failure or system crash. Fix it by ensuring that the filesystem device is flushed appropriately. Fixes: `3339578f05` ("jbd2: cleanup journal tail after transaction commit") Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Link: https://lore.kernel.org/r/20241203014407.805916-3-yi.zhang@huaweicloud.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-12-04 12:00:05 +01:00
Zhang Yi	ac1e21bd8c	jbd2: increase IO priority for writing revoke records Commit '6a3afb6ac6df ("jbd2: increase the journal IO's priority")' increases the priority of journal I/O by marking I/O with the JBD2_JOURNAL_REQ_FLAGS. However, that commit missed the revoke buffers, so also addresses that kind of I/Os. Fixes: `6a3afb6ac6` ("jbd2: increase the journal IO's priority") Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Link: https://lore.kernel.org/r/20241203014407.805916-2-yi.zhang@huaweicloud.com Reviewed-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-12-04 12:00:05 +01:00
Levi Yun	6fe437cfe2	firmware: arm_ffa: Fix the race around setting ffa_dev->properties Currently, ffa_dev->properties is set after the ffa_device_register() call return in ffa_setup_partitions(). This could potentially result in a race where the partition's properties is accessed while probing struct ffa_device before it is set. Update the ffa_device_register() to receive ffa_partition_info so all the data from the partition information received from the firmware can be updated into the struct ffa_device before the calling device_register() in ffa_device_register(). Fixes: `e781858488` ("firmware: arm_ffa: Add initial FFA bus support for device enumeration") Signed-off-by: Levi Yun <yeoreum.yun@arm.com> Message-Id: <20241203143109.1030514-2-yeoreum.yun@arm.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>	2024-12-04 09:59:54 +00:00
Keisuke Nishimura	be7e611274	KVM: arm64: vgic-its: Add error handling in vgic_its_cache_translation The return value of xa_store() needs to be checked. This fix adds an error handling path that resolves the kref inconsistency on failure. As suggested by Oliver Upton, this function does not return the error code intentionally because the translation cache is best effort. Fixes: `8201d1028c` ("KVM: arm64: vgic-its: Maintain a translation cache per ITS") Signed-off-by: Keisuke Nishimura <keisuke.nishimura@inria.fr> Suggested-by: Oliver Upton <oliver.upton@linux.dev> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20241130144952.23729-1-keisuke.nishimura@inria.fr Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-12-03 16:22:10 -08:00
Marc Zyngier	03c7527e97	KVM: arm64: Do not allow ID_AA64MMFR0_EL1.ASIDbits to be overridden Catalin reports that a hypervisor lying to a guest about the size of the ASID field may result in unexpected issues: - if the underlying HW does only supports 8 bit ASIDs, the ASID field in a TLBI VAE1* operation is only 8 bits, and the HW will ignore the other 8 bits - if on the contrary the HW is 16 bit capable, the ASID field in the same TLBI operation is always 16 bits, irrespective of the value of TCR_ELx.AS. This could lead to missed invalidations if the guest was lead to assume that the HW had 8 bit ASIDs while they really are 16 bit wide. In order to avoid any potential disaster that would be hard to debug, prenent the migration between a host with 8 bit ASIDs to one with wider ASIDs (the converse was obviously always forbidden). This is also consistent with what we already do for VMIDs. If it becomes absolutely mandatory to support such a migration path in the future, we will have to trap and emulate all TLBIs, something that nobody should look forward to. Fixes: `d5a32b60dc` ("KVM: arm64: Allow userspace to change ID_AA64MMFR{0-2}_EL1") Reported-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Oliver Upton <oliver.upton@linux.dev> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20241203190236.505759-1-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-12-03 16:21:08 -08:00
Haoyu Li	52fd1709e4	clk: en7523: Initialize num before accessing hws in en7523_register_clocks() With the new __counted_by annotation in clk_hw_onecell_data, the "num" struct member must be set before accessing the "hws" array. Failing to do so will trigger a runtime warning when enabling CONFIG_UBSAN_BOUNDS and CONFIG_FORTIFY_SOURCE. Fixes: `f316cdff8d` ("clk: Annotate struct clk_hw_onecell_data with __counted_by") Signed-off-by: Haoyu Li <lihaoyu499@gmail.com> Link: https://lore.kernel.org/r/20241203142915.345523-1-lihaoyu499@gmail.com Signed-off-by: Stephen Boyd <sboyd@kernel.org>	2024-12-03 14:54:12 -08:00
Christian Marangi	2eb75f86d5	clk: en7523: Fix wrong BUS clock for EN7581 The Documentation for EN7581 had a typo and still referenced the EN7523 BUS base source frequency. This was in conflict with a different page in the Documentration that state that the BUS runs at 300MHz (600MHz source with divisor set to 2) and the actual watchdog that tick at half the BUS clock (150MHz). This was verified with the watchdog by timing the seconds that the system takes to reboot (due too watchdog) and by operating on different values of the BUS divisor. The correct values for source of BUS clock are 600MHz and 540MHz. This was also confirmed by Airoha. Cc: stable@vger.kernel.org Fixes: `66bc47326c` ("clk: en7523: Add EN7581 support") Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Link: https://lore.kernel.org/r/20241116105710.19748-1-ansuelsmth@gmail.com Acked-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Stephen Boyd <sboyd@kernel.org>	2024-12-03 14:52:35 -08:00
Konstantin Andrikopoulos	b03917e02b	rust: add safety comment in workqueue traits Add missing safety comments for the implementation of the unsafe traits WorkItemPointer and RawWorkItem for Arc<T> in workqueue.rs Link: https://github.com/Rust-for-Linux/linux/issues/351. Co-developed-by: Vangelis Mamalakis <mamalakis@google.com> Signed-off-by: Vangelis Mamalakis <mamalakis@google.com> Suggested-by: Miguel Ojeda <ojeda@kernel.org> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Konstantin Andrikopoulos <kernel@mandragore.io> Signed-off-by: Tejun Heo <tj@kernel.org>	2024-12-03 10:47:58 -10:00
Honglei Wang	793baff3f2	sched_ext: Add __weak to fix the build errors commit `5cbb302880` ("sched_ext: Rename scx_bpf_dispatch[_vtime]_from_dsq() -> scx_bpf_dsq_move[_vtime]()") introduced several new functions which caused compilation errors when compiled with clang. Let's fix this by adding __weak markers. Signed-off-by: Honglei Wang <jameshongleiwang@126.com> Signed-off-by: Tejun Heo <tj@kernel.org> Fixes: `5cbb302880` ("sched_ext: Rename scx_bpf_dispatch[_vtime]_from_dsq() -> scx_bpf_dsq_move[_vtime]()") Acked-by: Andrii Nakryiko <andrii@kernel.org>	2024-12-03 10:27:26 -10:00
Filipe Manana	9c803c474c	btrfs: fix missing snapshot drew unlock when root is dead during swap activation When activating a swap file we acquire the root's snapshot drew lock and then check if the root is dead, failing and returning with -EPERM if it's dead but without unlocking the root's snapshot lock. Fix this by adding the missing unlock. Fixes: `60021bd754` ("btrfs: prevent subvol with swapfile from being deleted") Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2024-12-03 20:27:02 +01:00
Qu Wenruo	951a3f59d2	btrfs: fix mount failure due to remount races [BUG] The following reproducer can cause btrfs mount to fail: dev="/dev/test/scratch1" mnt1="/mnt/test" mnt2="/mnt/scratch" mkfs.btrfs -f $dev mount $dev $mnt1 btrfs subvolume create $mnt1/subvol1 btrfs subvolume create $mnt1/subvol2 umount $mnt1 mount $dev $mnt1 -o subvol=subvol1 while mount -o remount,ro $mnt1; do mount -o remount,rw $mnt1; done & bg=$! while mount $dev $mnt2 -o subvol=subvol2; do umount $mnt2; done kill $bg wait umount -R $mnt1 umount -R $mnt2 The script will fail with the following error: mount: /mnt/scratch: /dev/mapper/test-scratch1 already mounted on /mnt/test. dmesg(1) may have more information after failed mount system call. umount: /mnt/test: target is busy. umount: /mnt/scratch/: not mounted And there is no kernel error message. [CAUSE] During the btrfs mount, to support mounting different subvolumes with different RO/RW flags, we need to detect that and retry if needed: Retry with matching RO flags if the initial mount fail with -EBUSY. The problem is, during that retry we do not hold any super block lock (s_umount), this means there can be a remount process changing the RO flags of the original fs super block. If so, we can have an EBUSY error during retry. And this time we treat any failure as an error, without any retry and cause the above EBUSY mount failure. [FIX] The current retry behavior is racy because we do not have a super block thus no way to hold s_umount to prevent the race with remount. Solve the root problem by allowing fc->sb_flags to mismatch from the sb->s_flags at btrfs_get_tree_super(). Then at the re-entry point btrfs_get_tree_subvol(), manually check the fc->s_flags against sb->s_flags, if it's a RO->RW mismatch, then reconfigure with s_umount lock hold. Reported-by: Enno Gotthold <egotthold@suse.com> Reported-by: Fabian Vogt <fvogt@suse.com> [ Special thanks for the reproducer and early analysis pointing to btrfs. ] Fixes: `f044b31867` ("btrfs: handle the ro->rw transition for mounting different subvolumes") Link: https://bugzilla.suse.com/show_bug.cgi?id=1231836 Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2024-12-03 20:26:49 +01:00
Arnaldo Carvalho de Melo	88a6e2f67c	perf machine: Initialize machine->env to address a segfault Its used from trace__run(), for the 'perf trace' live mode, i.e. its strace-like, non-perf.data file processing mode, the most common one. The trace__run() function will set trace->host using machine__new_host() that is supposed to give a machine instance representing the running machine, and since we'll use perf_env__arch_strerrno() to get the right errno -> string table, we need to use machine->env, so initialize it in machine__new_host(). Before the patch: (gdb) run trace --errno-summary -a sleep 1 <SNIP> Summary of events: gvfs-afc-volume (3187), 2 events, 0.0% syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ pselect6 1 0 0.000 0.000 0.000 0.000 0.00% GUsbEventThread (3519), 2 events, 0.0% syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ poll 1 0 0.000 0.000 0.000 0.000 0.00% <SNIP> Program received signal SIGSEGV, Segmentation fault. 0x00000000005caba0 in perf_env__arch_strerrno (env=0x0, err=110) at util/env.c:478 478 if (env->arch_strerrno == NULL) (gdb) bt #0 0x00000000005caba0 in perf_env__arch_strerrno (env=0x0, err=110) at util/env.c:478 #1 0x00000000004b75d2 in thread__dump_stats (ttrace=0x14f58f0, trace=0x7fffffffa5b0, fp=0x7ffff6ff74e0 <_IO_2_1_stderr_>) at builtin-trace.c:4673 #2 0x00000000004b78bf in trace__fprintf_thread (fp=0x7ffff6ff74e0 <_IO_2_1_stderr_>, thread=0x10fa0b0, trace=0x7fffffffa5b0) at builtin-trace.c:4708 #3 0x00000000004b7ad9 in trace__fprintf_thread_summary (trace=0x7fffffffa5b0, fp=0x7ffff6ff74e0 <_IO_2_1_stderr_>) at builtin-trace.c:4747 #4 0x00000000004b656e in trace__run (trace=0x7fffffffa5b0, argc=2, argv=0x7fffffffde60) at builtin-trace.c:4456 #5 0x00000000004ba43e in cmd_trace (argc=2, argv=0x7fffffffde60) at builtin-trace.c:5487 #6 0x00000000004c0414 in run_builtin (p=0xec3068 <commands+648>, argc=5, argv=0x7fffffffde60) at perf.c:351 #7 0x00000000004c06bb in handle_internal_command (argc=5, argv=0x7fffffffde60) at perf.c:404 #8 0x00000000004c0814 in run_argv (argcp=0x7fffffffdc4c, argv=0x7fffffffdc40) at perf.c:448 #9 0x00000000004c0b5d in main (argc=5, argv=0x7fffffffde60) at perf.c:560 (gdb) After: root@number:~# perf trace -a --errno-summary sleep 1 <SNIP> pw-data-loop (2685), 1410 events, 16.0% syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ epoll_wait 188 0 983.428 0.000 5.231 15.595 8.68% ioctl 94 0 0.811 0.004 0.009 0.016 2.82% read 188 0 0.322 0.001 0.002 0.006 5.15% write 141 0 0.280 0.001 0.002 0.018 8.39% timerfd_settime 94 0 0.138 0.001 0.001 0.007 6.47% gnome-control-c (179406), 1848 events, 20.9% syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ poll 222 0 959.577 0.000 4.322 21.414 11.40% recvmsg 150 0 0.539 0.001 0.004 0.013 5.12% write 300 0 0.442 0.001 0.001 0.007 3.29% read 150 0 0.183 0.001 0.001 0.009 5.53% getpid 102 0 0.101 0.000 0.001 0.008 7.82% root@number:~# Fixes: `54373b5d53` ("perf env: Introduce perf_env__arch_strerrno()") Reported-by: Veronika Molnarova <vmolnaro@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Veronika Molnarova <vmolnaro@redhat.com> Acked-by: Michael Petlan <mpetlan@redhat.com> Tested-by: Michael Petlan <mpetlan@redhat.com> Link: https://lore.kernel.org/r/Z0XffUgNSv_9OjOi@x1 Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2024-12-03 10:07:31 -08:00
Herve Codina	d7dfa7fde6	of: Fix error path in of_parse_phandle_with_args_map() The current code uses some 'goto put;' to cancel the parsing operation and can lead to a return code value of 0 even on error cases. Indeed, some goto calls are done from a loop without setting the ret value explicitly before the goto call and so the ret value can be set to 0 due to operation done in previous loop iteration. For instance match can be set to 0 in the previous loop iteration (leading to a new iteration) but ret can also be set to 0 it the of_property_read_u32() call succeed. In that case if no match are found or if an error is detected the new iteration, the return value can be wrongly 0. Avoid those cases setting the ret value explicitly before the goto calls. Fixes: `bd6f2fd5a1` ("of: Support parsing phandle argument lists through a nexus node") Cc: stable@vger.kernel.org Signed-off-by: Herve Codina <herve.codina@bootlin.com> Link: https://lore.kernel.org/r/20241202165819.158681-1-herve.codina@bootlin.com Signed-off-by: Rob Herring (Arm) <robh@kernel.org>	2024-12-03 11:31:19 -06:00
Rob Herring (Arm)	239521712b	dt-bindings: mtd: fixed-partitions: Fix "compression" typo The example erroneously has "compress" property rather than the documented "compression" property. Acked-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20241113225632.1783241-1-robh@kernel.org Signed-off-by: Rob Herring (Arm) <robh@kernel.org>	2024-12-03 11:31:19 -06:00
Arnd Bergmann	514b2262ad	firmware: arm_scmi: Fix i.MX build dependency The newly added SCMI vendor driver references functions in the protocol driver but needs a Kconfig dependency to ensure it can link, essentially the Kconfig dependency needs to be reversed to match the link time dependency: \| arm-linux-gnueabi-ld: sound/soc/fsl/fsl_mqs.o: in function `fsl_mqs_sm_write': \| fsl_mqs.c:(.text+0x1aa): undefined reference to `scmi_imx_misc_ctrl_set' \| arm-linux-gnueabi-ld: sound/soc/fsl/fsl_mqs.o: in function `fsl_mqs_sm_read': \| fsl_mqs.c:(.text+0x1ee): undefined reference to `scmi_imx_misc_ctrl_get' This however only works after changing the dependency in the SND_SOC_FSL_MQS driver as well, which uses 'select IMX_SCMI_MISC_DRV' to turn on a driver it depends on. This is generally a bad idea, so the best solution is to change that into a dependency. To allow the ASoC driver to keep building with the SCMI support, this needs to be an optional dependency that enforces the link-time dependency if IMX_SCMI_MISC_DRV is a loadable module but not depend on it if that is disabled. Fixes: `61c9f03e22` ("firmware: arm_scmi: Add initial support for i.MX MISC protocol") Fixes: `101c902359` ("ASoC: fsl_mqs: Support accessing registers by scmi interface") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Mark Brown <broonie@kernel.org> Acked-by: Shengjiu Wang <shengjiu.wang@gmail.com> Message-Id: <20241115230555.2435004-1-arnd@kernel.org> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>	2024-12-03 15:47:11 +00:00
Benjamin Lin	819e0f1e58	wifi: mac80211: fix station NSS capability initialization order Station's spatial streaming capability should be initialized before handling VHT OMN, because the handling requires the capability information. Fixes: `a8bca3e937` ("wifi: mac80211: track capability/opmode NSS separately") Signed-off-by: Benjamin Lin <benjamin-jw.lin@mediatek.com> Link: https://patch.msgid.link/20241118080722.9603-1-benjamin-jw.lin@mediatek.com [rewrite subject] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-12-03 11:29:16 +01:00
Felix Fietkau	52cebabb12	wifi: mac80211: fix vif addr when switching from monitor to station Since adding support for opting out of virtual monitor support, a zero vif addr was used to indicate passive vs active monitor to the driver. This would break the vif->addr when changing the netdev mac address before switching the interface from monitor to sta mode. Fix the regression by adding a separate flag to indicate whether vif->addr is valid. Reported-by: syzbot+9ea265d998de25ac6a46@syzkaller.appspotmail.com Fixes: `9d40f7e327` ("wifi: mac80211: add flag to opt out of virtual monitor support") Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://patch.msgid.link/20241115115850.37449-1-nbd@nbd.name Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-12-03 11:28:59 +01:00
Emmanuel Grumbach	11ac0d7c3b	wifi: mac80211: fix a queue stall in certain cases of CSA If we got an unprotected action frame with CSA and then we heard the beacon with the CSA IE, we'll block the queues with the CSA reason twice. Since this reason is refcounted, we won't wake up the queues since we wake them up only once and the ref count will never reach 0. This led to blocked queues that prevented any activity (even disconnection wouldn't reset the queue state and the only way to recover would be to reload the kernel module. Fix this by not refcounting the CSA reason. It becomes now pointless to maintain the csa_blocked_queues state. Remove it. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Fixes: `414e090bc4` ("wifi: mac80211: restrict public action ECSA frame handling") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219447 Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20241119173108.5ea90828c2cc.I4f89e58572fb71ae48e47a81e74595cac410fbac@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-12-03 11:28:34 +01:00
Emmanuel Grumbach	220bf00053	wifi: mac80211: wake the queues in case of failure in resume In case we fail to resume, we'll WARN with "Hardware became unavailable during restart." and we'll wait until user space does something. It'll typically bring the interface down and up to recover. This won't work though because the queues are still stopped on IEEE80211_QUEUE_STOP_REASON_SUSPEND reason. Make sure we clear that reason so that we give a chance to the recovery to succeed. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219447 Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20241119173108.cd628f560f97.I76a15fdb92de450e5329940125f3c58916be3942@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-12-03 11:28:33 +01:00
Aditya Kumar Singh	b5c32ff6a3	wifi: cfg80211: clear link ID from bitmap during link delete after clean up Currently, during link deletion, the link ID is first removed from the valid_links bitmap before performing any clean-up operations. However, some functions require the link ID to remain in the valid_links bitmap. One such example is cfg80211_cac_event(). The flow is - nl80211_remove_link() cfg80211_remove_link() ieee80211_del_intf_link() ieee80211_vif_set_links() ieee80211_vif_update_links() ieee80211_link_stop() cfg80211_cac_event() cfg80211_cac_event() requires link ID to be present but it is cleared already in cfg80211_remove_link(). Ultimately, WARN_ON() is hit. Therefore, clear the link ID from the bitmap only after completing the link clean-up. Signed-off-by: Aditya Kumar Singh <quic_adisi@quicinc.com> Link: https://patch.msgid.link/20241121-mlo_dfs_fix-v2-1-92c3bf7ab551@quicinc.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-12-03 11:28:20 +01:00
Haoyu Li	496db69fd8	wifi: mac80211: init cnt before accessing elem in ieee80211_copy_mbssid_beacon With the new __counted_by annocation in cfg80211_mbssid_elems, the "cnt" struct member must be set before accessing the "elem" array. Failing to do so will trigger a runtime warning when enabling CONFIG_UBSAN_BOUNDS and CONFIG_FORTIFY_SOURCE. Fixes: `c14679d700` ("wifi: cfg80211: Annotate struct cfg80211_mbssid_elems with __counted_by") Signed-off-by: Haoyu Li <lihaoyu499@gmail.com> Link: https://patch.msgid.link/20241123172500.311853-1-lihaoyu499@gmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-12-03 11:27:24 +01:00
Issam Hamdi	49dba1ded8	wifi: mac80211: fix mbss changed flags corruption on 32 bit systems On 32-bit systems, the size of an unsigned long is 4 bytes, while a u64 is 8 bytes. Therefore, when using or_each_set_bit(bit, &bits, sizeof(changed) * BITS_PER_BYTE), the code is incorrectly searching for a bit in a 32-bit variable that is expected to be 64 bits in size, leading to incorrect bit finding. Solution: Ensure that the size of the bits variable is correctly adjusted for each architecture. Call Trace: ? show_regs+0x54/0x58 ? __warn+0x6b/0xd4 ? ieee80211_link_info_change_notify+0xcc/0xd4 [mac80211] ? report_bug+0x113/0x150 ? exc_overflow+0x30/0x30 ? handle_bug+0x27/0x44 ? exc_invalid_op+0x18/0x50 ? handle_exception+0xf6/0xf6 ? exc_overflow+0x30/0x30 ? ieee80211_link_info_change_notify+0xcc/0xd4 [mac80211] ? exc_overflow+0x30/0x30 ? ieee80211_link_info_change_notify+0xcc/0xd4 [mac80211] ? ieee80211_mesh_work+0xff/0x260 [mac80211] ? cfg80211_wiphy_work+0x72/0x98 [cfg80211] ? process_one_work+0xf1/0x1fc ? worker_thread+0x2c0/0x3b4 ? kthread+0xc7/0xf0 ? mod_delayed_work_on+0x4c/0x4c ? kthread_complete_and_exit+0x14/0x14 ? ret_from_fork+0x24/0x38 ? kthread_complete_and_exit+0x14/0x14 ? ret_from_fork_asm+0xf/0x14 ? entry_INT80_32+0xf0/0xf0 Signed-off-by: Issam Hamdi <ih@simonwunderlich.de> Link: https://patch.msgid.link/20241125162920.2711462-1-ih@simonwunderlich.de [restore no-op path for no changes] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-12-03 11:26:43 +01:00
Lin Ma	2e3dbf9386	wifi: nl80211: fix NL80211_ATTR_MLO_LINK_ID off-by-one Since the netlink attribute range validation provides inclusive checking, the max of attribute NL80211_ATTR_MLO_LINK_ID should be IEEE80211_MLD_MAX_NUM_LINKS - 1 otherwise causing an off-by-one. One crash stack for demonstration: ================================================================== BUG: KASAN: wild-memory-access in ieee80211_tx_control_port+0x3b6/0xca0 net/mac80211/tx.c:5939 Read of size 6 at addr 001102080000000c by task fuzzer.386/9508 CPU: 1 PID: 9508 Comm: syz.1.386 Not tainted 6.1.70 #2 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x177/0x231 lib/dump_stack.c:106 print_report+0xe0/0x750 mm/kasan/report.c:398 kasan_report+0x139/0x170 mm/kasan/report.c:495 kasan_check_range+0x287/0x290 mm/kasan/generic.c:189 memcpy+0x25/0x60 mm/kasan/shadow.c:65 ieee80211_tx_control_port+0x3b6/0xca0 net/mac80211/tx.c:5939 rdev_tx_control_port net/wireless/rdev-ops.h:761 [inline] nl80211_tx_control_port+0x7b3/0xc40 net/wireless/nl80211.c:15453 genl_family_rcv_msg_doit+0x22e/0x320 net/netlink/genetlink.c:756 genl_family_rcv_msg net/netlink/genetlink.c:833 [inline] genl_rcv_msg+0x539/0x740 net/netlink/genetlink.c:850 netlink_rcv_skb+0x1de/0x420 net/netlink/af_netlink.c:2508 genl_rcv+0x24/0x40 net/netlink/genetlink.c:861 netlink_unicast_kernel net/netlink/af_netlink.c:1326 [inline] netlink_unicast+0x74b/0x8c0 net/netlink/af_netlink.c:1352 netlink_sendmsg+0x882/0xb90 net/netlink/af_netlink.c:1874 sock_sendmsg_nosec net/socket.c:716 [inline] __sock_sendmsg net/socket.c:728 [inline] ____sys_sendmsg+0x5cc/0x8f0 net/socket.c:2499 ___sys_sendmsg+0x21c/0x290 net/socket.c:2553 __sys_sendmsg net/socket.c:2582 [inline] __do_sys_sendmsg net/socket.c:2591 [inline] __se_sys_sendmsg+0x19e/0x270 net/socket.c:2589 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x45/0x90 arch/x86/entry/common.c:81 entry_SYSCALL_64_after_hwframe+0x63/0xcd Update the policy to ensure correct validation. Fixes: `7b0a0e3c3a` ("wifi: cfg80211: do some rework towards MLO link APIs") Signed-off-by: Lin Ma <linma@zju.edu.cn> Suggested-by: Cengiz Can <cengiz.can@canonical.com> Link: https://patch.msgid.link/20241130170526.96698-1-linma@zju.edu.cn Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-12-03 11:25:41 +01:00
Fabio Estevam	b77bd3ba76	ARM: imx: Re-introduce the PINCTRL selection Since commit `17d2100189` ("ARM: imx: Allow user to disable pinctrl"), the CONFIG_PINCTRL option is no longer implicitly selected, causing several i.MX SoC pinctrl drivers no longer getting selected by default. This causes boot regressions on the ARMv4, ARMv5, ARMv6 and ARMv7 i.MX SoCs. Fix it by selecting CONFIG_PINCTRL as before. This defeats the purpose of 7d210018914 ("ARM: imx: Allow user to disable pinctrl"), but it is the less invasive fix for the boot regressions. The attempt to build Layerscape without pinctrl can still be explored later as suggested by Arnd: "Overall, my best advice here is still to not change the way i.MX pinctrl works at all, but just fix Layerscape to not depend on i.MX. The reason for the 'select' here is clearly that the i.MX machines would fail to boot without pinctrl, and changing that because of Layerscape seems backwards." Fixes: `17d2100189` ("ARM: imx: Allow user to disable pinctrl") Reported-by: Guenter Roeck <linux@roeck-us.net> Closes: https://lore.kernel.org/linux-arm-kernel/49ff070a-ce67-42d7-84ec-8b54fd7e9742@roeck-us.net/ Signed-off-by: Fabio Estevam <festevam@denx.de> Acked-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Tested-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/20241127190605.1367157-1-festevam@gmail.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>	2024-12-03 10:58:42 +01:00
Brahmajit Das	989e0cdc0f	fs/qnx6: Fix building with GCC 15 qnx6_checkroot() had been using weirdly spelled initializer - it needed to initialize 3-element arrays of char and it used NUL-padded 3-character string literals (i.e. 4-element initializers, with completely pointless zeroes at the end). That had been spotted by gcc-15[]; prior to that gcc quietly dropped the 4th element of initializers. However, none of that had been needed in the first place - all this array is used for is checking that the first directory entry in root directory is "." and the second - "..". The check had been expressed as a loop, using that match_root[] array. Since there is no chance that we ever want to extend that list of entries, the entire thing is much too fancy for its own good; what we need is just a couple of explicit memcmp() and that's it. []: fs/qnx6/inode.c: In function ‘qnx6_checkroot’: fs/qnx6/inode.c:182:41: error: initializer-string for array of ‘char’ is too long [-Werror=unterminated-string-initialization] 182 \| static char match_root[2][3] = {".\0\0", "..\0"}; \| ^~~~~~~ fs/qnx6/inode.c:182:50: error: initializer-string for array of ‘char’ is too long [-Werror=unterminated-string-initialization] 182 \| static char match_root[2][3] = {".\0\0", "..\0"}; \| ^~~~~~ Signed-off-by: Brahmajit Das <brahmajit.xyz@gmail.com> Link: https://lore.kernel.org/r/20241004195132.1393968-1-brahmajit.xyz@gmail.com Acked-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-12-03 10:40:36 +01:00
Jerome Brunet	5ae1a43486	clk: amlogic: axg-audio: revert reset implementation The audio subsystem of axg based platform is not probing anymore. This is due to the introduction of RESET_MESON_AUX and the config not being enabled with the default arm64 defconfig. This brought another discussion around proper decoupling between the clock and reset part. While this discussion gets sorted out, revert back to the initial implementation. This reverts * commit `681ed497d6` ("clk: amlogic: axg-audio: fix Kconfig dependency on RESET_MESON_AUX") * commit `664988eb47` ("clk: amlogic: axg-audio: use the auxiliary reset driver") Both are reverted with single change to avoid creating more compilation problems. Fixes: `681ed497d6` ("clk: amlogic: axg-audio: fix Kconfig dependency on RESET_MESON_AUX") Cc: Arnd Bergmann <arnd@arndb.de> Reported-by: Mark Brown <broonie@kernel.org> Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Link: https://lore.kernel.org/r/20241128-clk-audio-fix-rst-missing-v2-1-cf437d1a73da@baylibre.com Signed-off-by: Stephen Boyd <sboyd@kernel.org>	2024-12-02 17:21:03 -08:00
Johan Hovold	06fec99d4d	Revert "clk: Fix invalid execution of clk_set_rate" This reverts commit `25f1c96a0e`. The offending commit results in errors like cpu cpu0: _opp_config_clk_single: failed to set clock rate: -22 spamming the logs on the Lenovo ThinkPad X13s and other Qualcomm machines when cpufreq tries to update the CPUFreq HW Engine clocks. As mentioned in commit `4370232c72` ("cpufreq: qcom-hw: Add CPU clock provider support"): [T]he frequency supplied by the driver is the actual frequency that comes out of the EPSS/OSM block after the DCVS operation. This frequency is not same as what the CPUFreq framework has set but it is the one that gets supplied to the CPUs after throttling by LMh. which seems to suggest that the driver relies on the previous behaviour of clk_set_rate(). Since this affects many Qualcomm machines, let's revert for now. Fixes: `25f1c96a0e` ("clk: Fix invalid execution of clk_set_rate") Reported-by: Aishwarya TCV <aishwarya.tcv@arm.com> Link: https://lore.kernel.org/all/e2d83e57-ad07-411b-99f6-a4fc3c4534fa@arm.com/ Cc: Chuan Liu <chuan.liu@amlogic.com> Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Link: https://lore.kernel.org/r/20241202100621.29209-1-johan+linaro@kernel.org Signed-off-by: Stephen Boyd <sboyd@kernel.org>	2024-12-02 17:20:12 -08:00
FUKAUMI Naoki	2ddd93481b	arm64: dts: rockchip: rename rfkill label for Radxa ROCK 5B on ROCK 5B, there is no PCIe slot, instead there is a M.2 slot. rfkill pin is not exclusive to PCIe devices, there is SDIO Wi-Fi devices. rename rfkill label from "rfkill-pcie-wlan" to "rfkill-m2-wlan", it matches with rfkill-bt. Fixes: `82d40b141a` ("arm64: dts: rockchip: add rfkill node for M.2 Key E WiFi on rock-5b") Reviewed-by: Dragan Simic <dsimic@manjaro.org> Signed-off-by: FUKAUMI Naoki <naoki@radxa.com> Fixes: `82d40b141a` ("arm64: dts: rockchip: add rfkill node for M.2 Key E WiFi on rock-5b") Link: https://lore.kernel.org/r/20241128120631.37458-1-naoki@radxa.com Signed-off-by: Heiko Stuebner <heiko@sntech.de>	2024-12-03 00:28:23 +01:00
Chukun Pan	8b9c12757f	arm64: dts: rockchip: add reset-names for combphy on rk3568 The reset-names of combphy are missing, add it. Signed-off-by: Chukun Pan <amadeus@jmu.edu.cn> Fixes: `fd3ac6e804` ("dt-bindings: phy: rockchip: rk3588 has two reset lines") Link: https://lore.kernel.org/r/20241122073006.99309-1-amadeus@jmu.edu.cn Signed-off-by: Heiko Stuebner <heiko@sntech.de>	2024-12-03 00:25:39 +01:00
James Clark	f54cd8f43f	perf test: Don't signal all processes on system when interrupting tests This signal handler loops over all tests on ctrl-C, but it's active while the test list is being constructed. process.pid is 0, then -1, then finally set to the child pid on fork. If the Ctrl-C is received during this point a kill(-1, SIGINT) can be sent which affects all processes. Make sure the child has forked first before forwarding the signal. This can be reproduced with ctrl-C immediately after launching perf test which terminates the ssh connection. Fixes: `553d5efeb3` ("perf test: Add a signal handler to kill forked child processes") Signed-off-by: James Clark <james.clark@linaro.org> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20241129151948.3199732-1-james.clark@linaro.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2024-12-02 12:36:35 -08:00
Namhyung Kim	23c44f6c83	perf tools: Fix build-id event recording The build-id events written at the end of the record session are broken due to unexpected data. The write_buildid() writes the fixed length event first and then variable length filename. But a recent change made it write more data in the padding area accidentally. So readers of the event see zero-filled data for the next entry and treat it incorrectly. This resulted in wrong kernel symbols because the kernel DSO loaded a random vmlinux image in the path as it didn't have a valid build-id. Fixes: `ae39ba1655` ("perf inject: Fix build ID injection") Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/Z0aRFFW9xMh3mqKB@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2024-12-02 12:36:20 -08:00
Binbin Zhou	4b65d5322e	dmaengine: loongson2-apb: Change GENMASK to GENMASK_ULL Fix the following smatch static checker warning: drivers/dma/loongson2-apb-dma.c:189 ls2x_dma_write_cmd() warn: was expecting a 64 bit value instead of '~(((0)) + (((~((0))) - (((1)) << (0)) + 1) & (~((0)) >> ((8 * 4) - 1 - (4)))))' The GENMASK macro used "unsigned long", which caused build issues when using a 32-bit toolchain because it would try to access bits > 31. This patch switches GENMASK to GENMASK_ULL, which uses "unsigned long long". Fixes: `71e7d3cb6e` ("dmaengine: ls2x-apb: New driver for the Loongson LS2X APB DMA controller") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/all/87cdc025-7246-4548-85ca-3d36fdc2be2d@stanley.mountain/ Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn> Link: https://lore.kernel.org/r/20241028093413.1145820-1-zhoubinbin@loongson.cn Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-12-02 22:48:57 +05:30
Andy Shevchenko	f0e870a0e9	dmaengine: dw: Select only supported masters for ACPI devices The recently submitted fix-commit revealed a problem in the iDMA 32-bit platform code. Even though the controller supported only a single master the dw_dma_acpi_filter() method hard-coded two master interfaces with IDs 0 and 1. As a result the sanity check implemented in the commit `b336268dde` ("dmaengine: dw: Add peripheral bus width verification") got incorrect interface data width and thus prevented the client drivers from configuring the DMA-channel with the EINVAL error returned. E.g., the next error was printed for the PXA2xx SPI controller driver trying to configure the requested channels: > [ 164.525604] pxa2xx_spi_pci 0000:00:07.1: DMA slave config failed > [ 164.536105] pxa2xx_spi_pci 0000:00:07.1: failed to get DMA TX descriptor > [ 164.543213] spidev spi-SPT0001:00: SPI transfer failed: -16 The problem would have been spotted much earlier if the iDMA 32-bit controller supported more than one master interfaces. But since it supports just a single master and the iDMA 32-bit specific code just ignores the master IDs in the CTLLO preparation method, the issue has been gone unnoticed so far. Fix the problem by specifying the default master ID for both memory and peripheral devices in the driver data. Thus the issue noticed for the iDMA 32-bit controllers will be eliminated and the ACPI-probed DW DMA controllers will be configured with the correct master ID by default. Cc: stable@vger.kernel.org Fixes: `b336268dde` ("dmaengine: dw: Add peripheral bus width verification") Fixes: `199244d694` ("dmaengine: dw: add support of iDMA 32-bit hardware") Reported-by: Ferry Toth <fntoth@gmail.com> Closes: https://lore.kernel.org/dmaengine/ZuXbCKUs1iOqFu51@black.fi.intel.com/ Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Closes: https://lore.kernel.org/dmaengine/ZuXgI-VcHpMgbZ91@black.fi.intel.com/ Tested-by: Ferry Toth <fntoth@gmail.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20241104095142.157925-1-andriy.shevchenko@linux.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-12-02 22:35:45 +05:30
Chen Ridong	c43ec96e8d	dmaengine: at_xdmac: avoid null_prt_deref in at_xdmac_prep_dma_memset The at_xdmac_memset_create_desc may return NULL, which will lead to a null pointer dereference. For example, the len input is error, or the atchan->free_descs_list is empty and memory is exhausted. Therefore, add check to avoid this. Fixes: `b206d9a23a` ("dmaengine: xdmac: Add memset support") Signed-off-by: Chen Ridong <chenridong@huawei.com> Link: https://lore.kernel.org/r/20241029082845.1185380-1-chenridong@huaweicloud.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2024-12-02 22:14:19 +05:30
Herve Codina	60bc447c85	of: Add #address-cells/#size-cells in the device-tree root empty node On systems where ACPI is enabled or when a device-tree is not passed to the kernel by the bootloader, a device-tree root empty node is created. This device-tree root empty node does not have the #address-cells and the #size-cells properties This leads to the use of the default address cells and size cells values which are defined in the code to 1 for the address cells value and 1 for the size cells value. According to the devicetree specification and the OpenFirmware standard (IEEE 1275-1994) the default value for #address-cells should be 2. Also, according to the devicetree specification, the #address-cells and the #size-cells are required properties in the root node. The device tree compiler already uses 2 as default value for address cells and 1 for size cells. The powerpc PROM code also uses 2 as default value for address cells and 1 for size cells. Modern implementation should have the #address-cells and the #size-cells properties set and should not rely on default values. On x86, this root empty node is used and the code default values are used. In preparation of the support for device-tree overlay on PCI devices feature on x86 (i.e. the creation of the PCI root bus device-tree node), the default value for #address-cells needs to be updated. Indeed, on x86_64, addresses are on 64bits and the upper part of an address is needed for correct address translations. On x86_32 having the default value updated does not lead to issues while the upper part of a 64-bit value is zero. Changing the default value for all architectures may break device-tree compatibility. Indeed, existing dts file without the #address-cells property set in the root node will not be compatible with this modification. Instead of updating default values, add both required #address-cells and #size-cells properties in the device-tree empty node. Use 2 for both properties value in order to fully support 64-bit addresses and sizes on systems using this empty root node. Signed-off-by: Herve Codina <herve.codina@bootlin.com> Link: https://lore.kernel.org/r/20241202131522.142268-6-herve.codina@bootlin.com Signed-off-by: Rob Herring (Arm) <robh@kernel.org>	2024-12-02 09:26:33 -06:00
Leo Stone	b905bafdea	hfs: Sanity check the root record In the syzbot reproducer, the hfs_cat_rec for the root dir has type HFS_CDR_FIL after being read with hfs_bnode_read() in hfs_super_fill(). This indicates it should be used as an hfs_cat_file, which is 102 bytes. Only the first 70 bytes of that struct are initialized, however, because the entrylength passed into hfs_bnode_read() is still the length of a directory record. This causes uninitialized values to be used later on, when the hfs_cat_rec union is treated as the larger hfs_cat_file struct. Add a check to make sure the retrieved record has the correct type for the root directory (HFS_CDR_DIR), and make sure we load the correct number of bytes for a directory record. Reported-by: syzbot+2db3c7526ba68f4ea776@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=2db3c7526ba68f4ea776 Tested-by: syzbot+2db3c7526ba68f4ea776@syzkaller.appspotmail.com Tested-by: Leo Stone <leocstone@gmail.com> Signed-off-by: Leo Stone <leocstone@gmail.com> Link: https://lore.kernel.org/r/20241201051420.77858-1-leocstone@gmail.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-12-02 15:32:19 +01:00
Thomas Hellström	0302d2fd6e	locking/ww_mutex: Fix ww_mutex dummy lockdep map selftest warnings The below commit introduces a dummy lockdep map, but didn't get the initialization quite right (it should mimic the initialization of the real ww_mutex lockdep maps). It also introduced a separate locking api selftest failure. Fix these. Closes: https://lore.kernel.org/lkml/Zw19sMtnKdyOVQoh@boqun-archlinux/ Fixes: `823a566221` ("locking/ww_mutex: Adjust to lockdep nest_lock requirements") Reported-by: Boqun Feng <boqun.feng@gmail.com> Suggested-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20241127085430.3045-1-thomas.hellstrom@linux.intel.com	2024-12-02 12:16:57 +01:00
Kan Liang	9f3de72a0c	perf/x86/intel/ds: Unconditionally drain PEBS DS when changing PEBS_DATA_CFG The PEBS kernel warnings can still be observed with the below case. when the below commands are running in parallel for a while. while true; do perf record --no-buildid -a --intr-regs=AX \ -e cpu/event=0xd0,umask=0x81/pp \ -c 10003 -o /dev/null ./triad; done & while true; do perf record -e 'cpu/mem-loads,ldlat=3/uP' -W -d -- ./dtlb done The commit `b752ea0c28` ("perf/x86/intel/ds: Flush PEBS DS when changing PEBS_DATA_CFG") intends to flush the entire PEBS buffer before the hardware is reprogrammed. However, it fails in the above case. The first perf command utilizes the large PEBS, while the second perf command only utilizes a single PEBS. When the second perf event is added, only the n_pebs++. The intel_pmu_pebs_enable() is invoked after intel_pmu_pebs_add(). So the cpuc->n_pebs == cpuc->n_large_pebs check in the intel_pmu_drain_large_pebs() fails. The PEBS DS is not flushed. The new PEBS event should not be taken into account when flushing the existing PEBS DS. The check is unnecessary here. Before the hardware is reprogrammed, all the stale records must be drained unconditionally. For single PEBS or PEBS-vi-pt, the DS must be empty. The drain_pebs() can handle the empty case. There is no harm to unconditionally drain the PEBS DS. Fixes: `b752ea0c28` ("perf/x86/intel/ds: Flush PEBS DS when changing PEBS_DATA_CFG") Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20241119135504.1463839-2-kan.liang@linux.intel.com	2024-12-02 12:01:33 +01:00
Kan Liang	4e54ed4963	perf/x86/intel: Add Arrow Lake U support From PMU's perspective, the new Arrow Lake U is the same as the Meteor Lake. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20241121180526.2364759-1-kan.liang@linux.intel.com	2024-12-02 12:01:33 +01:00
John Stultz	82f9cc0949	locking: rtmutex: Fix wake_q logic in task_blocks_on_rt_mutex Anders had bisected a crash using PREEMPT_RT with linux-next and isolated it down to commit `894d1b3db4` ("locking/mutex: Remove wakeups from under mutex::wait_lock"), where it seemed the wake_q structure was somehow getting corrupted causing a null pointer traversal. I was able to easily repoduce this with PREEMPT_RT and managed to isolate down that through various call stacks we were actually calling wake_up_q() twice on the same wake_q. I found that in the problematic commit, I had added the wake_up_q() call in task_blocks_on_rt_mutex() around __ww_mutex_add_waiter(), following a similar pattern in __mutex_lock_common(). However, its just wrong. We haven't dropped the lock->wait_lock, so its contrary to the point of the original patch. And it didn't match the __mutex_lock_common() logic of re-initializing the wake_q after calling it midway in the stack. Looking at it now, the wake_up_q() call is incorrect and should just be removed. So drop the erronious logic I had added. Fixes: `894d1b3db4` ("locking/mutex: Remove wakeups from under mutex::wait_lock") Closes: https://lore.kernel.org/lkml/6afb936f-17c7-43fa-90e0-b9e780866097@app.fastmail.com/ Reported-by: Anders Roxell <anders.roxell@linaro.org> Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: John Stultz <jstultz@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Juri Lelli <juri.lelli@redhat.com> Tested-by: Anders Roxell <anders.roxell@linaro.org> Tested-by: K Prateek Nayak <kprateek.nayak@amd.com> Link: https://lore.kernel.org/r/20241114190051.552665-1-jstultz@google.com	2024-12-02 12:01:29 +01:00
Wander Lairson Costa	0664e2c311	sched/deadline: Fix warning in migrate_enable for boosted tasks When running the following command: while true; do stress-ng --cyclic 30 --timeout 30s --minimize --quiet done a warning is eventually triggered: WARNING: CPU: 43 PID: 2848 at kernel/sched/deadline.c:794 setup_new_dl_entity+0x13e/0x180 ... Call Trace: <TASK> ? show_trace_log_lvl+0x1c4/0x2df ? enqueue_dl_entity+0x631/0x6e0 ? setup_new_dl_entity+0x13e/0x180 ? __warn+0x7e/0xd0 ? report_bug+0x11a/0x1a0 ? handle_bug+0x3c/0x70 ? exc_invalid_op+0x14/0x70 ? asm_exc_invalid_op+0x16/0x20 enqueue_dl_entity+0x631/0x6e0 enqueue_task_dl+0x7d/0x120 __do_set_cpus_allowed+0xe3/0x280 __set_cpus_allowed_ptr_locked+0x140/0x1d0 __set_cpus_allowed_ptr+0x54/0xa0 migrate_enable+0x7e/0x150 rt_spin_unlock+0x1c/0x90 group_send_sig_info+0xf7/0x1a0 ? kill_pid_info+0x1f/0x1d0 kill_pid_info+0x78/0x1d0 kill_proc_info+0x5b/0x110 __x64_sys_kill+0x93/0xc0 do_syscall_64+0x5c/0xf0 entry_SYSCALL_64_after_hwframe+0x6e/0x76 RIP: 0033:0x7f0dab31f92b This warning occurs because set_cpus_allowed dequeues and enqueues tasks with the ENQUEUE_RESTORE flag set. If the task is boosted, the warning is triggered. A boosted task already had its parameters set by rt_mutex_setprio, and a new call to setup_new_dl_entity is unnecessary, hence the WARN_ON call. Check if we are requeueing a boosted task and avoid calling setup_new_dl_entity if that's the case. Fixes: `295d6d5e37` ("sched/deadline: Fix switching to -deadline") Signed-off-by: Wander Lairson Costa <wander@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Juri Lelli <juri.lelli@redhat.com> Link: https://lore.kernel.org/r/20240724142253.27145-2-wander@redhat.com	2024-12-02 12:01:29 +01:00
Sebastian Andrzej Siewior	f66e4a9965	sched/core: Update kernel boot parameters for LAZY preempt. Update the documentation for the `preempt=' parameter which now also accepts `lazy'. Fixes: `7c70cb94d2` ("sched: Add Lazy preemption model") Reported-by: Shrikanth Hegde <sshegde@linux.ibm.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Shrikanth Hegde <sshegde@linux.ibm.com> Link: https://lore.kernel.org/r/20241122173557.MYOtT95Q@linutronix.de	2024-12-02 12:01:28 +01:00
K Prateek Nayak	e932c4ab38	sched/core: Prevent wakeup of ksoftirqd during idle load balance Scheduler raises a SCHED_SOFTIRQ to trigger a load balancing event on from the IPI handler on the idle CPU. If the SMP function is invoked from an idle CPU via flush_smp_call_function_queue() then the HARD-IRQ flag is not set and raise_softirq_irqoff() needlessly wakes ksoftirqd because soft interrupts are handled before ksoftirqd get on the CPU. Adding a trace_printk() in nohz_csd_func() at the spot of raising SCHED_SOFTIRQ and enabling trace events for sched_switch, sched_wakeup, and softirq_entry (for SCHED_SOFTIRQ vector alone) helps observing the current behavior: <idle>-0 [000] dN.1.: nohz_csd_func: Raising SCHED_SOFTIRQ from nohz_csd_func <idle>-0 [000] dN.4.: sched_wakeup: comm=ksoftirqd/0 pid=16 prio=120 target_cpu=000 <idle>-0 [000] .Ns1.: softirq_entry: vec=7 [action=SCHED] <idle>-0 [000] .Ns1.: softirq_exit: vec=7 [action=SCHED] <idle>-0 [000] d..2.: sched_switch: prev_comm=swapper/0 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=ksoftirqd/0 next_pid=16 next_prio=120 ksoftirqd/0-16 [000] d..2.: sched_switch: prev_comm=ksoftirqd/0 prev_pid=16 prev_prio=120 prev_state=S ==> next_comm=swapper/0 next_pid=0 next_prio=120 ... Use __raise_softirq_irqoff() to raise the softirq. The SMP function call is always invoked on the requested CPU in an interrupt handler. It is guaranteed that soft interrupts are handled at the end. Following are the observations with the changes when enabling the same set of events: <idle>-0 [000] dN.1.: nohz_csd_func: Raising SCHED_SOFTIRQ for nohz_idle_balance <idle>-0 [000] dN.1.: softirq_raise: vec=7 [action=SCHED] <idle>-0 [000] .Ns1.: softirq_entry: vec=7 [action=SCHED] No unnecessary ksoftirqd wakeups are seen from idle task's context to service the softirq. Fixes: `b2a02fc43a` ("smp: Optimize send_call_function_single_ipi()") Closes: https://lore.kernel.org/lkml/fcf823f-195e-6c9a-eac3-25f870cb35ac@inria.fr/ [1] Reported-by: Julia Lawall <julia.lawall@inria.fr> Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/r/20241119054432.6405-5-kprateek.nayak@amd.com	2024-12-02 12:01:28 +01:00
K Prateek Nayak	ff47a0acfc	sched/fair: Check idle_cpu() before need_resched() to detect ilb CPU turning busy Commit `b2a02fc43a` ("smp: Optimize send_call_function_single_ipi()") optimizes IPIs to idle CPUs in TIF_POLLING_NRFLAG mode by setting the TIF_NEED_RESCHED flag in idle task's thread info and relying on flush_smp_call_function_queue() in idle exit path to run the call-function. A softirq raised by the call-function is handled shortly after in do_softirq_post_smp_call_flush() but the TIF_NEED_RESCHED flag remains set and is only cleared later when schedule_idle() calls __schedule(). need_resched() check in _nohz_idle_balance() exists to bail out of load balancing if another task has woken up on the CPU currently in-charge of idle load balancing which is being processed in SCHED_SOFTIRQ context. Since the optimization mentioned above overloads the interpretation of TIF_NEED_RESCHED, check for idle_cpu() before going with the existing need_resched() check which can catch a genuine task wakeup on an idle CPU processing SCHED_SOFTIRQ from do_softirq_post_smp_call_flush(), as well as the case where ksoftirqd needs to be preempted as a result of new task wakeup or slice expiry. In case of PREEMPT_RT or threadirqs, although the idle load balancing may be inhibited in some cases on the ilb CPU, the fact that ksoftirqd is the only fair task going back to sleep will trigger a newidle balance on the CPU which will alleviate some imbalance if it exists if idle balance fails to do so. Fixes: `b2a02fc43a` ("smp: Optimize send_call_function_single_ipi()") Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20241119054432.6405-4-kprateek.nayak@amd.com	2024-12-02 12:01:28 +01:00
K Prateek Nayak	ea9cffc0a1	sched/core: Remove the unnecessary need_resched() check in nohz_csd_func() The need_resched() check currently in nohz_csd_func() can be tracked to have been added in scheduler_ipi() back in 2011 via commit `ca38062e57` ("sched: Use resched IPI to kick off the nohz idle balance") Since then, it has travelled quite a bit but it seems like an idle_cpu() check currently is sufficient to detect the need to bail out from an idle load balancing. To justify this removal, consider all the following case where an idle load balancing could race with a task wakeup: o Since commit `f3dd3f6745` ("sched: Remove the limitation of WF_ON_CPU on wakelist if wakee cpu is idle") a target perceived to be idle (target_rq->nr_running == 0) will return true for ttwu_queue_cond(target) which will offload the task wakeup to the idle target via an IPI. In all such cases target_rq->ttwu_pending will be set to 1 before queuing the wake function. If an idle load balance races here, following scenarios are possible: - The CPU is not in TIF_POLLING_NRFLAG mode in which case an actual IPI is sent to the CPU to wake it out of idle. If the nohz_csd_func() queues before sched_ttwu_pending(), the idle load balance will bail out since idle_cpu(target) returns 0 since target_rq->ttwu_pending is 1. If the nohz_csd_func() is queued after sched_ttwu_pending() it should see rq->nr_running to be non-zero and bail out of idle load balancing. - The CPU is in TIF_POLLING_NRFLAG mode and instead of an actual IPI, the sender will simply set TIF_NEED_RESCHED for the target to put it out of idle and flush_smp_call_function_queue() in do_idle() will execute the call function. Depending on the ordering of the queuing of nohz_csd_func() and sched_ttwu_pending(), the idle_cpu() check in nohz_csd_func() should either see target_rq->ttwu_pending = 1 or target_rq->nr_running to be non-zero if there is a genuine task wakeup racing with the idle load balance kick. o The waker CPU perceives the target CPU to be busy (targer_rq->nr_running != 0) but the CPU is in fact going idle and due to a series of unfortunate events, the system reaches a case where the waker CPU decides to perform the wakeup by itself in ttwu_queue() on the target CPU but target is concurrently selected for idle load balance (XXX: Can this happen? I'm not sure, but we'll consider the mother of all coincidences to estimate the worst case scenario). ttwu_do_activate() calls enqueue_task() which would increment "rq->nr_running" post which it calls wakeup_preempt() which is responsible for setting TIF_NEED_RESCHED (via a resched IPI or by setting TIF_NEED_RESCHED on a TIF_POLLING_NRFLAG idle CPU) The key thing to note in this case is that rq->nr_running is already non-zero in case of a wakeup before TIF_NEED_RESCHED is set which would lead to idle_cpu() check returning false. In all cases, it seems that need_resched() check is unnecessary when checking for idle_cpu() first since an impending wakeup racing with idle load balancer will either set the "rq->ttwu_pending" or indicate a newly woken task via "rq->nr_running". Chasing the reason why this check might have existed in the first place, I came across Peter's suggestion on the fist iteration of Suresh's patch from 2011 [1] where the condition to raise the SCHED_SOFTIRQ was: sched_ttwu_do_pending(list); if (unlikely((rq->idle == current) && rq->nohz_balance_kick && !need_resched())) raise_softirq_irqoff(SCHED_SOFTIRQ); Since the condition to raise the SCHED_SOFIRQ was preceded by sched_ttwu_do_pending() (which is equivalent of sched_ttwu_pending()) in the current upstream kernel, the need_resched() check was necessary to catch a newly queued task. Peter suggested modifying it to: if (idle_cpu() && rq->nohz_balance_kick && !need_resched()) raise_softirq_irqoff(SCHED_SOFTIRQ); where idle_cpu() seems to have replaced "rq->idle == current" check. Even back then, the idle_cpu() check would have been sufficient to catch a new task being enqueued. Since commit `b2a02fc43a` ("smp: Optimize send_call_function_single_ipi()") overloads the interpretation of TIF_NEED_RESCHED for TIF_POLLING_NRFLAG idling, remove the need_resched() check in nohz_csd_func() to raise SCHED_SOFTIRQ based on Peter's suggestion. Fixes: `b2a02fc43a` ("smp: Optimize send_call_function_single_ipi()") Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20241119054432.6405-3-kprateek.nayak@amd.com	2024-12-02 12:01:28 +01:00
K Prateek Nayak	6675ce2004	softirq: Allow raising SCHED_SOFTIRQ from SMP-call-function on RT kernel do_softirq_post_smp_call_flush() on PREEMPT_RT kernels carries a WARN_ON_ONCE() for any SOFTIRQ being raised from an SMP-call-function. Since do_softirq_post_smp_call_flush() is called with preempt disabled, raising a SOFTIRQ during flush_smp_call_function_queue() can lead to longer preempt disabled sections. Since commit `b2a02fc43a` ("smp: Optimize send_call_function_single_ipi()") IPIs to an idle CPU in TIF_POLLING_NRFLAG mode can be optimized out by instead setting TIF_NEED_RESCHED bit in idle task's thread_info and relying on the flush_smp_call_function_queue() in the idle-exit path to run the SMP-call-function. To trigger an idle load balancing, the scheduler queues nohz_csd_function() responsible for triggering an idle load balancing on a target nohz idle CPU and sends an IPI. Only now, this IPI is optimized out and the SMP-call-function is executed from flush_smp_call_function_queue() in do_idle() which can raise a SCHED_SOFTIRQ to trigger the balancing. So far, this went undetected since, the need_resched() check in nohz_csd_function() would make it bail out of idle load balancing early as the idle thread does not clear TIF_POLLING_NRFLAG before calling flush_smp_call_function_queue(). The need_resched() check was added with the intent to catch a new task wakeup, however, it has recently discovered to be unnecessary and will be removed in the subsequent commit after which nohz_csd_function() can raise a SCHED_SOFTIRQ from flush_smp_call_function_queue() to trigger an idle load balance on an idle target in TIF_POLLING_NRFLAG mode. nohz_csd_function() bails out early if "idle_cpu()" check for the target CPU, and does not lock the target CPU's rq until the very end, once it has found tasks to run on the CPU and will not inhibit the wakeup of, or running of a newly woken up higher priority task. Account for this and prevent a WARN_ON_ONCE() when SCHED_SOFTIRQ is raised from flush_smp_call_function_queue(). Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20241119054432.6405-2-kprateek.nayak@amd.com	2024-12-02 12:01:27 +01:00
Josh Don	70ee7947a2	sched: fix warning in sched_setaffinity Commit `8f9ea86fdf` added some logic to sched_setaffinity that included a WARN when a per-task affinity assignment races with a cpuset update. Specifically, we can have a race where a cpuset update results in the task affinity no longer being a subset of the cpuset. That's fine; we have a fallback to instead use the cpuset mask. However, we have a WARN set up that will trigger if the cpuset mask has no overlap at all with the requested task affinity. This shouldn't be a warning condition; its trivial to create this condition. Reproduced the warning by the following setup: - $PID inside a cpuset cgroup - another thread repeatedly switching the cpuset cpus from 1-2 to just 1 - another thread repeatedly setting the $PID affinity (via taskset) to 2 Fixes: `8f9ea86fdf` ("sched: Always preserve the user requested cpumask") Signed-off-by: Josh Don <joshdon@google.com> Acked-and-tested-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Waiman Long <longman@redhat.com> Tested-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com> Link: https://lkml.kernel.org/r/20241111182738.1832953-1-joshdon@google.com	2024-12-02 12:01:27 +01:00
Juri Lelli	22368fe1f9	sched/deadline: Fix replenish_dl_new_period dl_server condition The condition in replenish_dl_new_period() that checks if a reservation (dl_server) is deferred and is not handling a starvation case is obviously wrong. Fix it. Fixes: `a110a81c52` ("sched/deadline: Deferrable dl server") Signed-off-by: Juri Lelli <juri.lelli@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20241127063740.8278-1-juri.lelli@redhat.com	2024-12-02 12:01:27 +01:00
Johan Hovold	098d837403	bus: mhi: host: pci_generic: fix MHI BAR mapping A recent change converting the MHI pci_generic driver to use pcim_iomap_region() failed to update the BAR parameter which is an index rather than a mask. This specifically broke the modem on machines like the Lenovo ThinkPad X13s and x1e80100 CRD: mhi-pci-generic 0004:01:00.0: failed to map pci region: -22 mhi-pci-generic 0004:01:00.0: probe with driver mhi-pci-generic failed with error -22 Fixes: `bd23e83642` ("bus: mhi: host: pci_generic: Use pcim_iomap_region() to request and map MHI BAR") Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Cc: Mayank Rana <quic_mrana@quicinc.com> Link: https://lore.kernel.org/r/20241201171120.31616-1-johan+linaro@kernel.org	2024-12-02 15:14:33 +05:30
Manivannan Sadhasivam	e60b14f47d	arm64: dts: qcom: sa8775p: Fix the size of 'addr_space' regions For both the controller instances, size of the 'addr_space' region should be 0x1fe00000 as per the hardware memory layout. Otherwise, endpoint drivers cannot request even reasonable BAR size of 1MB. Cc: stable@vger.kernel.org # 6.11 Fixes: `c5f5de8434` ("arm64: dts: qcom: sa8775p: Add ep pcie1 controller node") Fixes: `1924f55182` ("arm64: dts: qcom: sa8775p: Add ep pcie0 controller node") Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Link: https://lore.kernel.org/r/20241128145147.145618-1-manivannan.sadhasivam@linaro.org Signed-off-by: Bjorn Andersson <andersson@kernel.org>	2024-12-01 21:58:36 -06:00
Rob Herring (Arm)	61a6ba233f	dt-bindings: Unify "fsl,liodn" type definitions The type definition of "fsl,liodn" is defined as uint32 in crypto/fsl,sec-v4.0.yaml and uint32-array in soc/fsl/fsl,bman.yaml, soc/fsl/fsl,qman-portal.yaml, and soc/fsl/fsl,qman.yaml. Unify the type to be uint32-array and constraint the single entry cases. Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20241113225614.1782862-1-robh@kernel.org Signed-off-by: Rob Herring (Arm) <robh@kernel.org>	2024-11-27 09:24:23 -06:00
Andrea della Porta	7f05e20b98	of: address: Preserve the flags portion on 1:1 dma-ranges mapping A missing or empty dma-ranges in a DT node implies a 1:1 mapping for dma translations. In this specific case, the current behaviour is to zero out the entire specifier so that the translation could be carried on as an offset from zero. This includes address specifier that has flags (e.g. PCI ranges). Once the flags portion has been zeroed, the translation chain is broken since the mapping functions will check the upcoming address specifier against mismatching flags, always failing the 1:1 mapping and its entire purpose of always succeeding. Set to zero only the address portion while passing the flags through. Fixes: `dbbdee9473` ("of/address: Merge all of the bus translation code") Cc: stable@vger.kernel.org Signed-off-by: Andrea della Porta <andrea.porta@suse.com> Tested-by: Herve Codina <herve.codina@bootlin.com> Link: https://lore.kernel.org/r/e51ae57874e58a9b349c35e2e877425ebc075d7a.1732441813.git.andrea.porta@suse.com Signed-off-by: Rob Herring (Arm) <robh@kernel.org>	2024-11-27 09:18:04 -06:00
Andrea della Porta	1a75e81baf	of/unittest: Add empty dma-ranges address translation tests Intermediate DT PCI nodes dynamically generated by enabling CONFIG_PCI_DYNAMIC_OF_NODES have empty dma-ranges property. PCI address specifiers have 3 cells and when dma-ranges is missing or empty, of_translate_one() is currently dropping the flag portion of PCI addresses which are subnodes of the aforementioned ones, failing the translation. Add new tests covering this case. With this test, we get 1 new failure which is fixed in subsequent commit: FAIL of_unittest_pci_empty_dma_ranges():1245 for_each_of_pci_range wrong CPU addr (ffffffffffffffff) on node /testcase-data/address-tests2/pcie@d1070000/pci@0,0/dev@0,0/local-bus@0 Signed-off-by: Andrea della Porta <andrea.porta@suse.com> Link: https://lore.kernel.org/r/08f8fee4fdc0379240fda2f4a0e6f11ebf9647a8.1732441813.git.andrea.porta@suse.com Signed-off-by: Rob Herring (Arm) <robh@kernel.org>	2024-11-27 09:18:04 -06:00
Marc Zyngier	6fc3a49f23	KVM: arm64: Fix S1/S2 combination when FWB==1 and S2 has Device memory type The G.a revision of the ARM ARM had it pretty clear that HCR_EL2.FWB had no influence on "The way that stage 1 memory types and attributes are combined with stage 2 Device type and attributes." (D5.5.5). However, this wording was lost in further revisions of the architecture. Restore the intended behaviour, which is to take the strongest memory type of S1 and S2 in this case, as if FWB was 0. The specification is being fixed accordingly. Fixes: `be04cebf3e` ("KVM: arm64: nv: Add emulation of AT S12E{0,1}{R,W}") Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20241125094756.609590-1-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-11-26 07:32:27 -08:00
James Clark	d798bc6f3c	arm64: Fix usage of new shifted MDCR_EL2 values Since the linked fixes commit, these masks are already shifted so remove the shifts. One issue that this fixes is SPE and TRBE not being available anymore: arm_spe_pmu arm,spe-v1: profiling buffer owned by higher exception level Fixes: `641630313e` ("arm64: sysreg: Migrate MDCR_EL2 definition to table") Signed-off-by: James Clark <james.clark@linaro.org> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20241122164636.2944180-1-james.clark@linaro.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>	2024-11-26 06:31:36 -08:00
Samuel Holland	bc7acc0bd0	of: property: fw_devlink: Do not use interrupt-parent directly commit `7f00be96f1` ("of: property: Add device link support for interrupt-parent, dmas and -gpio(s)") started adding device links for the interrupt-parent property. commit `4104ca776b` ("of: property: Add fw_devlink support for interrupts") and commit `f265f06af1` ("of: property: Fix fw_devlink handling of interrupts/interrupts-extended") later added full support for parsing the interrupts and interrupts-extended properties, which includes looking up the node of the parent domain. This made the handler for the interrupt-parent property redundant. In fact, creating device links based solely on interrupt-parent is problematic, because it can create spurious cycles. A node may have this property without itself being an interrupt controller or consumer. For example, this property is often present in the root node or a /soc bus node to set the default interrupt parent for child nodes. However, it is incorrect for the bus to depend on the interrupt controller, as some of the bus's children may not be interrupt consumers at all or may have a different interrupt parent. Resolving these spurious dependency cycles can cause an incorrect probe order for interrupt controller drivers. This was observed on a RISC-V system with both an APLIC and IMSIC under /soc, where interrupt-parent in /soc points to the APLIC, and the APLIC msi-parent points to the IMSIC. fw_devlink found three dependency cycles and attempted to probe the APLIC before the IMSIC. After applying this patch, there were no dependency cycles and the probe order was correct. Acked-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Fixes: `4104ca776b` ("of: property: Add fw_devlink support for interrupts") Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Link: https://lore.kernel.org/r/20241120233124.3649382-1-samuel.holland@sifive.com Signed-off-by: Rob Herring (Arm) <robh@kernel.org>	2024-11-25 08:24:17 -06:00
Lizhi Xu	eb09fbeb48	mac802154: check local interfaces before deleting sdata list syzkaller reported a corrupted list in ieee802154_if_remove. [1] Remove an IEEE 802.15.4 network interface after unregister an IEEE 802.15.4 hardware device from the system. CPU0 CPU1 ==== ==== genl_family_rcv_msg_doit ieee802154_unregister_hw ieee802154_del_iface ieee802154_remove_interfaces rdev_del_virtual_intf_deprecated list_del(&sdata->list) ieee802154_if_remove list_del_rcu The net device has been unregistered, since the rcu grace period, unregistration must be run before ieee802154_if_remove. To avoid this issue, add a check for local->interfaces before deleting sdata list. [1] kernel BUG at lib/list_debug.c:58! Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI CPU: 0 UID: 0 PID: 6277 Comm: syz-executor157 Not tainted 6.12.0-rc6-syzkaller-00005-g557329bcecc2 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 RIP: 0010:__list_del_entry_valid_or_report+0xf4/0x140 lib/list_debug.c:56 Code: e8 a1 7e 00 07 90 0f 0b 48 c7 c7 e0 37 60 8c 4c 89 fe e8 8f 7e 00 07 90 0f 0b 48 c7 c7 40 38 60 8c 4c 89 fe e8 7d 7e 00 07 90 <0f> 0b 48 c7 c7 a0 38 60 8c 4c 89 fe e8 6b 7e 00 07 90 0f 0b 48 c7 RSP: 0018:ffffc9000490f3d0 EFLAGS: 00010246 RAX: 000000000000004e RBX: dead000000000122 RCX: d211eee56bb28d00 RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000 RBP: ffff88805b278dd8 R08: ffffffff8174a12c R09: 1ffffffff2852f0d R10: dffffc0000000000 R11: fffffbfff2852f0e R12: dffffc0000000000 R13: dffffc0000000000 R14: dead000000000100 R15: ffff88805b278cc0 FS: 0000555572f94380(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000056262e4a3000 CR3: 0000000078496000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> __list_del_entry_valid include/linux/list.h:124 [inline] __list_del_entry include/linux/list.h:215 [inline] list_del_rcu include/linux/rculist.h:157 [inline] ieee802154_if_remove+0x86/0x1e0 net/mac802154/iface.c:687 rdev_del_virtual_intf_deprecated net/ieee802154/rdev-ops.h:24 [inline] ieee802154_del_iface+0x2c0/0x5c0 net/ieee802154/nl-phy.c:323 genl_family_rcv_msg_doit net/netlink/genetlink.c:1115 [inline] genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline] genl_rcv_msg+0xb14/0xec0 net/netlink/genetlink.c:1210 netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2551 genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219 netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline] netlink_unicast+0x7f6/0x990 net/netlink/af_netlink.c:1357 netlink_sendmsg+0x8e4/0xcb0 net/netlink/af_netlink.c:1901 sock_sendmsg_nosec net/socket.c:729 [inline] __sock_sendmsg+0x221/0x270 net/socket.c:744 ____sys_sendmsg+0x52a/0x7e0 net/socket.c:2607 ___sys_sendmsg net/socket.c:2661 [inline] __sys_sendmsg+0x292/0x380 net/socket.c:2690 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Reported-and-tested-by: syzbot+985f827280dc3a6e7e92@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=985f827280dc3a6e7e92 Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com> Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/20241113095129.1457225-1-lizhi.xu@windriver.com Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>	2024-11-19 10:54:17 +01:00
Keisuke Nishimura	2c87309ea7	ieee802154: ca8210: Add missing check for kfifo_alloc() in ca8210_probe() ca8210_test_interface_init() returns the result of kfifo_alloc(), which can be non-zero in case of an error. The caller, ca8210_probe(), should check the return value and do error-handling if it fails. Fixes: `ded845a781` ("ieee802154: Add CA8210 IEEE 802.15.4 device driver") Signed-off-by: Keisuke Nishimura <keisuke.nishimura@inria.fr> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/20241029182712.318271-1-keisuke.nishimura@inria.fr Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>	2024-11-19 10:47:37 +01:00

1132 changed files with 16312 additions and 10027 deletions

5

.mailmap

View File

@@ -121,6 +121,8 @@ Ben Widawsky <bwidawsk@kernel.org> <benjamin.widawsky@intel.com>
 Benjamin Poirier <benjamin.poirier@gmail.com> <bpoirier@suse.de>
 Benjamin Tissoires <bentiss@kernel.org> <benjamin.tissoires@gmail.com>
 Benjamin Tissoires <bentiss@kernel.org> <benjamin.tissoires@redhat.com>
 Bingwu Zhang <xtex@aosc.io> <xtexchooser@duck.com>
 Bingwu Zhang <xtex@aosc.io> <xtex@xtexx.eu.org>
 Bjorn Andersson <andersson@kernel.org> <bjorn@kryo.se>
 Bjorn Andersson <andersson@kernel.org> <bjorn.andersson@linaro.org>
 Bjorn Andersson <andersson@kernel.org> <bjorn.andersson@sonymobile.com>
@@ -435,7 +437,7 @@ Martin Kepplinger <martink@posteo.de> <martin.kepplinger@ginzinger.com>
 Martin Kepplinger <martink@posteo.de> <martin.kepplinger@puri.sm>
 Martin Kepplinger <martink@posteo.de> <martin.kepplinger@theobroma-systems.com>
 Martyna Szapar-Mudlaw <martyna.szapar-mudlaw@linux.intel.com> <martyna.szapar-mudlaw@intel.com>
 Mathieu Othacehe <m.othacehe@gmail.com> <othacehe@gnu.org>
 Mathieu Othacehe <othacehe@gnu.org> <m.othacehe@gmail.com>
 Mat Martineau <martineau@kernel.org> <mathew.j.martineau@linux.intel.com>
 Mat Martineau <martineau@kernel.org> <mathewm@codeaurora.org>
 Matthew Wilcox <willy@infradead.org> <matthew.r.wilcox@intel.com>
@@ -735,6 +737,7 @@ Wolfram Sang <wsa@kernel.org> <w.sang@pengutronix.de>
 Wolfram Sang <wsa@kernel.org> <wsa@the-dreams.de>
 Yakir Yang <kuankuan.y@gmail.com> <ykk@rock-chips.com>
 Yanteng Si <si.yanteng@linux.dev> <siyanteng@loongson.cn>
 Ying Huang <huang.ying.caritas@gmail.com> <ying.huang@intel.com>
 Yusuke Goda <goda.yusuke@renesas.com>
 Zack Rusin <zack.rusin@broadcom.com> <zackr@vmware.com>
 Zhu Yanjun <zyjzyj2000@gmail.com> <yanjunz@nvidia.com>

12

CREDITS

View File

@@ -20,6 +20,10 @@ N: Thomas Abraham
 E: thomas.ab@samsung.com
 D: Samsung pin controller driver
 N: Jose Abreu
 E: jose.abreu@synopsys.com
 D: Synopsys DesignWare XPCS MDIO/PCS driver.
 N: Dragos Acostachioaie
 E: dragos@iname.com
 W: http://www.arbornet.org/~dragos
@@ -1428,6 +1432,10 @@ S: 8124 Constitution Apt. 7
 S: Sterling Heights, Michigan 48313
 S: USA
 N: Andy Gospodarek
 E: andy@greyhouse.net
 D: Maintenance and contributions to the network interface bonding driver.
 N: Wolfgang Grandegger
 E: wg@grandegger.com
 D: Controller Area Network (device drivers)
@@ -1812,6 +1820,10 @@ D: Author/maintainer of most DRM drivers (especially ATI, MGA)
 D: Core DRM templates, general DRM and 3D-related hacking
 S: No fixed address
 N: Woojung Huh
 E: woojung.huh@microchip.com
 D: Microchip LAN78XX USB Ethernet driver
 N: Kenn Humborg
 E: kenn@wombat.ie
 D: Mods to loop device to support sparse backing files

5

Documentation/admin-guide/kernel-parameters.txt

View File

@@ -4822,6 +4822,11 @@
 			       can be preempted anytime.  Tasks will also yield
 			       contended spinlocks (if the critical section isn't
 			       explicitly preempt disabled beyond the lock itself).
 			lazy - Scheduler controlled. Similar to full but instead
 			       of preempting the task immediately, the task gets
 			       one HZ tick time to yield itself before the
 			       preemption will be forced. One preemption is when the
 			       task returns to user space.
 	print-fatal-signals=
 			[KNL] debug: print fatal signals

									
										10

Documentation/admin-guide/laptops/thinkpad-acpi.rst
									
												View File
												
				@@ -445,8 +445,10 @@ event	code	Key		Notes

				0x1008	0x07	FN+F8		IBM: toggle screen expand

								Lenovo: configure UltraNav,

								or toggle screen expand.

								On newer platforms (2024+)

								replaced by 0x131f (see below)

								On 2024 platforms replaced by

								0x131f (see below) and on newer

								platforms (2025 +) keycode is

								replaced by 0x1401 (see below).

				0x1009	0x08	FN+F9		-

				@@ -506,9 +508,11 @@ event	code	Key		Notes

				0x1019	0x18	unknown

				0x131f	...	FN+F8	        Platform Mode change.

				0x131f	...	FN+F8		Platform Mode change (2024 systems).

								Implemented in driver.

				0x1401	...	FN+F8		Platform Mode change (2025 + systems).

								Implemented in driver.

				...	...	...

				0x1020	0x1F	unknown

									
										2

Documentation/admin-guide/mm/transhuge.rst
									
												View File
												
				@@ -436,7 +436,7 @@ AnonHugePmdMapped).

				The number of file transparent huge pages mapped to userspace is available

				by reading ShmemPmdMapped and ShmemHugePages fields in ``/proc/meminfo``.

				To identify what applications are mapping file transparent huge pages, it

				is necessary to read ``/proc/PID/smaps`` and count the FileHugeMapped fields

				is necessary to read ``/proc/PID/smaps`` and count the FilePmdMapped fields

				for each mapping.

				Note that reading the smaps file is expensive and reading it

									
										4

Documentation/admin-guide/pm/amd-pstate.rst
									
												View File
												
				@@ -251,9 +251,7 @@ performance supported in `AMD CPPC Performance Capability <perf_cap_>`_).

				In some ASICs, the highest CPPC performance is not the one in the ``_CPC``

				table, so we need to expose it to sysfs. If boost is not active, but

				still supported, this maximum frequency will be larger than the one in

				``cpuinfo``. On systems that support preferred core, the driver will have

				different values for some cores than others and this will reflect the values

				advertised by the platform at bootup.

				``cpuinfo``.

				This attribute is read-only.

				``amd_pstate_lowest_nonlinear_freq``

									
										10

Documentation/devicetree/bindings/crypto/fsl,sec-v4.0.yaml
									
												View File
												
				@@ -114,8 +114,9 @@ patternProperties:

				          table that specifies the PPID to LIODN mapping. Needed if the PAMU is

				          used.  Value is a 12 bit value where value is a LIODN ID for this JR.

				          This property is normally set by boot firmware.

				        $ref: /schemas/types.yaml#/definitions/uint32

				        maximum: 0xfff

				        $ref: /schemas/types.yaml#/definitions/uint32-array

				        items:

				          - maximum: 0xfff

				  '^rtic@[0-9a-f]+$':

				    type: object

				@@ -186,8 +187,9 @@ patternProperties:

				              Needed if the PAMU is used.  Value is a 12 bit value where value

				              is a LIODN ID for this JR. This property is normally set by boot

				              firmware.

				            $ref: /schemas/types.yaml#/definitions/uint32

				            maximum: 0xfff

				            $ref: /schemas/types.yaml#/definitions/uint32-array

				            items:

				              - maximum: 0xfff

				          fsl,rtic-region:

				            description:

									
										2

Documentation/devicetree/bindings/display/bridge/adi,adv7533.yaml
									
												View File
												
				@@ -90,7 +90,7 @@ properties:

				  adi,dsi-lanes:

				    description: Number of DSI data lanes connected to the DSI host.

				    $ref: /schemas/types.yaml#/definitions/uint32

				    enum: [ 1, 2, 3, 4 ]

				    enum: [ 2, 3, 4 ]

				  "#sound-dai-cells":

				    const: 0

									
										19

Documentation/devicetree/bindings/display/mediatek/mediatek,dp.yaml
									
												View File
												
				@@ -42,6 +42,9 @@ properties:

				  interrupts:

				    maxItems: 1

				  '#sound-dai-cells':

				    const: 0

				  ports:

				    $ref: /schemas/graph.yaml#/properties/ports

				    properties:

				@@ -85,7 +88,21 @@ required:

				  - ports

				  - max-linkrate-mhz

				additionalProperties: false

				allOf:

				  - $ref: /schemas/sound/dai-common.yaml#

				  - if:

				      not:

				        properties:

				          compatible:

				            contains:

				              enum:

				                - mediatek,mt8188-dp-tx

				                - mediatek,mt8195-dp-tx

				    then:

				      properties:

				        '#sound-dai-cells': false

				unevaluatedProperties: false

				examples:

				  - |

									
										1

Documentation/devicetree/bindings/iio/st,st-sensors.yaml
									
												View File
												
				@@ -65,6 +65,7 @@ properties:

				          - st,lsm9ds0-gyro

				      - description: STMicroelectronics Magnetometers

				        enum:

				          - st,iis2mdc

				          - st,lis2mdl

				          - st,lis3mdl-magn

				          - st,lsm303agr-magn

									
										2

Documentation/devicetree/bindings/mtd/partitions/fixed-partitions.yaml
									
												View File
												
				@@ -82,7 +82,7 @@ examples:

				        uimage@100000 {

				            reg = <0x0100000 0x200000>;

				            compress = "lzma";

				            compression = "lzma";

				        };

				    };

									
										2

Documentation/devicetree/bindings/net/pse-pd/pse-controller.yaml
									
												View File
												
				@@ -81,7 +81,7 @@ properties:

				              List of phandles, each pointing to the power supply for the

				              corresponding pairset named in 'pairset-names'. This property

				              aligns with IEEE 802.3-2022, Section 33.2.3 and 145.2.4.

				              PSE Pinout Alternatives (as per IEEE 802.3-2022 Table 145\u20133)

				              PSE Pinout Alternatives (as per IEEE 802.3-2022 Table 145-3)

				              |-----------|---------------|---------------|---------------|---------------|

				              | Conductor | Alternative A | Alternative A | Alternative B | Alternative B |

				              |           |    (MDI-X)    |     (MDI)     |      (X)      |      (S)      |

									
										7

Documentation/devicetree/bindings/phy/fsl,imx8mq-usb-phy.yaml
									
												View File
												
				@@ -113,11 +113,8 @@ allOf:

				          maxItems: 1

				  - if:

				      properties:

				        compatible:

				          contains:

				            enum:

				              - fsl,imx95-usb-phy

				      required:

				        - orientation-switch

				    then:

				      $ref: /schemas/usb/usb-switch.yaml#

									
										27

Documentation/devicetree/bindings/regulator/qcom,qca6390-pmu.yaml
									
												View File
												
				@@ -18,6 +18,7 @@ properties:

				  compatible:

				    enum:

				      - qcom,qca6390-pmu

				      - qcom,wcn6750-pmu

				      - qcom,wcn6855-pmu

				      - qcom,wcn7850-pmu

				@@ -27,6 +28,9 @@ properties:

				  vddaon-supply:

				    description: VDD_AON supply regulator handle

				  vddasd-supply:

				    description: VDD_ASD supply regulator handle

				  vdddig-supply:

				    description: VDD_DIG supply regulator handle

				@@ -42,6 +46,9 @@ properties:

				  vddio1p2-supply:

				    description: VDD_IO_1P2 supply regulator handle

				  vddrfa0p8-supply:

				    description: VDD_RFA_0P8 supply regulator handle

				  vddrfa0p95-supply:

				    description: VDD_RFA_0P95 supply regulator handle

				@@ -51,12 +58,18 @@ properties:

				  vddrfa1p3-supply:

				    description: VDD_RFA_1P3 supply regulator handle

				  vddrfa1p7-supply:

				    description: VDD_RFA_1P7 supply regulator handle

				  vddrfa1p8-supply:

				    description: VDD_RFA_1P8 supply regulator handle

				  vddrfa1p9-supply:

				    description: VDD_RFA_1P9 supply regulator handle

				  vddrfa2p2-supply:

				    description: VDD_RFA_2P2 supply regulator handle

				  vddpcie1p3-supply:

				    description: VDD_PCIE_1P3 supply regulator handle

				@@ -119,6 +132,20 @@ allOf:

				        - vddpcie1p3-supply

				        - vddpcie1p9-supply

				        - vddio-supply

				  - if:

				      properties:

				        compatible:

				          contains:

				            const: qcom,wcn6750-pmu

				    then:

				      required:

				        - vddaon-supply

				        - vddasd-supply

				        - vddpmu-supply

				        - vddrfa0p8-supply

				        - vddrfa1p2-supply

				        - vddrfa1p7-supply

				        - vddrfa2p2-supply

				  - if:

				      properties:

				        compatible:

									
										2

Documentation/devicetree/bindings/soc/fsl/fsl,qman-portal.yaml
									
												View File
												
				@@ -35,6 +35,7 @@ properties:

				  fsl,liodn:

				    $ref: /schemas/types.yaml#/definitions/uint32-array

				    maxItems: 2

				    description: See pamu.txt. Two LIODN(s). DQRR LIODN (DLIODN) and Frame LIODN

				      (FLIODN)

				@@ -69,6 +70,7 @@ patternProperties:

				    type: object

				    properties:

				      fsl,liodn:

				        $ref: /schemas/types.yaml#/definitions/uint32-array

				        description: See pamu.txt, PAMU property used for static LIODN assignment

				      fsl,iommu-parent:

									
										2

Documentation/devicetree/bindings/sound/realtek,rt5645.yaml
									
												View File
												
				@@ -51,7 +51,7 @@ properties:

				    description: Power supply for AVDD, providing 1.8V.

				  cpvdd-supply:

				    description: Power supply for CPVDD, providing 3.5V.

				    description: Power supply for CPVDD, providing 1.8V.

				  hp-detect-gpios:

				    description:

									
										850

Documentation/mm/process_addrs.rst
									
												View File
												
				@@ -3,3 +3,853 @@

				=================

				Process Addresses

				=================

				.. toctree::

				   :maxdepth: 3

				Userland memory ranges are tracked by the kernel via Virtual Memory Areas or

				'VMA's of type :c:struct:`!struct vm_area_struct`.

				Each VMA describes a virtually contiguous memory range with identical

				attributes, each described by a :c:struct:`!struct vm_area_struct`

				object. Userland access outside of VMAs is invalid except in the case where an

				adjacent stack VMA could be extended to contain the accessed address.

				All VMAs are contained within one and only one virtual address space, described

				by a :c:struct:`!struct mm_struct` object which is referenced by all tasks (that is,

				threads) which share the virtual address space. We refer to this as the

				:c:struct:`!mm`.

				Each mm object contains a maple tree data structure which describes all VMAs

				within the virtual address space.

				.. note:: An exception to this is the 'gate' VMA which is provided by

				          architectures which use :c:struct:`!vsyscall` and is a global static

				          object which does not belong to any specific mm.

				-------

				Locking

				-------

				The kernel is designed to be highly scalable against concurrent read operations

				on VMA **metadata** so a complicated set of locks are required to ensure memory

				corruption does not occur.

				.. note:: Locking VMAs for their metadata does not have any impact on the memory

				          they describe nor the page tables that map them.

				Terminology

				-----------

				* **mmap locks** - Each MM has a read/write semaphore :c:member:`!mmap_lock`

				  which locks at a process address space granularity which can be acquired via

				  :c:func:`!mmap_read_lock`, :c:func:`!mmap_write_lock` and variants.

				* **VMA locks** - The VMA lock is at VMA granularity (of course) which behaves

				  as a read/write semaphore in practice. A VMA read lock is obtained via

				  :c:func:`!lock_vma_under_rcu` (and unlocked via :c:func:`!vma_end_read`) and a

				  write lock via :c:func:`!vma_start_write` (all VMA write locks are unlocked

				  automatically when the mmap write lock is released). To take a VMA write lock

				  you **must** have already acquired an :c:func:`!mmap_write_lock`.

				* **rmap locks** - When trying to access VMAs through the reverse mapping via a

				  :c:struct:`!struct address_space` or :c:struct:`!struct anon_vma` object

				  (reachable from a folio via :c:member:`!folio->mapping`). VMAs must be stabilised via

				  :c:func:`!anon_vma_[try]lock_read` or :c:func:`!anon_vma_[try]lock_write` for

				  anonymous memory and :c:func:`!i_mmap_[try]lock_read` or

				  :c:func:`!i_mmap_[try]lock_write` for file-backed memory. We refer to these

				  locks as the reverse mapping locks, or 'rmap locks' for brevity.

				We discuss page table locks separately in the dedicated section below.

				The first thing **any** of these locks achieve is to **stabilise** the VMA

				within the MM tree. That is, guaranteeing that the VMA object will not be

				deleted from under you nor modified (except for some specific fields

				described below).

				Stabilising a VMA also keeps the address space described by it around.

				Lock usage

				----------

				If you want to **read** VMA metadata fields or just keep the VMA stable, you

				must do one of the following:

				* Obtain an mmap read lock at the MM granularity via :c:func:`!mmap_read_lock` (or a

				  suitable variant), unlocking it with a matching :c:func:`!mmap_read_unlock` when

				  you're done with the VMA, *or*

				* Try to obtain a VMA read lock via :c:func:`!lock_vma_under_rcu`. This tries to

				  acquire the lock atomically so might fail, in which case fall-back logic is

				  required to instead obtain an mmap read lock if this returns :c:macro:`!NULL`,

				  *or*

				* Acquire an rmap lock before traversing the locked interval tree (whether

				  anonymous or file-backed) to obtain the required VMA.

				If you want to **write** VMA metadata fields, then things vary depending on the

				field (we explore each VMA field in detail below). For the majority you must:

				* Obtain an mmap write lock at the MM granularity via :c:func:`!mmap_write_lock` (or a

				  suitable variant), unlocking it with a matching :c:func:`!mmap_write_unlock` when

				  you're done with the VMA, *and*

				* Obtain a VMA write lock via :c:func:`!vma_start_write` for each VMA you wish to

				  modify, which will be released automatically when :c:func:`!mmap_write_unlock` is

				  called.

				* If you want to be able to write to **any** field, you must also hide the VMA

				  from the reverse mapping by obtaining an **rmap write lock**.

				VMA locks are special in that you must obtain an mmap **write** lock **first**

				in order to obtain a VMA **write** lock. A VMA **read** lock however can be

				obtained without any other lock (:c:func:`!lock_vma_under_rcu` will acquire then

				release an RCU lock to lookup the VMA for you).

				This constrains the impact of writers on readers, as a writer can interact with

				one VMA while a reader interacts with another simultaneously.

				.. note:: The primary users of VMA read locks are page fault handlers, which

				          means that without a VMA write lock, page faults will run concurrent with

				          whatever you are doing.

				Examining all valid lock states:

				.. table::

				   ========= ======== ========= ======= ===== =========== ==========

				   mmap lock VMA lock rmap lock Stable? Read? Write most? Write all?

				   ========= ======== ========= ======= ===== =========== ==========

				   \-        \-       \-        N       N     N           N

				   \-        R        \-        Y       Y     N           N

				   \-        \-       R/W       Y       Y     N           N

				   R/W       \-/R     \-/R/W    Y       Y     N           N

				   W         W        \-/R      Y       Y     Y           N

				   W         W        W         Y       Y     Y           Y

				   ========= ======== ========= ======= ===== =========== ==========

				.. warning:: While it's possible to obtain a VMA lock while holding an mmap read lock,

				             attempting to do the reverse is invalid as it can result in deadlock - if

				             another task already holds an mmap write lock and attempts to acquire a VMA

				             write lock that will deadlock on the VMA read lock.

				All of these locks behave as read/write semaphores in practice, so you can

				obtain either a read or a write lock for each of these.

				.. note:: Generally speaking, a read/write semaphore is a class of lock which

				          permits concurrent readers. However a write lock can only be obtained

				          once all readers have left the critical region (and pending readers

				          made to wait).

				          This renders read locks on a read/write semaphore concurrent with other

				          readers and write locks exclusive against all others holding the semaphore.

				VMA fields

				^^^^^^^^^^

				We can subdivide :c:struct:`!struct vm_area_struct` fields by their purpose, which makes it

				easier to explore their locking characteristics:

				.. note:: We exclude VMA lock-specific fields here to avoid confusion, as these

				          are in effect an internal implementation detail.

				.. table:: Virtual layout fields

				   ===================== ======================================== ===========

				   Field                 Description                              Write lock

				   ===================== ======================================== ===========

				   :c:member:`!vm_start` Inclusive start virtual address of range mmap write,

				                         VMA describes.                           VMA write,

				                                                                  rmap write.

				   :c:member:`!vm_end`   Exclusive end virtual address of range   mmap write,

				                         VMA describes.                           VMA write,

				                                                                  rmap write.

				   :c:member:`!vm_pgoff` Describes the page offset into the file, mmap write,

				                         the original page offset within the      VMA write,

				                         virtual address space (prior to any      rmap write.

				                         :c:func:`!mremap`), or PFN if a PFN map

				                         and the architecture does not support

				                         :c:macro:`!CONFIG_ARCH_HAS_PTE_SPECIAL`.

				   ===================== ======================================== ===========

				These fields describes the size, start and end of the VMA, and as such cannot be

				modified without first being hidden from the reverse mapping since these fields

				are used to locate VMAs within the reverse mapping interval trees.

				.. table:: Core fields

				   ============================ ======================================== =========================

				   Field                        Description                              Write lock

				   ============================ ======================================== =========================

				   :c:member:`!vm_mm`           Containing mm_struct.                    None - written once on

				                                                                         initial map.

				   :c:member:`!vm_page_prot`    Architecture-specific page table         mmap write, VMA write.

				                                protection bits determined from VMA

				                                flags.

				   :c:member:`!vm_flags`        Read-only access to VMA flags describing N/A

				                                attributes of the VMA, in union with

				                                private writable

				                                :c:member:`!__vm_flags`.

				   :c:member:`!__vm_flags`      Private, writable access to VMA flags    mmap write, VMA write.

				                                field, updated by

				                                :c:func:`!vm_flags_*` functions.

				   :c:member:`!vm_file`         If the VMA is file-backed, points to a   None - written once on

				                                struct file object describing the        initial map.

				                                underlying file, if anonymous then

				                                :c:macro:`!NULL`.

				   :c:member:`!vm_ops`          If the VMA is file-backed, then either   None - Written once on

				                                the driver or file-system provides a     initial map by

				                                :c:struct:`!struct vm_operations_struct` :c:func:`!f_ops->mmap()`.

				                                object describing callbacks to be

				                                invoked on VMA lifetime events.

				   :c:member:`!vm_private_data` A :c:member:`!void *` field for          Handled by driver.

				                                driver-specific metadata.

				   ============================ ======================================== =========================

				These are the core fields which describe the MM the VMA belongs to and its attributes.

				.. table:: Config-specific fields

				   ================================= ===================== ======================================== ===============

				   Field                             Configuration option  Description                              Write lock

				   ================================= ===================== ======================================== ===============

				   :c:member:`!anon_name`            CONFIG_ANON_VMA_NAME  A field for storing a                    mmap write,

				                                                           :c:struct:`!struct anon_vma_name`        VMA write.

				                                                           object providing a name for anonymous

				                                                           mappings, or :c:macro:`!NULL` if none

				                                                           is set or the VMA is file-backed. The

											   underlying object is reference counted

											   and can be shared across multiple VMAs

											   for scalability.

				   :c:member:`!swap_readahead_info`  CONFIG_SWAP           Metadata used by the swap mechanism      mmap read,

				                                                           to perform readahead. This field is      swap-specific

				                                                           accessed atomically.                     lock.

				   :c:member:`!vm_policy`            CONFIG_NUMA           :c:type:`!mempolicy` object which        mmap write,

				                                                           describes the NUMA behaviour of the      VMA write.

				                                                           VMA. The underlying object is reference

											   counted.

				   :c:member:`!numab_state`          CONFIG_NUMA_BALANCING :c:type:`!vma_numab_state` object which  mmap read,

				                                                           describes the current state of           numab-specific

				                                                           NUMA balancing in relation to this VMA.  lock.

				                                                           Updated under mmap read lock by

				                                                           :c:func:`!task_numa_work`.

				   :c:member:`!vm_userfaultfd_ctx`   CONFIG_USERFAULTFD    Userfaultfd context wrapper object of    mmap write,

				                                                           type :c:type:`!vm_userfaultfd_ctx`,      VMA write.

				                                                           either of zero size if userfaultfd is

				                                                           disabled, or containing a pointer

				                                                           to an underlying

				                                                           :c:type:`!userfaultfd_ctx` object which

				                                                           describes userfaultfd metadata.

				   ================================= ===================== ======================================== ===============

				These fields are present or not depending on whether the relevant kernel

				configuration option is set.

				.. table:: Reverse mapping fields

				   =================================== ========================================= ============================

				   Field                               Description                               Write lock

				   =================================== ========================================= ============================

				   :c:member:`!shared.rb`              A red/black tree node used, if the        mmap write, VMA write,

				                                       mapping is file-backed, to place the VMA  i_mmap write.

				                                       in the

				                                       :c:member:`!struct address_space->i_mmap`

				                                       red/black interval tree.

				   :c:member:`!shared.rb_subtree_last` Metadata used for management of the       mmap write, VMA write,

				                                       interval tree if the VMA is file-backed.  i_mmap write.

				   :c:member:`!anon_vma_chain`         List of pointers to both forked/CoW’d     mmap read, anon_vma write.

				                                       :c:type:`!anon_vma` objects and

				                                       :c:member:`!vma->anon_vma` if it is

				                                       non-:c:macro:`!NULL`.

				   :c:member:`!anon_vma`               :c:type:`!anon_vma` object used by        When :c:macro:`NULL` and

				                                       anonymous folios mapped exclusively to    setting non-:c:macro:`NULL`:

				                                       this VMA. Initially set by                mmap read, page_table_lock.

				                                       :c:func:`!anon_vma_prepare` serialised

				                                       by the :c:macro:`!page_table_lock`. This  When non-:c:macro:`NULL` and

				                                       is set as soon as any page is faulted in. setting :c:macro:`NULL`:

				                                                                                 mmap write, VMA write,

				                                                                                 anon_vma write.

				   =================================== ========================================= ============================

				These fields are used to both place the VMA within the reverse mapping, and for

				anonymous mappings, to be able to access both related :c:struct:`!struct anon_vma` objects

				and the :c:struct:`!struct anon_vma` in which folios mapped exclusively to this VMA should

				reside.

				.. note:: If a file-backed mapping is mapped with :c:macro:`!MAP_PRIVATE` set

				          then it can be in both the :c:type:`!anon_vma` and :c:type:`!i_mmap`

				          trees at the same time, so all of these fields might be utilised at

				          once.

				Page tables

				-----------

				We won't speak exhaustively on the subject but broadly speaking, page tables map

				virtual addresses to physical ones through a series of page tables, each of

				which contain entries with physical addresses for the next page table level

				(along with flags), and at the leaf level the physical addresses of the

				underlying physical data pages or a special entry such as a swap entry,

				migration entry or other special marker. Offsets into these pages are provided

				by the virtual address itself.

				In Linux these are divided into five levels - PGD, P4D, PUD, PMD and PTE. Huge

				pages might eliminate one or two of these levels, but when this is the case we

				typically refer to the leaf level as the PTE level regardless.

				.. note:: In instances where the architecture supports fewer page tables than

					  five the kernel cleverly 'folds' page table levels, that is stubbing

					  out functions related to the skipped levels. This allows us to

					  conceptually act as if there were always five levels, even if the

					  compiler might, in practice, eliminate any code relating to missing

					  ones.

				There are four key operations typically performed on page tables:

				1. **Traversing** page tables - Simply reading page tables in order to traverse

				   them. This only requires that the VMA is kept stable, so a lock which

				   establishes this suffices for traversal (there are also lockless variants

				   which eliminate even this requirement, such as :c:func:`!gup_fast`).

				2. **Installing** page table mappings - Whether creating a new mapping or

				   modifying an existing one in such a way as to change its identity. This

				   requires that the VMA is kept stable via an mmap or VMA lock (explicitly not

				   rmap locks).

				3. **Zapping/unmapping** page table entries - This is what the kernel calls

				   clearing page table mappings at the leaf level only, whilst leaving all page

				   tables in place. This is a very common operation in the kernel performed on

				   file truncation, the :c:macro:`!MADV_DONTNEED` operation via

				   :c:func:`!madvise`, and others. This is performed by a number of functions

				   including :c:func:`!unmap_mapping_range` and :c:func:`!unmap_mapping_pages`.

				   The VMA need only be kept stable for this operation.

				4. **Freeing** page tables - When finally the kernel removes page tables from a

				   userland process (typically via :c:func:`!free_pgtables`) extreme care must

				   be taken to ensure this is done safely, as this logic finally frees all page

				   tables in the specified range, ignoring existing leaf entries (it assumes the

				   caller has both zapped the range and prevented any further faults or

				   modifications within it).

				.. note:: Modifying mappings for reclaim or migration is performed under rmap

				          lock as it, like zapping, does not fundamentally modify the identity

				          of what is being mapped.

				**Traversing** and **zapping** ranges can be performed holding any one of the

				locks described in the terminology section above - that is the mmap lock, the

				VMA lock or either of the reverse mapping locks.

				That is - as long as you keep the relevant VMA **stable** - you are good to go

				ahead and perform these operations on page tables (though internally, kernel

				operations that perform writes also acquire internal page table locks to

				serialise - see the page table implementation detail section for more details).

				When **installing** page table entries, the mmap or VMA lock must be held to

				keep the VMA stable. We explore why this is in the page table locking details

				section below.

				.. warning:: Page tables are normally only traversed in regions covered by VMAs.

				             If you want to traverse page tables in areas that might not be

				             covered by VMAs, heavier locking is required.

				             See :c:func:`!walk_page_range_novma` for details.

				**Freeing** page tables is an entirely internal memory management operation and

				has special requirements (see the page freeing section below for more details).

				.. warning:: When **freeing** page tables, it must not be possible for VMAs

				             containing the ranges those page tables map to be accessible via

				             the reverse mapping.

				             The :c:func:`!free_pgtables` function removes the relevant VMAs

				             from the reverse mappings, but no other VMAs can be permitted to be

				             accessible and span the specified range.

				Lock ordering

				-------------

				As we have multiple locks across the kernel which may or may not be taken at the

				same time as explicit mm or VMA locks, we have to be wary of lock inversion, and

				the **order** in which locks are acquired and released becomes very important.

				.. note:: Lock inversion occurs when two threads need to acquire multiple locks,

				   but in doing so inadvertently cause a mutual deadlock.

				   For example, consider thread 1 which holds lock A and tries to acquire lock B,

				   while thread 2 holds lock B and tries to acquire lock A.

				   Both threads are now deadlocked on each other. However, had they attempted to

				   acquire locks in the same order, one would have waited for the other to

				   complete its work and no deadlock would have occurred.

				The opening comment in :c:macro:`!mm/rmap.c` describes in detail the required

				ordering of locks within memory management code:

				.. code-block::

				  inode->i_rwsem        (while writing or truncating, not reading or faulting)

				    mm->mmap_lock

				      mapping->invalidate_lock (in filemap_fault)

				        folio_lock

				          hugetlbfs_i_mmap_rwsem_key (in huge_pmd_share, see hugetlbfs below)

				            vma_start_write

				              mapping->i_mmap_rwsem

				                anon_vma->rwsem

				                  mm->page_table_lock or pte_lock

				                    swap_lock (in swap_duplicate, swap_info_get)

				                      mmlist_lock (in mmput, drain_mmlist and others)

				                      mapping->private_lock (in block_dirty_folio)

				                          i_pages lock (widely used)

				                            lruvec->lru_lock (in folio_lruvec_lock_irq)

				                      inode->i_lock (in set_page_dirty's __mark_inode_dirty)

				                      bdi.wb->list_lock (in set_page_dirty's __mark_inode_dirty)

				                        sb_lock (within inode_lock in fs/fs-writeback.c)

				                        i_pages lock (widely used, in set_page_dirty,

				                                  in arch-dependent flush_dcache_mmap_lock,

				                                  within bdi.wb->list_lock in __sync_single_inode)

				There is also a file-system specific lock ordering comment located at the top of

				:c:macro:`!mm/filemap.c`:

				.. code-block::

				  ->i_mmap_rwsem                        (truncate_pagecache)

				    ->private_lock                      (__free_pte->block_dirty_folio)

				      ->swap_lock                       (exclusive_swap_page, others)

				        ->i_pages lock

				  ->i_rwsem

				    ->invalidate_lock                   (acquired by fs in truncate path)

				      ->i_mmap_rwsem                    (truncate->unmap_mapping_range)

				  ->mmap_lock

				    ->i_mmap_rwsem

				      ->page_table_lock or pte_lock     (various, mainly in memory.c)

				        ->i_pages lock                  (arch-dependent flush_dcache_mmap_lock)

				  ->mmap_lock

				    ->invalidate_lock                   (filemap_fault)

				      ->lock_page                       (filemap_fault, access_process_vm)

				  ->i_rwsem                             (generic_perform_write)

				    ->mmap_lock                         (fault_in_readable->do_page_fault)

				  bdi->wb.list_lock

				    sb_lock                             (fs/fs-writeback.c)

				    ->i_pages lock                      (__sync_single_inode)

				  ->i_mmap_rwsem

				    ->anon_vma.lock                     (vma_merge)

				  ->anon_vma.lock

				    ->page_table_lock or pte_lock       (anon_vma_prepare and various)

				  ->page_table_lock or pte_lock

				    ->swap_lock                         (try_to_unmap_one)

				    ->private_lock                      (try_to_unmap_one)

				    ->i_pages lock                      (try_to_unmap_one)

				    ->lruvec->lru_lock                  (follow_page_mask->mark_page_accessed)

				    ->lruvec->lru_lock                  (check_pte_range->folio_isolate_lru)

				    ->private_lock                      (folio_remove_rmap_pte->set_page_dirty)

				    ->i_pages lock                      (folio_remove_rmap_pte->set_page_dirty)

				    bdi.wb->list_lock                   (folio_remove_rmap_pte->set_page_dirty)

				    ->inode->i_lock                     (folio_remove_rmap_pte->set_page_dirty)

				    bdi.wb->list_lock                   (zap_pte_range->set_page_dirty)

				    ->inode->i_lock                     (zap_pte_range->set_page_dirty)

				    ->private_lock                      (zap_pte_range->block_dirty_folio)

				Please check the current state of these comments which may have changed since

				the time of writing of this document.

				------------------------------

				Locking Implementation Details

				------------------------------

				.. warning:: Locking rules for PTE-level page tables are very different from

				             locking rules for page tables at other levels.

				Page table locking details

				--------------------------

				In addition to the locks described in the terminology section above, we have

				additional locks dedicated to page tables:

				* **Higher level page table locks** - Higher level page tables, that is PGD, P4D

				  and PUD each make use of the process address space granularity

				  :c:member:`!mm->page_table_lock` lock when modified.

				* **Fine-grained page table locks** - PMDs and PTEs each have fine-grained locks

				  either kept within the folios describing the page tables or allocated

				  separated and pointed at by the folios if :c:macro:`!ALLOC_SPLIT_PTLOCKS` is

				  set. The PMD spin lock is obtained via :c:func:`!pmd_lock`, however PTEs are

				  mapped into higher memory (if a 32-bit system) and carefully locked via

				  :c:func:`!pte_offset_map_lock`.

				These locks represent the minimum required to interact with each page table

				level, but there are further requirements.

				Importantly, note that on a **traversal** of page tables, sometimes no such

				locks are taken. However, at the PTE level, at least concurrent page table

				deletion must be prevented (using RCU) and the page table must be mapped into

				high memory, see below.

				Whether care is taken on reading the page table entries depends on the

				architecture, see the section on atomicity below.

				Locking rules

				^^^^^^^^^^^^^

				We establish basic locking rules when interacting with page tables:

				* When changing a page table entry the page table lock for that page table

				  **must** be held, except if you can safely assume nobody can access the page

				  tables concurrently (such as on invocation of :c:func:`!free_pgtables`).

				* Reads from and writes to page table entries must be *appropriately*

				  atomic. See the section on atomicity below for details.

				* Populating previously empty entries requires that the mmap or VMA locks are

				  held (read or write), doing so with only rmap locks would be dangerous (see

				  the warning below).

				* As mentioned previously, zapping can be performed while simply keeping the VMA

				  stable, that is holding any one of the mmap, VMA or rmap locks.

				.. warning:: Populating previously empty entries is dangerous as, when unmapping

				             VMAs, :c:func:`!vms_clear_ptes` has a window of time between

				             zapping (via :c:func:`!unmap_vmas`) and freeing page tables (via

				             :c:func:`!free_pgtables`), where the VMA is still visible in the

				             rmap tree. :c:func:`!free_pgtables` assumes that the zap has

				             already been performed and removes PTEs unconditionally (along with

				             all other page tables in the freed range), so installing new PTE

				             entries could leak memory and also cause other unexpected and

				             dangerous behaviour.

				There are additional rules applicable when moving page tables, which we discuss

				in the section on this topic below.

				PTE-level page tables are different from page tables at other levels, and there

				are extra requirements for accessing them:

				* On 32-bit architectures, they may be in high memory (meaning they need to be

				  mapped into kernel memory to be accessible).

				* When empty, they can be unlinked and RCU-freed while holding an mmap lock or

				  rmap lock for reading in combination with the PTE and PMD page table locks.

				  In particular, this happens in :c:func:`!retract_page_tables` when handling

				  :c:macro:`!MADV_COLLAPSE`.

				  So accessing PTE-level page tables requires at least holding an RCU read lock;

				  but that only suffices for readers that can tolerate racing with concurrent

				  page table updates such that an empty PTE is observed (in a page table that

				  has actually already been detached and marked for RCU freeing) while another

				  new page table has been installed in the same location and filled with

				  entries. Writers normally need to take the PTE lock and revalidate that the

				  PMD entry still refers to the same PTE-level page table.

				To access PTE-level page tables, a helper like :c:func:`!pte_offset_map_lock` or

				:c:func:`!pte_offset_map` can be used depending on stability requirements.

				These map the page table into kernel memory if required, take the RCU lock, and

				depending on variant, may also look up or acquire the PTE lock.

				See the comment on :c:func:`!__pte_offset_map_lock`.

				Atomicity

				^^^^^^^^^

				Regardless of page table locks, the MMU hardware concurrently updates accessed

				and dirty bits (perhaps more, depending on architecture). Additionally, page

				table traversal operations in parallel (though holding the VMA stable) and

				functionality like GUP-fast locklessly traverses (that is reads) page tables,

				without even keeping the VMA stable at all.

				When performing a page table traversal and keeping the VMA stable, whether a

				read must be performed once and only once or not depends on the architecture

				(for instance x86-64 does not require any special precautions).

				If a write is being performed, or if a read informs whether a write takes place

				(on an installation of a page table entry say, for instance in

				:c:func:`!__pud_install`), special care must always be taken. In these cases we

				can never assume that page table locks give us entirely exclusive access, and

				must retrieve page table entries once and only once.

				If we are reading page table entries, then we need only ensure that the compiler

				does not rearrange our loads. This is achieved via :c:func:`!pXXp_get`

				functions - :c:func:`!pgdp_get`, :c:func:`!p4dp_get`, :c:func:`!pudp_get`,

				:c:func:`!pmdp_get`, and :c:func:`!ptep_get`.

				Each of these uses :c:func:`!READ_ONCE` to guarantee that the compiler reads

				the page table entry only once.

				However, if we wish to manipulate an existing page table entry and care about

				the previously stored data, we must go further and use an hardware atomic

				operation as, for example, in :c:func:`!ptep_get_and_clear`.

				Equally, operations that do not rely on the VMA being held stable, such as

				GUP-fast (see :c:func:`!gup_fast` and its various page table level handlers like

				:c:func:`!gup_fast_pte_range`), must very carefully interact with page table

				entries, using functions such as :c:func:`!ptep_get_lockless` and equivalent for

				higher level page table levels.

				Writes to page table entries must also be appropriately atomic, as established

				by :c:func:`!set_pXX` functions - :c:func:`!set_pgd`, :c:func:`!set_p4d`,

				:c:func:`!set_pud`, :c:func:`!set_pmd`, and :c:func:`!set_pte`.

				Equally functions which clear page table entries must be appropriately atomic,

				as in :c:func:`!pXX_clear` functions - :c:func:`!pgd_clear`,

				:c:func:`!p4d_clear`, :c:func:`!pud_clear`, :c:func:`!pmd_clear`, and

				:c:func:`!pte_clear`.

				Page table installation

				^^^^^^^^^^^^^^^^^^^^^^^

				Page table installation is performed with the VMA held stable explicitly by an

				mmap or VMA lock in read or write mode (see the warning in the locking rules

				section for details as to why).

				When allocating a P4D, PUD or PMD and setting the relevant entry in the above

				PGD, P4D or PUD, the :c:member:`!mm->page_table_lock` must be held. This is

				acquired in :c:func:`!__p4d_alloc`, :c:func:`!__pud_alloc` and

				:c:func:`!__pmd_alloc` respectively.

				.. note:: :c:func:`!__pmd_alloc` actually invokes :c:func:`!pud_lock` and

				   :c:func:`!pud_lockptr` in turn, however at the time of writing it ultimately

				   references the :c:member:`!mm->page_table_lock`.

				Allocating a PTE will either use the :c:member:`!mm->page_table_lock` or, if

				:c:macro:`!USE_SPLIT_PMD_PTLOCKS` is defined, a lock embedded in the PMD

				physical page metadata in the form of a :c:struct:`!struct ptdesc`, acquired by

				:c:func:`!pmd_ptdesc` called from :c:func:`!pmd_lock` and ultimately

				:c:func:`!__pte_alloc`.

				Finally, modifying the contents of the PTE requires special treatment, as the

				PTE page table lock must be acquired whenever we want stable and exclusive

				access to entries contained within a PTE, especially when we wish to modify

				them.

				This is performed via :c:func:`!pte_offset_map_lock` which carefully checks to

				ensure that the PTE hasn't changed from under us, ultimately invoking

				:c:func:`!pte_lockptr` to obtain a spin lock at PTE granularity contained within

				the :c:struct:`!struct ptdesc` associated with the physical PTE page. The lock

				must be released via :c:func:`!pte_unmap_unlock`.

				.. note:: There are some variants on this, such as

				   :c:func:`!pte_offset_map_rw_nolock` when we know we hold the PTE stable but

				   for brevity we do not explore this.  See the comment for

				   :c:func:`!__pte_offset_map_lock` for more details.

				When modifying data in ranges we typically only wish to allocate higher page

				tables as necessary, using these locks to avoid races or overwriting anything,

				and set/clear data at the PTE level as required (for instance when page faulting

				or zapping).

				A typical pattern taken when traversing page table entries to install a new

				mapping is to optimistically determine whether the page table entry in the table

				above is empty, if so, only then acquiring the page table lock and checking

				again to see if it was allocated underneath us.

				This allows for a traversal with page table locks only being taken when

				required. An example of this is :c:func:`!__pud_alloc`.

				At the leaf page table, that is the PTE, we can't entirely rely on this pattern

				as we have separate PMD and PTE locks and a THP collapse for instance might have

				eliminated the PMD entry as well as the PTE from under us.

				This is why :c:func:`!__pte_offset_map_lock` locklessly retrieves the PMD entry

				for the PTE, carefully checking it is as expected, before acquiring the

				PTE-specific lock, and then *again* checking that the PMD entry is as expected.

				If a THP collapse (or similar) were to occur then the lock on both pages would

				be acquired, so we can ensure this is prevented while the PTE lock is held.

				Installing entries this way ensures mutual exclusion on write.

				Page table freeing

				^^^^^^^^^^^^^^^^^^

				Tearing down page tables themselves is something that requires significant

				care. There must be no way that page tables designated for removal can be

				traversed or referenced by concurrent tasks.

				It is insufficient to simply hold an mmap write lock and VMA lock (which will

				prevent racing faults, and rmap operations), as a file-backed mapping can be

				truncated under the :c:struct:`!struct address_space->i_mmap_rwsem` alone.

				As a result, no VMA which can be accessed via the reverse mapping (either

				through the :c:struct:`!struct anon_vma->rb_root` or the :c:member:`!struct

				address_space->i_mmap` interval trees) can have its page tables torn down.

				The operation is typically performed via :c:func:`!free_pgtables`, which assumes

				either the mmap write lock has been taken (as specified by its

				:c:member:`!mm_wr_locked` parameter), or that the VMA is already unreachable.

				It carefully removes the VMA from all reverse mappings, however it's important

				that no new ones overlap these or any route remain to permit access to addresses

				within the range whose page tables are being torn down.

				Additionally, it assumes that a zap has already been performed and steps have

				been taken to ensure that no further page table entries can be installed between

				the zap and the invocation of :c:func:`!free_pgtables`.

				Since it is assumed that all such steps have been taken, page table entries are

				cleared without page table locks (in the :c:func:`!pgd_clear`, :c:func:`!p4d_clear`,

				:c:func:`!pud_clear`, and :c:func:`!pmd_clear` functions.

				.. note:: It is possible for leaf page tables to be torn down independent of

				          the page tables above it as is done by

				          :c:func:`!retract_page_tables`, which is performed under the i_mmap

				          read lock, PMD, and PTE page table locks, without this level of care.

				Page table moving

				^^^^^^^^^^^^^^^^^

				Some functions manipulate page table levels above PMD (that is PUD, P4D and PGD

				page tables). Most notable of these is :c:func:`!mremap`, which is capable of

				moving higher level page tables.

				In these instances, it is required that **all** locks are taken, that is

				the mmap lock, the VMA lock and the relevant rmap locks.

				You can observe this in the :c:func:`!mremap` implementation in the functions

				:c:func:`!take_rmap_locks` and :c:func:`!drop_rmap_locks` which perform the rmap

				side of lock acquisition, invoked ultimately by :c:func:`!move_page_tables`.

				VMA lock internals

				------------------

				Overview

				^^^^^^^^

				VMA read locking is entirely optimistic - if the lock is contended or a competing

				write has started, then we do not obtain a read lock.

				A VMA **read** lock is obtained by :c:func:`!lock_vma_under_rcu`, which first

				calls :c:func:`!rcu_read_lock` to ensure that the VMA is looked up in an RCU

				critical section, then attempts to VMA lock it via :c:func:`!vma_start_read`,

				before releasing the RCU lock via :c:func:`!rcu_read_unlock`.

				VMA read locks hold the read lock on the :c:member:`!vma->vm_lock` semaphore for

				their duration and the caller of :c:func:`!lock_vma_under_rcu` must release it

				via :c:func:`!vma_end_read`.

				VMA **write** locks are acquired via :c:func:`!vma_start_write` in instances where a

				VMA is about to be modified, unlike :c:func:`!vma_start_read` the lock is always

				acquired. An mmap write lock **must** be held for the duration of the VMA write

				lock, releasing or downgrading the mmap write lock also releases the VMA write

				lock so there is no :c:func:`!vma_end_write` function.

				Note that a semaphore write lock is not held across a VMA lock. Rather, a

				sequence number is used for serialisation, and the write semaphore is only

				acquired at the point of write lock to update this.

				This ensures the semantics we require - VMA write locks provide exclusive write

				access to the VMA.

				Implementation details

				^^^^^^^^^^^^^^^^^^^^^^

				The VMA lock mechanism is designed to be a lightweight means of avoiding the use

				of the heavily contended mmap lock. It is implemented using a combination of a

				read/write semaphore and sequence numbers belonging to the containing

				:c:struct:`!struct mm_struct` and the VMA.

				Read locks are acquired via :c:func:`!vma_start_read`, which is an optimistic

				operation, i.e. it tries to acquire a read lock but returns false if it is

				unable to do so. At the end of the read operation, :c:func:`!vma_end_read` is

				called to release the VMA read lock.

				Invoking :c:func:`!vma_start_read` requires that :c:func:`!rcu_read_lock` has

				been called first, establishing that we are in an RCU critical section upon VMA

				read lock acquisition. Once acquired, the RCU lock can be released as it is only

				required for lookup. This is abstracted by :c:func:`!lock_vma_under_rcu` which

				is the interface a user should use.

				Writing requires the mmap to be write-locked and the VMA lock to be acquired via

				:c:func:`!vma_start_write`, however the write lock is released by the termination or

				downgrade of the mmap write lock so no :c:func:`!vma_end_write` is required.

				All this is achieved by the use of per-mm and per-VMA sequence counts, which are

				used in order to reduce complexity, especially for operations which write-lock

				multiple VMAs at once.

				If the mm sequence count, :c:member:`!mm->mm_lock_seq` is equal to the VMA

				sequence count :c:member:`!vma->vm_lock_seq` then the VMA is write-locked. If

				they differ, then it is not.

				Each time the mmap write lock is released in :c:func:`!mmap_write_unlock` or

				:c:func:`!mmap_write_downgrade`, :c:func:`!vma_end_write_all` is invoked which

				also increments :c:member:`!mm->mm_lock_seq` via

				:c:func:`!mm_lock_seqcount_end`.

				This way, we ensure that, regardless of the VMA's sequence number, a write lock

				is never incorrectly indicated and that when we release an mmap write lock we

				efficiently release **all** VMA write locks contained within the mmap at the

				same time.

				Since the mmap write lock is exclusive against others who hold it, the automatic

				release of any VMA locks on its release makes sense, as you would never want to

				keep VMAs locked across entirely separate write operations. It also maintains

				correct lock ordering.

				Each time a VMA read lock is acquired, we acquire a read lock on the

				:c:member:`!vma->vm_lock` read/write semaphore and hold it, while checking that

				the sequence count of the VMA does not match that of the mm.

				If it does, the read lock fails. If it does not, we hold the lock, excluding

				writers, but permitting other readers, who will also obtain this lock under RCU.

				Importantly, maple tree operations performed in :c:func:`!lock_vma_under_rcu`

				are also RCU safe, so the whole read lock operation is guaranteed to function

				correctly.

				On the write side, we acquire a write lock on the :c:member:`!vma->vm_lock`

				read/write semaphore, before setting the VMA's sequence number under this lock,

				also simultaneously holding the mmap write lock.

				This way, if any read locks are in effect, :c:func:`!vma_start_write` will sleep

				until these are finished and mutual exclusion is achieved.

				After setting the VMA's sequence number, the lock is released, avoiding

				complexity with a long-term held write lock.

				This clever combination of a read/write semaphore and sequence count allows for

				fast RCU-based per-VMA lock acquisition (especially on page fault, though

				utilised elsewhere) with minimal complexity around lock ordering.

				mmap write lock downgrading

				---------------------------

				When an mmap write lock is held one has exclusive access to resources within the

				mmap (with the usual caveats about requiring VMA write locks to avoid races with

				tasks holding VMA read locks).

				It is then possible to **downgrade** from a write lock to a read lock via

				:c:func:`!mmap_write_downgrade` which, similar to :c:func:`!mmap_write_unlock`,

				implicitly terminates all VMA write locks via :c:func:`!vma_end_write_all`, but

				importantly does not relinquish the mmap lock while downgrading, therefore

				keeping the locked virtual address space stable.

				An interesting consequence of this is that downgraded locks are exclusive

				against any other task possessing a downgraded lock (since a racing task would

				have to acquire a write lock first to downgrade it, and the downgraded lock

				prevents a new write lock from being obtained until the original lock is

				released).

				For clarity, we map read (R)/downgraded write (D)/write (W) locks against one

				another showing which locks exclude the others:

				.. list-table:: Lock exclusivity

				   :widths: 5 5 5 5

				   :header-rows: 1

				   :stub-columns: 1

				   * -

				     - R

				     - D

				     - W

				   * - R

				     - N

				     - N

				     - Y

				   * - D

				     - N

				     - Y

				     - Y

				   * - W

				     - Y

				     - Y

				     - Y

				Here a Y indicates the locks in the matching row/column are mutually exclusive,

				and N indicates that they are not.

				Stack expansion

				---------------

				Stack expansion throws up additional complexities in that we cannot permit there

				to be racing page faults, as a result we invoke :c:func:`!vma_start_write` to

				prevent this in :c:func:`!expand_downwards` or :c:func:`!expand_upwards`.

									
										60

Documentation/netlink/specs/mptcp_pm.yaml
									
												View File
												
				@@ -22,65 +22,67 @@ definitions:

				      doc: unused event

				     -

				      name: created

				      doc:

				        token, family, saddr4 | saddr6, daddr4 | daddr6, sport, dport

				      doc: >-

				        A new MPTCP connection has been created. It is the good time to

				        allocate memory and send ADD_ADDR if needed. Depending on the

				        traffic-patterns it can take a long time until the

				        MPTCP_EVENT_ESTABLISHED is sent.

				        Attributes: token, family, saddr4 | saddr6, daddr4 | daddr6, sport,

				        dport, server-side.

				     -

				      name: established

				      doc:

				        token, family, saddr4 | saddr6, daddr4 | daddr6, sport, dport

				      doc: >-

				        A MPTCP connection is established (can start new subflows).

				        Attributes: token, family, saddr4 | saddr6, daddr4 | daddr6, sport,

				        dport, server-side.

				     -

				      name: closed

				      doc:

				        token

				      doc: >-

				        A MPTCP connection has stopped.

				        Attribute: token.

				     -

				      name: announced

				      value: 6

				      doc:

				        token, rem_id, family, daddr4 | daddr6 [, dport]

				      doc: >-

				        A new address has been announced by the peer.

				        Attributes: token, rem_id, family, daddr4 | daddr6 [, dport].

				     -

				      name: removed

				      doc:

				        token, rem_id

				      doc: >-

				        An address has been lost by the peer.

				        Attributes: token, rem_id.

				     -

				      name: sub-established

				      value: 10

				      doc:

				        token, family, loc_id, rem_id, saddr4 | saddr6, daddr4 | daddr6, sport,

				        dport, backup, if_idx [, error]

				      doc: >-

				        A new subflow has been established. 'error' should not be set.

				        Attributes: token, family, loc_id, rem_id, saddr4 | saddr6, daddr4 |

				        daddr6, sport, dport, backup, if_idx [, error].

				     -

				      name: sub-closed

				      doc:

				        token, family, loc_id, rem_id, saddr4 | saddr6, daddr4 | daddr6, sport,

				        dport, backup, if_idx [, error]

				      doc: >-

				        A subflow has been closed. An error (copy of sk_err) could be set if an

				        error has been detected for this subflow.

				        Attributes: token, family, loc_id, rem_id, saddr4 | saddr6, daddr4 |

				        daddr6, sport, dport, backup, if_idx [, error].

				     -

				      name: sub-priority

				      value: 13

				      doc:

				        token, family, loc_id, rem_id, saddr4 | saddr6, daddr4 | daddr6, sport,

				        dport, backup, if_idx [, error]

				      doc: >-

				        The priority of a subflow has changed. 'error' should not be set.

				        Attributes: token, family, loc_id, rem_id, saddr4 | saddr6, daddr4 |

				        daddr6, sport, dport, backup, if_idx [, error].

				     -

				      name: listener-created

				      value: 15

				      doc:

				        family, sport, saddr4 | saddr6

				      doc: >-

				        A new PM listener is created.

				        Attributes: family, sport, saddr4 | saddr6.

				     -

				      name: listener-closed

				      doc:

				        family, sport, saddr4 | saddr6

				      doc: >-

				        A PM listener is closed.

				        Attributes: family, sport, saddr4 | saddr6.

				attribute-sets:

				  -

				@@ -306,8 +308,8 @@ operations:

				         attributes:

				           - addr

				    -

				      name:  flush-addrs

				      doc: flush addresses

				      name: flush-addrs

				      doc: Flush addresses

				      attribute-set: endpoint

				      dont-validate: [ strict ]

				      flags: [ uns-admin-perm ]

				@@ -351,7 +353,7 @@ operations:

				            - addr-remote

				    -

				      name: announce

				      doc: announce new sf

				      doc: Announce new address

				      attribute-set: attr

				      dont-validate: [ strict ]

				      flags: [ uns-admin-perm ]

				@@ -362,7 +364,7 @@ operations:

				            - token

				    -

				      name: remove

				      doc: announce removal

				      doc: Announce removal

				      attribute-set: attr

				      dont-validate: [ strict ]

				      flags: [ uns-admin-perm ]

				@@ -373,7 +375,7 @@ operations:

				           - loc-id

				    -

				      name: subflow-create

				      doc: todo

				      doc: Create subflow

				      attribute-set: attr

				      dont-validate: [ strict ]

				      flags: [ uns-admin-perm ]

				@@ -385,7 +387,7 @@ operations:

				            - addr-remote

				    -

				      name: subflow-destroy

				      doc: todo

				      doc: Destroy subflow

				      attribute-set: attr

				      dont-validate: [ strict ]

				      flags: [ uns-admin-perm ]

									
										6

Documentation/networking/ip-sysctl.rst
									
												View File
												
				@@ -2170,6 +2170,12 @@ nexthop_compat_mode - BOOLEAN

					understands the new API, this sysctl can be disabled to achieve full

					performance benefits of the new API by disabling the nexthop expansion

					and extraneous notifications.

					Note that as a backward-compatible mode, dumping of modern features

					might be incomplete or wrong. For example, resilient groups will not be

					shown as such, but rather as just a list of next hops. Also weights that

					do not fit into 8 bits will show incorrectly.

					Default: true (backward compat mode)

				fib_notify_on_flag_change - INTEGER

									
										4

Documentation/power/runtime_pm.rst
									
												View File
												
				@@ -347,7 +347,9 @@ drivers/base/power/runtime.c and include/linux/pm_runtime.h:

				  `int pm_runtime_resume_and_get(struct device *dev);`

				    - run pm_runtime_resume(dev) and if successful, increment the device's

				      usage counter; return the result of pm_runtime_resume

				      usage counter; returns 0 on success (whether or not the device's

				      runtime PM status was already 'active') or the error code from

				      pm_runtime_resume() on failure.

				  `int pm_request_idle(struct device *dev);`

				    - submit a request to execute the subsystem-level idle callback for the

									
										3

Documentation/virt/kvm/api.rst
									
												View File
												
				@@ -1914,6 +1914,9 @@ No flags are specified so far, the corresponding field must be set to zero.

				  #define KVM_IRQ_ROUTING_HV_SINT 4

				  #define KVM_IRQ_ROUTING_XEN_EVTCHN 5

				On s390, adding a KVM_IRQ_ROUTING_S390_ADAPTER is rejected on ucontrol VMs with

				error -EINVAL.

				flags:

				- KVM_MSI_VALID_DEVID: used along with KVM_IRQ_ROUTING_MSI routing entry

									
										4

Documentation/virt/kvm/devices/s390_flic.rst
									
												View File
												
				@@ -58,11 +58,15 @@ Groups:

				    Enables async page faults for the guest. So in case of a major page fault

				    the host is allowed to handle this async and continues the guest.

				    -EINVAL is returned when called on the FLIC of a ucontrol VM.

				  KVM_DEV_FLIC_APF_DISABLE_WAIT

				    Disables async page faults for the guest and waits until already pending

				    async page faults are done. This is necessary to trigger a completion interrupt

				    for every init interrupt before migrating the interrupt list.

				    -EINVAL is returned when called on the FLIC of a ucontrol VM.

				  KVM_DEV_FLIC_ADAPTER_REGISTER

				    Register an I/O adapter interrupt source. Takes a kvm_s390_io_adapter

				    describing the adapter to register::

43

MAINTAINERS

View File

@@ -949,7 +949,6 @@ AMAZON ETHERNET DRIVERS
 M:	Shay Agroskin <shayagr@amazon.com>
 M:	Arthur Kiyanovski <akiyano@amazon.com>
 R:	David Arinzon <darinzon@amazon.com>
 R:	Noam Dagan <ndagan@amazon.com>
 R:	Saeed Bishara <saeedb@amazon.com>
 L:	netdev@vger.kernel.org
 S:	Supported
@@ -1797,7 +1796,6 @@ F:	include/uapi/linux/if_arcnet.h
 ARM AND ARM64 SoC SUB-ARCHITECTURES (COMMON PARTS)
 M:	Arnd Bergmann <arnd@arndb.de>
 M:	Olof Johansson <olof@lixom.net>
 L:	linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
 L:	soc@lists.linux.dev
 S:	Maintained
@@ -2691,7 +2689,6 @@ N:	at91
 N:	atmel
 ARM/Microchip Sparx5 SoC support
 M:	Lars Povlsen <lars.povlsen@microchip.com>
 M:	Steen Hegelund <Steen.Hegelund@microchip.com>
 M:	Daniel Machon <daniel.machon@microchip.com>
 M:	UNGLinuxDriver@microchip.com
@@ -3608,6 +3605,7 @@ F:	drivers/phy/qualcomm/phy-ath79-usb.c
 ATHEROS ATH GENERIC UTILITIES
 M:	Kalle Valo <kvalo@kernel.org>
 M:	Jeff Johnson <jjohnson@kernel.org>
 L:	linux-wireless@vger.kernel.org
 S:	Supported
 F:	drivers/net/wireless/ath/*
@@ -3893,7 +3891,7 @@ W:	http://www.baycom.org/~tom/ham/ham.html
 F:	drivers/net/hamradio/baycom*
 BCACHE (BLOCK LAYER CACHE)
 M:	Coly Li <colyli@suse.de>
 M:	Coly Li <colyli@kernel.org>
 M:	Kent Overstreet <kent.overstreet@linux.dev>
 L:	linux-bcache@vger.kernel.org
 S:	Maintained
@@ -4058,7 +4056,6 @@ F:	net/bluetooth/
 BONDING DRIVER
 M:	Jay Vosburgh <jv@jvosburgh.net>
 M:	Andy Gospodarek <andy@greyhouse.net>
 L:	netdev@vger.kernel.org
 S:	Maintained
 F:	Documentation/networking/bonding.rst
@@ -4131,7 +4128,6 @@ S:	Odd Fixes
 F:	drivers/net/ethernet/netronome/nfp/bpf/
 BPF JIT for POWERPC (32-BIT AND 64-BIT)
 M:	Michael Ellerman <mpe@ellerman.id.au>
 M:	Hari Bathini <hbathini@linux.ibm.com>
 M:	Christophe Leroy <christophe.leroy@csgroup.eu>
 R:	Naveen N Rao <naveen@kernel.org>
@@ -7347,7 +7343,7 @@ F:	drivers/gpu/drm/panel/panel-novatek-nt36672a.c
 DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS
 M:	Karol Herbst <kherbst@redhat.com>
 M:	Lyude Paul <lyude@redhat.com>
 M:	Danilo Krummrich <dakr@redhat.com>
 M:	Danilo Krummrich <dakr@kernel.org>
 L:	dri-devel@lists.freedesktop.org
 L:	nouveau@lists.freedesktop.org
 S:	Supported
@@ -8453,7 +8449,7 @@ F:	include/video/s1d13xxxfb.h
 EROFS FILE SYSTEM
 M:	Gao Xiang <xiang@kernel.org>
 M:	Chao Yu <chao@kernel.org>
 R:	Yue Hu <huyue2@coolpad.com>
 R:	Yue Hu <zbestahu@gmail.com>
 R:	Jeffle Xu <jefflexu@linux.alibaba.com>
 R:	Sandeep Dhavale <dhavale@google.com>
 L:	linux-erofs@lists.ozlabs.org
@@ -8924,7 +8920,7 @@ F:	include/linux/arm_ffa.h
 FIRMWARE LOADER (request_firmware)
 M:	Luis Chamberlain <mcgrof@kernel.org>
 M:	Russ Weight <russ.weight@linux.dev>
 M:	Danilo Krummrich <dakr@redhat.com>
 M:	Danilo Krummrich <dakr@kernel.org>
 L:	linux-kernel@vger.kernel.org
 S:	Maintained
 F:	Documentation/firmware_class/
@@ -12632,7 +12628,7 @@ F:	arch/mips/include/uapi/asm/kvm*
 F:	arch/mips/kvm/
 KERNEL VIRTUAL MACHINE FOR POWERPC (KVM/powerpc)
 M:	Michael Ellerman <mpe@ellerman.id.au>
 M:	Madhavan Srinivasan <maddy@linux.ibm.com>
 R:	Nicholas Piggin <npiggin@gmail.com>
 L:	linuxppc-dev@lists.ozlabs.org
 L:	kvm@vger.kernel.org
@@ -13211,11 +13207,11 @@ X:	drivers/macintosh/adb-iop.c
 X:	drivers/macintosh/via-macii.c
 LINUX FOR POWERPC (32-BIT AND 64-BIT)
 M:	Madhavan Srinivasan <maddy@linux.ibm.com>
 M:	Michael Ellerman <mpe@ellerman.id.au>
 R:	Nicholas Piggin <npiggin@gmail.com>
 R:	Christophe Leroy <christophe.leroy@csgroup.eu>
 R:	Naveen N Rao <naveen@kernel.org>
 M:	Madhavan Srinivasan <maddy@linux.ibm.com>
 L:	linuxppc-dev@lists.ozlabs.org
 S:	Supported
 W:	https://github.com/linuxppc/wiki/wiki
@@ -14566,7 +14562,6 @@ F:	drivers/dma/mediatek/
 MEDIATEK ETHERNET DRIVER
 M:	Felix Fietkau <nbd@nbd.name>
 M:	Sean Wang <sean.wang@mediatek.com>
 M:	Mark Lee <Mark-MC.Lee@mediatek.com>
 M:	Lorenzo Bianconi <lorenzo@kernel.org>
 L:	netdev@vger.kernel.org
 S:	Maintained
@@ -14756,7 +14751,7 @@ F:	drivers/memory/mtk-smi.c
 F:	include/soc/mediatek/smi.h
 MEDIATEK SWITCH DRIVER
 M:	Arınç ÜNAL <arinc.unal@arinc9.com>
 M:	Chester A. Unal <chester.a.unal@arinc9.com>
 M:	Daniel Golle <daniel@makrotopia.org>
 M:	DENG Qingfang <dqfext@gmail.com>
 M:	Sean Wang <sean.wang@mediatek.com>
@@ -15345,7 +15340,7 @@ M:	Daniel Machon <daniel.machon@microchip.com>
 M:	UNGLinuxDriver@microchip.com
 L:	netdev@vger.kernel.org
 S:	Maintained
 F:	drivers/net/ethernet/microchip/lan969x/*
 F:	drivers/net/ethernet/microchip/sparx5/lan969x/*
 MICROCHIP LCDFB DRIVER
 M:	Nicolas Ferre <nicolas.ferre@microchip.com>
@@ -16337,6 +16332,7 @@ F:	Documentation/networking/
 F:	Documentation/networking/net_cachelines/
 F:	Documentation/process/maintainer-netdev.rst
 F:	Documentation/userspace-api/netlink/
 F:	include/linux/ethtool.h
 F:	include/linux/framer/framer-provider.h
 F:	include/linux/framer/framer.h
 F:	include/linux/in.h
@@ -16351,6 +16347,7 @@ F:	include/linux/rtnetlink.h
 F:	include/linux/seq_file_net.h
 F:	include/linux/skbuff*
 F:	include/net/
 F:	include/uapi/linux/ethtool.h
 F:	include/uapi/linux/genetlink.h
 F:	include/uapi/linux/hsr_netlink.h
 F:	include/uapi/linux/in.h
@@ -18458,7 +18455,7 @@ F:	Documentation/devicetree/bindings/pinctrl/mediatek,mt8183-pinctrl.yaml
 F:	drivers/pinctrl/mediatek/
 PIN CONTROLLER - MEDIATEK MIPS
 M:	Arınç ÜNAL <arinc.unal@arinc9.com>
 M:	Chester A. Unal <chester.a.unal@arinc9.com>
 M:	Sergio Paracuellos <sergio.paracuellos@gmail.com>
 L:	linux-mediatek@lists.infradead.org (moderated for non-subscribers)
 L:	linux-mips@vger.kernel.org
@@ -19502,7 +19499,7 @@ S:	Maintained
 F:	arch/mips/ralink
 RALINK MT7621 MIPS ARCHITECTURE
 M:	Arınç ÜNAL <arinc.unal@arinc9.com>
 M:	Chester A. Unal <chester.a.unal@arinc9.com>
 M:	Sergio Paracuellos <sergio.paracuellos@gmail.com>
 L:	linux-mips@vger.kernel.org
 S:	Maintained
@@ -20905,6 +20902,8 @@ F:	kernel/sched/
 SCHEDULER - SCHED_EXT
 R:	Tejun Heo <tj@kernel.org>
 R:	David Vernet <void@manifault.com>
 R:	Andrea Righi <arighi@nvidia.com>
 R:	Changwoo Min <changwoo@igalia.com>
 L:	linux-kernel@vger.kernel.org
 S:	Maintained
 W:	https://github.com/sched-ext/scx
@@ -22499,11 +22498,8 @@ F:	Documentation/devicetree/bindings/phy/st,stm32mp25-combophy.yaml
 F:	drivers/phy/st/phy-stm32-combophy.c
 STMMAC ETHERNET DRIVER
 M:	Alexandre Torgue <alexandre.torgue@foss.st.com>
 M:	Jose Abreu <joabreu@synopsys.com>
 L:	netdev@vger.kernel.org
 S:	Supported
 W:	http://www.stlinux.com
 S:	Orphan
 F:	Documentation/networking/device_drivers/ethernet/stmicro/
 F:	drivers/net/ethernet/stmicro/stmmac/
@@ -22735,9 +22731,8 @@ S:	Supported
 F:	drivers/net/ethernet/synopsys/
 SYNOPSYS DESIGNWARE ETHERNET XPCS DRIVER
 M:	Jose Abreu <Jose.Abreu@synopsys.com>
 L:	netdev@vger.kernel.org
 S:	Supported
 S:	Orphan
 F:	drivers/net/pcs/pcs-xpcs.c
 F:	drivers/net/pcs/pcs-xpcs.h
 F:	include/linux/pcs/pcs-xpcs.h
@@ -23645,7 +23640,6 @@ F:	tools/testing/selftests/timers/
 TIPC NETWORK LAYER
 M:	Jon Maloy <jmaloy@redhat.com>
 M:	Ying Xue <ying.xue@windriver.com>
 L:	netdev@vger.kernel.org (core kernel code)
 L:	tipc-discussion@lists.sourceforge.net (user apps, general discussion)
 S:	Maintained
@@ -24251,7 +24245,8 @@ F:	Documentation/devicetree/bindings/usb/nxp,isp1760.yaml
 F:	drivers/usb/isp1760/*
 USB LAN78XX ETHERNET DRIVER
 M:	Woojung Huh <woojung.huh@microchip.com>
 M:	Thangaraj Samynathan <Thangaraj.S@microchip.com>
 M:	Rengarajan Sundararajan <Rengarajan.S@microchip.com>
 M:	UNGLinuxDriver@microchip.com
 L:	netdev@vger.kernel.org
 S:	Maintained

									
										2

Makefile
									
												View File
												
				@@ -2,7 +2,7 @@

				VERSION = 6

				PATCHLEVEL = 13

				SUBLEVEL = 0

				EXTRAVERSION = -rc2

				EXTRAVERSION = -rc7

				NAME = Baby Opossum Posse

				# *DOCUMENTATION*

5

arch/arc/Kconfig

View File

@@ -6,6 +6,7 @@
 config ARC
 	def_bool y
 	select ARC_TIMERS
 	select ARCH_HAS_CPU_CACHE_ALIASING
 	select ARCH_HAS_CACHE_LINE_SIZE
 	select ARCH_HAS_DEBUG_VM_PGTABLE
 	select ARCH_HAS_DMA_PREP_COHERENT
@@ -297,7 +298,6 @@ config ARC_PAGE_SIZE_16K
 config ARC_PAGE_SIZE_4K
 	bool "4KB"
 	select HAVE_PAGE_SIZE_4KB
 	depends on ARC_MMU_V3 || ARC_MMU_V4
 endchoice
@@ -474,7 +474,8 @@ config HIGHMEM
 config ARC_HAS_PAE40
 	bool "Support for the 40-bit Physical Address Extension"
 	depends on ISA_ARCV2
 	depends on ARC_MMU_V4
 	depends on !ARC_PAGE_SIZE_4K
 	select HIGHMEM
 	select PHYS_ADDR_T_64BIT
 	help

									
										2

arch/arc/Makefile
									
												View File
												
				@@ -6,7 +6,7 @@

				KBUILD_DEFCONFIG := haps_hs_smp_defconfig

				ifeq ($(CROSS_COMPILE),)

				CROSS_COMPILE := $(call cc-cross-prefix, arc-linux- arceb-linux-)

				CROSS_COMPILE := $(call cc-cross-prefix, arc-linux- arceb-linux- arc-linux-gnu-)

				endif

				cflags-y	+= -fno-common -pipe -fno-builtin -mmedium-calls -D__linux__

2

arch/arc/boot/dts/axc001.dtsi

View File

@@ -54,7 +54,7 @@
 				compatible = "snps,dw-apb-gpio-port";
 				gpio-controller;
 				#gpio-cells = <2>;
 				snps,nr-gpios = <30>;
 				ngpios = <30>;
 				reg = <0>;
 				interrupt-controller;
 				#interrupt-cells = <2>;

2

arch/arc/boot/dts/axc003.dtsi

View File

@@ -62,7 +62,7 @@
 				compatible = "snps,dw-apb-gpio-port";
 				gpio-controller;
 				#gpio-cells = <2>;
 				snps,nr-gpios = <30>;
 				ngpios = <30>;
 				reg = <0>;
 				interrupt-controller;
 				#interrupt-cells = <2>;

2

arch/arc/boot/dts/axc003_idu.dtsi

View File

@@ -69,7 +69,7 @@
 				compatible = "snps,dw-apb-gpio-port";
 				gpio-controller;
 				#gpio-cells = <2>;
 				snps,nr-gpios = <30>;
 				ngpios = <30>;
 				reg = <0>;
 				interrupt-controller;
 				#interrupt-cells = <2>;

12

arch/arc/boot/dts/axs10x_mb.dtsi

View File

@@ -250,7 +250,7 @@
 				compatible = "snps,dw-apb-gpio-port";
 				gpio-controller;
 				#gpio-cells = <2>;
 				snps,nr-gpios = <32>;
 				ngpios = <32>;
 				reg = <0>;
 			};
@@ -258,7 +258,7 @@
 				compatible = "snps,dw-apb-gpio-port";
 				gpio-controller;
 				#gpio-cells = <2>;
 				snps,nr-gpios = <8>;
 				ngpios = <8>;
 				reg = <1>;
 			};
@@ -266,7 +266,7 @@
 				compatible = "snps,dw-apb-gpio-port";
 				gpio-controller;
 				#gpio-cells = <2>;
 				snps,nr-gpios = <8>;
 				ngpios = <8>;
 				reg = <2>;
 			};
 		};
@@ -281,7 +281,7 @@
 				compatible = "snps,dw-apb-gpio-port";
 				gpio-controller;
 				#gpio-cells = <2>;
 				snps,nr-gpios = <30>;
 				ngpios = <30>;
 				reg = <0>;
 			};
@@ -289,7 +289,7 @@
 				compatible = "snps,dw-apb-gpio-port";
 				gpio-controller;
 				#gpio-cells = <2>;
 				snps,nr-gpios = <10>;
 				ngpios = <10>;
 				reg = <1>;
 			};
@@ -297,7 +297,7 @@
 				compatible = "snps,dw-apb-gpio-port";
 				gpio-controller;
 				#gpio-cells = <2>;
 				snps,nr-gpios = <8>;
 				ngpios = <8>;
 				reg = <2>;
 			};
 		};

2

arch/arc/boot/dts/hsdk.dts

View File

@@ -308,7 +308,7 @@
 				compatible = "snps,dw-apb-gpio-port";
 				gpio-controller;
 				#gpio-cells = <2>;
 				snps,nr-gpios = <24>;
 				ngpios = <24>;
 				reg = <0>;
 			};
 		};

									
										2

arch/arc/include/asm/arcregs.h
									
												View File
												
				@@ -146,7 +146,7 @@

				#ifndef __ASSEMBLY__

				#include <soc/arc/aux.h>

				#include <soc/arc/arc_aux.h>

				/* Helpers */

				#define TO_KB(bytes)		((bytes) >> 10)

									
										8

arch/arc/include/asm/cachetype.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,8 @@

				/* SPDX-License-Identifier: GPL-2.0 */

				#ifndef __ASM_ARC_CACHETYPE_H

				#define __ASM_ARC_CACHETYPE_H

				#define cpu_dcache_is_aliasing()	false

				#define cpu_icache_is_aliasing()	true

				#endif

									
										2

arch/arc/include/asm/cmpxchg.h
									
												View File
												
				@@ -48,7 +48,7 @@

													\

					switch(sizeof((_p_))) {						\

					case 1:								\

						_prev_ = (__typeof__(*(ptr)))cmpxchg_emu_u8((volatile u8 *)_p_, (uintptr_t)_o_, (uintptr_t)_n_);	\

						_prev_ = (__typeof__(*(ptr)))cmpxchg_emu_u8((volatile u8 *__force)_p_, (uintptr_t)_o_, (uintptr_t)_n_);	\

						break;							\

					case 4:								\

						_prev_ = __cmpxchg(_p_, _o_, _n_);			\

									
										2

arch/arc/include/asm/mmu-arcv2.h
									
												View File
												
				@@ -9,7 +9,7 @@

				#ifndef _ASM_ARC_MMU_ARCV2_H

				#define _ASM_ARC_MMU_ARCV2_H

				#include <soc/arc/aux.h>

				#include <soc/arc/arc_aux.h>

				/*

				 * TLB Management regs

									
										2

arch/arc/net/bpf_jit_arcv2.c
									
												View File
												
				@@ -2916,7 +2916,7 @@ bool check_jmp_32(u32 curr_off, u32 targ_off, u8 cond)

					addendum = (cond == ARC_CC_AL) ? 0 : INSN_len_normal;

					disp = get_displacement(curr_off + addendum, targ_off);

					if (ARC_CC_AL)

					if (cond == ARC_CC_AL)

						return is_valid_far_disp(disp);

					else

						return is_valid_near_disp(disp);

2

arch/arm/boot/dts/nxp/imx/imxrt1050.dtsi

View File

@@ -87,7 +87,7 @@
 			reg = <0x402c0000 0x4000>;
 			interrupts = <110>;
 			clocks = <&clks IMXRT1050_CLK_IPG_PDOF>,
 				<&clks IMXRT1050_CLK_OSC>,
 				<&clks IMXRT1050_CLK_AHB_PODF>,
 				<&clks IMXRT1050_CLK_USDHC1>;
 			clock-names = "ipg", "ahb", "per";
 			bus-width = <4>;

1

arch/arm/configs/imx_v6_v7_defconfig

View File

@@ -323,6 +323,7 @@ CONFIG_SND_SOC_IMX_SGTL5000=y
 CONFIG_SND_SOC_FSL_ASOC_CARD=y
 CONFIG_SND_SOC_AC97_CODEC=y
 CONFIG_SND_SOC_CS42XX8_I2C=y
 CONFIG_SND_SOC_SPDIF=y
 CONFIG_SND_SOC_TLV320AIC3X_I2C=y
 CONFIG_SND_SOC_WM8960=y
 CONFIG_SND_SOC_WM8962=y

1

arch/arm/mach-imx/Kconfig

View File

@@ -6,6 +6,7 @@ menuconfig ARCH_MXC
 	select CLKSRC_IMX_GPT
 	select GENERIC_IRQ_CHIP
 	select GPIOLIB
 	select PINCTRL
 	select PM_OPP if PM
 	select SOC_BUS
 	select SRAM

2

arch/arm64/boot/dts/arm/fvp-base-revc.dts

View File

@@ -233,7 +233,7 @@
 		#interrupt-cells = <0x1>;
 		compatible = "pci-host-ecam-generic";
 		device_type = "pci";
 		bus-range = <0x0 0x1>;
 		bus-range = <0x0 0xff>;
 		reg = <0x0 0x40000000 0x0 0x10000000>;
 		ranges = <0x2000000 0x0 0x50000000 0x0 0x50000000 0x0 0x10000000>;
 		interrupt-map = <0 0 0 1 &gic 0 0 GIC_SPI 168 IRQ_TYPE_LEVEL_HIGH>,

8

arch/arm64/boot/dts/broadcom/bcm2712.dtsi

View File

@@ -67,7 +67,7 @@
 			l2_cache_l0: l2-cache-l0 {
 				compatible = "cache";
 				cache-size = <0x80000>;
 				cache-line-size = <128>;
 				cache-line-size = <64>;
 				cache-sets = <1024>; //512KiB(size)/64(line-size)=8192ways/8-way set
 				cache-level = <2>;
 				cache-unified;
@@ -91,7 +91,7 @@
 			l2_cache_l1: l2-cache-l1 {
 				compatible = "cache";
 				cache-size = <0x80000>;
 				cache-line-size = <128>;
 				cache-line-size = <64>;
 				cache-sets = <1024>; //512KiB(size)/64(line-size)=8192ways/8-way set
 				cache-level = <2>;
 				cache-unified;
@@ -115,7 +115,7 @@
 			l2_cache_l2: l2-cache-l2 {
 				compatible = "cache";
 				cache-size = <0x80000>;
 				cache-line-size = <128>;
 				cache-line-size = <64>;
 				cache-sets = <1024>; //512KiB(size)/64(line-size)=8192ways/8-way set
 				cache-level = <2>;
 				cache-unified;
@@ -139,7 +139,7 @@
 			l2_cache_l3: l2-cache-l3 {
 				compatible = "cache";
 				cache-size = <0x80000>;
 				cache-line-size = <128>;
 				cache-line-size = <64>;
 				cache-sets = <1024>; //512KiB(size)/64(line-size)=8192ways/8-way set
 				cache-level = <2>;
 				cache-unified;

2

arch/arm64/boot/dts/freescale/imx8-ss-audio.dtsi

View File

@@ -165,7 +165,7 @@ audio_subsys: bus@59000000 {
 	};
 	esai0: esai@59010000 {
 		compatible = "fsl,imx8qm-esai";
 		compatible = "fsl,imx8qm-esai", "fsl,imx6ull-esai";
 		reg = <0x59010000 0x10000>;
 		interrupts = <GIC_SPI 409 IRQ_TYPE_LEVEL_HIGH>;
 		clocks = <&esai0_lpcg IMX_LPCG_CLK_4>,

2

arch/arm64/boot/dts/freescale/imx8qm-ss-audio.dtsi

View File

@@ -134,7 +134,7 @@
 	};
 	esai1: esai@59810000 {
 		compatible = "fsl,imx8qm-esai";
 		compatible = "fsl,imx8qm-esai", "fsl,imx6ull-esai";
 		reg = <0x59810000 0x10000>;
 		interrupts = <GIC_SPI 411 IRQ_TYPE_LEVEL_HIGH>;
 		clocks = <&esai1_lpcg IMX_LPCG_CLK_0>,

2

arch/arm64/boot/dts/freescale/imx95.dtsi

View File

@@ -1673,7 +1673,7 @@
 		netcmix_blk_ctrl: syscon@4c810000 {
 			compatible = "nxp,imx95-netcmix-blk-ctrl", "syscon";
 			reg = <0x0 0x4c810000 0x0 0x10000>;
 			reg = <0x0 0x4c810000 0x0 0x8>;
 			#clock-cells = <1>;
 			clocks = <&scmi_clk IMX95_CLK_BUSNETCMIX>;
 			assigned-clocks = <&scmi_clk IMX95_CLK_BUSNETCMIX>;

5

arch/arm64/boot/dts/qcom/sa8775p.dtsi

View File

@@ -2440,6 +2440,7 @@
 			qcom,cmb-element-bits = <32>;
 			qcom,cmb-msrs-num = <32>;
 			status = "disabled";
 			out-ports {
 				port {
@@ -6092,7 +6093,7 @@
 		      <0x0 0x40000000 0x0 0xf20>,
 		      <0x0 0x40000f20 0x0 0xa8>,
 		      <0x0 0x40001000 0x0 0x4000>,
 		      <0x0 0x40200000 0x0 0x100000>,
 		      <0x0 0x40200000 0x0 0x1fe00000>,
 		      <0x0 0x01c03000 0x0 0x1000>,
 		      <0x0 0x40005000 0x0 0x2000>;
 		reg-names = "parf", "dbi", "elbi", "atu", "addr_space",
@@ -6250,7 +6251,7 @@
 		      <0x0 0x60000000 0x0 0xf20>,
 		      <0x0 0x60000f20 0x0 0xa8>,
 		      <0x0 0x60001000 0x0 0x4000>,
 		      <0x0 0x60200000 0x0 0x100000>,
 		      <0x0 0x60200000 0x0 0x1fe00000>,
 		      <0x0 0x01c13000 0x0 0x1000>,
 		      <0x0 0x60005000 0x0 0x2000>;
 		reg-names = "parf", "dbi", "elbi", "atu", "addr_space",

8

arch/arm64/boot/dts/qcom/x1e78100-lenovo-thinkpad-t14s.dts

View File

@@ -773,6 +773,10 @@
 	status = "okay";
 };
 &usb_1_ss0_dwc3 {
 	dr_mode = "host";
 };
 &usb_1_ss0_dwc3_hs {
 	remote-endpoint = <&pmic_glink_ss0_hs_in>;
 };
@@ -801,6 +805,10 @@
 	status = "okay";
 };
 &usb_1_ss1_dwc3 {
 	dr_mode = "host";
 };
 &usb_1_ss1_dwc3_hs {
 	remote-endpoint = <&pmic_glink_ss1_hs_in>;
 };

12

arch/arm64/boot/dts/qcom/x1e80100-crd.dts

View File

@@ -1197,6 +1197,10 @@
 	status = "okay";
 };
 &usb_1_ss0_dwc3 {
 	dr_mode = "host";
 };
 &usb_1_ss0_dwc3_hs {
 	remote-endpoint = <&pmic_glink_ss0_hs_in>;
 };
@@ -1225,6 +1229,10 @@
 	status = "okay";
 };
 &usb_1_ss1_dwc3 {
 	dr_mode = "host";
 };
 &usb_1_ss1_dwc3_hs {
 	remote-endpoint = <&pmic_glink_ss1_hs_in>;
 };
@@ -1253,6 +1261,10 @@
 	status = "okay";
 };
 &usb_1_ss2_dwc3 {
 	dr_mode = "host";
 };
 &usb_1_ss2_dwc3_hs {
 	remote-endpoint = <&pmic_glink_ss2_hs_in>;
 };

8

arch/arm64/boot/dts/qcom/x1e80100.dtsi

View File

@@ -2924,7 +2924,7 @@
 			#address-cells = <3>;
 			#size-cells = <2>;
 			ranges = <0x01000000 0x0 0x00000000 0x0 0x70200000 0x0 0x100000>,
 				 <0x02000000 0x0 0x70300000 0x0 0x70300000 0x0 0x1d00000>;
 				 <0x02000000 0x0 0x70300000 0x0 0x70300000 0x0 0x3d00000>;
 			bus-range = <0x00 0xff>;
 			dma-coherent;
@@ -4066,8 +4066,6 @@
 				dma-coherent;
 				usb-role-switch;
 				ports {
 					#address-cells = <1>;
 					#size-cells = <0>;
@@ -4321,8 +4319,6 @@
 				dma-coherent;
 				usb-role-switch;
 				ports {
 					#address-cells = <1>;
 					#size-cells = <0>;
@@ -4421,8 +4417,6 @@
 				dma-coherent;
 				usb-role-switch;
 				ports {
 					#address-cells = <1>;
 					#size-cells = <0>;

1

arch/arm64/boot/dts/rockchip/rk3328.dtsi

View File

@@ -333,6 +333,7 @@
 			power-domain@RK3328_PD_HEVC {
 				reg = <RK3328_PD_HEVC>;
 				clocks = <&cru SCLK_VENC_CORE>;
 				#power-domain-cells = <0>;
 			};
 			power-domain@RK3328_PD_VIDEO {

1

arch/arm64/boot/dts/rockchip/rk3568.dtsi

View File

@@ -350,6 +350,7 @@
 		assigned-clocks = <&pmucru CLK_PCIEPHY0_REF>;
 		assigned-clock-rates = <100000000>;
 		resets = <&cru SRST_PIPEPHY0>;
 		reset-names = "phy";
 		rockchip,pipe-grf = <&pipegrf>;
 		rockchip,pipe-phy-grf = <&pipe_phy_grf0>;
 		#phy-cells = <1>;

2

arch/arm64/boot/dts/rockchip/rk356x-base.dtsi

View File

@@ -1681,6 +1681,7 @@
 		assigned-clocks = <&pmucru CLK_PCIEPHY1_REF>;
 		assigned-clock-rates = <100000000>;
 		resets = <&cru SRST_PIPEPHY1>;
 		reset-names = "phy";
 		rockchip,pipe-grf = <&pipegrf>;
 		rockchip,pipe-phy-grf = <&pipe_phy_grf1>;
 		#phy-cells = <1>;
@@ -1697,6 +1698,7 @@
 		assigned-clocks = <&pmucru CLK_PCIEPHY2_REF>;
 		assigned-clock-rates = <100000000>;
 		resets = <&cru SRST_PIPEPHY2>;
 		reset-names = "phy";
 		rockchip,pipe-grf = <&pipegrf>;
 		rockchip,pipe-phy-grf = <&pipe_phy_grf2>;
 		#phy-cells = <1>;

2

arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts

View File

@@ -72,7 +72,7 @@
 	rfkill {
 		compatible = "rfkill-gpio";
 		label = "rfkill-pcie-wlan";
 		label = "rfkill-m2-wlan";
 		radio-type = "wlan";
 		shutdown-gpios = <&gpio4 RK_PA2 GPIO_ACTIVE_HIGH>;
 	};

1

arch/arm64/boot/dts/rockchip/rk3588s-nanopi-r6.dtsi

View File

@@ -434,6 +434,7 @@
 &sdmmc {
 	bus-width = <4>;
 	cap-sd-highspeed;
 	cd-gpios = <&gpio0 RK_PA4 GPIO_ACTIVE_LOW>;
 	disable-wp;
 	max-frequency = <150000000>;
 	no-mmc;

									
										4

arch/arm64/include/asm/el2_setup.h
									
												View File
												
				@@ -87,7 +87,7 @@

						      1 << PMSCR_EL2_PA_SHIFT)

					msr_s	SYS_PMSCR_EL2, x0		// addresses and physical counter

				.Lskip_spe_el2_\@:

					mov	x0, #(MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT)

					mov	x0, #MDCR_EL2_E2PB_MASK

					orr	x2, x2, x0			// If we don't have VHE, then

										// use EL1&0 translation.

				@@ -100,7 +100,7 @@

					and	x0, x0, TRBIDR_EL1_P

					cbnz	x0, .Lskip_trace_\@		// If TRBE is available at EL2

					mov	x0, #(MDCR_EL2_E2TB_MASK << MDCR_EL2_E2TB_SHIFT)

					mov	x0, #MDCR_EL2_E2TB_MASK

					orr	x2, x2, x0			// allow the EL1&0 translation

										// to own it.

									
										4

arch/arm64/kernel/hyp-stub.S
									
												View File
												
				@@ -114,8 +114,8 @@ SYM_CODE_START_LOCAL(__finalise_el2)

					// Use EL2 translations for SPE & TRBE and disable access from EL1

					mrs	x0, mdcr_el2

					bic	x0, x0, #(MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT)

					bic	x0, x0, #(MDCR_EL2_E2TB_MASK << MDCR_EL2_E2TB_SHIFT)

					bic	x0, x0, #MDCR_EL2_E2PB_MASK

					bic	x0, x0, #MDCR_EL2_E2TB_MASK

					msr	mdcr_el2, x0

					// Transfer the MM state from EL1 to EL2

									
										83

arch/arm64/kernel/signal.c
									
												View File
												
				@@ -36,15 +36,8 @@

				#include <asm/traps.h>

				#include <asm/vdso.h>

				#ifdef CONFIG_ARM64_GCS

				#define GCS_SIGNAL_CAP(addr) (((unsigned long)addr) & GCS_CAP_ADDR_MASK)

				static bool gcs_signal_cap_valid(u64 addr, u64 val)

				{

					return val == GCS_SIGNAL_CAP(addr);

				}

				#endif

				/*

				 * Do a signal return; undo the signal stack. These are aligned to 128-bit.

				 */

				@@ -1062,8 +1055,7 @@ static int restore_sigframe(struct pt_regs *regs,

				#ifdef CONFIG_ARM64_GCS

				static int gcs_restore_signal(void)

				{

					unsigned long __user *gcspr_el0;

					u64 cap;

					u64 gcspr_el0, cap;

					int ret;

					if (!system_supports_gcs())

				@@ -1072,7 +1064,7 @@ static int gcs_restore_signal(void)

					if (!(current->thread.gcs_el0_mode & PR_SHADOW_STACK_ENABLE))

						return 0;

					gcspr_el0 = (unsigned long __user *)read_sysreg_s(SYS_GCSPR_EL0);

					gcspr_el0 = read_sysreg_s(SYS_GCSPR_EL0);

					/*

					 * Ensure that any changes to the GCS done via GCS operations

				@@ -1087,22 +1079,23 @@ static int gcs_restore_signal(void)

					 * then faults will be generated on GCS operations - the main

					 * concern is to protect GCS pages.

					 */

					ret = copy_from_user(&cap, gcspr_el0, sizeof(cap));

					ret = copy_from_user(&cap, (unsigned long __user *)gcspr_el0,

							     sizeof(cap));

					if (ret)

						return -EFAULT;

					/*

					 * Check that the cap is the actual GCS before replacing it.

					 */

					if (!gcs_signal_cap_valid((u64)gcspr_el0, cap))

					if (cap != GCS_SIGNAL_CAP(gcspr_el0))

						return -EINVAL;

					/* Invalidate the token to prevent reuse */

					put_user_gcs(0, (__user void*)gcspr_el0, &ret);

					put_user_gcs(0, (unsigned long __user *)gcspr_el0, &ret);

					if (ret != 0)

						return -EFAULT;

					write_sysreg_s(gcspr_el0 + 1, SYS_GCSPR_EL0);

					write_sysreg_s(gcspr_el0 + 8, SYS_GCSPR_EL0);

					return 0;

				}

				@@ -1421,7 +1414,7 @@ static int get_sigframe(struct rt_sigframe_user_layout *user,

				static int gcs_signal_entry(__sigrestore_t sigtramp, struct ksignal *ksig)

				{

					unsigned long __user *gcspr_el0;

					u64 gcspr_el0;

					int ret = 0;

					if (!system_supports_gcs())

				@@ -1434,18 +1427,20 @@ static int gcs_signal_entry(__sigrestore_t sigtramp, struct ksignal *ksig)

					 * We are entering a signal handler, current register state is

					 * active.

					 */

					gcspr_el0 = (unsigned long __user *)read_sysreg_s(SYS_GCSPR_EL0);

					gcspr_el0 = read_sysreg_s(SYS_GCSPR_EL0);

					/*

					 * Push a cap and the GCS entry for the trampoline onto the GCS.

					 */

					put_user_gcs((unsigned long)sigtramp, gcspr_el0 - 2, &ret);

					put_user_gcs(GCS_SIGNAL_CAP(gcspr_el0 - 1), gcspr_el0 - 1, &ret);

					put_user_gcs((unsigned long)sigtramp,

						     (unsigned long __user *)(gcspr_el0 - 16), &ret);

					put_user_gcs(GCS_SIGNAL_CAP(gcspr_el0 - 8),

						     (unsigned long __user *)(gcspr_el0 - 8), &ret);

					if (ret != 0)

						return ret;

					gcspr_el0 -= 2;

					write_sysreg_s((unsigned long)gcspr_el0, SYS_GCSPR_EL0);

					gcspr_el0 -= 16;

					write_sysreg_s(gcspr_el0, SYS_GCSPR_EL0);

					return 0;

				}

				@@ -1462,10 +1457,33 @@ static int setup_return(struct pt_regs *regs, struct ksignal *ksig,

							 struct rt_sigframe_user_layout *user, int usig)

				{

					__sigrestore_t sigtramp;

					int err;

					if (ksig->ka.sa.sa_flags & SA_RESTORER)

						sigtramp = ksig->ka.sa.sa_restorer;

					else

						sigtramp = VDSO_SYMBOL(current->mm->context.vdso, sigtramp);

					err = gcs_signal_entry(sigtramp, ksig);

					if (err)

						return err;

					/*

					 * We must not fail from this point onwards. We are going to update

					 * registers, including SP, in order to invoke the signal handler. If

					 * we failed and attempted to deliver a nested SIGSEGV to a handler

					 * after that point, the subsequent sigreturn would end up restoring

					 * the (partial) state for the original signal handler.

					 */

					regs->regs[0] = usig;

					if (ksig->ka.sa.sa_flags & SA_SIGINFO) {

						regs->regs[1] = (unsigned long)&user->sigframe->info;

						regs->regs[2] = (unsigned long)&user->sigframe->uc;

					}

					regs->sp = (unsigned long)user->sigframe;

					regs->regs[29] = (unsigned long)&user->next_frame->fp;

					regs->regs[30] = (unsigned long)sigtramp;

					regs->pc = (unsigned long)ksig->ka.sa.sa_handler;

					/*

				@@ -1506,14 +1524,7 @@ static int setup_return(struct pt_regs *regs, struct ksignal *ksig,

						sme_smstop();

					}

					if (ksig->ka.sa.sa_flags & SA_RESTORER)

						sigtramp = ksig->ka.sa.sa_restorer;

					else

						sigtramp = VDSO_SYMBOL(current->mm->context.vdso, sigtramp);

					regs->regs[30] = (unsigned long)sigtramp;

					return gcs_signal_entry(sigtramp, ksig);

					return 0;

				}

				static int setup_rt_frame(int usig, struct ksignal *ksig, sigset_t *set,

				@@ -1537,14 +1548,16 @@ static int setup_rt_frame(int usig, struct ksignal *ksig, sigset_t *set,

					err |= __save_altstack(&frame->uc.uc_stack, regs->sp);

					err |= setup_sigframe(&user, regs, set, &ua_state);

					if (err == 0) {

					if (ksig->ka.sa.sa_flags & SA_SIGINFO)

						err |= copy_siginfo_to_user(&frame->info, &ksig->info);

					if (err == 0)

						err = setup_return(regs, ksig, &user, usig);

						if (ksig->ka.sa.sa_flags & SA_SIGINFO) {

							err |= copy_siginfo_to_user(&frame->info, &ksig->info);

							regs->regs[1] = (unsigned long)&frame->info;

							regs->regs[2] = (unsigned long)&frame->uc;

						}

					}

					/*

					 * We must not fail if setup_return() succeeded - see comment at the

					 * beginning of setup_return().

					 */

					if (err == 0)

						set_handler_user_access_state();

									
										32

arch/arm64/kernel/stacktrace.c
									
												View File
												
				@@ -26,7 +26,6 @@ enum kunwind_source {

					KUNWIND_SOURCE_CALLER,

					KUNWIND_SOURCE_TASK,

					KUNWIND_SOURCE_REGS_PC,

					KUNWIND_SOURCE_REGS_LR,

				};

				union unwind_flags {

				@@ -138,8 +137,10 @@ kunwind_recover_return_address(struct kunwind_state *state)

						orig_pc = ftrace_graph_ret_addr(state->task, &state->graph_idx,

										state->common.pc,

										(void *)state->common.fp);

						if (WARN_ON_ONCE(state->common.pc == orig_pc))

						if (state->common.pc == orig_pc) {

							WARN_ON_ONCE(state->task == current);

							return -EINVAL;

						}

						state->common.pc = orig_pc;

						state->flags.fgraph = 1;

					}

				@@ -178,23 +179,8 @@ int kunwind_next_regs_pc(struct kunwind_state *state)

					state->regs = regs;

					state->common.pc = regs->pc;

					state->common.fp = regs->regs[29];

					state->source = KUNWIND_SOURCE_REGS_PC;

					return 0;

				}

				static __always_inline int

				kunwind_next_regs_lr(struct kunwind_state *state)

				{

					/*

					 * The stack for the regs was consumed by kunwind_next_regs_pc(), so we

					 * cannot consume that again here, but we know the regs are safe to

					 * access.

					 */

					state->common.pc = state->regs->regs[30];

					state->common.fp = state->regs->regs[29];

					state->regs = NULL;

					state->source = KUNWIND_SOURCE_REGS_LR;

					state->source = KUNWIND_SOURCE_REGS_PC;

					return 0;

				}

				@@ -215,12 +201,12 @@ kunwind_next_frame_record_meta(struct kunwind_state *state)

					case FRAME_META_TYPE_FINAL:

						if (meta == &task_pt_regs(tsk)->stackframe)

							return -ENOENT;

						WARN_ON_ONCE(1);

						WARN_ON_ONCE(tsk == current);

						return -EINVAL;

					case FRAME_META_TYPE_PT_REGS:

						return kunwind_next_regs_pc(state);

					default:

						WARN_ON_ONCE(1);

						WARN_ON_ONCE(tsk == current);

						return -EINVAL;

					}

				}

				@@ -274,11 +260,8 @@ kunwind_next(struct kunwind_state *state)

					case KUNWIND_SOURCE_FRAME:

					case KUNWIND_SOURCE_CALLER:

					case KUNWIND_SOURCE_TASK:

					case KUNWIND_SOURCE_REGS_LR:

						err = kunwind_next_frame_record(state);

						break;

					case KUNWIND_SOURCE_REGS_PC:

						err = kunwind_next_regs_lr(state);

						err = kunwind_next_frame_record(state);

						break;

					default:

						err = -EINVAL;

				@@ -436,7 +419,6 @@ static const char *state_source_string(const struct kunwind_state *state)

					case KUNWIND_SOURCE_CALLER:	return "C";

					case KUNWIND_SOURCE_TASK:	return "T";

					case KUNWIND_SOURCE_REGS_PC:	return "P";

					case KUNWIND_SOURCE_REGS_LR:	return "L";

					default:			return "U";

					}

				}

									
										11

arch/arm64/kvm/at.c
									
												View File
												
				@@ -739,8 +739,15 @@ static u64 compute_par_s12(struct kvm_vcpu *vcpu, u64 s1_par,

							final_attr = s1_parattr;

							break;

						default:

							/* MemAttr[2]=0, Device from S2 */

							final_attr = s2_memattr & GENMASK(1,0) << 2;

							/*

							 * MemAttr[2]=0, Device from S2.

							 *

							 * FWB does not influence the way that stage 1

							 * memory types and attributes are combined

							 * with stage 2 Device type and attributes.

							 */

							final_attr = min(s2_memattr_to_attr(s2_memattr),

									 s1_parattr);

						}

					} else {

						/* Combination of R_HMNDG, R_TNHFM and R_GQFSF */

									
										3

arch/arm64/kvm/hyp/nvhe/mem_protect.c
									
												View File
												
				@@ -783,9 +783,6 @@ static int hyp_ack_unshare(u64 addr, const struct pkvm_mem_transition *tx)

					if (tx->initiator.id == PKVM_ID_HOST && hyp_page_count((void *)addr))

						return -EBUSY;

					if (__hyp_ack_skip_pgtable_check(tx))

						return 0;

					return __hyp_check_page_state_range(addr, size,

									    PKVM_PAGE_SHARED_BORROWED);

				}

									
										4

arch/arm64/kvm/hyp/nvhe/pkvm.c
									
												View File
												
				@@ -126,7 +126,7 @@ static void pvm_init_traps_aa64dfr0(struct kvm_vcpu *vcpu)

					/* Trap SPE */

					if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMSVer), feature_ids)) {

						mdcr_set |= MDCR_EL2_TPMS;

						mdcr_clear |= MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT;

						mdcr_clear |= MDCR_EL2_E2PB_MASK;

					}

					/* Trap Trace Filter */

				@@ -143,7 +143,7 @@ static void pvm_init_traps_aa64dfr0(struct kvm_vcpu *vcpu)

					/* Trap External Trace */

					if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_ExtTrcBuff), feature_ids))

						mdcr_clear |= MDCR_EL2_E2TB_MASK << MDCR_EL2_E2TB_SHIFT;

						mdcr_clear |= MDCR_EL2_E2TB_MASK;

					vcpu->arch.mdcr_el2 |= mdcr_set;

					vcpu->arch.mdcr_el2 &= ~mdcr_clear;

									
										91

arch/arm64/kvm/pmu-emul.c
									
												View File
												
				@@ -24,6 +24,7 @@ static DEFINE_MUTEX(arm_pmus_lock);

				static void kvm_pmu_create_perf_event(struct kvm_pmc *pmc);

				static void kvm_pmu_release_perf_event(struct kvm_pmc *pmc);

				static bool kvm_pmu_counter_is_enabled(struct kvm_pmc *pmc);

				static struct kvm_vcpu *kvm_pmc_to_vcpu(const struct kvm_pmc *pmc)

				{

				@@ -327,48 +328,25 @@ u64 kvm_pmu_implemented_counter_mask(struct kvm_vcpu *vcpu)

						return GENMASK(val - 1, 0) | BIT(ARMV8_PMU_CYCLE_IDX);

				}

				/**

				 * kvm_pmu_enable_counter_mask - enable selected PMU counters

				 * @vcpu: The vcpu pointer

				 * @val: the value guest writes to PMCNTENSET register

				 *

				 * Call perf_event_enable to start counting the perf event

				 */

				void kvm_pmu_enable_counter_mask(struct kvm_vcpu *vcpu, u64 val)

				static void kvm_pmc_enable_perf_event(struct kvm_pmc *pmc)

				{

					int i;

					if (!kvm_vcpu_has_pmu(vcpu))

					if (!pmc->perf_event) {

						kvm_pmu_create_perf_event(pmc);

						return;

					if (!(kvm_vcpu_read_pmcr(vcpu) & ARMV8_PMU_PMCR_E) || !val)

						return;

					for (i = 0; i < KVM_ARMV8_PMU_MAX_COUNTERS; i++) {

						struct kvm_pmc *pmc;

						if (!(val & BIT(i)))

							continue;

						pmc = kvm_vcpu_idx_to_pmc(vcpu, i);

						if (!pmc->perf_event) {

							kvm_pmu_create_perf_event(pmc);

						} else {

							perf_event_enable(pmc->perf_event);

							if (pmc->perf_event->state != PERF_EVENT_STATE_ACTIVE)

								kvm_debug("fail to enable perf event\n");

						}

					}

					perf_event_enable(pmc->perf_event);

					if (pmc->perf_event->state != PERF_EVENT_STATE_ACTIVE)

						kvm_debug("fail to enable perf event\n");

				}

				/**

				 * kvm_pmu_disable_counter_mask - disable selected PMU counters

				 * @vcpu: The vcpu pointer

				 * @val: the value guest writes to PMCNTENCLR register

				 *

				 * Call perf_event_disable to stop counting the perf event

				 */

				void kvm_pmu_disable_counter_mask(struct kvm_vcpu *vcpu, u64 val)

				static void kvm_pmc_disable_perf_event(struct kvm_pmc *pmc)

				{

					if (pmc->perf_event)

						perf_event_disable(pmc->perf_event);

				}

				void kvm_pmu_reprogram_counter_mask(struct kvm_vcpu *vcpu, u64 val)

				{

					int i;

				@@ -376,16 +354,18 @@ void kvm_pmu_disable_counter_mask(struct kvm_vcpu *vcpu, u64 val)

						return;

					for (i = 0; i < KVM_ARMV8_PMU_MAX_COUNTERS; i++) {

						struct kvm_pmc *pmc;

						struct kvm_pmc *pmc = kvm_vcpu_idx_to_pmc(vcpu, i);

						if (!(val & BIT(i)))

							continue;

						pmc = kvm_vcpu_idx_to_pmc(vcpu, i);

						if (pmc->perf_event)

							perf_event_disable(pmc->perf_event);

						if (kvm_pmu_counter_is_enabled(pmc))

							kvm_pmc_enable_perf_event(pmc);

						else

							kvm_pmc_disable_perf_event(pmc);

					}

					kvm_vcpu_pmu_restore_guest(vcpu);

				}

				/*

				@@ -626,27 +606,28 @@ void kvm_pmu_handle_pmcr(struct kvm_vcpu *vcpu, u64 val)

					if (!kvm_has_feat(vcpu->kvm, ID_AA64DFR0_EL1, PMUVer, V3P5))

						val &= ~ARMV8_PMU_PMCR_LP;

					/* Request a reload of the PMU to enable/disable affected counters */

					if ((__vcpu_sys_reg(vcpu, PMCR_EL0) ^ val) & ARMV8_PMU_PMCR_E)

						kvm_make_request(KVM_REQ_RELOAD_PMU, vcpu);

					/* The reset bits don't indicate any state, and shouldn't be saved. */

					__vcpu_sys_reg(vcpu, PMCR_EL0) = val & ~(ARMV8_PMU_PMCR_C | ARMV8_PMU_PMCR_P);

					if (val & ARMV8_PMU_PMCR_E) {

						kvm_pmu_enable_counter_mask(vcpu,

						       __vcpu_sys_reg(vcpu, PMCNTENSET_EL0));

					} else {

						kvm_pmu_disable_counter_mask(vcpu,

						       __vcpu_sys_reg(vcpu, PMCNTENSET_EL0));

					}

					if (val & ARMV8_PMU_PMCR_C)

						kvm_pmu_set_counter_value(vcpu, ARMV8_PMU_CYCLE_IDX, 0);

					if (val & ARMV8_PMU_PMCR_P) {

						unsigned long mask = kvm_pmu_accessible_counter_mask(vcpu);

						mask &= ~BIT(ARMV8_PMU_CYCLE_IDX);

						/*

						 * Unlike other PMU sysregs, the controls in PMCR_EL0 always apply

						 * to the 'guest' range of counters and never the 'hyp' range.

						 */

						unsigned long mask = kvm_pmu_implemented_counter_mask(vcpu) &

								     ~kvm_pmu_hyp_counter_mask(vcpu) &

								     ~BIT(ARMV8_PMU_CYCLE_IDX);

						for_each_set_bit(i, &mask, 32)

							kvm_pmu_set_pmc_value(kvm_vcpu_idx_to_pmc(vcpu, i), 0, true);

					}

					kvm_vcpu_pmu_restore_guest(vcpu);

				}

				static bool kvm_pmu_counter_is_enabled(struct kvm_pmc *pmc)

				@@ -910,11 +891,11 @@ void kvm_vcpu_reload_pmu(struct kvm_vcpu *vcpu)

				{

					u64 mask = kvm_pmu_implemented_counter_mask(vcpu);

					kvm_pmu_handle_pmcr(vcpu, kvm_vcpu_read_pmcr(vcpu));

					__vcpu_sys_reg(vcpu, PMOVSSET_EL0) &= mask;

					__vcpu_sys_reg(vcpu, PMINTENSET_EL1) &= mask;

					__vcpu_sys_reg(vcpu, PMCNTENSET_EL0) &= mask;

					kvm_pmu_reprogram_counter_mask(vcpu, mask);

				}

				int kvm_arm_pmu_v3_enable(struct kvm_vcpu *vcpu)

									
										35

arch/arm64/kvm/sys_regs.c
									
												View File
												
				@@ -1208,16 +1208,14 @@ static bool access_pmcnten(struct kvm_vcpu *vcpu, struct sys_reg_params *p,

					mask = kvm_pmu_accessible_counter_mask(vcpu);

					if (p->is_write) {

						val = p->regval & mask;

						if (r->Op2 & 0x1) {

						if (r->Op2 & 0x1)

							/* accessing PMCNTENSET_EL0 */

							__vcpu_sys_reg(vcpu, PMCNTENSET_EL0) |= val;

							kvm_pmu_enable_counter_mask(vcpu, val);

							kvm_vcpu_pmu_restore_guest(vcpu);

						} else {

						else

							/* accessing PMCNTENCLR_EL0 */

							__vcpu_sys_reg(vcpu, PMCNTENSET_EL0) &= ~val;

							kvm_pmu_disable_counter_mask(vcpu, val);

						}

						kvm_pmu_reprogram_counter_mask(vcpu, val);

					} else {

						p->regval = __vcpu_sys_reg(vcpu, PMCNTENSET_EL0);

					}

				@@ -2450,6 +2448,26 @@ static unsigned int s1pie_el2_visibility(const struct kvm_vcpu *vcpu,

					return __el2_visibility(vcpu, rd, s1pie_visibility);

				}

				static bool access_mdcr(struct kvm_vcpu *vcpu,

							struct sys_reg_params *p,

							const struct sys_reg_desc *r)

				{

					u64 old = __vcpu_sys_reg(vcpu, MDCR_EL2);

					if (!access_rw(vcpu, p, r))

						return false;

					/*

					 * Request a reload of the PMU to enable/disable the counters affected

					 * by HPME.

					 */

					if ((old ^ __vcpu_sys_reg(vcpu, MDCR_EL2)) & MDCR_EL2_HPME)

						kvm_make_request(KVM_REQ_RELOAD_PMU, vcpu);

					return true;

				}

				/*

				 * Architected system registers.

				 * Important: Must be sorted ascending by Op0, Op1, CRn, CRm, Op2

				@@ -2618,7 +2636,8 @@ static const struct sys_reg_desc sys_reg_descs[] = {

					ID_WRITABLE(ID_AA64MMFR0_EL1, ~(ID_AA64MMFR0_EL1_RES0 |

									ID_AA64MMFR0_EL1_TGRAN4_2 |

									ID_AA64MMFR0_EL1_TGRAN64_2 |

									ID_AA64MMFR0_EL1_TGRAN16_2)),

									ID_AA64MMFR0_EL1_TGRAN16_2 |

									ID_AA64MMFR0_EL1_ASIDBITS)),

					ID_WRITABLE(ID_AA64MMFR1_EL1, ~(ID_AA64MMFR1_EL1_RES0 |

									ID_AA64MMFR1_EL1_HCX |

									ID_AA64MMFR1_EL1_TWED |

				@@ -2982,7 +3001,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {

					EL2_REG(SCTLR_EL2, access_rw, reset_val, SCTLR_EL2_RES1),

					EL2_REG(ACTLR_EL2, access_rw, reset_val, 0),

					EL2_REG_VNCR(HCR_EL2, reset_hcr, 0),

					EL2_REG(MDCR_EL2, access_rw, reset_val, 0),

					EL2_REG(MDCR_EL2, access_mdcr, reset_val, 0),

					EL2_REG(CPTR_EL2, access_rw, reset_val, CPTR_NVHE_EL2_RES1),

					EL2_REG_VNCR(HSTR_EL2, reset_val, 0),

					EL2_REG_VNCR(HFGRTR_EL2, reset_val, 0),

									
										12

arch/arm64/kvm/vgic/vgic-its.c
									
												View File
												
				@@ -608,12 +608,22 @@ static void vgic_its_cache_translation(struct kvm *kvm, struct vgic_its *its,

					lockdep_assert_held(&its->its_lock);

					vgic_get_irq_kref(irq);

					old = xa_store(&its->translation_cache, cache_key, irq, GFP_KERNEL_ACCOUNT);

					/*

					 * Put the reference taken on @irq if the store fails. Intentionally do

					 * not return the error as the translation cache is best effort.

					 */

					if (xa_is_err(old)) {

						vgic_put_irq(kvm, irq);

						return;

					}

					/*

					 * We could have raced with another CPU caching the same

					 * translation behind our back, ensure we don't leak a

					 * reference if that is the case.

					 */

					old = xa_store(&its->translation_cache, cache_key, irq, GFP_KERNEL_ACCOUNT);

					if (old)

						vgic_put_irq(kvm, old);

				}

									
										6

arch/hexagon/Makefile
									
												View File
												
				@@ -32,3 +32,9 @@ KBUILD_LDFLAGS += $(ldflags-y)

				TIR_NAME := r19

				KBUILD_CFLAGS += -ffixed-$(TIR_NAME) -DTHREADINFO_REG=$(TIR_NAME) -D__linux__

				KBUILD_AFLAGS += -DTHREADINFO_REG=$(TIR_NAME)

				# Disable HexagonConstExtenders pass for LLVM versions prior to 19.1.0

				# https://github.com/llvm/llvm-project/issues/99714

				ifneq ($(call clang-min-version, 190100),y)

				KBUILD_CFLAGS += -mllvm -hexagon-cext=false

				endif

									
										10

arch/nios2/kernel/cpuinfo.c
									
												View File
												
				@@ -143,11 +143,11 @@ static int show_cpuinfo(struct seq_file *m, void *v)

						   " DIV:\t\t%s\n"

						   " BMX:\t\t%s\n"

						   " CDX:\t\t%s\n",

						   cpuinfo.has_mul ? "yes" : "no",

						   cpuinfo.has_mulx ? "yes" : "no",

						   cpuinfo.has_div ? "yes" : "no",

						   cpuinfo.has_bmx ? "yes" : "no",

						   cpuinfo.has_cdx ? "yes" : "no");

						   str_yes_no(cpuinfo.has_mul),

						   str_yes_no(cpuinfo.has_mulx),

						   str_yes_no(cpuinfo.has_div),

						   str_yes_no(cpuinfo.has_bmx),

						   str_yes_no(cpuinfo.has_cdx));

					seq_printf(m,

						   "Icache:\t\t%ukB, line length: %u\n",

									
										2

arch/openrisc/kernel/entry.S
									
												View File
												
				@@ -239,6 +239,8 @@ handler:							;\

				/* =====================================================[ exceptions] === */

					__REF

				/* ---[ 0x100: RESET exception ]----------------------------------------- */

				EXCEPTION_ENTRY(_tng_kernel_start)

									
										32

arch/openrisc/kernel/head.S
									
												View File
												
				@@ -26,15 +26,15 @@

				#include <asm/asm-offsets.h>

				#include <linux/of_fdt.h>

				#define tophys(rd,rs)				\

					l.movhi	rd,hi(-KERNELBASE)		;\

				#define tophys(rd,rs)						\

					l.movhi	rd,hi(-KERNELBASE)				;\

					l.add	rd,rd,rs

				#define CLEAR_GPR(gpr)				\

				#define CLEAR_GPR(gpr)						\

					l.movhi	gpr,0x0

				#define LOAD_SYMBOL_2_GPR(gpr,symbol)		\

					l.movhi gpr,hi(symbol)			;\

				#define LOAD_SYMBOL_2_GPR(gpr,symbol)				\

					l.movhi gpr,hi(symbol)					;\

					l.ori   gpr,gpr,lo(symbol)

				@@ -326,21 +326,21 @@

					l.addi  r1,r1,-(INT_FRAME_SIZE)				;\

					/* r1 is KSP, r30 is __pa(KSP) */			;\

					tophys  (r30,r1)					;\

					l.sw    PT_GPR12(r30),r12					;\

					l.sw    PT_GPR12(r30),r12				;\

					l.mfspr r12,r0,SPR_EPCR_BASE				;\

					l.sw    PT_PC(r30),r12					;\

					l.mfspr r12,r0,SPR_ESR_BASE				;\

					l.sw    PT_SR(r30),r12					;\

					/* save r31 */						;\

					EXCEPTION_T_LOAD_GPR30(r12)				;\

					l.sw	PT_GPR30(r30),r12					;\

					l.sw	PT_GPR30(r30),r12				;\

					/* save r10 as was prior to exception */		;\

					EXCEPTION_T_LOAD_GPR10(r12)				;\

					l.sw	PT_GPR10(r30),r12					;\

					/* save PT_SP as was prior to exception */			;\

					l.sw	PT_GPR10(r30),r12				;\

					/* save PT_SP as was prior to exception */		;\

					EXCEPTION_T_LOAD_SP(r12)				;\

					l.sw	PT_SP(r30),r12					;\

					l.sw    PT_GPR13(r30),r13					;\

					l.sw    PT_GPR13(r30),r13				;\

					/* --> */						;\

					/* save exception r4, set r4 = EA */			;\

					l.sw	PT_GPR4(r30),r4					;\

				@@ -357,6 +357,8 @@

				/* =====================================================[ exceptions] === */

					__HEAD

				/* ---[ 0x100: RESET exception ]----------------------------------------- */

				    .org 0x100

					/* Jump to .init code at _start which lives in the .head section

				@@ -394,7 +396,7 @@ _dispatch_do_ipage_fault:

				    .org 0x500

					EXCEPTION_HANDLE(_timer_handler)

				/* ---[ 0x600: Alignment exception ]-------------------------------------- */

				/* ---[ 0x600: Alignment exception ]------------------------------------- */

				    .org 0x600

					EXCEPTION_HANDLE(_alignment_handler)

				@@ -424,7 +426,7 @@ _dispatch_do_ipage_fault:

				    .org 0xc00

					EXCEPTION_HANDLE(_sys_call_handler)

				/* ---[ 0xd00: Floating point exception ]--------------------------------- */

				/* ---[ 0xd00: Floating point exception ]-------------------------------- */

				    .org 0xd00

					EXCEPTION_HANDLE(_fpe_trap_handler)

				@@ -506,10 +508,10 @@ _dispatch_do_ipage_fault:

				/*    .text*/

				/* This early stuff belongs in HEAD, but some of the functions below definitely

				/* This early stuff belongs in the .init.text section, but some of the functions below definitely

				 * don't... */

					__HEAD

					__INIT

					.global _start

				_start:

					/* Init r0 to zero as per spec */

				@@ -816,7 +818,7 @@ secondary_start:

				#endif

				/* ========================================[ cache ]=== */

				/* ==========================================================[ cache ]=== */

					/* alignment here so we don't change memory offsets with

					 * memory controller defined

									
										3

arch/openrisc/kernel/vmlinux.lds.S
									
												View File
												
				@@ -50,6 +50,7 @@ SECTIONS

				        .text                   : AT(ADDR(.text) - LOAD_OFFSET)

					{

				          _stext = .;

					  HEAD_TEXT

					  TEXT_TEXT

					  SCHED_TEXT

					  LOCK_TEXT

				@@ -83,8 +84,6 @@ SECTIONS

					. = ALIGN(PAGE_SIZE);

					__init_begin = .;

					HEAD_TEXT_SECTION

					/* Page aligned */

					INIT_TEXT_SECTION(PAGE_SIZE)

1

arch/powerpc/configs/pmac32_defconfig

View File

@@ -208,6 +208,7 @@ CONFIG_FB_ATY=y
 CONFIG_FB_ATY_CT=y
 CONFIG_FB_ATY_GX=y
 CONFIG_FB_3DFX=y
 CONFIG_BACKLIGHT_CLASS_DEVICE=y
 # CONFIG_VGA_CONSOLE is not set
 CONFIG_FRAMEBUFFER_CONSOLE=y
 CONFIG_LOGO=y

1

arch/powerpc/configs/ppc6xx_defconfig

View File

@@ -716,6 +716,7 @@ CONFIG_FB_TRIDENT=m
 CONFIG_FB_SM501=m
 CONFIG_FB_IBM_GXT4500=y
 CONFIG_LCD_PLATFORM=m
 CONFIG_BACKLIGHT_CLASS_DEVICE=y
 CONFIG_FRAMEBUFFER_CONSOLE=y
 CONFIG_FRAMEBUFFER_CONSOLE_ROTATION=y
 CONFIG_LOGO=y

									
										2

arch/powerpc/kvm/e500.h
									
												View File
												
				@@ -34,6 +34,8 @@ enum vcpu_ftr {

				#define E500_TLB_BITMAP		(1 << 30)

				/* TLB1 entry is mapped by host TLB0 */

				#define E500_TLB_TLB0		(1 << 29)

				/* entry is writable on the host */

				#define E500_TLB_WRITABLE	(1 << 28)

				/* bits [6-5] MAS2_X1 and MAS2_X0 and [4-0] bits for WIMGE */

				#define E500_TLB_MAS2_ATTR	(0x7f)

									
										199

arch/powerpc/kvm/e500_mmu_host.c
									
												View File
												
				@@ -45,11 +45,14 @@ static inline unsigned int tlb1_max_shadow_size(void)

					return host_tlb_params[1].entries - tlbcam_index - 1;

				}

				static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode)

				static inline u32 e500_shadow_mas3_attrib(u32 mas3, bool writable, int usermode)

				{

					/* Mask off reserved bits. */

					mas3 &= MAS3_ATTRIB_MASK;

					if (!writable)

						mas3 &= ~(MAS3_UW|MAS3_SW);

				#ifndef CONFIG_KVM_BOOKE_HV

					if (!usermode) {

						/* Guest is in supervisor mode,

				@@ -242,17 +245,18 @@ static inline int tlbe_is_writable(struct kvm_book3e_206_tlb_entry *tlbe)

					return tlbe->mas7_3 & (MAS3_SW|MAS3_UW);

				}

				static inline bool kvmppc_e500_ref_setup(struct tlbe_ref *ref,

				static inline void kvmppc_e500_ref_setup(struct tlbe_ref *ref,

									 struct kvm_book3e_206_tlb_entry *gtlbe,

									 kvm_pfn_t pfn, unsigned int wimg)

									 kvm_pfn_t pfn, unsigned int wimg,

									 bool writable)

				{

					ref->pfn = pfn;

					ref->flags = E500_TLB_VALID;

					if (writable)

						ref->flags |= E500_TLB_WRITABLE;

					/* Use guest supplied MAS2_G and MAS2_E */

					ref->flags |= (gtlbe->mas2 & MAS2_ATTRIB_MASK) | wimg;

					return tlbe_is_writable(gtlbe);

				}

				static inline void kvmppc_e500_ref_release(struct tlbe_ref *ref)

				@@ -305,6 +309,7 @@ static void kvmppc_e500_setup_stlbe(

				{

					kvm_pfn_t pfn = ref->pfn;

					u32 pr = vcpu->arch.shared->msr & MSR_PR;

					bool writable = !!(ref->flags & E500_TLB_WRITABLE);

					BUG_ON(!(ref->flags & E500_TLB_VALID));

				@@ -312,7 +317,7 @@ static void kvmppc_e500_setup_stlbe(

					stlbe->mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID;

					stlbe->mas2 = (gvaddr & MAS2_EPN) | (ref->flags & E500_TLB_MAS2_ATTR);

					stlbe->mas7_3 = ((u64)pfn << PAGE_SHIFT) |

							e500_shadow_mas3_attrib(gtlbe->mas7_3, pr);

							e500_shadow_mas3_attrib(gtlbe->mas7_3, writable, pr);

				}

				static inline int kvmppc_e500_shadow_map(struct kvmppc_vcpu_e500 *vcpu_e500,

				@@ -321,15 +326,14 @@ static inline int kvmppc_e500_shadow_map(struct kvmppc_vcpu_e500 *vcpu_e500,

					struct tlbe_ref *ref)

				{

					struct kvm_memory_slot *slot;

					unsigned long pfn = 0; /* silence GCC warning */

					unsigned int psize;

					unsigned long pfn;

					struct page *page = NULL;

					unsigned long hva;

					int pfnmap = 0;

					int tsize = BOOK3E_PAGESZ_4K;

					int ret = 0;

					unsigned long mmu_seq;

					struct kvm *kvm = vcpu_e500->vcpu.kvm;

					unsigned long tsize_pages = 0;

					pte_t *ptep;

					unsigned int wimg = 0;

					pgd_t *pgdir;

				@@ -351,110 +355,12 @@ static inline int kvmppc_e500_shadow_map(struct kvmppc_vcpu_e500 *vcpu_e500,

					slot = gfn_to_memslot(vcpu_e500->vcpu.kvm, gfn);

					hva = gfn_to_hva_memslot(slot, gfn);

					if (tlbsel == 1) {

						struct vm_area_struct *vma;

						mmap_read_lock(kvm->mm);

						vma = find_vma(kvm->mm, hva);

						if (vma && hva >= vma->vm_start &&

						    (vma->vm_flags & VM_PFNMAP)) {

							/*

							 * This VMA is a physically contiguous region (e.g.

							 * /dev/mem) that bypasses normal Linux page

							 * management.  Find the overlap between the

							 * vma and the memslot.

							 */

							unsigned long start, end;

							unsigned long slot_start, slot_end;

							pfnmap = 1;

							start = vma->vm_pgoff;

							end = start +

							      vma_pages(vma);

							pfn = start + ((hva - vma->vm_start) >> PAGE_SHIFT);

							slot_start = pfn - (gfn - slot->base_gfn);

							slot_end = slot_start + slot->npages;

							if (start < slot_start)

								start = slot_start;

							if (end > slot_end)

								end = slot_end;

							tsize = (gtlbe->mas1 & MAS1_TSIZE_MASK) >>

								MAS1_TSIZE_SHIFT;

							/*

							 * e500 doesn't implement the lowest tsize bit,

							 * or 1K pages.

							 */

							tsize = max(BOOK3E_PAGESZ_4K, tsize & ~1);

							/*

							 * Now find the largest tsize (up to what the guest

							 * requested) that will cover gfn, stay within the

							 * range, and for which gfn and pfn are mutually

							 * aligned.

							 */

							for (; tsize > BOOK3E_PAGESZ_4K; tsize -= 2) {

								unsigned long gfn_start, gfn_end;

								tsize_pages = 1UL << (tsize - 2);

								gfn_start = gfn & ~(tsize_pages - 1);

								gfn_end = gfn_start + tsize_pages;

								if (gfn_start + pfn - gfn < start)

									continue;

								if (gfn_end + pfn - gfn > end)

									continue;

								if ((gfn & (tsize_pages - 1)) !=

								    (pfn & (tsize_pages - 1)))

									continue;

								gvaddr &= ~((tsize_pages << PAGE_SHIFT) - 1);

								pfn &= ~(tsize_pages - 1);

								break;

							}

						} else if (vma && hva >= vma->vm_start &&

							   is_vm_hugetlb_page(vma)) {

							unsigned long psize = vma_kernel_pagesize(vma);

							tsize = (gtlbe->mas1 & MAS1_TSIZE_MASK) >>

								MAS1_TSIZE_SHIFT;

							/*

							 * Take the largest page size that satisfies both host

							 * and guest mapping

							 */

							tsize = min(__ilog2(psize) - 10, tsize);

							/*

							 * e500 doesn't implement the lowest tsize bit,

							 * or 1K pages.

							 */

							tsize = max(BOOK3E_PAGESZ_4K, tsize & ~1);

						}

						mmap_read_unlock(kvm->mm);

					}

					if (likely(!pfnmap)) {

						tsize_pages = 1UL << (tsize + 10 - PAGE_SHIFT);

						pfn = __kvm_faultin_pfn(slot, gfn, FOLL_WRITE, NULL, &page);

						if (is_error_noslot_pfn(pfn)) {

							if (printk_ratelimit())

								pr_err("%s: real page not found for gfn %lx\n",

								       __func__, (long)gfn);

							return -EINVAL;

						}

						/* Align guest and physical address to page map boundaries */

						pfn &= ~(tsize_pages - 1);

						gvaddr &= ~((tsize_pages << PAGE_SHIFT) - 1);

					pfn = __kvm_faultin_pfn(slot, gfn, FOLL_WRITE, &writable, &page);

					if (is_error_noslot_pfn(pfn)) {

						if (printk_ratelimit())

							pr_err("%s: real page not found for gfn %lx\n",

							       __func__, (long)gfn);

						return -EINVAL;

					}

					spin_lock(&kvm->mmu_lock);

				@@ -472,14 +378,13 @@ static inline int kvmppc_e500_shadow_map(struct kvmppc_vcpu_e500 *vcpu_e500,

					 * can't run hence pfn won't change.

					 */

					local_irq_save(flags);

					ptep = find_linux_pte(pgdir, hva, NULL, NULL);

					ptep = find_linux_pte(pgdir, hva, NULL, &psize);

					if (ptep) {

						pte_t pte = READ_ONCE(*ptep);

						if (pte_present(pte)) {

							wimg = (pte_val(pte) >> PTE_WIMGE_SHIFT) &

								MAS2_WIMGE_MASK;

							local_irq_restore(flags);

						} else {

							local_irq_restore(flags);

							pr_err_ratelimited("%s: pte not present: gfn %lx,pfn %lx\n",

				@@ -488,10 +393,72 @@ static inline int kvmppc_e500_shadow_map(struct kvmppc_vcpu_e500 *vcpu_e500,

							goto out;

						}

					}

					writable = kvmppc_e500_ref_setup(ref, gtlbe, pfn, wimg);

					local_irq_restore(flags);

					if (psize && tlbsel == 1) {

						unsigned long psize_pages, tsize_pages;

						unsigned long start, end;

						unsigned long slot_start, slot_end;

						psize_pages = 1UL << (psize - PAGE_SHIFT);

						start = pfn & ~(psize_pages - 1);

						end = start + psize_pages;

						slot_start = pfn - (gfn - slot->base_gfn);

						slot_end = slot_start + slot->npages;

						if (start < slot_start)

							start = slot_start;

						if (end > slot_end)

							end = slot_end;

						tsize = (gtlbe->mas1 & MAS1_TSIZE_MASK) >>

							MAS1_TSIZE_SHIFT;

						/*

						 * Any page size that doesn't satisfy the host mapping

						 * will fail the start and end tests.

						 */

						tsize = min(psize - PAGE_SHIFT + BOOK3E_PAGESZ_4K, tsize);

						/*

						 * e500 doesn't implement the lowest tsize bit,

						 * or 1K pages.

						 */

						tsize = max(BOOK3E_PAGESZ_4K, tsize & ~1);

						/*

						 * Now find the largest tsize (up to what the guest

						 * requested) that will cover gfn, stay within the

						 * range, and for which gfn and pfn are mutually

						 * aligned.

						 */

						for (; tsize > BOOK3E_PAGESZ_4K; tsize -= 2) {

							unsigned long gfn_start, gfn_end;

							tsize_pages = 1UL << (tsize - 2);

							gfn_start = gfn & ~(tsize_pages - 1);

							gfn_end = gfn_start + tsize_pages;

							if (gfn_start + pfn - gfn < start)

								continue;

							if (gfn_end + pfn - gfn > end)

								continue;

							if ((gfn & (tsize_pages - 1)) !=

							    (pfn & (tsize_pages - 1)))

								continue;

							gvaddr &= ~((tsize_pages << PAGE_SHIFT) - 1);

							pfn &= ~(tsize_pages - 1);

							break;

						}

					}

					kvmppc_e500_ref_setup(ref, gtlbe, pfn, wimg, writable);

					kvmppc_e500_setup_stlbe(&vcpu_e500->vcpu, gtlbe, tsize,

								ref, gvaddr, stlbe);

					writable = tlbe_is_writable(stlbe);

					/* Clear i-cache for new pages */

					kvmppc_mmu_flush_icache(pfn);

									
										36

arch/powerpc/platforms/book3s/vas-api.c
									
												View File
												
				@@ -464,7 +464,43 @@ static vm_fault_t vas_mmap_fault(struct vm_fault *vmf)

					return VM_FAULT_SIGBUS;

				}

				/*

				 * During mmap() paste address, mapping VMA is saved in VAS window

				 * struct which is used to unmap during migration if the window is

				 * still open. But the user space can remove this mapping with

				 * munmap() before closing the window and the VMA address will

				 * be invalid. Set VAS window VMA to NULL in this function which

				 * is called before VMA free.

				 */

				static void vas_mmap_close(struct vm_area_struct *vma)

				{

					struct file *fp = vma->vm_file;

					struct coproc_instance *cp_inst = fp->private_data;

					struct vas_window *txwin;

					/* Should not happen */

					if (!cp_inst || !cp_inst->txwin) {

						pr_err("No attached VAS window for the paste address mmap\n");

						return;

					}

					txwin = cp_inst->txwin;

					/*

					 * task_ref.vma is set in coproc_mmap() during mmap paste

					 * address. So it has to be the same VMA that is getting freed.

					 */

					if (WARN_ON(txwin->task_ref.vma != vma)) {

						pr_err("Invalid paste address mmaping\n");

						return;

					}

					mutex_lock(&txwin->task_ref.mmap_mutex);

					txwin->task_ref.vma = NULL;

					mutex_unlock(&txwin->task_ref.mmap_mutex);

				}

				static const struct vm_operations_struct vas_vm_ops = {

					.close = vas_mmap_close,

					.fault = vas_mmap_fault,

				};

									
										4

arch/riscv/include/asm/kfence.h
									
												View File
												
				@@ -22,7 +22,9 @@ static inline bool kfence_protect_page(unsigned long addr, bool protect)

					else

						set_pte(pte, __pte(pte_val(ptep_get(pte)) | _PAGE_PRESENT));

					flush_tlb_kernel_range(addr, addr + PAGE_SIZE);

					preempt_disable();

					local_flush_tlb_kernel_range(addr, addr + PAGE_SIZE);

					preempt_enable();

					return true;

				}

									
										1

arch/riscv/include/asm/page.h
									
												View File
												
				@@ -122,6 +122,7 @@ struct kernel_mapping {

				extern struct kernel_mapping kernel_map;

				extern phys_addr_t phys_ram_base;

				extern unsigned long vmemmap_start_pfn;

				#define is_kernel_mapping(x)	\

					((x) >= kernel_map.virt_addr && (x) < (kernel_map.virt_addr + kernel_map.size))

									
										2

arch/riscv/include/asm/pgtable.h
									
												View File
												
				@@ -87,7 +87,7 @@

				 * Define vmemmap for pfn_to_page & page_to_pfn calls. Needed if kernel

				 * is configured with CONFIG_SPARSEMEM_VMEMMAP enabled.

				 */

				#define vmemmap		((struct page *)VMEMMAP_START - (phys_ram_base >> PAGE_SHIFT))

				#define vmemmap		((struct page *)VMEMMAP_START - vmemmap_start_pfn)

				#define PCI_IO_SIZE      SZ_16M

				#define PCI_IO_END       VMEMMAP_START

									
										1

arch/riscv/include/asm/sbi.h
									
												View File
												
				@@ -159,6 +159,7 @@ struct riscv_pmu_snapshot_data {

				};

				#define RISCV_PMU_RAW_EVENT_MASK GENMASK_ULL(47, 0)

				#define RISCV_PMU_PLAT_FW_EVENT_MASK GENMASK_ULL(61, 0)

				#define RISCV_PMU_RAW_EVENT_IDX 0x20000

				#define RISCV_PLAT_FW_EVENT	0xFFFF

									
										5

arch/riscv/include/asm/spinlock.h
									
												View File
												
				@@ -3,8 +3,11 @@

				#ifndef __ASM_RISCV_SPINLOCK_H

				#define __ASM_RISCV_SPINLOCK_H

				#ifdef CONFIG_RISCV_COMBO_SPINLOCKS

				#ifdef CONFIG_QUEUED_SPINLOCKS

				#define _Q_PENDING_LOOPS	(1 << 9)

				#endif

				#ifdef CONFIG_RISCV_COMBO_SPINLOCKS

				#define __no_arch_spinlock_redefine

				#include <asm/ticket_spinlock.h>

									
										21

arch/riscv/kernel/entry.S
									
												View File
												
				@@ -23,21 +23,21 @@

					REG_S 	a0, TASK_TI_A0(tp)

					csrr 	a0, CSR_CAUSE

					/* Exclude IRQs */

					blt  	a0, zero, _new_vmalloc_restore_context_a0

					blt  	a0, zero, .Lnew_vmalloc_restore_context_a0

					REG_S 	a1, TASK_TI_A1(tp)

					/* Only check new_vmalloc if we are in page/protection fault */

					li   	a1, EXC_LOAD_PAGE_FAULT

					beq  	a0, a1, _new_vmalloc_kernel_address

					beq  	a0, a1, .Lnew_vmalloc_kernel_address

					li   	a1, EXC_STORE_PAGE_FAULT

					beq  	a0, a1, _new_vmalloc_kernel_address

					beq  	a0, a1, .Lnew_vmalloc_kernel_address

					li   	a1, EXC_INST_PAGE_FAULT

					bne  	a0, a1, _new_vmalloc_restore_context_a1

					bne  	a0, a1, .Lnew_vmalloc_restore_context_a1

				_new_vmalloc_kernel_address:

				.Lnew_vmalloc_kernel_address:

					/* Is it a kernel address? */

					csrr 	a0, CSR_TVAL

					bge 	a0, zero, _new_vmalloc_restore_context_a1

					bge 	a0, zero, .Lnew_vmalloc_restore_context_a1

					/* Check if a new vmalloc mapping appeared that could explain the trap */

					REG_S	a2, TASK_TI_A2(tp)

				@@ -69,7 +69,7 @@ _new_vmalloc_kernel_address:

					/* Check the value of new_vmalloc for this cpu */

					REG_L	a2, 0(a0)

					and	a2, a2, a1

					beq	a2, zero, _new_vmalloc_restore_context

					beq	a2, zero, .Lnew_vmalloc_restore_context

					/* Atomically reset the current cpu bit in new_vmalloc */

					amoxor.d	a0, a1, (a0)

				@@ -83,11 +83,11 @@ _new_vmalloc_kernel_address:

					csrw	CSR_SCRATCH, x0

					sret

				_new_vmalloc_restore_context:

				.Lnew_vmalloc_restore_context:

					REG_L 	a2, TASK_TI_A2(tp)

				_new_vmalloc_restore_context_a1:

				.Lnew_vmalloc_restore_context_a1:

					REG_L 	a1, TASK_TI_A1(tp)

				_new_vmalloc_restore_context_a0:

				.Lnew_vmalloc_restore_context_a0:

					REG_L	a0, TASK_TI_A0(tp)

				.endm

				@@ -278,6 +278,7 @@ SYM_CODE_START_NOALIGN(ret_from_exception)

				#else

					sret

				#endif

				SYM_INNER_LABEL(ret_from_exception_end, SYM_L_GLOBAL)

				SYM_CODE_END(ret_from_exception)

				ASM_NOKPROBE(ret_from_exception)

									
										12

arch/riscv/kernel/jump_label.c
									
												View File
												
				@@ -36,9 +36,15 @@ bool arch_jump_label_transform_queue(struct jump_entry *entry,

						insn = RISCV_INSN_NOP;

					}

					mutex_lock(&text_mutex);

					patch_insn_write(addr, &insn, sizeof(insn));

					mutex_unlock(&text_mutex);

					if (early_boot_irqs_disabled) {

						riscv_patch_in_stop_machine = 1;

						patch_insn_write(addr, &insn, sizeof(insn));

						riscv_patch_in_stop_machine = 0;

					} else {

						mutex_lock(&text_mutex);

						patch_insn_write(addr, &insn, sizeof(insn));

						mutex_unlock(&text_mutex);

					}

					return true;

				}

									
										18

arch/riscv/kernel/module.c
									
												View File
												
				@@ -23,7 +23,7 @@ struct used_bucket {

				struct relocation_head {

					struct hlist_node node;

					struct list_head *rel_entry;

					struct list_head rel_entry;

					void *location;

				};

				@@ -634,7 +634,7 @@ process_accumulated_relocations(struct module *me,

							location = rel_head_iter->location;

							list_for_each_entry_safe(rel_entry_iter,

										 rel_entry_iter_tmp,

										 rel_head_iter->rel_entry,

										 &rel_head_iter->rel_entry,

										 head) {

								curr_type = rel_entry_iter->type;

								reloc_handlers[curr_type].reloc_handler(

				@@ -704,16 +704,7 @@ static int add_relocation_to_accumulate(struct module *me, int type,

							return -ENOMEM;

						}

						rel_head->rel_entry =

							kmalloc(sizeof(struct list_head), GFP_KERNEL);

						if (!rel_head->rel_entry) {

							kfree(entry);

							kfree(rel_head);

							return -ENOMEM;

						}

						INIT_LIST_HEAD(rel_head->rel_entry);

						INIT_LIST_HEAD(&rel_head->rel_entry);

						rel_head->location = location;

						INIT_HLIST_NODE(&rel_head->node);

						if (!current_head->first) {

				@@ -722,7 +713,6 @@ static int add_relocation_to_accumulate(struct module *me, int type,

							if (!bucket) {

								kfree(entry);

								kfree(rel_head->rel_entry);

								kfree(rel_head);

								return -ENOMEM;

							}

				@@ -735,7 +725,7 @@ static int add_relocation_to_accumulate(struct module *me, int type,

					}

					/* Add relocation to head of discovered rel_head */

					list_add_tail(&entry->head, rel_head->rel_entry);

					list_add_tail(&entry->head, &rel_head->rel_entry);

					return 0;

				}

									
										2

arch/riscv/kernel/probes/kprobes.c
									
												View File
												
				@@ -30,7 +30,7 @@ static void __kprobes arch_prepare_ss_slot(struct kprobe *p)

					p->ainsn.api.restore = (unsigned long)p->addr + len;

					patch_text_nosync(p->ainsn.api.insn, &p->opcode, len);

					patch_text_nosync(p->ainsn.api.insn + len, &insn, GET_INSN_LENGTH(insn));

					patch_text_nosync((void *)p->ainsn.api.insn + len, &insn, GET_INSN_LENGTH(insn));

				}

				static void __kprobes arch_prepare_simulate(struct kprobe *p)

									
										2

arch/riscv/kernel/setup.c
									
												View File
												
				@@ -227,7 +227,7 @@ static void __init init_resources(void)

				static void __init parse_dtb(void)

				{

					/* Early scan of device tree from init memory */

					if (early_init_dt_scan(dtb_early_va, __pa(dtb_early_va))) {

					if (early_init_dt_scan(dtb_early_va, dtb_early_pa)) {

						const char *name = of_flat_dt_get_machine_name();

						if (name) {

									
										4

arch/riscv/kernel/stacktrace.c
									
												View File
												
				@@ -17,6 +17,7 @@

				#ifdef CONFIG_FRAME_POINTER

				extern asmlinkage void handle_exception(void);

				extern unsigned long ret_from_exception_end;

				static inline int fp_is_valid(unsigned long fp, unsigned long sp)

				{

				@@ -71,7 +72,8 @@ void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs,

							fp = frame->fp;

							pc = ftrace_graph_ret_addr(current, &graph_idx, frame->ra,

										   &frame->ra);

							if (pc == (unsigned long)handle_exception) {

							if (pc >= (unsigned long)handle_exception &&

							    pc < (unsigned long)&ret_from_exception_end) {

								if (unlikely(!__kernel_text_address(pc) || !fn(arg, pc)))

									break;

									
										6

arch/riscv/kernel/traps.c
									
												View File
												
				@@ -35,7 +35,7 @@

				int show_unhandled_signals = 1;

				static DEFINE_SPINLOCK(die_lock);

				static DEFINE_RAW_SPINLOCK(die_lock);

				static int copy_code(struct pt_regs *regs, u16 *val, const u16 *insns)

				{

				@@ -81,7 +81,7 @@ void die(struct pt_regs *regs, const char *str)

					oops_enter();

					spin_lock_irqsave(&die_lock, flags);

					raw_spin_lock_irqsave(&die_lock, flags);

					console_verbose();

					bust_spinlocks(1);

				@@ -100,7 +100,7 @@ void die(struct pt_regs *regs, const char *str)

					bust_spinlocks(0);

					add_taint(TAINT_DIE, LOCKDEP_NOW_UNRELIABLE);

					spin_unlock_irqrestore(&die_lock, flags);

					raw_spin_unlock_irqrestore(&die_lock, flags);

					oops_exit();

					if (in_interrupt())

									
										2

arch/riscv/kvm/aia.c
									
												View File
												
				@@ -590,7 +590,7 @@ void kvm_riscv_aia_enable(void)

					csr_set(CSR_HIE, BIT(IRQ_S_GEXT));

					/* Enable IRQ filtering for overflow interrupt only if sscofpmf is present */

					if (__riscv_isa_extension_available(NULL, RISCV_ISA_EXT_SSCOFPMF))

						csr_write(CSR_HVIEN, BIT(IRQ_PMU_OVF));

						csr_set(CSR_HVIEN, BIT(IRQ_PMU_OVF));

				}

				void kvm_riscv_aia_disable(void)

									
										24

arch/riscv/mm/init.c
									
												View File
												
				@@ -33,6 +33,7 @@

				#include <asm/pgtable.h>

				#include <asm/sections.h>

				#include <asm/soc.h>

				#include <asm/sparsemem.h>

				#include <asm/tlbflush.h>

				#include "../kernel/head.h"

				@@ -62,6 +63,13 @@ EXPORT_SYMBOL(pgtable_l5_enabled);

				phys_addr_t phys_ram_base __ro_after_init;

				EXPORT_SYMBOL(phys_ram_base);

				#ifdef CONFIG_SPARSEMEM_VMEMMAP

				#define VMEMMAP_ADDR_ALIGN	(1ULL << SECTION_SIZE_BITS)

				unsigned long vmemmap_start_pfn __ro_after_init;

				EXPORT_SYMBOL(vmemmap_start_pfn);

				#endif

				unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)]

											__page_aligned_bss;

				EXPORT_SYMBOL(empty_zero_page);

				@@ -240,8 +248,12 @@ static void __init setup_bootmem(void)

					 * Make sure we align the start of the memory on a PMD boundary so that

					 * at worst, we map the linear mapping with PMD mappings.

					 */

					if (!IS_ENABLED(CONFIG_XIP_KERNEL))

					if (!IS_ENABLED(CONFIG_XIP_KERNEL)) {

						phys_ram_base = memblock_start_of_DRAM() & PMD_MASK;

				#ifdef CONFIG_SPARSEMEM_VMEMMAP

						vmemmap_start_pfn = round_down(phys_ram_base, VMEMMAP_ADDR_ALIGN) >> PAGE_SHIFT;

				#endif

					}

					/*

					 * In 64-bit, any use of __va/__pa before this point is wrong as we

				@@ -1101,6 +1113,9 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)

					kernel_map.xiprom_sz = (uintptr_t)(&_exiprom) - (uintptr_t)(&_xiprom);

					phys_ram_base = CONFIG_PHYS_RAM_BASE;

				#ifdef CONFIG_SPARSEMEM_VMEMMAP

					vmemmap_start_pfn = round_down(phys_ram_base, VMEMMAP_ADDR_ALIGN) >> PAGE_SHIFT;

				#endif

					kernel_map.phys_addr = (uintptr_t)CONFIG_PHYS_RAM_BASE;

					kernel_map.size = (uintptr_t)(&_end) - (uintptr_t)(&_start);

				@@ -1566,7 +1581,7 @@ static void __meminit free_pte_table(pte_t *pte_start, pmd_t *pmd)

					pmd_clear(pmd);

				}

				static void __meminit free_pmd_table(pmd_t *pmd_start, pud_t *pud)

				static void __meminit free_pmd_table(pmd_t *pmd_start, pud_t *pud, bool is_vmemmap)

				{

					struct page *page = pud_page(*pud);

					struct ptdesc *ptdesc = page_ptdesc(page);

				@@ -1579,7 +1594,8 @@ static void __meminit free_pmd_table(pmd_t *pmd_start, pud_t *pud)

							return;

					}

					pagetable_pmd_dtor(ptdesc);

					if (!is_vmemmap)

						pagetable_pmd_dtor(ptdesc);

					if (PageReserved(page))

						free_reserved_page(page);

					else

				@@ -1703,7 +1719,7 @@ static void __meminit remove_pud_mapping(pud_t *pud_base, unsigned long addr, un

						remove_pmd_mapping(pmd_base, addr, next, is_vmemmap, altmap);

						if (pgtable_l4_enabled)

							free_pmd_table(pmd_base, pudp);

							free_pmd_table(pmd_base, pudp, is_vmemmap);

					}

				}

									
										2

arch/s390/boot/startup.c
									
												View File
												
				@@ -234,6 +234,8 @@ static unsigned long get_vmem_size(unsigned long identity_size,

					vsize = round_up(SZ_2G + max_mappable, rte_size) +

						round_up(vmemmap_size, rte_size) +

						FIXMAP_SIZE + MODULES_LEN + KASLR_LEN;

					if (IS_ENABLED(CONFIG_KMSAN))

						vsize += MODULES_LEN * 2;

					return size_add(vsize, vmalloc_size);

				}

									
										6

arch/s390/boot/vmem.c
									
												View File
												
				@@ -306,7 +306,7 @@ static void pgtable_pte_populate(pmd_t *pmd, unsigned long addr, unsigned long e

							pages++;

						}

					}

					if (mode == POPULATE_DIRECT)

					if (mode == POPULATE_IDENTITY)

						update_page_count(PG_DIRECT_MAP_4K, pages);

				}

				@@ -339,7 +339,7 @@ static void pgtable_pmd_populate(pud_t *pud, unsigned long addr, unsigned long e

						}

						pgtable_pte_populate(pmd, addr, next, mode);

					}

					if (mode == POPULATE_DIRECT)

					if (mode == POPULATE_IDENTITY)

						update_page_count(PG_DIRECT_MAP_1M, pages);

				}

				@@ -372,7 +372,7 @@ static void pgtable_pud_populate(p4d_t *p4d, unsigned long addr, unsigned long e

						}

						pgtable_pmd_populate(pud, addr, next, mode);

					}

					if (mode == POPULATE_DIRECT)

					if (mode == POPULATE_IDENTITY)

						update_page_count(PG_DIRECT_MAP_2G, pages);

				}

									
										2

arch/s390/kernel/ipl.c
									
												View File
												
				@@ -270,7 +270,7 @@ static ssize_t sys_##_prefix##_##_name##_store(struct kobject *kobj,	\

					if (len >= sizeof(_value))					\

						return -E2BIG;						\

					len = strscpy(_value, buf, sizeof(_value));			\

					if (len < 0)							\

					if ((ssize_t)len < 0)						\

						return len;						\

					strim(_value);							\

					return len;							\

									
										6

arch/s390/kvm/interrupt.c
									
												View File
												
				@@ -2678,9 +2678,13 @@ static int flic_set_attr(struct kvm_device *dev, struct kvm_device_attr *attr)

						kvm_s390_clear_float_irqs(dev->kvm);

						break;

					case KVM_DEV_FLIC_APF_ENABLE:

						if (kvm_is_ucontrol(dev->kvm))

							return -EINVAL;

						dev->kvm->arch.gmap->pfault_enabled = 1;

						break;

					case KVM_DEV_FLIC_APF_DISABLE_WAIT:

						if (kvm_is_ucontrol(dev->kvm))

							return -EINVAL;

						dev->kvm->arch.gmap->pfault_enabled = 0;

						/*

						 * Make sure no async faults are in transition when

				@@ -2894,6 +2898,8 @@ int kvm_set_routing_entry(struct kvm *kvm,

					switch (ue->type) {

					/* we store the userspace addresses instead of the guest addresses */

					case KVM_IRQ_ROUTING_S390_ADAPTER:

						if (kvm_is_ucontrol(kvm))

							return -EINVAL;

						e->set = set_adapter_int;

						uaddr =  gmap_translate(kvm->arch.gmap, ue->u.adapter.summary_addr);

						if (uaddr == -EFAULT)

									
										2

arch/s390/kvm/vsie.c
									
												View File
												
				@@ -854,7 +854,7 @@ unpin:

				static void unpin_scb(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page,

						      gpa_t gpa)

				{

					hpa_t hpa = (hpa_t) vsie_page->scb_o;

					hpa_t hpa = virt_to_phys(vsie_page->scb_o);

					if (hpa)

						unpin_guest_page(vcpu->kvm, gpa, hpa);

									
										13

arch/x86/events/intel/core.c
									
												View File
												
				@@ -429,6 +429,16 @@ static struct event_constraint intel_lnc_event_constraints[] = {

					EVENT_CONSTRAINT_END

				};

				static struct extra_reg intel_lnc_extra_regs[] __read_mostly = {

					INTEL_UEVENT_EXTRA_REG(0x012a, MSR_OFFCORE_RSP_0, 0xfffffffffffull, RSP_0),

					INTEL_UEVENT_EXTRA_REG(0x012b, MSR_OFFCORE_RSP_1, 0xfffffffffffull, RSP_1),

					INTEL_UEVENT_PEBS_LDLAT_EXTRA_REG(0x01cd),

					INTEL_UEVENT_EXTRA_REG(0x02c6, MSR_PEBS_FRONTEND, 0x9, FE),

					INTEL_UEVENT_EXTRA_REG(0x03c6, MSR_PEBS_FRONTEND, 0x7fff1f, FE),

					INTEL_UEVENT_EXTRA_REG(0x40ad, MSR_PEBS_FRONTEND, 0xf, FE),

					INTEL_UEVENT_EXTRA_REG(0x04c2, MSR_PEBS_FRONTEND, 0x8, FE),

					EVENT_EXTRA_END

				};

				EVENT_ATTR_STR(mem-loads,	mem_ld_nhm,	"event=0x0b,umask=0x10,ldlat=3");

				EVENT_ATTR_STR(mem-loads,	mem_ld_snb,	"event=0xcd,umask=0x1,ldlat=3");

				@@ -6422,7 +6432,7 @@ static __always_inline void intel_pmu_init_lnc(struct pmu *pmu)

					intel_pmu_init_glc(pmu);

					hybrid(pmu, event_constraints) = intel_lnc_event_constraints;

					hybrid(pmu, pebs_constraints) = intel_lnc_pebs_event_constraints;

					hybrid(pmu, extra_regs) = intel_rwc_extra_regs;

					hybrid(pmu, extra_regs) = intel_lnc_extra_regs;

				}

				static __always_inline void intel_pmu_init_skt(struct pmu *pmu)

				@@ -7135,6 +7145,7 @@ __init int intel_pmu_init(void)

					case INTEL_METEORLAKE:

					case INTEL_METEORLAKE_L:

					case INTEL_ARROWLAKE_U:

						intel_pmu_init_hybrid(hybrid_big_small);

						x86_pmu.pebs_latency_data = cmt_latency_data;

									
										3

arch/x86/events/intel/ds.c
									
												View File
												
				@@ -1489,7 +1489,7 @@ void intel_pmu_pebs_enable(struct perf_event *event)

							 * hence we need to drain when changing said

							 * size.

							 */

							intel_pmu_drain_large_pebs(cpuc);

							intel_pmu_drain_pebs_buffer();

							adaptive_pebs_record_size_update();

							wrmsrl(MSR_PEBS_DATA_CFG, pebs_data_cfg);

							cpuc->active_pebs_data_cfg = pebs_data_cfg;

				@@ -2517,6 +2517,7 @@ void __init intel_ds_init(void)

							x86_pmu.large_pebs_flags |= PERF_SAMPLE_TIME;

							break;

						case 6:

						case 5:

							x86_pmu.pebs_ept = 1;

							fallthrough;

									
										1

arch/x86/events/intel/uncore.c
									
												View File
												
				@@ -1910,6 +1910,7 @@ static const struct x86_cpu_id intel_uncore_match[] __initconst = {

					X86_MATCH_VFM(INTEL_ATOM_GRACEMONT,	&adl_uncore_init),

					X86_MATCH_VFM(INTEL_ATOM_CRESTMONT_X,	&gnr_uncore_init),

					X86_MATCH_VFM(INTEL_ATOM_CRESTMONT,	&gnr_uncore_init),

					X86_MATCH_VFM(INTEL_ATOM_DARKMONT_X,	&gnr_uncore_init),

					{},

				};

				MODULE_DEVICE_TABLE(x86cpu, intel_uncore_match);

									
										1

arch/x86/include/asm/cpufeatures.h
									
												View File
												
				@@ -452,6 +452,7 @@

				#define X86_FEATURE_SME_COHERENT	(19*32+10) /* AMD hardware-enforced cache coherency */

				#define X86_FEATURE_DEBUG_SWAP		(19*32+14) /* "debug_swap" AMD SEV-ES full debug state swap support */

				#define X86_FEATURE_SVSM		(19*32+28) /* "svsm" SVSM present */

				#define X86_FEATURE_HV_INUSE_WR_ALLOWED	(19*32+30) /* Allow Write to in-use hypervisor-owned pages */

				/* AMD-defined Extended Feature 2 EAX, CPUID level 0x80000021 (EAX), word 20 */

				#define X86_FEATURE_NO_NESTED_DATA_BP	(20*32+ 0) /* No Nested Data Breakpoints */

									
										2

arch/x86/include/asm/processor.h
									
												View File
												
				@@ -230,6 +230,8 @@ static inline unsigned long long l1tf_pfn_limit(void)

					return BIT_ULL(boot_cpu_data.x86_cache_bits - 1 - PAGE_SHIFT);

				}

				void init_cpu_devs(void);

				void get_cpu_vendor(struct cpuinfo_x86 *c);

				extern void early_cpu_init(void);

				extern void identify_secondary_cpu(struct cpuinfo_x86 *);

				extern void print_cpu_info(struct cpuinfo_x86 *);

									
										15

arch/x86/include/asm/static_call.h
									
												View File
												
				@@ -65,4 +65,19 @@

				extern bool __static_call_fixup(void *tramp, u8 op, void *dest);

				extern void __static_call_update_early(void *tramp, void *func);

				#define static_call_update_early(name, _func)				\

				({									\

					typeof(&STATIC_CALL_TRAMP(name)) __F = (_func);			\

					if (static_call_initialized) {					\

						__static_call_update(&STATIC_CALL_KEY(name),		\

								     STATIC_CALL_TRAMP_ADDR(name), __F);\

					} else {							\

						WRITE_ONCE(STATIC_CALL_KEY(name).func, _func);		\

						__static_call_update_early(STATIC_CALL_TRAMP_ADDR(name),\

									   __F);			\

					}								\

				})

				#endif /* _ASM_STATIC_CALL_H */

									
										6

arch/x86/include/asm/sync_core.h
									
												View File
												
				@@ -8,7 +8,7 @@

				#include <asm/special_insns.h>

				#ifdef CONFIG_X86_32

				static inline void iret_to_self(void)

				static __always_inline void iret_to_self(void)

				{

					asm volatile (

						"pushfl\n\t"

				@@ -19,7 +19,7 @@ static inline void iret_to_self(void)

						: ASM_CALL_CONSTRAINT : : "memory");

				}

				#else

				static inline void iret_to_self(void)

				static __always_inline void iret_to_self(void)

				{

					unsigned int tmp;

				@@ -55,7 +55,7 @@ static inline void iret_to_self(void)

				 * Like all of Linux's memory ordering operations, this is a

				 * compiler barrier as well.

				 */

				static inline void sync_core(void)

				static __always_inline void sync_core(void)

				{

					/*

					 * The SERIALIZE instruction is the most straightforward way to

									
										36

arch/x86/include/asm/xen/hypercall.h
									
												View File
												
				@@ -39,9 +39,11 @@

				#include <linux/string.h>

				#include <linux/types.h>

				#include <linux/pgtable.h>

				#include <linux/instrumentation.h>

				#include <trace/events/xen.h>

				#include <asm/alternative.h>

				#include <asm/page.h>

				#include <asm/smap.h>

				#include <asm/nospec-branch.h>

				@@ -86,11 +88,20 @@ struct xen_dm_op_buf;

				 * there aren't more than 5 arguments...)

				 */

				extern struct { char _entry[32]; } hypercall_page[];

				void xen_hypercall_func(void);

				DECLARE_STATIC_CALL(xen_hypercall, xen_hypercall_func);

				#define __HYPERCALL		"call hypercall_page+%c[offset]"

				#define __HYPERCALL_ENTRY(x)						\

					[offset] "i" (__HYPERVISOR_##x * sizeof(hypercall_page[0]))

				#ifdef MODULE

				#define __ADDRESSABLE_xen_hypercall

				#else

				#define __ADDRESSABLE_xen_hypercall __ADDRESSABLE_ASM_STR(__SCK__xen_hypercall)

				#endif

				#define __HYPERCALL					\

					__ADDRESSABLE_xen_hypercall			\

					"call __SCT__xen_hypercall"

				#define __HYPERCALL_ENTRY(x)	"a" (x)

				#ifdef CONFIG_X86_32

				#define __HYPERCALL_RETREG	"eax"

				@@ -148,7 +159,7 @@ extern struct { char _entry[32]; } hypercall_page[];

					__HYPERCALL_0ARG();						\

					asm volatile (__HYPERCALL					\

						      : __HYPERCALL_0PARAM				\

						      : __HYPERCALL_ENTRY(name)				\

						      : __HYPERCALL_ENTRY(__HYPERVISOR_ ## name)	\

						      : __HYPERCALL_CLOBBER0);				\

					(type)__res;							\

				})

				@@ -159,7 +170,7 @@ extern struct { char _entry[32]; } hypercall_page[];

					__HYPERCALL_1ARG(a1);						\

					asm volatile (__HYPERCALL					\

						      : __HYPERCALL_1PARAM				\

						      : __HYPERCALL_ENTRY(name)				\

						      : __HYPERCALL_ENTRY(__HYPERVISOR_ ## name)	\

						      : __HYPERCALL_CLOBBER1);				\

					(type)__res;							\

				})

				@@ -170,7 +181,7 @@ extern struct { char _entry[32]; } hypercall_page[];

					__HYPERCALL_2ARG(a1, a2);					\

					asm volatile (__HYPERCALL					\

						      : __HYPERCALL_2PARAM				\

						      : __HYPERCALL_ENTRY(name)				\

						      : __HYPERCALL_ENTRY(__HYPERVISOR_ ## name)	\

						      : __HYPERCALL_CLOBBER2);				\

					(type)__res;							\

				})

				@@ -181,7 +192,7 @@ extern struct { char _entry[32]; } hypercall_page[];

					__HYPERCALL_3ARG(a1, a2, a3);					\

					asm volatile (__HYPERCALL					\

						      : __HYPERCALL_3PARAM				\

						      : __HYPERCALL_ENTRY(name)				\

						      : __HYPERCALL_ENTRY(__HYPERVISOR_ ## name)	\

						      : __HYPERCALL_CLOBBER3);				\

					(type)__res;							\

				})

				@@ -192,7 +203,7 @@ extern struct { char _entry[32]; } hypercall_page[];

					__HYPERCALL_4ARG(a1, a2, a3, a4);				\

					asm volatile (__HYPERCALL					\

						      : __HYPERCALL_4PARAM				\

						      : __HYPERCALL_ENTRY(name)				\

						      : __HYPERCALL_ENTRY(__HYPERVISOR_ ## name)	\

						      : __HYPERCALL_CLOBBER4);				\

					(type)__res;							\

				})

				@@ -206,12 +217,9 @@ xen_single_call(unsigned int call,

					__HYPERCALL_DECLS;

					__HYPERCALL_5ARG(a1, a2, a3, a4, a5);

					if (call >= PAGE_SIZE / sizeof(hypercall_page[0]))

						return -EINVAL;

					asm volatile(CALL_NOSPEC

					asm volatile(__HYPERCALL

						     : __HYPERCALL_5PARAM

						     : [thunk_target] "a" (&hypercall_page[call])

						     : __HYPERCALL_ENTRY(call)

						     : __HYPERCALL_CLOBBER5);

					return (long)__res;

Compare commits

1392 Commits v6.13-rc2 ... v6.13-rc7

5 .mailmap Unescape Escape View File

12 CREDITS Unescape Escape View File

5 Documentation/admin-guide/kernel-parameters.txt Unescape Escape View File

10 Documentation/admin-guide/laptops/thinkpad-acpi.rst Unescape Escape View File

2 Documentation/admin-guide/mm/transhuge.rst Unescape Escape View File

4 Documentation/admin-guide/pm/amd-pstate.rst Unescape Escape View File

10 Documentation/devicetree/bindings/crypto/fsl,sec-v4.0.yaml Unescape Escape View File

2 Documentation/devicetree/bindings/display/bridge/adi,adv7533.yaml Unescape Escape View File

19 Documentation/devicetree/bindings/display/mediatek/mediatek,dp.yaml Unescape Escape View File

1 Documentation/devicetree/bindings/iio/st,st-sensors.yaml Unescape Escape View File

2 Documentation/devicetree/bindings/mtd/partitions/fixed-partitions.yaml Unescape Escape View File

2 Documentation/devicetree/bindings/net/pse-pd/pse-controller.yaml Unescape Escape View File

7 Documentation/devicetree/bindings/phy/fsl,imx8mq-usb-phy.yaml Unescape Escape View File

27 Documentation/devicetree/bindings/regulator/qcom,qca6390-pmu.yaml Unescape Escape View File

2 Documentation/devicetree/bindings/soc/fsl/fsl,qman-portal.yaml Unescape Escape View File

2 Documentation/devicetree/bindings/sound/realtek,rt5645.yaml Unescape Escape View File

850 Documentation/mm/process_addrs.rst Unescape Escape View File

60 Documentation/netlink/specs/mptcp_pm.yaml Unescape Escape View File

6 Documentation/networking/ip-sysctl.rst Unescape Escape View File

4 Documentation/power/runtime_pm.rst Unescape Escape View File

3 Documentation/virt/kvm/api.rst Unescape Escape View File

4 Documentation/virt/kvm/devices/s390_flic.rst Unescape Escape View File

43 MAINTAINERS Unescape Escape View File

2 Makefile Unescape Escape View File

5 arch/arc/Kconfig Unescape Escape View File

2 arch/arc/Makefile Unescape Escape View File

2 arch/arc/boot/dts/axc001.dtsi Unescape Escape View File

2 arch/arc/boot/dts/axc003.dtsi Unescape Escape View File

2 arch/arc/boot/dts/axc003_idu.dtsi Unescape Escape View File

12 arch/arc/boot/dts/axs10x_mb.dtsi Unescape Escape View File

2 arch/arc/boot/dts/hsdk.dts Unescape Escape View File

2 arch/arc/include/asm/arcregs.h Unescape Escape View File

8 arch/arc/include/asm/cachetype.h Normal file Unescape Escape View File

2 arch/arc/include/asm/cmpxchg.h Unescape Escape View File

2 arch/arc/include/asm/mmu-arcv2.h Unescape Escape View File

2 arch/arc/net/bpf_jit_arcv2.c Unescape Escape View File

2 arch/arm/boot/dts/nxp/imx/imxrt1050.dtsi Unescape Escape View File

1 arch/arm/configs/imx_v6_v7_defconfig Unescape Escape View File

1 arch/arm/mach-imx/Kconfig Unescape Escape View File

2 arch/arm64/boot/dts/arm/fvp-base-revc.dts Unescape Escape View File

8 arch/arm64/boot/dts/broadcom/bcm2712.dtsi Unescape Escape View File

2 arch/arm64/boot/dts/freescale/imx8-ss-audio.dtsi Unescape Escape View File

2 arch/arm64/boot/dts/freescale/imx8qm-ss-audio.dtsi Unescape Escape View File

2 arch/arm64/boot/dts/freescale/imx95.dtsi Unescape Escape View File

5 arch/arm64/boot/dts/qcom/sa8775p.dtsi Unescape Escape View File

8 arch/arm64/boot/dts/qcom/x1e78100-lenovo-thinkpad-t14s.dts Unescape Escape View File

12 arch/arm64/boot/dts/qcom/x1e80100-crd.dts Unescape Escape View File

8 arch/arm64/boot/dts/qcom/x1e80100.dtsi Unescape Escape View File

1 arch/arm64/boot/dts/rockchip/rk3328.dtsi Unescape Escape View File

1 arch/arm64/boot/dts/rockchip/rk3568.dtsi Unescape Escape View File

2 arch/arm64/boot/dts/rockchip/rk356x-base.dtsi Unescape Escape View File

2 arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts Unescape Escape View File

1 arch/arm64/boot/dts/rockchip/rk3588s-nanopi-r6.dtsi Unescape Escape View File

4 arch/arm64/include/asm/el2_setup.h Unescape Escape View File

4 arch/arm64/kernel/hyp-stub.S Unescape Escape View File

83 arch/arm64/kernel/signal.c Unescape Escape View File

32 arch/arm64/kernel/stacktrace.c Unescape Escape View File

11 arch/arm64/kvm/at.c Unescape Escape View File

3 arch/arm64/kvm/hyp/nvhe/mem_protect.c Unescape Escape View File

4 arch/arm64/kvm/hyp/nvhe/pkvm.c Unescape Escape View File

91 arch/arm64/kvm/pmu-emul.c Unescape Escape View File

35 arch/arm64/kvm/sys_regs.c Unescape Escape View File

12 arch/arm64/kvm/vgic/vgic-its.c Unescape Escape View File

6 arch/hexagon/Makefile Unescape Escape View File

10 arch/nios2/kernel/cpuinfo.c Unescape Escape View File

2 arch/openrisc/kernel/entry.S Unescape Escape View File

32 arch/openrisc/kernel/head.S Unescape Escape View File

3 arch/openrisc/kernel/vmlinux.lds.S Unescape Escape View File

1 arch/powerpc/configs/pmac32_defconfig Unescape Escape View File

1 arch/powerpc/configs/ppc6xx_defconfig Unescape Escape View File

2 arch/powerpc/kvm/e500.h Unescape Escape View File

199 arch/powerpc/kvm/e500_mmu_host.c Unescape Escape View File

36 arch/powerpc/platforms/book3s/vas-api.c Unescape Escape View File

4 arch/riscv/include/asm/kfence.h Unescape Escape View File

1 arch/riscv/include/asm/page.h Unescape Escape View File

2 arch/riscv/include/asm/pgtable.h Unescape Escape View File

1 arch/riscv/include/asm/sbi.h Unescape Escape View File

5 arch/riscv/include/asm/spinlock.h Unescape Escape View File

1392 Commits

v6.13-rc2 ... v6.13-rc7

5

.mailmap

View File

12

CREDITS

View File

5

Documentation/admin-guide/kernel-parameters.txt

View File

10

Documentation/admin-guide/laptops/thinkpad-acpi.rst

View File

2

Documentation/admin-guide/mm/transhuge.rst

View File

4

Documentation/admin-guide/pm/amd-pstate.rst

View File

10

Documentation/devicetree/bindings/crypto/fsl,sec-v4.0.yaml

View File

2

Documentation/devicetree/bindings/display/bridge/adi,adv7533.yaml

View File

19

Documentation/devicetree/bindings/display/mediatek/mediatek,dp.yaml

View File

1

Documentation/devicetree/bindings/iio/st,st-sensors.yaml

View File

2

Documentation/devicetree/bindings/mtd/partitions/fixed-partitions.yaml

View File

2

Documentation/devicetree/bindings/net/pse-pd/pse-controller.yaml

View File

7

Documentation/devicetree/bindings/phy/fsl,imx8mq-usb-phy.yaml

View File

27

Documentation/devicetree/bindings/regulator/qcom,qca6390-pmu.yaml

View File

2

Documentation/devicetree/bindings/soc/fsl/fsl,qman-portal.yaml

View File

2

Documentation/devicetree/bindings/sound/realtek,rt5645.yaml

View File

850

Documentation/mm/process_addrs.rst

View File

60

Documentation/netlink/specs/mptcp_pm.yaml

View File

6

Documentation/networking/ip-sysctl.rst

View File

4

Documentation/power/runtime_pm.rst

View File

3

Documentation/virt/kvm/api.rst

View File

4

Documentation/virt/kvm/devices/s390_flic.rst

View File

43

MAINTAINERS

View File

2

Makefile

View File

5

arch/arc/Kconfig

View File

2

arch/arc/Makefile

View File

2

arch/arc/boot/dts/axc001.dtsi

View File

2

arch/arc/boot/dts/axc003.dtsi

View File

2

arch/arc/boot/dts/axc003_idu.dtsi

View File

12

arch/arc/boot/dts/axs10x_mb.dtsi

View File

2

arch/arc/boot/dts/hsdk.dts

View File

2

arch/arc/include/asm/arcregs.h

View File

8

arch/arc/include/asm/cachetype.h Normal file

View File

2

arch/arc/include/asm/cmpxchg.h

View File

2

arch/arc/include/asm/mmu-arcv2.h

View File

2

arch/arc/net/bpf_jit_arcv2.c

View File

2

arch/arm/boot/dts/nxp/imx/imxrt1050.dtsi

View File

1

arch/arm/configs/imx_v6_v7_defconfig

View File

1

arch/arm/mach-imx/Kconfig

View File

2

arch/arm64/boot/dts/arm/fvp-base-revc.dts

View File

8

arch/arm64/boot/dts/broadcom/bcm2712.dtsi

View File

2

arch/arm64/boot/dts/freescale/imx8-ss-audio.dtsi

View File

2

arch/arm64/boot/dts/freescale/imx8qm-ss-audio.dtsi

View File

2

arch/arm64/boot/dts/freescale/imx95.dtsi

View File

5

arch/arm64/boot/dts/qcom/sa8775p.dtsi

View File

8

arch/arm64/boot/dts/qcom/x1e78100-lenovo-thinkpad-t14s.dts

View File

12

arch/arm64/boot/dts/qcom/x1e80100-crd.dts

View File

8

arch/arm64/boot/dts/qcom/x1e80100.dtsi

View File

1

arch/arm64/boot/dts/rockchip/rk3328.dtsi

View File

1

arch/arm64/boot/dts/rockchip/rk3568.dtsi

View File

2

arch/arm64/boot/dts/rockchip/rk356x-base.dtsi

View File

2

arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts

View File

1

arch/arm64/boot/dts/rockchip/rk3588s-nanopi-r6.dtsi

View File

4

arch/arm64/include/asm/el2_setup.h

View File

4

arch/arm64/kernel/hyp-stub.S

View File

83

arch/arm64/kernel/signal.c

View File

32

arch/arm64/kernel/stacktrace.c

View File

11

arch/arm64/kvm/at.c

View File

3

arch/arm64/kvm/hyp/nvhe/mem_protect.c

View File

4

arch/arm64/kvm/hyp/nvhe/pkvm.c

View File

91

arch/arm64/kvm/pmu-emul.c

View File

35

arch/arm64/kvm/sys_regs.c

View File

12

arch/arm64/kvm/vgic/vgic-its.c

View File

6

arch/hexagon/Makefile

View File

10

arch/nios2/kernel/cpuinfo.c

View File

2

arch/openrisc/kernel/entry.S

View File

32

arch/openrisc/kernel/head.S

View File

3

arch/openrisc/kernel/vmlinux.lds.S

View File

1

arch/powerpc/configs/pmac32_defconfig

View File

1

arch/powerpc/configs/ppc6xx_defconfig

View File

2

arch/powerpc/kvm/e500.h

View File

199

arch/powerpc/kvm/e500_mmu_host.c

View File

36

arch/powerpc/platforms/book3s/vas-api.c

View File

4

arch/riscv/include/asm/kfence.h

View File

1

arch/riscv/include/asm/page.h

View File

2

arch/riscv/include/asm/pgtable.h

View File

1

arch/riscv/include/asm/sbi.h

View File

5

arch/riscv/include/asm/spinlock.h

View File

21

arch/riscv/kernel/entry.S

View File