torvalds-linux

mirror of https://kernel.googlesource.com/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-01-22 00:09:11 +03:00

Author	SHA1	Message	Date
Linus Torvalds	62de6e1685	Scheduler enhancements for v6.14: - Fair scheduler (SCHED_FAIR) enhancements: - Behavioral improvements: - Untangle NEXT_BUDDY and pick_next_task() (Peter Zijlstra) - Delayed-dequeue enhancements & fixes: (Vincent Guittot) - Rename h_nr_running into h_nr_queued - Add new cfs_rq.h_nr_runnable - Use the new cfs_rq.h_nr_runnable - Removed unsued cfs_rq.h_nr_delayed - Rename cfs_rq.idle_h_nr_running into h_nr_idle - Remove unused cfs_rq.idle_nr_running - Rename cfs_rq.nr_running into nr_queued - Do not try to migrate delayed dequeue task - Fix variable declaration position - Encapsulate set custom slice in a __setparam_fair() function - Fixes: - Fix race between yield_to() and try_to_wake_up() (Tianchen Ding) - Fix CPU bandwidth limit bypass during CPU hotplug (Vishal Chourasia) - Cleanups: - Clean up in migrate_degrades_locality() to improve readability (Peter Zijlstra) - Mark m_vruntime() with __maybe_unused (Andy Shevchenko) - Update comments after sched_tick() rename (Sebastian Andrzej Siewior) - Remove CONFIG_CFS_BANDWIDTH=n definition of cfs_bandwidth_used() (Valentin Schneider) - Deadline scheduler (SCHED_DL) enhancements: - Restore dl_server bandwidth on non-destructive root domain changes (Juri Lelli) - Correctly account for allocated bandwidth during hotplug (Juri Lelli) - Check bandwidth overflow earlier for hotplug (Juri Lelli) - Clean up goto label in pick_earliest_pushable_dl_task() (John Stultz) - Consolidate timer cancellation (Wander Lairson Costa) - Load-balancer enhancements: - Improve performance by prioritizing migrating eligible tasks in sched_balance_rq() (Hao Jia) - Do not compute NUMA Balancing stats unnecessarily during load-balancing (K Prateek Nayak) - Do not compute overloaded status unnecessarily during load-balancing (K Prateek Nayak) - Generic scheduling code enhancements: - Use READ_ONCE() in task_on_rq_queued(), to consistently use the WRITE_ONCE() updated ->on_rq field (Harshit Agarwal) - Isolated CPUs support enhancements: (Waiman Long) - Make "isolcpus=nohz" equivalent to "nohz_full" - Consolidate housekeeping cpumasks that are always identical - Remove HK_TYPE_SCHED - Unify HK_TYPE_{TIMER\|TICK\|MISC} to HK_TYPE_KERNEL_NOISE - RSEQ enhancements: - Validate read-only fields under DEBUG_RSEQ config (Mathieu Desnoyers) - PSI enhancements: - Fix race when task wakes up before psi_sched_switch() adjusts flags (Chengming Zhou) - IRQ time accounting performance enhancements: (Yafang Shao) - Define sched_clock_irqtime as static key - Don't account irq time if sched_clock_irqtime is disabled - Virtual machine scheduling enhancements: - Don't try to catch up excess steal time (Suleiman Souhlal) - Heterogenous x86 CPU scheduling enhancements: (K Prateek Nayak) - Convert "sysctl_sched_itmt_enabled" to boolean - Use guard() for itmt_update_mutex - Move the "sched_itmt_enabled" sysctl to debugfs - Remove x86_smt_flags and use cpu_smt_flags directly - Use x86_sched_itmt_flags for PKG domain unconditionally - Debugging code & instrumentation enhancements: - Change need_resched warnings to pr_err() (David Rientjes) - Print domain name in /proc/schedstat (K Prateek Nayak) - Fix value reported by hot tasks pulled in /proc/schedstat (Peter Zijlstra) - Report the different kinds of imbalances in /proc/schedstat (Swapnil Sapkal) - Move sched domain name out of CONFIG_SCHED_DEBUG (Swapnil Sapkal) - Update Schedstat version to 17 (Swapnil Sapkal) Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmePSRcRHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1hrdBAAjYiLl5Q8SHM0xnl+kbvuUkCTgEB/gSgA mfrZtHRUgRZuA89NZ9NljlCkQSlsLTOjnpNuaeFzs529GMg9iemc99dbnz3BP5F3 V5qpYvWe7yIkJ3hd0TOGLmYEPMNQaAW57YBOrxcPjWNLJ4cr9iMdccVA1OQtcmqD ZUh3nibv81QI8HDmT2G+figxEIqH3yBV1+SmEIxbrdkQpIJ5702Ng6+0KQK5TShN xwjFELWZUl2TfkoCc4nkIpkImV6cI1DvXSw1xK6gbb1xEVOrsmFW3TYFw4trKHBu 2RBG4wtmzNjh+12GmSdIBJHogPNcay+JIJW9EG/unT7jirqzkkeP1X2eJEbh+X1L CMa7GsD9Vy72jCzeJDMuiy7bKfG/MiKUtDXrAZQDo2atbw7H88QOzMuTE5a5WSV+ tRxXGI/dgFVOk+JQUfctfJbYeXjmG8GAflawvXtGDAfDZsja6M+65fH8p0AOgW1E HHmXUzAe2E2xQBiSok/DYHPQeCDBAjoJvU93YhGiXv8UScb2UaD4BAfzfmc8P+Zs Eox6444ah5U0jiXmZ3HU707n1zO+Ql4qKoyyMJzSyP+oYHE/Do7NYTElw2QovVdN FX/9Uae8T4ttA/5lFe7FNoXgKvSxXDKYyKLZcysjVrWJF866Ui/TWtmxA6w8Osn7 sfucuLawLPM= =5ZNW -----END PGP SIGNATURE----- Merge tag 'sched-core-2025-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "Fair scheduler (SCHED_FAIR) enhancements: - Behavioral improvements: - Untangle NEXT_BUDDY and pick_next_task() (Peter Zijlstra) - Delayed-dequeue enhancements & fixes: (Vincent Guittot) - Rename h_nr_running into h_nr_queued - Add new cfs_rq.h_nr_runnable - Use the new cfs_rq.h_nr_runnable - Removed unsued cfs_rq.h_nr_delayed - Rename cfs_rq.idle_h_nr_running into h_nr_idle - Remove unused cfs_rq.idle_nr_running - Rename cfs_rq.nr_running into nr_queued - Do not try to migrate delayed dequeue task - Fix variable declaration position - Encapsulate set custom slice in a __setparam_fair() function - Fixes: - Fix race between yield_to() and try_to_wake_up() (Tianchen Ding) - Fix CPU bandwidth limit bypass during CPU hotplug (Vishal Chourasia) - Cleanups: - Clean up in migrate_degrades_locality() to improve readability (Peter Zijlstra) - Mark m_vruntime() with __maybe_unused (Andy Shevchenko) - Update comments after sched_tick() rename (Sebastian Andrzej Siewior) - Remove CONFIG_CFS_BANDWIDTH=n definition of cfs_bandwidth_used() (Valentin Schneider) Deadline scheduler (SCHED_DL) enhancements: - Restore dl_server bandwidth on non-destructive root domain changes (Juri Lelli) - Correctly account for allocated bandwidth during hotplug (Juri Lelli) - Check bandwidth overflow earlier for hotplug (Juri Lelli) - Clean up goto label in pick_earliest_pushable_dl_task() (John Stultz) - Consolidate timer cancellation (Wander Lairson Costa) Load-balancer enhancements: - Improve performance by prioritizing migrating eligible tasks in sched_balance_rq() (Hao Jia) - Do not compute NUMA Balancing stats unnecessarily during load-balancing (K Prateek Nayak) - Do not compute overloaded status unnecessarily during load-balancing (K Prateek Nayak) Generic scheduling code enhancements: - Use READ_ONCE() in task_on_rq_queued(), to consistently use the WRITE_ONCE() updated ->on_rq field (Harshit Agarwal) Isolated CPUs support enhancements: (Waiman Long) - Make "isolcpus=nohz" equivalent to "nohz_full" - Consolidate housekeeping cpumasks that are always identical - Remove HK_TYPE_SCHED - Unify HK_TYPE_{TIMER\|TICK\|MISC} to HK_TYPE_KERNEL_NOISE RSEQ enhancements: - Validate read-only fields under DEBUG_RSEQ config (Mathieu Desnoyers) PSI enhancements: - Fix race when task wakes up before psi_sched_switch() adjusts flags (Chengming Zhou) IRQ time accounting performance enhancements: (Yafang Shao) - Define sched_clock_irqtime as static key - Don't account irq time if sched_clock_irqtime is disabled Virtual machine scheduling enhancements: - Don't try to catch up excess steal time (Suleiman Souhlal) Heterogenous x86 CPU scheduling enhancements: (K Prateek Nayak) - Convert "sysctl_sched_itmt_enabled" to boolean - Use guard() for itmt_update_mutex - Move the "sched_itmt_enabled" sysctl to debugfs - Remove x86_smt_flags and use cpu_smt_flags directly - Use x86_sched_itmt_flags for PKG domain unconditionally Debugging code & instrumentation enhancements: - Change need_resched warnings to pr_err() (David Rientjes) - Print domain name in /proc/schedstat (K Prateek Nayak) - Fix value reported by hot tasks pulled in /proc/schedstat (Peter Zijlstra) - Report the different kinds of imbalances in /proc/schedstat (Swapnil Sapkal) - Move sched domain name out of CONFIG_SCHED_DEBUG (Swapnil Sapkal) - Update Schedstat version to 17 (Swapnil Sapkal)" * tag 'sched-core-2025-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (48 commits) rseq: Fix rseq unregistration regression psi: Fix race when task wakes up before psi_sched_switch() adjusts flags sched, psi: Don't account irq time if sched_clock_irqtime is disabled sched: Don't account irq time if sched_clock_irqtime is disabled sched: Define sched_clock_irqtime as static key sched/fair: Do not compute overloaded status unnecessarily during lb sched/fair: Do not compute NUMA Balancing stats unnecessarily during lb x86/topology: Use x86_sched_itmt_flags for PKG domain unconditionally x86/topology: Remove x86_smt_flags and use cpu_smt_flags directly x86/itmt: Move the "sched_itmt_enabled" sysctl to debugfs x86/itmt: Use guard() for itmt_update_mutex x86/itmt: Convert "sysctl_sched_itmt_enabled" to boolean sched/core: Prioritize migrating eligible tasks in sched_balance_rq() sched/debug: Change need_resched warnings to pr_err sched/fair: Encapsulate set custom slice in a __setparam_fair() function sched: Fix race between yield_to() and try_to_wake_up() docs: Update Schedstat version to 17 sched/stats: Print domain name in /proc/schedstat sched: Move sched domain name out of CONFIG_SCHED_DEBUG sched: Report the different kinds of imbalances in /proc/schedstat ...	2025-01-21 11:32:36 -08:00
Linus Torvalds	858df1de21	Miscellaneous x86 cleanups and typo fixes, and also the removal of the "disablelapic" boot parameter. Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmePTD8RHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1jf5g//Wo1WKUXukRrBANr2nIlx9B7xJliRmUxv mJ0VKo49YPl6C34fjSHhBs3+nPbYD+CyWVKAz5PqkfkFRGBgpQi26EnyKaIhLVFW HWhW5vQm/FJfzBIrfFg7g/H1PK+rEYa4mv8JF9vhwp7BOfuqx4ABGKWQnrvOGg2B VivE5k7/kxWRPTg45Kgb1iwlS2gcfWCRi9qdCzdJgY/4XYE6k6hKeV0PgTT3Vojf pZKsgZRq8tzMaX75obtyyrX3TWj0nkRec0XbgyXBFvlFh/l3e0RswxzGGAjrC1XP R+qmscdCkczUwRGc1mGj9MoCqMRRffU6/hTNsjqu8o7Q2gzZzXWHcUc+X7UwOeKZ 2guxOj4iagdn7+mIso6uAjY+OOdFVw7/C8ysbCmwo3MiaDsfaK2NkdBoT2xDWuIw NP/45RMpTIsgL0wG6upzXXApKgYxfWhNSq+oHDF4/TjWY4i779hjMghvtX1BI7yb LXIh2SsRcnmEPl42UGaz6xmdmkulWZPPxI5rghixU48Eazkngfp7ZTHYpm5NFoRP Qc3JNcKo7rGmkoo/sA7uwawjnaTz/H77SDNjfAufzjVAKidvUqW6xaK/8JM1fq0n du+9sQN5MrAqdKx5Lu624s/7ektwkDeUdQFGazqS9y0GBT25T9Rw+LQDuec7BG3p v8sok4IaPA0= =Hzj3 -----END PGP SIGNATURE----- Merge tag 'x86-cleanups-2025-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Ingo Molnar: "Miscellaneous x86 cleanups and typo fixes, and also the removal of the 'disablelapic' boot parameter" * tag 'x86-cleanups-2025-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/ioapic: Remove a stray tab in the IO-APIC type string x86/cpufeatures: Remove "AMD" from the comments to the AMD-specific leaf Documentation/kernel-parameters: Fix a typo in kvm.enable_virt_at_load text x86/cpu: Fix typo in x86_match_cpu()'s doc x86/apic: Remove "disablelapic" cmdline option Documentation: Merge x86-specific boot options doc into kernel-parameters.txt x86/ioremap: Remove unused size parameter in remapping functions x86/ioremap: Simplify setup_data mapping variants x86/boot/compressed: Remove unused header includes from kaslr.c	2025-01-21 11:15:29 -08:00
Linus Torvalds	6c4aa896eb	Performance events changes for v6.14: - Seqlock optimizations that arose in a perf context and were merged into the perf tree: - seqlock: Add raw_seqcount_try_begin (Suren Baghdasaryan) - mm: Convert mm_lock_seq to a proper seqcount ((Suren Baghdasaryan) - mm: Introduce mmap_lock_speculate_{try_begin\|retry} (Suren Baghdasaryan) - mm/gup: Use raw_seqcount_try_begin() (Peter Zijlstra) - Core perf enhancements: - Reduce 'struct page' footprint of perf by mapping pages in advance (Lorenzo Stoakes) - Save raw sample data conditionally based on sample type (Yabin Cui) - Reduce sampling overhead by checking sample_type in perf_sample_save_callchain() and perf_sample_save_brstack() (Yabin Cui) - Export perf_exclude_event() (Namhyung Kim) - Uprobes scalability enhancements: (Andrii Nakryiko) - Simplify find_active_uprobe_rcu() VMA checks - Add speculative lockless VMA-to-inode-to-uprobe resolution - Simplify session consumer tracking - Decouple return_instance list traversal and freeing - Ensure return_instance is detached from the list before freeing - Reuse return_instances between multiple uretprobes within task - Guard against kmemdup() failing in dup_return_instance() - AMD core PMU driver enhancements: - Relax privilege filter restriction on AMD IBS (Namhyung Kim) - AMD RAPL energy counters support: (Dhananjay Ugwekar) - Introduce topology_logical_core_id() (K Prateek Nayak) - Remove the unused get_rapl_pmu_cpumask() function - Remove the cpu_to_rapl_pmu() function - Rename rapl_pmu variables - Make rapl_model struct global - Add arguments to the init and cleanup functions - Modify the generic variable names to _pkg - Remove the global variable rapl_msrs - Move the cntr_mask to rapl_pmus struct - Add core energy counter support for AMD CPUs - Intel core PMU driver enhancements: - Support RDPMC 'metrics clear mode' feature (Kan Liang) - Clarify adaptive PEBS processing (Kan Liang) - Factor out functions for PEBS records processing (Kan Liang) - Simplify the PEBS records processing for adaptive PEBS (Kan Liang) - Intel uncore driver enhancements: (Kan Liang) - Convert buggy pmu->func_id use to pmu->registered - Support more units on Granite Rapids Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmeOJdQRHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1i2yQ/+MXl7yfJOgdbwjBpgGGzH4burEO7ppak+ ktzz+YjpNgjODe/xMAJGjjblouuYArCnRolc1UPvPm6M7jSY76wi42Y6c4dRtFoB 2ReSrRqnreLOcrRS9nsTjvWRHfJHqJDVSd9TfHX6ILfzbaizCZOGYk558ZxAKRqu Lw7FOvLEe/Y3tg4z8dDg083jsasalKySP9wIPc0BkSqQTOfusd3KXju/Fux/9wkn hZcUgF4ds+0bH7xtO1/G9ILqGyeq97X1McIR9bAjln5Mxykclen4hSjRaWWHHo9O mzBKmd/blIATisfuuW+QLDQow3M1k3688cz7e9QOeWHHd/dJiMb9RLV90jdND/T/ uLINC5vNemzyWEfnNiYQ31LjhG3SeuDiKWzRp36MbQcCh6EBdRXWLBgtmxq1L/3o ZCaCdtFu5+6epycdyOVZEpWDnjdx4GmLXMZi5WJfZ7fZ/IFjNkjk4OdzI1iRQ+i3 Sbi75ep59ayTUhm5AB7gCJsP3R7EsZsiPHUenQdA2n9Sj6xE+IuhlS/QDQ9g5mdY Ijs0jHeVCGmhYoOD1xWnCZSzlnkEVU3zwfypAK+MC7pgtFMwDy5/Bu1USGxXXDy+ aKsrJRSgHbtZ1gwoHstqkV+DeCTfElCLYkvigzI5Nmyib5Zp4vkwy2ZLWQjaNjm7 mqRI7PugUkU= =c8XB -----END PGP SIGNATURE----- Merge tag 'perf-core-2025-01-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull performance events updates from Ingo Molnar: "Seqlock optimizations that arose in a perf context and were merged into the perf tree: - seqlock: Add raw_seqcount_try_begin (Suren Baghdasaryan) - mm: Convert mm_lock_seq to a proper seqcount (Suren Baghdasaryan) - mm: Introduce mmap_lock_speculate_{try_begin\|retry} (Suren Baghdasaryan) - mm/gup: Use raw_seqcount_try_begin() (Peter Zijlstra) Core perf enhancements: - Reduce 'struct page' footprint of perf by mapping pages in advance (Lorenzo Stoakes) - Save raw sample data conditionally based on sample type (Yabin Cui) - Reduce sampling overhead by checking sample_type in perf_sample_save_callchain() and perf_sample_save_brstack() (Yabin Cui) - Export perf_exclude_event() (Namhyung Kim) Uprobes scalability enhancements: (Andrii Nakryiko) - Simplify find_active_uprobe_rcu() VMA checks - Add speculative lockless VMA-to-inode-to-uprobe resolution - Simplify session consumer tracking - Decouple return_instance list traversal and freeing - Ensure return_instance is detached from the list before freeing - Reuse return_instances between multiple uretprobes within task - Guard against kmemdup() failing in dup_return_instance() AMD core PMU driver enhancements: - Relax privilege filter restriction on AMD IBS (Namhyung Kim) AMD RAPL energy counters support: (Dhananjay Ugwekar) - Introduce topology_logical_core_id() (K Prateek Nayak) - Remove the unused get_rapl_pmu_cpumask() function - Remove the cpu_to_rapl_pmu() function - Rename rapl_pmu variables - Make rapl_model struct global - Add arguments to the init and cleanup functions - Modify the generic variable names to _pkg - Remove the global variable rapl_msrs - Move the cntr_mask to rapl_pmus struct - Add core energy counter support for AMD CPUs Intel core PMU driver enhancements: - Support RDPMC 'metrics clear mode' feature (Kan Liang) - Clarify adaptive PEBS processing (Kan Liang) - Factor out functions for PEBS records processing (Kan Liang) - Simplify the PEBS records processing for adaptive PEBS (Kan Liang) Intel uncore driver enhancements: (Kan Liang) - Convert buggy pmu->func_id use to pmu->registered - Support more units on Granite Rapids" * tag 'perf-core-2025-01-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits) perf: map pages in advance perf/x86/intel/uncore: Support more units on Granite Rapids perf/x86/intel/uncore: Clean up func_id perf/x86/intel: Support RDPMC metrics clear mode uprobes: Guard against kmemdup() failing in dup_return_instance() perf/x86: Relax privilege filter restriction on AMD IBS perf/core: Export perf_exclude_event() uprobes: Reuse return_instances between multiple uretprobes within task uprobes: Ensure return_instance is detached from the list before freeing uprobes: Decouple return_instance list traversal and freeing uprobes: Simplify session consumer tracking uprobes: add speculative lockless VMA-to-inode-to-uprobe resolution uprobes: simplify find_active_uprobe_rcu() VMA checks mm: introduce mmap_lock_speculate_{try_begin\|retry} mm: convert mm_lock_seq to a proper seqcount mm/gup: Use raw_seqcount_try_begin() seqlock: add raw_seqcount_try_begin perf/x86/rapl: Add core energy counter support for AMD CPUs perf/x86/rapl: Move the cntr_mask to rapl_pmus struct perf/x86/rapl: Remove the global variable rapl_msrs ...	2025-01-21 10:52:03 -08:00
Linus Torvalds	a6640c8c2f	Objtool changes for v6.14: - Introduce the generic section-based annotation infrastructure a.k.a. ASM_ANNOTATE/ANNOTATE (Peter Zijlstra) - Convert various facilities to ASM_ANNOTATE/ANNOTATE: (Peter Zijlstra) - ANNOTATE_NOENDBR - ANNOTATE_RETPOLINE_SAFE - instrumentation_{begin,end}() - VALIDATE_UNRET_BEGIN - ANNOTATE_IGNORE_ALTERNATIVE - ANNOTATE_INTRA_FUNCTION_CALL - {.UN}REACHABLE - Optimize the annotation-sections parsing code (Peter Zijlstra) - Centralize annotation definitions in <linux/objtool.h> - Unify & simplify the barrier_before_unreachable()/unreachable() definitions (Peter Zijlstra) - Convert unreachable() calls to BUG() in x86 code, as unreachable() has unreliable code generation (Peter Zijlstra) - Remove annotate_reachable() and annotate_unreachable(), as it's unreliable against compiler optimizations (Peter Zijlstra) - Fix non-standard ANNOTATE_REACHABLE annotation order (Peter Zijlstra) - Robustify the annotation code by warning about unknown annotation types (Peter Zijlstra) - Allow arch code to discover jump table size, in preparation of annotated jump table support (Ard Biesheuvel) Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmeOHiARHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1gATw/7Bn4A+Isqk9bKo6QgYEnKRoyf760ALQl6 av/toEy1qCHT/CXCiEn1Hut1JEy4YyD6lIarC1scRl5xy7amRDEcCL0i2CKz3orn pf6Fk8/Pi68G2K50o4LTiq8t3uPBJXPlGyDlngh2hFTYRfPRT4m+cig784hmJEXG Xq2YzzUNG++U/4Uwe3JH7bX/vcZTYkZfM62FWfp3I4V0OqKU4c+Pkiv4u3Rs7L7b c3xk5/PktKZWV5TDsz0wU4SAGxYFGV47hhYM6cxdSYD3la7RVO+qZcqxsJByjpcL bvOmGKQ1SAXr08rV7TB+Fh8icaNE8Rbbmxf6slB0hdXBQb8STAZ810mZJFey6pnm kXgfhhfBOK5Sq+UbTfzF2JgquCGAbKK75bmNGgf2HaLnVLkFIw3AyMsuFqnxhI4X vXRHGnHCYpYUHTxzRYTFYR8XL8twA2kgjWkSe7hYrX/RQZV3XfyKOc2jyoJFMXeX LecfGJCE/pziZyj60SXT9WaUTvKc8gjWOEuAnW1pJQRM0zJqB9kjLh1cDYUseuwv gGkH59KEu0kcfOb5t/jWoqW3PTENJjEAhOmjun6Jv8wgbOxU88TMmSCWppj54O2X c2ibO407535u1SKBWZuaKFBLYftS2GM4WaGsdyTyh+ta48C8An90HMfYNKTHM9Nz F61Q7Zbn65E= =9nGt -----END PGP SIGNATURE----- Merge tag 'objtool-core-2025-01-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool updates from Ingo Molnar: - Introduce the generic section-based annotation infrastructure a.k.a. ASM_ANNOTATE/ANNOTATE (Peter Zijlstra) - Convert various facilities to ASM_ANNOTATE/ANNOTATE: (Peter Zijlstra) - ANNOTATE_NOENDBR - ANNOTATE_RETPOLINE_SAFE - instrumentation_{begin,end}() - VALIDATE_UNRET_BEGIN - ANNOTATE_IGNORE_ALTERNATIVE - ANNOTATE_INTRA_FUNCTION_CALL - {.UN}REACHABLE - Optimize the annotation-sections parsing code (Peter Zijlstra) - Centralize annotation definitions in <linux/objtool.h> - Unify & simplify the barrier_before_unreachable()/unreachable() definitions (Peter Zijlstra) - Convert unreachable() calls to BUG() in x86 code, as unreachable() has unreliable code generation (Peter Zijlstra) - Remove annotate_reachable() and annotate_unreachable(), as it's unreliable against compiler optimizations (Peter Zijlstra) - Fix non-standard ANNOTATE_REACHABLE annotation order (Peter Zijlstra) - Robustify the annotation code by warning about unknown annotation types (Peter Zijlstra) - Allow arch code to discover jump table size, in preparation of annotated jump table support (Ard Biesheuvel) * tag 'objtool-core-2025-01-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm: Convert unreachable() to BUG() objtool: Allow arch code to discover jump table size objtool: Warn about unknown annotation types objtool: Fix ANNOTATE_REACHABLE to be a normal annotation objtool: Convert {.UN}REACHABLE to ANNOTATE objtool: Remove annotate_{,un}reachable() loongarch: Use ASM_REACHABLE x86: Convert unreachable() to BUG() unreachable: Unify objtool: Collect more annotations in objtool.h objtool: Collapse annotate sequences objtool: Convert ANNOTATE_INTRA_FUNCTION_CALL to ANNOTATE objtool: Convert ANNOTATE_IGNORE_ALTERNATIVE to ANNOTATE objtool: Convert VALIDATE_UNRET_BEGIN to ANNOTATE objtool: Convert instrumentation_{begin,end}() to ANNOTATE objtool: Convert ANNOTATE_RETPOLINE_SAFE to ANNOTATE objtool: Convert ANNOTATE_NOENDBR to ANNOTATE objtool: Generic annotation infrastructure	2025-01-21 10:13:11 -08:00
Linus Torvalds	8838a1a2d2	Locking changes for v6.14: - Lockdep: - Improve and fix lockdep bitsize limits, clarify the Kconfig documentation (Carlos Llamas) - Fix lockdep build warning on Clang related to chain_hlock_class_idx() inlining (Andy Shevchenko) - Relax the requirements of PROVE_RAW_LOCK_NESTING arch support by not tying it to ARCH_SUPPORTS_RT unnecessarily (Waiman Long) - Rust integration: - Support lock pointers managed by the C side (Lyude Paul) - Support guard types (Lyude Paul) - Update MAINTAINERS file filters to include the Rust locking code (Boqun Feng) - Wake-queues: - Add raw_spin_wake() helpers to simplify locking code (John Stultz) - SMP cross-calls: - Fix potential data update race by evaluating the local cond_func() before IPI side-effects (Mathieu Desnoyers) - Guard primitives: - Ease [c]tags based searches by including the cleanup/guard type primitives (Peter Zijlstra) - ww_mutexes: - Simplify the ww_mutex self-test code via swap() (Thorsten Blum) - Static calls: - Update the static calls MAINTAINERS file-pattern (Jiri Slaby) Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmeOCPcRHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1g1Eg/9HapUGZyNSN9Se8F5zWCzIS+85hPgZIAd 0FLm2uMc+zsGGQZ7pxsqofuLlTVLBigfq1TJnddEIFg3dYGG8hUUNav3eS1NWIlW SZIsp6qZwDSwfrHMg5rs1e7ACYRmKlMRKkWHuSYnwwN6XmJfGmWXd9XW4Aokrqou 1+t5Zhv3eieo7Fk+nfmuVK/T8VfWWBD8gbTqI15KTrdnxIqcfDy5Dq+7urk/OkSD IgMf3sHvNEj3lUPFQK+emp2TVC158Yi2awj8ZbzsECmQRUY0hh9/K/yoU5TY2S/O EJGaF253/Tc1k48vz1cB3Lqrl4ZqPNsDu0vYEvMS2L78E1904qfq1EvICJj98Q/c wmPmPyUTKl7+00o8btpMz++5Ro0qxnhN7Phhxfbc6iNIo3wVueoAzQM/bWWCVQ6E Lar9QXQsawBUA3tplrX7JBRAk/qVoz+9pxp0J7AKavCWar3XseKRCpbpn7HNV57B mhkg5zxJMpAaKbyMgrOPpsNovq39rbw0gSAt5o3yxqZAoCJ6x5ol5y0MhPwzymIz TAjdvzo/DHAaDSuBq4BnuffkZpYYgEOTdaO3z+aVR0hsZJ0VQP2AUA7Mv293EOZd I+U6XRd4jUKm1C+5S5XuNMjQG7iX45mPTIs3R6qnatqHuPXrKZobeRHSdK6aX9ZO HuD5iZSq1vg= =yCzP -----END PGP SIGNATURE----- Merge tag 'locking-core-2025-01-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "Lockdep: - Improve and fix lockdep bitsize limits, clarify the Kconfig documentation (Carlos Llamas) - Fix lockdep build warning on Clang related to chain_hlock_class_idx() inlining (Andy Shevchenko) - Relax the requirements of PROVE_RAW_LOCK_NESTING arch support by not tying it to ARCH_SUPPORTS_RT unnecessarily (Waiman Long) Rust integration: - Support lock pointers managed by the C side (Lyude Paul) - Support guard types (Lyude Paul) - Update MAINTAINERS file filters to include the Rust locking code (Boqun Feng) Wake-queues: - Add raw_spin_wake() helpers to simplify locking code (John Stultz) SMP cross-calls: - Fix potential data update race by evaluating the local cond_func() before IPI side-effects (Mathieu Desnoyers) Guard primitives: - Ease [c]tags based searches by including the cleanup/guard type primitives (Peter Zijlstra) ww_mutexes: - Simplify the ww_mutex self-test code via swap() (Thorsten Blum) Static calls: - Update the static calls MAINTAINERS file-pattern (Jiri Slaby)" * tag 'locking-core-2025-01-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: MAINTAINERS: Add static_call_inline.c to STATIC BRANCH/CALL cleanup, tags: Create tags for the cleanup primitives sched/wake_q: Add helper to call wake_up_q after unlock with preemption disabled rust: sync: Add lock::Backend::assert_is_held() rust: sync: Add SpinLockGuard type alias rust: sync: Add MutexGuard type alias rust: sync: Make Guard::new() public rust: sync: Add Lock::from_raw() for Lock<(), B> locking: MAINTAINERS: Start watching Rust locking primitives lockdep: Move lockdep_assert_locked() under #ifdef CONFIG_PROVE_LOCKING lockdep: Mark chain_hlock_class_idx() with __maybe_unused lockdep: Document MAX_LOCKDEP_CHAIN_HLOCKS calculation lockdep: Clarify size for LOCKDEP__BITS configs lockdep: Fix upper limit for LOCKDEP__BITS configs locking/ww_mutex/test: Use swap() macro smp/scf: Evaluate local cond_func() before IPI side-effects locking/lockdep: Enforce PROVE_RAW_LOCK_NESTING only if ARCH_SUPPORTS_RT	2025-01-21 10:10:24 -08:00
Linus Torvalds	b9d8a295ed	- The first part of a restructuring of AMD's representation of a northbridge which is legacy now, and the creation of the new AMD node concept which represents the Zen architecture of having a collection of I/O devices within an SoC. Those nodes comprise the so-called data fabric on Zen. This has at least one practical advantage of not having to add a PCI ID each time a new data fabric PCI device releases. Eventually, the lot more uniform provider of data fabric functionality amd_node.c will be used by all the drivers which need it - Smaller cleanups -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmePuPIACgkQEsHwGGHe VUpU6Q//S9j9+YC9EpredFoJ5W0BfERR5XOum7YjlLxq2mVTStrf9Q1ecrwmS4Q6 4mAydIDfhqNlouUjMBgNNFJcvm8lat+/pjY78oT8ZdjumslMbMxo81VmQ3fX+6fE izMrL81DG4j8zeleUyz5ecJEK/KPw1s3SkY736511PeJSalOU4hLYmU819imfAk/ 5c9os2GNhszIROE1YUYZQ3zXne1t2PNXKvctzVrJYjyKpIDgFNzTj6gXhePzXBNO iFdApqSgKdnnsD6VsfxYVnOKP+cSIl27Tbge6dm7DHQbSs00aVL64JPcX8/hWtp6 ExrwBYiFk6yafwsNUu7/PmqbZNKYxDgvXFq8jSOFfioh6Km/QZYs8y1/qXN3qmSU 78Ah5jyO+U+++FsSa2o9eRpU2l84UIQqvp84PeSLylzh7iLFyFCWsMfreNeIsF9v Jsost58JQOCufRK3qfMiDO88QUZRKyCfFymDAVcvPoBwp5nK9R1ohlbxgXrCPsE7 Bd7J6jrlpcoRyYc8vhshkrnK2Sk6pP77OZOh5AZ9AybnALH0afUNLzk6sBtaObkZ xIJcSIBkKz3P4zWFKsXmqGYHWp1IsKsYRsNjCt5FExWOF+uKKKBjynHmlKeS0l/b J6bwDUPVW/gfkBqDV8bILultj9Gm8L5Z8SwvD1ww69OYN+c7oVk= =ZAjD -----END PGP SIGNATURE----- Merge tag 'x86_misc_for_v6.14_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc x86 updates from Borislav Petkov: - The first part of a restructuring of AMD's representation of a northbridge which is legacy now, and the creation of the new AMD node concept which represents the Zen architecture of having a collection of I/O devices within an SoC. Those nodes comprise the so-called data fabric on Zen. This has at least one practical advantage of not having to add a PCI ID each time a new data fabric PCI device releases. Eventually, the lot more uniform provider of data fabric functionality amd_node.c will be used by all the drivers which need it - Smaller cleanups * tag 'x86_misc_for_v6.14_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/amd_node: Use defines for SMN register offsets x86/amd_node: Remove dependency on AMD_NB x86/amd_node: Update __amd_smn_rw() error paths x86/amd_nb: Move SMN access code to a new amd_node driver x86/amd_nb, hwmon: (k10temp): Simplify amd_pci_dev_to_node_id() x86/amd_nb: Simplify function 3 search x86/amd_nb: Use topology info to get AMD node count x86/amd_nb: Simplify root device search x86/amd_nb: Simplify function 4 search x86: Start moving AMD node functionality out of AMD_NB x86/amd_nb: Clean up early_is_amd_nb() x86/amd_nb: Restrict init function to AMD-based systems x86/mtrr: Rename mtrr_overwrite_state() to guest_force_mtrr_state()	2025-01-21 09:38:52 -08:00
Linus Torvalds	48795f90cb	- Remove the less generic CPU matching infra around struct x86_cpu_desc and use the generic struct x86_cpu_id thing - Remove magic naked numbers for CPUID functions and use proper defines of the prefix CPUID_LEAF_. Consolidate some of the crazy use around the tree - Smaller cleanups and improvements -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmePjeIACgkQEsHwGGHe VUqRBA//TinKFcWagaQB3lsnoBRwqyg6JJZIBNMF9sBMDD9HnvEZ/JduC+3+g1rx iztuCmRSgQsi/QvRaEFNuDMOgk6gACyXxi7Uf6eXsQkSlsZFViaqbXsy9kqslRbl 7QP1NS1sfdSd42JPp2UZT/lg9kluuVnn5b40zZIwy2AAzwrNFfZAS4Yg7Qe4XQDF xBcHi8MAF+LTm5Tv0hLmx2UcfZLhi7hXy8mTAIFS0Liww+Y5qaam33xw9KxNU5lZ tVepzY5my43pRs4MB1CvaQCiZ84GxvAVqz3JYsg5YhVp45xh7P2WtjBeeOqLljaW MkWnDLOmlaD4Y0kL4QA3ReyBVux54RbDGKC0E/t5fwYlk3dQ7gYwSEvh5358R+0z kwxw3NdnNngoLRXAX45EonSxj36jb6KCBHAGqXSfL73OOt30RWCqknEnixcOp/BP chNxCiIx7qko+rAYOD62QkguEEPFdb8roeayhIKtiKL5zUwQAr+jt/pKVx2htWLi xxqSaVoCFu4edWpsEJnanqhS0Es0v7YiBU3jDC37rZJ+dtzf0C2ewD7Nb1g+wUTn NzDkmt58hQW4jBxoxHBIclLfhEETISTEGAAObTa5I5r8IDb7Dv+ZnSv7RfjoR9fL RWMz1bJ1Scem+Fx7fc/IRJFSElC41giSwFlhThHdAzI1m95zJN8= =9Hdg -----END PGP SIGNATURE----- Merge tag 'x86_cpu_for_v6.14_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cpuid updates from Borislav Petkov: - Remove the less generic CPU matching infra around struct x86_cpu_desc and use the generic struct x86_cpu_id thing - Remove magic naked numbers for CPUID functions and use proper defines of the prefix CPUID_LEAF_. Consolidate some of the crazy use around the tree - Smaller cleanups and improvements * tag 'x86_cpu_for_v6.14_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu: Make all all CPUID leaf names consistent x86/fpu: Remove unnecessary CPUID level check x86/fpu: Move CPUID leaf definitions to common code x86/tsc: Remove CPUID "frequency" leaf magic numbers. x86/tsc: Move away from TSC leaf magic numbers x86/cpu: Move TSC CPUID leaf definition x86/cpu: Refresh DCA leaf reading code x86/cpu: Remove unnecessary MwAIT leaf checks x86/cpu: Use MWAIT leaf definition x86/cpu: Move MWAIT leaf definition to common header x86/cpu: Remove 'x86_cpu_desc' infrastructure x86/cpu: Move AMD erratum 1386 table over to 'x86_cpu_id' x86/cpu: Replace PEBS use of 'x86_cpu_desc' use with 'x86_cpu_id' x86/cpu: Expose only stepping min/max interface x86/cpu: Introduce new microcode matching helper x86/cpufeature: Document cpu_feature_enabled() as the default to use x86/paravirt: Remove the WBINVD callback x86/cpufeatures: Free up unused feature bits	2025-01-21 09:30:59 -08:00
Linus Torvalds	13b6931c44	- A segmented Reverse Map table (RMP) is a across-nodes distributed table of sorts which contains per-node descriptors of each node-local 4K page, denoting its ownership (hypervisor, guest, etc) in the realm of confidential computing. Add support for such a table in order to improve referential locality when accessing or modifying RMP table entries - Add support for reading the TSC in SNP guests by removing any interference or influence the hypervisor might have, with the goal of making a confidential guest even more independent from the hypervisor -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmeOYLsACgkQEsHwGGHe VUrywg//WBuywe3+TNPwF0Iw8becqtD7lKMftmUoqpcf20JhiHSCexb+3/r7U2Kb WL1/T5cxX1rA45HzkwovUljlvin8B9bdpY40dUqrKFPMnWLfs4ru0HPA6UxPBsAq r/8XrXuRrI22MLbrAeQ2xSt8dqw3DpbJyUcyr0qOb6OsbtAy05uElYCzMSyzT06F QsTmenosuJqSo1gIGTxfU4nKyd1o8EJ5b1ThK11hvZaIOffgLjEU6g39cG9AeF4X TOkh9CdIlQc3ot14rJeWMy15YEW+xBdXdMEv0ZPOSZiKzTHA7wwdl0VmPm1EK57f BQkZikuoJezJA0r5wSwVgslTaYO0GTXNewwL5jxK1mqRgoK06IgC6xAkX8N7NTYL K6DX+tfaKjSJGY1z9TYOzs+wGV4MBAXmbLwnuhcPumkTYXPFbRFZqx6ec2BLIU+Y bZfwhlr3q+bfFeBYMzyWPHJ87JinOjwu4Ah0uLVmkoRtgb0S3pIdlyRYZAcEl6fn Tgfu0/RNLGGsH/a3BF7AQdt+hOv1ms5hEMYXg++30uC59LR8XbuKnLdUPRi0nVeD e9xyxFybu5ySesnnXabtaO9bSUF+8HV4nkclKglFvuHpLMQ5GlPxTnBj1V1podYR l12G2htXKsSV5JJK4x+WfYBe6Nn3tbcpgZD8M8g0lso8kejqMjs= =hh1m -----END PGP SIGNATURE----- Merge tag 'x86_sev_for_v6.14_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 SEV updates from Borislav Petkov: - A segmented Reverse Map table (RMP) is a across-nodes distributed table of sorts which contains per-node descriptors of each node-local 4K page, denoting its ownership (hypervisor, guest, etc) in the realm of confidential computing. Add support for such a table in order to improve referential locality when accessing or modifying RMP table entries - Add support for reading the TSC in SNP guests by removing any interference or influence the hypervisor might have, with the goal of making a confidential guest even more independent from the hypervisor * tag 'x86_sev_for_v6.14_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/sev: Add the Secure TSC feature for SNP guests x86/tsc: Init the TSC for Secure TSC guests x86/sev: Mark the TSC in a secure TSC guest as reliable x86/sev: Prevent RDTSC/RDTSCP interception for Secure TSC enabled guests x86/sev: Prevent GUEST_TSC_FREQ MSR interception for Secure TSC enabled guests x86/sev: Change TSC MSR behavior for Secure TSC enabled guests x86/sev: Add Secure TSC support for SNP guests x86/sev: Relocate SNP guest messaging routines to common code x86/sev: Carve out and export SNP guest messaging init routines virt: sev-guest: Replace GFP_KERNEL_ACCOUNT with GFP_KERNEL virt: sev-guest: Remove is_vmpck_empty() helper x86/sev/docs: Document the SNP Reverse Map Table (RMP) x86/sev: Add full support for a segmented RMP table x86/sev: Treat the contiguous RMP table as a single RMP segment x86/sev: Map only the RMP table entries instead of the full RMP range x86/sev: Move the SNP probe routine out of the way x86/sev: Require the RMPREAD instruction after Zen4 x86/sev: Add support for the RMPREAD instruction x86/sev: Prepare for using the RMPREAD instruction to access the RMP	2025-01-21 09:00:31 -08:00
Linus Torvalds	254d763310	- A bunch of minor cleanups -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmeOWnYACgkQEsHwGGHe VUpEfQ/+MaFbHwEyewHWxGqhe9bOcoFsKUxQvK1jO99KbcPYjg1sEZ1DDeLydtea vwFoNNMnZDmKYA/+yAeshUIkcxyKYa3Iy6Tc8MXw1Yx1Z+AduKwDm47rvxcn47ig oQt1Zj0zMzON6aF7Xx7xlF5wDDBPMuRLKdKBSuggiio/IJz2G+m2CNI4KTUGAN9S aHE+TIE64MHftcabuoKdRtZ44FUZyb3BRvtpj3+G1BpRR6e/HFl9ghM3emvOv4BE e9QUvMFdVHnXQyib+JV+lOyu/2MpTVuhZqS51e8vSaNToJ0KfOc+j/Sl8uRCSciL qXALGIB5uEHlotKIVk9p9frGeUruc8H5YvdOUt7VEmSofuqbfFaEMSdhBfSxfq+S 1LGBl3hXtVaspgQfzmPpEN5pMQAIMF6xlMiFeZSwBnUhb6n4QVlj38lodBbjct2j mzjBaYtu9NiVNi1H6dr1s/16Vl69tqAeRncGFdzvbJV0ZzY6KN/yFWy7g6PCPwHx gXdXFuOiwsoXQSsQn1l3nsxuXq1zp+TjUFrREPnmYv7t/GyEdieL5LfPnAwXB9x5 vBAPDLa0c5XZRn8QyaeH1gZVX3HMi88cjXBEDI9auejkGr3e1VfdDSZkQpYMIK9b oIWwD2Wr2oU6uz5ye6M0cZpoZBVsbuluE5/kLJ70aj8QEJT3OsM= =xrfO -----END PGP SIGNATURE----- Merge tag 'x86_microcode_for_v6.14_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 microcode loader updates from Borislav Petkov: - A bunch of minor cleanups * tag 'x86_microcode_for_v6.14_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/microcode/AMD: Remove ret local var in early_apply_microcode() x86/microcode/AMD: Have __apply_microcode_amd() return bool x86/microcode/AMD: Make __verify_patch_size() return bool x86/microcode/AMD: Remove bogus comment from parse_container() x86/microcode/AMD: Return bool from find_blobs_in_containers()	2025-01-21 08:33:10 -08:00
Linus Torvalds	3357d1d1f9	- Extend resctrl with the capability of total memory bandwidth monitoring, thus accomodating systems which support only total but not local memory bandwidth monitoring. Add the respective new mount options - The usual cleanups -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmeOU2QACgkQEsHwGGHe VUoqTw/+LOt/36tCMbqUEIhUIvPciqdhAk+gBzP9XzGEWRJDcYLmgmBvlrFcYgIL lZVZ/tBsYqljycMFaJ4K4vishpkJr9s8GtDCOItkMA4cYOJcXSaGSpNojctC2Qs6 42J+qABENRZYWFmWWcCkf8jTG0QWebmVRJwV9Q/4uYRicLFW/B+Em0sOTFn9n7OX 39ZyCXKlmu2wsTW/DrQ8wX0KkW5hevLkJk/NFN3CDuWg7LOs09IotALs/U7ayj+C BIU5ZEAvan8LAPLX4Nrhb5HArsP6eCmB0Kbr+Mm+X9lRBLSbMmThXy58WlRDT0v6 LEx+IbKUnEVoYJnC2QiqxPB4FghhZs9RE6cDnCQivwkigFhIau9krPZUc2klnVtv AmjUx8BporU3rcvrtKycInQQCVdzi9Q8ai7WRrpCqNEItHSOtNk9BipnFGqcnFDr va0SCZ3N1vIiVnZkTZ0CObA2F8RRLMiBCAB9XNfD0TzY+K0p2KCNqS6UnxmrOtR+ 8Fbk+C624u7LhmjbrPr7Hj30wIu2b/CqncIDHrh8O+jK2awvTEl6T0Tryd+jXxR1 ERy0OSUhHEcnW73ckdfdn3/6fsL/9bnBZrTufRWxPzaM+3E9YR9KGmyc7hf5fkSX WE/ZRMse4QwELmSajdv9NK7oO48VDOije+cgUzPgme0i4Lsg4c0= =i4Yw -----END PGP SIGNATURE----- Merge tag 'x86_cache_for_v6.14_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 resource control updates from Borislav Petkov: - Extend resctrl with the capability of total memory bandwidth monitoring, thus accomodating systems which support only total but not local memory bandwidth monitoring. Add the respective new mount options - The usual cleanups * tag 'x86_cache_for_v6.14_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/resctrl: Document the new "mba_MBps_event" file x86/resctrl: Add write option to "mba_MBps_event" file x86/resctrl: Add "mba_MBps_event" file to CTRL_MON directories x86/resctrl: Make mba_sc use total bandwidth if local is not supported x86/resctrl: Compute memory bandwidth for all supported events x86/resctrl: Modify update_mba_bw() to use per CTRL_MON group event x86/resctrl: Prepare for per-CTRL_MON group mba_MBps control x86/resctrl: Introduce resctrl_file_fflags_init() to initialize fflags x86/resctrl: Use kthread_run_on_cpu()	2025-01-21 08:31:04 -08:00
Linus Torvalds	d80825ee4a	- Add support for AMD hardware which is not affected by SRSO on the user/kernel attack vector and advertise it to guest userspace -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmeOTMwACgkQEsHwGGHe VUoMKhAAjMp7tYNmh8687oz8A7ujXDYvbaIh8d3zRnOKq2cEpsGKSOgkw50tbs/I LE5o5k2NJ6evIYEkqZZH0WvksealwzoTY1LWGqHj2zotbyP6ypZn+GKORH+MsNNL fUaoj6DLELqPbLrr48GJG2uabtwmPOgiElZ6bqKrFnGDPI2LSLkrY7fugM3aU4h7 VXDUAz2N2kIRKXFedVTArZtYiVO+O4/fM1VxjIRv/KrQt0lTatsjUYc6jei/7Rqa xPCmw6WsYfPPY8FjsgR3oaGfUQPzs8nv96Vh9lnIFw5/ajkDbwtvRuPEwSYe9MBZ mE+oOqdPz4of12Mv++/BkQL/tKuVPG/e38aeZUQPo/hj2LOWdUdwdAuZuslfrqaA 9xKZgslhPBKr0yRAku60hRpbqnp07cEHuM6JMpmFoDqN1ESnWlDapWKQj+jOpGyz /w0Gp00R03TVhF9QTV7KUyj/U1ykhWG+4q843G5acrgh0geWzy+fYL+jPHgtBbWp E+NFKmnCg9YNbTiB6y9xIcEU9siq6iMXyhp3iv0qlpwhF5WueCvc3BiUwavgpoM6 IpVqrrJspLy6/K7tMKNVKDCIkbHvJ6vKxSM9o3yzqMTL7B3ISlG9o3MSTKQVjytR qEnIQAwwfsWfmeWGEDun+hh83b+HsZ+tyLyrFNleGoe4yJosZtc= =bWI/ -----END PGP SIGNATURE----- Merge tag 'x86_bugs_for_v6.14_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 CPU speculation update from Borislav Petkov: - Add support for AMD hardware which is not affected by SRSO on the user/kernel attack vector and advertise it to guest userspace * tag 'x86_bugs_for_v6.14_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: KVM: x86: Advertise SRSO_USER_KERNEL_NO to userspace x86/bugs: Add SRSO_USER_KERNEL_NO support	2025-01-21 08:22:40 -08:00
Linus Torvalds	0763dd8928	- Remove the EDAC PowerPC Cell driver due to the removal of the IBM Cell blades support - Add a new EDAC driver for Loongson SoCs which reports single-bit correctable errors - Extend the SKX and i10NM EDAC drivers to support UV systems which can have more than 8 nodes - Add Intel Clearwater Forest server support to i10nm_edac - Minor fix -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmeLi5QACgkQEsHwGGHe VUpZEhAAsbvNhKhq84FUCoQP+OWOdQBv0s3WNGherWPAxC3bGY7xDKcjjdf6ebYG 4Rk0+nTEK5kefA6PsyiWtxWQZiYUzERDrNpdjWq5RxaO6WtREDiczgJm4NaCCTBr F59eHImjW/ajBsU8FAcWnVZLo7KqZvtF19vQFL1TXAKJO6Zpb0ybLIts9BVSwwyV c1xYyjMFh1p1H8n7MOsF11u2QUpCcc/SMDesCGWSVAJ2QnB7Ox5NfUaI97lQSiow gnS8vTWTIM6e6rZk2HtkSage0Wt7UHCkIsza5DEdW3xQG10eZUE6o33kerxNMezd lMWNzavR26CYhkO6/McvhsClOHwcAZZVd4PUTXnNvlNTSV+EEbEbD5JWryQvmvkV gazOlPHwd0pj1MkAZBUTdCnR6/DCpqsu68sGMjAiPvR7pb3sBLyRF/DGg9p5Rz0a s3Q6SuTAg/ZQWNqgRNcnjh2SZbzqZ+GGD4blDTv1pNjWemTDYpxj73Wl5JxKTdtr 6n7/ariQTaiMTpQ8ZhDkDP5eHWx5fyTY7P5MzPEuOmNzW/gLz0V5goSqXUUYhQjm YAKy5PDS1QwTBKzEOQ8dE3RcOhZtd1X2Vj04CcDgHrjhZBDderPkGs42R3B74cF0 JB4k5QJFJ97Cxe9xYIFHju7Es6+z+j8kLzWGTydfHGMZfHMHCHM= =nwgH -----END PGP SIGNATURE----- Merge tag 'edac_updates_for_v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull EDAC updates from Borislav Petkov: - Remove the EDAC PowerPC Cell driver due to the removal of the IBM Cell blades support - Add a new EDAC driver for Loongson SoCs which reports single-bit correctable errors - Extend the SKX and i10NM EDAC drivers to support UV systems which can have more than 8 nodes - Add Intel Clearwater Forest server support to i10nm_edac - Minor fix * tag 'edac_updates_for_v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC/cell: Remove powerpc Cell driver EDAC: Add an EDAC driver for the Loongson memory controller EDAC: Fix typos in comments EDAC/{i10nm,skx,skx_common}: Support UV systems EDAC/i10nm: Add Intel Clearwater Forest server support	2025-01-21 08:21:12 -08:00
Linus Torvalds	d3504411a4	- Remove the shared threshold bank hack on AMD and streamline and simplify it - Cleanup and sanitize MCA code -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmeOR1oACgkQEsHwGGHe VUo06hAAlUk3F5sp+53djAKInHdbUoHZdGX4vy5dppjnDfu5JQcg/OwO0l6RbfhA YUTytWHT0SdkyoJG7CnxQOteOkmimeHMfVTZw9LWm9xLMtulNMxsxJnKHlpvBbfR P/1eot/R+wLMzZoqqf6O8MfF1Gs/S3WjO2+T3wbdgwW7YXprbTSwW51FXhLOaObR PR+PfDXqcu4u4+b+bC0HSo3dN0Sc4J71cdb0tt7VIeQwVUAcfEZgdM1opXSxtQJJ G/Ekbjg5dJo4ZRFXXrxVNWxOXJsKbuubc6mw0C+cgCcbDklcF1gmQYvL9+NSExeP vDyhmMhuEDbtvUBJPQFnFywqYH/a1neo00RJUqw6xVXsn+ebBHVGLik8mgbQOaHt fh8bATsQ1aETAk6nx3RMPk9saiqFHk8t4qIV9FwjskXzuKDh5LzM1rGuiFLl5py/ 5hazmwn7/jYTxYJyG2ZEHD1ro2jcZFevu9dPTOSaJL3ODtlH4fQUugBcoukq6re3 OEf/v+J8LcX+fvo8ylJYyXXT9ZDTpckjTNipU8JiEjVcro0MrxEzTnTWma4tcn+w Hp7lZ+/AEmHwKQcNab7frKhTPdxLFRbJyYIGiRAt9mwxXz49IBTDpoVFpXUf56zV Djcd6wmG1gKvM+27or/tuuiDyCZlK0+s7twRYxP50cMuvFcvmNk= =hSqE -----END PGP SIGNATURE----- Merge tag 'ras_core_for_v6.14_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 RAS updates from Borislav Petkov: - Remove the shared threshold bank hack on AMD and streamline and simplify it - Cleanup and sanitize MCA code * tag 'ras_core_for_v6.14_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce/amd: Remove shared threshold bank plumbing x86/mce: Remove the redundant mce_hygon_feature_init() x86/mce: Convert family/model mixed checks to VFM-based checks x86/mce: Break up __mcheck_cpu_apply_quirks() x86/mce: Make four functions return bool x86/mce/threshold: Remove the redundant this_cpu_dec_return() x86/mce: Make several functions return bool	2025-01-21 08:16:24 -08:00
Mathieu Desnoyers	40724ecafc	rseq: Fix rseq unregistration regression A logic inversion in rseq_reset_rseq_cpu_node_id() causes the rseq unregistration to fail when rseq_validate_ro_fields() succeeds rather than the opposite. This affects both CONFIG_DEBUG_RSEQ=y and CONFIG_DEBUG_RSEQ=n. Fixes: `7d5265ffcd` ("rseq: Validate read-only fields under DEBUG_RSEQ config") Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250116205956.836074-1-mathieu.desnoyers@efficios.com	2025-01-21 08:10:51 +01:00
Linus Torvalds	95ec54a420	powerpc updates for 6.14 - Add preempt lazy support - Deprecate cxl and cxl flash driver - Fix a possible IOMMU related OOPS at boot on pSeries - Optimize sched_clock() in ppc32 by replacing mulhdu() by mul_u64_u64_shr() Thanks to: Andrew Donnellan, Andy Shevchenko, Ankur Arora, Christophe Leroy, Frederic Barrat, Gaurav Batra, Luis Felipe Hernandez, Michael Ellerman, Nilay Shroff, Ricardo B. Marliere, Ritesh Harjani (IBM), Sebastian Andrzej Siewior, Shrikanth Hegde, Sourabh Jain, Thorsten Blum, Zhu Jun. -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEqX2DNAOgU8sBX3pRpnEsdPSHZJQFAmePIeIACgkQpnEsdPSH ZJTLRxAAmtarhPItiCQxwi0uyQpuzBoypcVuX8M9qpAUr1cQJv1swPlJI0tFW2xV QDK37FlCytYib1oMJpwyhg5DA8kdg08OuWtvGRVxGu4O+vh2v0aehewAfPsBKBwq JTOhjSlAeDPgsYQQlK6baSlfjb4kYlAFr2mh/oJIfXi2BFV1MB7rQmCXq2sPnfKS 9cFFgsZ74fFhbYOn9qFsldnzb9TPxR0/UcTOETqRcGOjiExv4aYlmWtKGMY/nLkN k5go3xoB5WP7z11clmg0pp+RIoYKR41kR58CtGdcCEEXJJ6WBGPhPLQzT5cLBkMi ppZieQNKrZK7J/udrdKP0+2cTmBTbCpjxHicLf7BhzsWwVxHCnyjrJIzUPuLcDUi Ym9AXsmzBsqMudqnR0lslsY2mUvZOJPYh4ZCKTA5S0TDYWGy/HlAlL7sMs2uCzaM 4g8MVpEJLVo4GAoZM96x4RMcPi4RlHYXbYqNpENRkxiZu2fDoRz9WStPCdda59/D 3rQNaSDT1vBpue9ac6EIMeGgNh+f6q6WKh/PA48QBYDTp/IVbfShD+xiXtaa72cZ W+JmWUwBRyM4HOP0C5yhXBXwL6a5sHj+d6R4gng4UUww7VppJmkZpBhXZsN4VS55 Xos+2Q75FBSQkAZa84yK6dXvFW3v/upIdSXuWTkSgoKs+4Z7dG8= =ctY4 -----END PGP SIGNATURE----- Merge tag 'powerpc-6.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Madhavan Srinivasan: - Add preempt lazy support - Deprecate cxl and cxl flash driver - Fix a possible IOMMU related OOPS at boot on pSeries - Optimize sched_clock() in ppc32 by replacing mulhdu() by mul_u64_u64_shr() Thanks to Andrew Donnellan, Andy Shevchenko, Ankur Arora, Christophe Leroy, Frederic Barrat, Gaurav Batra, Luis Felipe Hernandez, Michael Ellerman, Nilay Shroff, Ricardo B. Marliere, Ritesh Harjani (IBM), Sebastian Andrzej Siewior, Shrikanth Hegde, Sourabh Jain, Thorsten Blum, and Zhu Jun. * tag 'powerpc-6.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: selftests/powerpc: Fix argument order to timer_sub() powerpc/prom_init: Use IS_ENABLED() powerpc/pseries/iommu: IOMMU incorrectly marks MMIO range in DDW powerpc: Use str_on_off() helper in check_cache_coherency() powerpc: Large user copy aware of full:rt:lazy preemption powerpc: Add preempt lazy support powerpc/book3s64/hugetlb: Fix disabling hugetlb when fadump is active powerpc/vdso: Mark the vDSO code read-only after init powerpc/64: Use get_user() in start_thread() macintosh: declare ctl_table as const selftest/powerpc/ptrace: Cleanup duplicate macro definitions selftest/powerpc/ptrace/ptrace-pkey: Remove duplicate macros selftest/powerpc/ptrace/core-pkey: Remove duplicate macros powerpc/8xx: Drop legacy-of-mm-gpiochip.h header scsi/cxlflash: Deprecate driver cxl: Deprecate driver selftests/powerpc: Fix typo in test-vphn.c powerpc/xmon: Use str_yes_no() helper in dump_one_paca() powerpc/32: Replace mulhdu() by mul_u64_u64_shr()	2025-01-20 21:40:19 -08:00
Linus Torvalds	9ad09c4f28	arm64 updates for 6.14 Confidential Computing: * Register a platform device when running in CCA realm mode to enable automatic loading of dependent modules. CPU Features: * Update a bunch of system register definitions to pick up new field encodings from the architectural documentation. * Add hwcaps and selftests for the new (2024) dpISA extensions. Documentation: * Update EL3 (firmware) requirements for booting Linux on modern arm64 designs. * Remove stale information about the kernel virtual memory map. Miscellaneous: * Minor cleanups and typo fixes. Memory management: * Fix vmemmap_check_pmd() to look at the PMD type bits * LPA2 (52-bit physical addressing) cleanups and minor fixes. * Adjust physical address space depending upon whether or not LPA2 is enabled. Perf and PMUs: * Add port filtering support for NVIDIA's NVLINK-C2C Coresight PMU * Extend AXI filtering support for the DDR PMU on NXP IMX SoCs * Fix Designware PCIe PMU event numbering. * Add generic branch events for the Apple M1 CPU PMU. * Add support for Marvell Odyssey DDR and LLC-TAD PMUs. * Cleanups to the Hisilicon DDRC and Uncore PMU code. * Advertise discard mode for the SPE PMU. * Add the perf users mailing list to our MAINTAINERS entry. -----BEGIN PGP SIGNATURE----- iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmeKZLcQHHdpbGxAa2Vy bmVsLm9yZwAKCRC3rHDchMFjNEQzB/0X2U89ZiqxIkTPQvfFrjN/uUGybkq59rEL DfeoGukTgJIwc3GHWXXtQ//wuuYKdTeCXaIz5NFK3+7/wmKSLvjkexmue8pta6EY 5rx9bAPr/D8lAUvhKIN2l3pF/ygoRwDz+nT2yVQ1xlZxYJWX7ZIsMj7W7ceb5kdx HRrTSQuhEEPREAWWO4oCMWl5SQZSrIflSE3Be/PsP0OhW6k//ZmWbcJTgUcHbKam o2WtNjITyGzxMpRCcrGEZKoe9YcwSxiut/PoD7JuoB4C/rbsf1cdJ6uLmtvGJcZj qsdRHhVfBzP1+ahONrDbiT3C2+s1UZySKdCDIxiYy6lB39wpP0dd =E7Mf -----END PGP SIGNATURE----- Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: "We've got a little less than normal thanks to the holidays in December, but there's the usual summary below. The highlight is probably the 52-bit physical addressing (LPA2) clean-up from Ard. Confidential Computing: - Register a platform device when running in CCA realm mode to enable automatic loading of dependent modules CPU Features: - Update a bunch of system register definitions to pick up new field encodings from the architectural documentation - Add hwcaps and selftests for the new (2024) dpISA extensions Documentation: - Update EL3 (firmware) requirements for booting Linux on modern arm64 designs - Remove stale information about the kernel virtual memory map Miscellaneous: - Minor cleanups and typo fixes Memory management: - Fix vmemmap_check_pmd() to look at the PMD type bits - LPA2 (52-bit physical addressing) cleanups and minor fixes - Adjust physical address space depending upon whether or not LPA2 is enabled Perf and PMUs: - Add port filtering support for NVIDIA's NVLINK-C2C Coresight PMU - Extend AXI filtering support for the DDR PMU on NXP IMX SoCs - Fix Designware PCIe PMU event numbering - Add generic branch events for the Apple M1 CPU PMU - Add support for Marvell Odyssey DDR and LLC-TAD PMUs - Cleanups to the Hisilicon DDRC and Uncore PMU code - Advertise discard mode for the SPE PMU - Add the perf users mailing list to our MAINTAINERS entry" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (64 commits) Documentation: arm64: Remove stale and redundant virtual memory diagrams perf docs: arm_spe: Document new discard mode perf: arm_spe: Add format option for discard mode MAINTAINERS: Add perf list for drivers/perf/ arm64: Remove duplicate included header drivers/perf: apple_m1: Map generic branch events arm64: rsi: Add automatic arm-cca-guest module loading kselftest/arm64: Add 2024 dpISA extensions to hwcap test KVM: arm64: Allow control of dpISA extensions in ID_AA64ISAR3_EL1 arm64/hwcap: Describe 2024 dpISA extensions to userspace arm64/sysreg: Update ID_AA64SMFR0_EL1 to DDI0601 2024-12 arm64: Filter out SVE hwcaps when FEAT_SVE isn't implemented drivers/perf: hisi: Set correct IRQ affinity for PMUs with no association arm64/sme: Move storage of reg_smidr to __cpuinfo_store_cpu() arm64: mm: Test for pmd_sect() in vmemmap_check_pmd() arm64/mm: Replace open encodings with PXD_TABLE_BIT arm64/mm: Rename pte_mkpresent() as pte_mkvalid() arm64/sysreg: Update ID_AA64ISAR2_EL1 to DDI0601 2024-09 arm64/sysreg: Update ID_AA64ZFR0_EL1 to DDI0601 2024-09 arm64/sysreg: Update ID_AA64FPFR0_EL1 to DDI0601 2024-09 ...	2025-01-20 21:21:49 -08:00
Linus Torvalds	e7244cc382	m68k updates for v6.14 - Use the generic muldi3 libgcc function, - Miscellaneous fixes and improvements. -----BEGIN PGP SIGNATURE----- iIsEABYKADMWIQQ9qaHoIs/1I4cXmEiKwlD9ZEnxcAUCZ45X7xUcZ2VlcnRAbGlu dXgtbTY4ay5vcmcACgkQisJQ/WRJ8XB4tQD/WKgZmWHvfi/9Tk7+c8WD/e2ApdQE ZhL9q0AEUzumpH0A/2ROwftS1g39MXdITUfts2g5j2wQy2ePnRDOZmTtP9kA =ZK3B -----END PGP SIGNATURE----- Merge tag 'm68k-for-v6.14-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k Pull m68k updates from Geert Uytterhoeven: - Use the generic muldi3 libgcc function - Miscellaneous fixes and improvements * tag 'm68k-for-v6.14-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: m68k: libgcc: Fix lvalue abuse in umul_ppmm() m68k: vga: Fix I/O defines zorro: Constify 'struct bin_attribute' m68k: atari: Use str_on_off() helper in atari_nvram_proc_read() m68k: Use kernel's generic muldi3 libgcc function	2025-01-20 21:18:36 -08:00
Linus Torvalds	4f42d0bf72	s390 updates for 6.14 merge window - Select config option KASAN_VMALLOC if KASAN is enabled - Select config option VMAP_STACK unconditionally - Implement arch_atomic_inc() / arch_atomic_dec() functions which result in a single instruction if compiled for z196 or newer architectures - Make layering between atomic.h and atomic_ops.h consistent - Comment s390 preempt_count implementation - Remove pre MARCH_HAS_Z196_FEATURES preempt count implementation - GCC uses the number of lines of an inline assembly to calculate number of instructions and decide on inlining. Therefore remove superfluous new lines from a couple of inline assemblies. - Provide arch_atomic__and_test() implementations that allow the compiler to generate slightly better code. - Optimize __preempt_count_dec_and_test() - Remove __bootdata annotations from declarations in header files - Add missing include of <linux/smp.h> in abs_lowcore.h to provide declarations for get_cpu() and put_cpu() used in the code - Fix suboptimal kernel image base when running make kasan.config - Remove huge_pte_none() and huge_pte_none_mostly() as are identical to the generic variants - Remove unused PAGE_KERNEL_EXEC, SEGMENT_KERNEL_EXEC, and REGION3_KERNEL_EXEC defines - Simplify noexec page protection handling and change the page, segment and region3 protection definitions automatically if the instruction execution-protection facility is not available - Save one instruction and prefer EXRL instruction over EX in string, xor_(), amode31 and other functions - Create /dev/diag misc device to fetch diagnose specific information from the kernel and provide it to userspace - Retrieve electrical power readings using DIAGNOSE 0x324 ioctl - Make ccw_device_get_ciw() consistent and use array indices instead of pointer arithmetic * s390/qdio: Move memory alloc/pointer arithmetic for slib and sl into one place - The sysfs core now allows instances of 'struct bin_attribute' to be moved into read-only memory. Make use of that in s390 code - Add missing TLB range adjustment in pud_free_tlb() - Improve topology setup by adding early polarization detection - Fix length checks in codepage_convert() function - The generic bitops implementation is nearly identical to the s390 one. Switch to the generic variant and decrease a bit the kernel image size - Provide an optimized arch_test_bit() implementation which makes use of flag output constraint. This generates slightly better code - Provide memory topology information obtanied with DIAGNOSE 0x310 using ioctl. - Various other small improvements, fixes, and cleanups These changes were added with a merge of 'pci-device-recovery' branch - Add PCI error recovery status mechanism - Simplify and document debug_next_entry() logic - Split private data allocation and freeing out of debug file open() and close() operations - Add debug_dump() function that gets a textual representation of a debug info (e.g. PCI recovery hardware error logs) - Add formatted content of pci_debug_msg_id to the PCI report -----BEGIN PGP SIGNATURE----- iI0EABYKADUWIQQrtrZiYVkVzKQcYivNdxKlNrRb8AUCZ4pkVxccYWdvcmRlZXZA bGludXguaWJtLmNvbQAKCRDNdxKlNrRb8GulAQDg/7pCj1fXH5XKN9W16972OYQD pNwfCekw8suO8HUBCgEAkzdgTJC6/thifrnUt+Gj8HqASh//Qzw/6Q2Jk6595gk= =lE3V -----END PGP SIGNATURE----- Merge tag 's390-6.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Alexander Gordeev: - Select config option KASAN_VMALLOC if KASAN is enabled - Select config option VMAP_STACK unconditionally - Implement arch_atomic_inc() / arch_atomic_dec() functions which result in a single instruction if compiled for z196 or newer architectures - Make layering between atomic.h and atomic_ops.h consistent - Comment s390 preempt_count implementation - Remove pre MARCH_HAS_Z196_FEATURES preempt count implementation - GCC uses the number of lines of an inline assembly to calculate number of instructions and decide on inlining. Therefore remove superfluous new lines from a couple of inline assemblies. - Provide arch_atomic__and_test() implementations that allow the compiler to generate slightly better code. - Optimize __preempt_count_dec_and_test() - Remove __bootdata annotations from declarations in header files - Add missing include of <linux/smp.h> in abs_lowcore.h to provide declarations for get_cpu() and put_cpu() used in the code - Fix suboptimal kernel image base when running make kasan.config - Remove huge_pte_none() and huge_pte_none_mostly() as are identical to the generic variants - Remove unused PAGE_KERNEL_EXEC, SEGMENT_KERNEL_EXEC, and REGION3_KERNEL_EXEC defines - Simplify noexec page protection handling and change the page, segment and region3 protection definitions automatically if the instruction execution-protection facility is not available - Save one instruction and prefer EXRL instruction over EX in string, xor_(), amode31 and other functions - Create /dev/diag misc device to fetch diagnose specific information from the kernel and provide it to userspace - Retrieve electrical power readings using DIAGNOSE 0x324 ioctl - Make ccw_device_get_ciw() consistent and use array indices instead of pointer arithmetic - s390/qdio: Move memory alloc/pointer arithmetic for slib and sl into one place - The sysfs core now allows instances of 'struct bin_attribute' to be moved into read-only memory. Make use of that in s390 code - Add missing TLB range adjustment in pud_free_tlb() - Improve topology setup by adding early polarization detection - Fix length checks in codepage_convert() function - The generic bitops implementation is nearly identical to the s390 one. Switch to the generic variant and decrease a bit the kernel image size - Provide an optimized arch_test_bit() implementation which makes use of flag output constraint. This generates slightly better code - Provide memory topology information obtanied with DIAGNOSE 0x310 using ioctl. - Various other small improvements, fixes, and cleanups Also, some changes came in through a merge of 'pci-device-recovery' branch: - Add PCI error recovery status mechanism - Simplify and document debug_next_entry() logic - Split private data allocation and freeing out of debug file open() and close() operations - Add debug_dump() function that gets a textual representation of a debug info (e.g. PCI recovery hardware error logs) - Add formatted content of pci_debug_msg_id to the PCI report * tag 's390-6.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (48 commits) s390/futex: Fix FUTEX_OP_ANDN implementation s390/diag: Add memory topology information via diag310 s390/bitops: Provide optimized arch_test_bit() s390/bitops: Switch to generic bitops s390/ebcdic: Fix length decrement in codepage_convert() s390/ebcdic: Fix length check in codepage_convert() s390/ebcdic: Use exrl instead of ex s390/amode31: Use exrl instead of ex s390/stackleak: Use exrl instead of ex in __stackleak_poison() s390/lib: Use exrl instead of ex in xor functions s390/topology: Improve topology detection s390/tlb: Add missing TLB range adjustment s390/pkey: Constify 'struct bin_attribute' s390/sclp: Constify 'struct bin_attribute' s390/pci: Constify 'struct bin_attribute' s390/ipl: Constify 'struct bin_attribute' s390/crypto/cpacf: Constify 'struct bin_attribute' s390/qdio: Move memory alloc/pointer arithmetic for slib and sl into one place s390/cio: Use array indices instead of pointer arithmetic s390/qdio: Rename feature flag aif_osa to aif_qdio ...	2025-01-20 21:14:49 -08:00
Linus Torvalds	a312e1706c	for-6.14/io_uring-20250119 -----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmeNDEUQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpl5hD/4t7kWWNQDeQG9CiA3QStMJ5Yow2AgYtK8f sJBr5/6PGEsbTreX//Kh8DtPZPRGcjG9elCo58QxWaPZ2mg3fTOR3/QYLMlaGXU2 hSht58lj32utpuzMjMo9bG3aesi03bLf+buaq7V1FaMlcTV8rXqK1s/HGtphDBRo 8tNLEk3JDJDs3vlWbNp/5Hqh9+Ro6DU8df1zWWH4Vbu8RXaGIPyJyjKvvcbfuuCf k7Ay45XNAmTZg+rSNGv1H3Yn1LNzPMVFLWBfzRahPCzlKy2+mJMWz1PWu9naaUK+ WTM+kgiBLF24k59G/9xuxC5bYtsTjTbr4GsEE5ZvFBnhKPzLzzaJj7iQHRj83vtv tqxNmAbA3wJoNk48Zr8+cYbfDX9Q9Pl32wIaS/LxRgF9MT4lem6pyKY7Skd12oK3 rnQ8moGtnOBxp3QUU6BZ7IX3ipb+Bgw7FhZbtVYJdlqKeKyi1QO0MuITwGXpMwk/ EWDDTsspIf+QaTu+fmO8byJavugKljW8t7hM1JpvlfOLl+rsh6/+AYz42fCvcaA0 Tu4bpUk8SuwALvZfU2R6bLkorGG6MFuGI8g3eixOcGir3YAcHBMfdg6ItpZi5qVt ToM87BMaezOZZvSwX1JBaQ0AR5HBQYmHaiLWgPsORf3PjJ0kz+u21SK9D+yJkUtU rT6+HvoVXA== =ufpE -----END PGP SIGNATURE----- Merge tag 'for-6.14/io_uring-20250119' of git://git.kernel.dk/linux Pull io_uring updates from Jens Axboe: "Not a lot in terms of features this time around, mostly just cleanups and code consolidation: - Support for PI meta data read/write via io_uring, with NVMe and SCSI covered - Cleanup the per-op structure caching, making it consistent across various command types - Consolidate the various user mapped features into a concept called regions, making the various users of that consistent - Various cleanups and fixes" * tag 'for-6.14/io_uring-20250119' of git://git.kernel.dk/linux: (56 commits) io_uring/fdinfo: fix io_uring_show_fdinfo() misuse of ->d_iname io_uring: reuse io_should_terminate_tw() for cmds io_uring: Factor out a function to parse restrictions io_uring/rsrc: require cloned buffers to share accounting contexts io_uring: simplify the SQPOLL thread check when cancelling requests io_uring: expose read/write attribute capability io_uring/rw: don't gate retry on completion context io_uring/rw: handle -EAGAIN retry at IO completion time io_uring/rw: use io_rw_recycle() from cleanup path io_uring/rsrc: simplify the bvec iter count calculation io_uring: ensure io_queue_deferred() is out-of-line io_uring/rw: always clear ->bytes_done on io_async_rw setup io_uring/rw: use NULL for rw->free_iovec assigment io_uring/rw: don't mask in f_iocb_flags io_uring/msg_ring: Drop custom destructor io_uring: Move old async data allocation helper to header io_uring/rw: Allocate async data through helper io_uring/net: Allocate msghdr async data through helper io_uring/uring_cmd: Allocate async data through generic helper io_uring/poll: Allocate apoll with generic alloc_cache helper ...	2025-01-20 20:27:33 -08:00
Linus Torvalds	1cbfb828e0	for-6.14/block-20250118 -----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmeL6hoQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgppw2EADQV8nDgLRggZR+il4U03yKHXcQEdAX1GrB Erowx+dasIJuh6kp3n6qRe9QD/pRqt1DKyLvXoWF8Qfuwq85j7oDnDDYxutNYT27 hDgrLJriJ3VeKYtTu+andHWt8P29b5h57UayInDOUJurEPA6rXyFZ5YVIti8n21K uDOrQXiACG3qRWS2+p2f3UNhX0MkFNFdN/lxi13WMIJtRWF5bXAP+JOgIWCID4Ze QuSY6rQD4dp4Q6M2erpX6tn0YZb7Hvw3rPjsd91n6jvYfTUVLH375zg8jCBpi6Wi Syufbb8xcTtriVPTDRNu0ekjebkc8wD8ax/h86g0z9v3Ua4DlNmsx9eXrtv6r5nu YXqDODOad6stI0+owFquW2vas0gHmfNSfyfGdlk2g24PMtP5Yx0V6FIEvwIeqnje ghgxQvBuKUsdhqakByfNnc+XvXi3+RUJek8kvMeUSUQWT1IyMQqPOOk0yp9WdyWD bY1f2ECP5BR1b37zYOyawewsI5xTupHUswn5a4r4qtGn3O15rGDkX98Nab5aLCnR rW/DvX7+wT6gW9EwrRHiwjwfNDZbsJ9Ggu3lMhtUl5GUWdk58yTiVgKaHJLnlX9/ CKFKfyyIR1Vl8+gYIpemyFhhcoN+dCSf06ISkrg0jeS0/tYwydaAaCBPL5J4kxZA h3Rtbh+Pgg== =EXYs -----END PGP SIGNATURE----- Merge tag 'for-6.14/block-20250118' of git://git.kernel.dk/linux Pull block updates from Jens Axboe: - NVMe pull requests via Keith: - Target support for PCI-Endpoint transport (Damien) - TCP IO queue spreading fixes (Sagi, Chaitanya) - Target handling for "limited retry" flags (Guixen) - Poll type fix (Yongsoo) - Xarray storage error handling (Keisuke) - Host memory buffer free size fix on error (Francis) - MD pull requests via Song: - Reintroduce md-linear (Yu Kuai) - md-bitmap refactor and fix (Yu Kuai) - Replace kmap_atomic with kmap_local_page (David Reaver) - Quite a few queue freeze and debugfs deadlock fixes Ming introduced lockdep support for this in the 6.13 kernel, and it has (unsurprisingly) uncovered quite a few issues - Use const attributes for IO schedulers - Remove bio ioprio wrappers - Fixes for stacked device atomic write support - Refactor queue affinity helpers, in preparation for better supporting isolated CPUs - Cleanups of loop O_DIRECT handling - Cleanup of BLK_MQ_F_* flags - Add rotational support for null_blk - Various fixes and cleanups * tag 'for-6.14/block-20250118' of git://git.kernel.dk/linux: (106 commits) block: Don't trim an atomic write block: Add common atomic writes enable flag md/md-linear: Fix a NULL vs IS_ERR() bug in linear_add() block: limit disk max sectors to (LLONG_MAX >> 9) block: Change blk_stack_atomic_writes_limits() unit_min check block: Ensure start sector is aligned for stacking atomic writes blk-mq: Move more error handling into blk_mq_submit_bio() block: Reorder the request allocation code in blk_mq_submit_bio() nvme: fix bogus kzalloc() return check in nvme_init_effects_log() md/md-bitmap: move bitmap_{start, end}write to md upper layer md/raid5: implement pers->bitmap_sector() md: add a new callback pers->bitmap_sector() md/md-bitmap: remove the last parameter for bimtap_ops->endwrite() md/md-bitmap: factor behind write counters out from bitmap_{start/end}write() md: Replace deprecated kmap_atomic() with kmap_local_page() md: reintroduce md-linear partitions: ldm: remove the initial kernel-doc notation blk-cgroup: rwstat: fix kernel-doc warnings in header file blk-cgroup: fix kernel-doc warnings in header file nbd: fix partial sending ...	2025-01-20 19:38:46 -08:00
Linus Torvalds	3d3a9c8b89	dlm for 6.14 - Fix a case where the new scanning code missed removing an unused rsb. - Fix the error when removing a configfs entry for an invalid node id. -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEcGkeEvkvjdvlR90nOBtzx/yAaaoFAmeOi6EACgkQOBtzx/yA aaqLPRAApqsWLMdaOejBY3B2KfFeSGqi+hcYO5fpjRgFocRCnG3q2aY1+lNEemDd 8EEvFGDzqCvFKbS+VGWjQ+ABiA8Zro4nqjuc6vW/EHivNsWiAgSqeSwQSG81v7x1 Ht2EkVg9UK2rRYb0Y4Y46XIiGU7Yd9q+bpv1nLtjLsFM+u7j3hC1IrK5Rl71JSYE ozhHIVkg5VxxNHjr3isc7kChIdYdRIX+xZm+YfAfC3/Z9YHcJAQ436RNvW/rDjJX iR/td4Z04tACwZVu46TDjHaLS5jQ/Lk/7Vk+FrliuXjTTcNbxM+MTX0uoxKUDJ5D JD1bMaiFsIvvd146wGk022iRTSUE27KnFJaknvV1njuvY3+jHhV9uDehl8Vurv7b GS2ZNajUmU/5Mv9MtfxfZNsH5cKQPMKhyKugt5gZhPFLnhf6APEz6htyZ+Sbmueb 8LMycO9SiIDiwOowS8leR0qmfI9k/11vwmO2vi0fbDDCSPTL8wq12JWmg/S+YXAp HuKNfpnCE6s+c+EB3y50C1jOvbHQ1u96FpdHyUzv1hDrGG9/w5JG95codZcXQ314 uA4uEQBpan7TDLaSlSccSXUcRilrYZ3eY94wKHlRuBhLswsAGAZDz4HmuMm7VPeR etiZRhehQYdHZs/+Zr5k5sn+AI8yDZKzl+mw55SGeij7ti0sXzY= =fSW0 -----END PGP SIGNATURE----- Merge tag 'dlm-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm Pull dlm updates from David Teigland: - Fix a case where the new scanning code missed removing an unused rsb - Fix the error when removing a configfs entry for an invalid node id * tag 'dlm-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm: dlm: return -ENOENT if no comm was found dlm: fix srcu_read_lock() return type to int dlm: fix removal of rsb struct that is master and dir record	2025-01-20 14:26:59 -08:00
Linus Torvalds	2622f29041	bcachefs updates for 6.14-rc1 Lots of scalability work, another big on disk format change. On disk format version goes from 1.13 to 1.20. Like 6.11, this is another big and expensive automatic/required on disk format upgrade. This is planned to be the last big on disk format upgrade before the experimental label comes off. There will be one more minor on disk format update for a few things that couldn't make this release. Headline improvements: - Fix mount time regression that some users encountered post the 6.11 disk accounting rewrite. Accounting keys were encoded little endian (typetag in the low bits) - which didn't anticipate adding accounting keys for every inode, which aren't stored in memory and we don't want to scan at mount time. - fsck time on large filesystems is improved by multiple orders of magnitude. Previously, 100TB was about the practical max filesystem size, where users were reporting fsck times of a day+. With the new changes (which nearly eliminate backpointers fsck overhead), we fsck'd a filesystem with 10PB of data in 1.5 hours. The problematic fsck passes were walking every extent and checking for missing backpointers, and walking every backpointer to check for dangling backpointers. As we've been adding more and more runtime self healing there was no reason to keep around the backpointers -> extents pass; dangling backpointers are just deleted, and we can do that when using them - thus, backpointers -> extents is now only run in debug mode. extents -> backpointers does need to exist, since missing backpointers would mean we can't find data to move it (for e.g. copygc, device evacuate, scrub). But the new on disk format version makes possible a new strategy where we sum up backpointers within a bucket and check it against the bucket sector counts, and then only scan for missing backpointers if the counts are off (and then, only for specific buckets). Full list of on disk format changes: - 1.14: backpointer_bucket_gen Backpointers now have a field for the bucket generation number, replacing the obsolete bucket_offset field. This is needed for the new "sum up backpointers within a bucket" code, since backpointers use the btree write buffer - meaning we will see stale reads, and this runs online, with the filesystem in full rw mode. - 1.15: disk_accounting_big_endian As previously described, fix the endianness of accounting keys so that accounting keys with the same typetag sort together, and accounting read can skip types it's not interested in. - 1.16: reflink_p_may_update_opts: This version indicates that a new reflink pointer field is understood and may be used; the field indicates whether the reflink pointer has permissions to update IO path options (e.g. compression, replicas) may be updated on the indirect extent it points to. This completes the rebalance/reflink data path option handling from the 6.13 pull request. - 1.17: inode_depth Add a new inode field, bi_depth, to accelerate the check_directory_structure fsck path, which checks for loops in the filesystem heirarchy. check_inodes and check_dirents check connectivity, so check_directory_structure only has to check for loops - by walking back up to the root from every directory. But a path can't be a loop if it has a counter that increases monotonically from root to leaf - adding a depth counter means that we can check for loops with only local (parent -> child) checks. We might need to occasionally renumber the depth field in fsck if directories have been moved around, but then future fsck runs will be much faster. - 1.18: persistent_inode_cursors Previously, the cursor used for inode allocation was only kept in memory, which meant that users with large filesystems and lots of files were reporting that the first create after mounting would take awhile - since it had to scan from the start. Inode allocation cursors are now persistent, and also include a generation field (incremented on wraparound, which will only happen if inode allocation is restricted to 32 bit inodes), so that we don't have to leave inode_generation keys around after a delete. The option for 32 bit inode numbers may now also be set on individual directories, and non-32 bit inode allocations are disallowed from allocating from the 32 bit part of the inode number space. - 1.19: autofix_errors Runtime self healing is now the default.o - 1.20: directory size (from Hongbo) directory i_size is now meaningful, and not 0. Release notes from the previous 6.13 pull request: - Self healing work: Allocator and reflink now run the exact same check/repair code that fsck does at runtime, where applicable. The long term goal here is to remove inconsistent() errors (that cause us to go emergency read only) by lifting fsck code up to normal runtime paths; we should only go emergency read-only if we detect an inconsistency that was due to a runtime bug - or truly catastrophic damage (corrupted btree roots/interior nodes). - Reflink repair no longer deletes reflink pointers: instead we flip an error bit and log the error, and they can still be deleted by file deletion. This means a temporary failure to find an indirect extent (perhaps repaired later by btree node scan) won't result in unnecessary data loss - Improvements to rebalance data path option handling: we can now correctly apply changed filesystem-level io path options to pending rebalance work, and soon we'll be able to apply file-level io path option changes to indirect extents. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmeOiboACgkQE6szbY3K bnY8zQ//Yoy+5ZA07tQV+Fi0JV0DZ6w3xotxNhAUeaKgCKHgp37gcKa47TFir4pd 6ha7PQV3GimFwHoIUfOY5X4Y+bEm16XblyfK3VU6IgGiE3cUg+1q8b8WrD2eHmLJ qIT8DWWpAM2AcZ/f5G37hH8pxn2t0TUuzJ1Sz7wEhJUNZEP+z+qaacnGhwuc8yQ3 Srj7Cc/NSd9T+6G2yKhERFITUrXmqVGgGihhVZqs0hCAPt8bwn5K8d1H2IKoj1N6 jJ3MQfmPIzUk0mfIjHrBlqrA+3tjtt5LGU+QpOWs8g509xHCP0BfGGOXQhjMjHVI JVSqAuIENK4V1ubz7BZcSoPAVncPeFl8Ly5Qdw5FlDBux9kKsch8wJPjn1A1gkPt Fb9VBTRkCK7WqUzkmbQh152SNC/0plb/8qFjywHNkvYyGMMlJME8zDIg40RN+0Ql ckXjlvdVGm0GbyM2GLth4gbOSXDzKrq12i3rWROnOLZ0Q2SBKfJe5K0UdRat1/nu 2sWWJNJqDzaaP1Gd/qk3Yht06GWnhI/17Bl/Znt5M8rxtSBbbxO58vi3gxasbccS l3qozuNouvAMNRBqE4ayVtjV+Aj69j1IBJnAfCareDDDf6ugjooLqu27BQkLOPg7 wswq633T6WG+UfQ44GvseiCaDW5MMh0aq7vxzjnBUoTz5usMfxg= =d0Zb -----END PGP SIGNATURE----- Merge tag 'bcachefs-2025-01-20.2' of git://evilpiepirate.org/bcachefs Pull bcachefs updates from Kent Overstreet: "Lots of scalability work, another big on-disk format change. On-disk format version goes from 1.13 to 1.20. Like 6.11, this is another big and expensive automatic/required on disk format upgrade. This is planned to be the last big on disk format upgrade before the experimental label comes off. There will be one more minor on disk format update for a few things that couldn't make this release. Headline improvements: - Self healing work: Allocator and reflink now run the exact same check/repair code that fsck does at runtime, where applicable. The long term goal here is to remove inconsistent() errors (that cause us to go emergency read only) by lifting fsck code up to normal runtime paths; we should only go emergency read-only if we detect an inconsistency that was due to a runtime bug - or truly catastrophic damage (corrupted btree roots/interior nodes). - Reflink repair no longer deletes reflink pointers: Instead we flip an error bit and log the error, and they can still be deleted by file deletion. This means a temporary failure to find an indirect extent (perhaps repaired later by btree node scan) won't result in unnecessary data loss - Improvements to rebalance data path option handling: We can now correctly apply changed filesystem-level io path options to pending rebalance work, and soon we'll be able to apply file-level io path option changes to indirect extents - Fix mount time regression that some users encountered post the 6.11 disk accounting rewrite. Accounting keys were encoded little endian (typetag in the low bits) - which didn't anticipate adding accounting keys for every inode, which aren't stored in memory and we don't want to scan at mount time. - fsck time on large filesystems is improved by multiple orders of magnitude. Previously, 100TB was about the practical max filesystem size, where users were reporting fsck times of a day+. With the new changes (which nearly eliminate backpointers fsck overhead), we fsck'd a filesystem with 10PB of data in 1.5 hours. The problematic fsck passes were walking every extent and checking for missing backpointers, and walking every backpointer to check for dangling backpointers. As we've been adding more and more runtime self healing there was no reason to keep around the backpointers -> extents pass; dangling backpointers are just deleted, and we can do that when using them - thus, backpointers -> extents is now only run in debug mode. extents -> backpointers does need to exist, since missing backpointers would mean we can't find data to move it (for e.g. copygc, device evacuate, scrub). But the new on disk format version makes possible a new strategy where we sum up backpointers within a bucket and check it against the bucket sector counts, and then only scan for missing backpointers if the counts are off (and then, only for specific buckets). Full list of on disk format changes: - 1.14: backpointer_bucket_gen Backpointers now have a field for the bucket generation number, replacing the obsolete bucket_offset field. This is needed for the new "sum up backpointers within a bucket" code, since backpointers use the btree write buffer - meaning we will see stale reads, and this runs online, with the filesystem in full rw mode. - 1.15: disk_accounting_big_endian As previously described, fix the endianness of accounting keys so that accounting keys with the same typetag sort together, and accounting read can skip types it's not interested in. - 1.16: reflink_p_may_update_opts: This version indicates that a new reflink pointer field is understood and may be used; the field indicates whether the reflink pointer has permissions to update IO path options (e.g. compression, replicas) may be updated on the indirect extent it points to. This completes the rebalance/reflink data path option handling from the 6.13 pull request. - 1.17: inode_depth Add a new inode field, bi_depth, to accelerate the check_directory_structure fsck path, which checks for loops in the filesystem heirarchy. check_inodes and check_dirents check connectivity, so check_directory_structure only has to check for loops - by walking back up to the root from every directory. But a path can't be a loop if it has a counter that increases monotonically from root to leaf - adding a depth counter means that we can check for loops with only local (parent -> child) checks. We might need to occasionally renumber the depth field in fsck if directories have been moved around, but then future fsck runs will be much faster. - 1.18: persistent_inode_cursors Previously, the cursor used for inode allocation was only kept in memory, which meant that users with large filesystems and lots of files were reporting that the first create after mounting would take awhile - since it had to scan from the start. Inode allocation cursors are now persistent, and also include a generation field (incremented on wraparound, which will only happen if inode allocation is restricted to 32 bit inodes), so that we don't have to leave inode_generation keys around after a delete. The option for 32 bit inode numbers may now also be set on individual directories, and non-32 bit inode allocations are disallowed from allocating from the 32 bit part of the inode number space. - 1.19: autofix_errors Runtime self healing is now the default.o - 1.20: directory size (from Hongbo) directory i_size is now meaningful, and not 0" * tag 'bcachefs-2025-01-20.2' of git://evilpiepirate.org/bcachefs: (268 commits) bcachefs: Fix check_inode_hash_info_matches_root() bcachefs: Document issue with bch_stripe layout bcachefs: Fix self healing on read error bcachefs: Pop all the transactions from the abort one bcachefs: Only abort the transactions in the cycle bcachefs: Introduce lock_graph_pop_from bcachefs: Convert open-coded lock_graph_pop_all to helper bcachefs: Do not allow no fail lock request to fail bcachefs: Merge the condition to avoid additional invocation Revert "bcachefs: Fix bch2_btree_node_upgrade()" bcachefs: bcachefs_metadata_version_directory_size bcachefs: make directory i_size meaningful bcachefs: check_unreachable_inodes is not actually PASS_ONLINE yet bcachefs: Don't use BTREE_ITER_cached when walking alloc btree during fsck bcachefs: Check for dirents to overwritten inodes bcachefs: bch2_btree_iter_peek_slot() handles navigating to nonexistent depth bcachefs: Don't set btree_path to updtodate if we don't fill bcachefs: __bch2_btree_pos_to_text() bcachefs: printbuf_reset() handles tabstops bcachefs: Silence read-only errors when deleting snapshots ...	2025-01-20 13:55:19 -08:00
Linus Torvalds	5d8a4bd6b2	pstore updates for v6.14-rc1 - pstore/blk: trivial typo fixes (Eugen Hristev) - pstore/zone: reject zero-sized allocations (Eugen Hristev) -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCZ4hRFwAKCRA2KwveOeQk u9nXAQCieFhSxc1peuUzbN2HENIIu2S/agsE9CIOQrnZEFVixAD/bIj5EMfw4sF0 T7v140vGsk7Bg9apbWhnGdoI/vzzcQg= =j5zd -----END PGP SIGNATURE----- Merge tag 'pstore-v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull pstore updates from Kees Cook: - pstore/blk: trivial typo fixes (Eugen Hristev) - pstore/zone: reject zero-sized allocations (Eugen Hristev) * tag 'pstore-v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: pstore/zone: avoid dereferencing zero sized ptr after init zones pstore/blk: trivial typo fixes	2025-01-20 13:37:14 -08:00
Linus Torvalds	fadc3ed9ce	execve updates for v6.14-rc1 - exec: fix up /proc/pid/comm in the execveat(AT_EMPTY_PATH) case (Tycho Andersen, Kees Cook) - binfmt_misc: Fix comment typos (Christophe JAILLET) - exec: move empty argv[0] warning closer to actual logic (Nir Lichtman) - exec: remove legacy custom binfmt modules autoloading (Nir Lichtman) - binfmt_flat: Fix integer overflow bug on 32 bit systems (Dan Carpenter) - exec: Make sure set_task_comm() always NUL-terminates - coredump: Do not lock when copying "comm" - MAINTAINERS: add auxvec.h and set myself as maintainer -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCZ4hNmQAKCRA2KwveOeQk u0/nAQCTGU0zqhdO6t7ABsL3p9kJ2jVRA5njAoX7A/9jGPSWEQD/boRMqZuUpthV nMevcQ2F4u0A7kJJBMK05YdXWHkYqgk= =49Di -----END PGP SIGNATURE----- Merge tag 'execve-v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull execve updates from Kees Cook: - fix up /proc/pid/comm in the execveat(AT_EMPTY_PATH) case (Tycho Andersen, Kees Cook) - binfmt_misc: Fix comment typos (Christophe JAILLET) - move empty argv[0] warning closer to actual logic (Nir Lichtman) - remove legacy custom binfmt modules autoloading (Nir Lichtman) - Make sure set_task_comm() always NUL-terminates - binfmt_flat: Fix integer overflow bug on 32 bit systems (Dan Carpenter) - coredump: Do not lock when copying "comm" - MAINTAINERS: add auxvec.h and set myself as maintainer * tag 'execve-v6.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: binfmt_flat: Fix integer overflow bug on 32 bit systems selftests/exec: add a test for execveat()'s comm exec: fix up /proc/pid/comm in the execveat(AT_EMPTY_PATH) case exec: Make sure task->comm is always NUL-terminated exec: remove legacy custom binfmt modules autoloading exec: move warning of null argv to be next to the relevant code fs: binfmt: Fix a typo MAINTAINERS: exec: Mark Kees as maintainer MAINTAINERS: exec: Add auxvec.h UAPI coredump: Do not lock during 'comm' reporting	2025-01-20 13:27:58 -08:00
Linus Torvalds	0eb4aaa230	for-6.14-tag -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmeHvVQACgkQxWXV+ddt WDsJ6w//cPqI8tf3kMxurZcG7clJRIIISotPrC6hm3UDNpJLa7HDaVJ50FAoIhMV sB4RQNZky4mfB6ypXxmETzV3ZHvP0+oFgRs72Ommi0ZbdnBgxhaUTrDXLKl52o4r UoeqvRKReEYOesN09rPXYPwytUOkxHU/GjNzv7bC/Tzvq/xKaIN5qMYZwkHtJ8PK JtCFypfbmDPNDJz37l0BhRya2oMtpcUtxM9uP8RWVuQtaELgjcy56W/+osoyJTy9 FSKaoWUPsDVDufnILlGR8Kub2Z5mcISVqyARUdr/q3j5CDfyTdQvahmUy7sHgUAe HGh5QBdRJu1QTvdZw+nK4YCaYpK6Nj4liDtO1cwVitde5RXsJrt6kYBLlY/kU2Qr KODOloM/zVKxULR0ARl11NULZquUsczP6Wxfn+dtyDJ3JGlY9OcuESmorHoUtkMX 75Tj1AtRMNcfZAE2HquL1Oz3bIMcg4btDJsC+9Yp5K11SP12XpOwC42k/9Bx3iBe Iki0BSuppFqX5MMY3OEWzD1pz2vOGYR8ISD6EIsjpjl2vBeRwydaCCZfuszSC7gl Y4goSdwFMPVlqllL1h27XUjKVXvttCqqdB6P28MbvZKnFAPlm189BJQZC5cbHAJU ceBww5PvI9QxnJnFG5iOLcnko6liUWPP9l2c5LLtUsJIi8B5Hu0= =SXLv -----END PGP SIGNATURE----- Merge tag 'for-6.14-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs updates from David Sterba: "User visible changes, features: - rebuilding of the free space tree at mount time is done in more transactions, fix potential hangs when the transaction thread is blocked due to large amount of block groups - more read IO balancing strategies (experimental config), add two new ways how to select a device for read if the profiles allow that (all RAID1), the current default selects the device by pid which is good on average but less performant for single reader workloads - select preferred device for all reads (namely for testing) - round-robin, balance reads across devices relevant for the requested IO range - add encoded write ioctl support to io_uring (read was added in 6.12), basis for writing send stream using that instead of syscalls, non-blocking mode is not yet implemented - support FS_IOC_READ_VERITY_METADATA, applications can use the metadata to do their own verification - pass inode's i_write_hint to bios, for parity with other filesystems, ioctls F_GET_RW_HINT/F_SET_RW_HINT Core: - in zoned mode: allow to directly reclaim a block group by simply resetting it, then it can be reused and another block group does not need to be allocated - super block validation now also does more comprehensive sys array validation, adding it to the points where superblock is validated (post-read, pre-write) - subpage mode fixes: - fix double accounting of blocks due to some races - improved or fixed error handling in a few cases (compression, delalloc) - raid stripe tree: - fix various cases with extent range splitting or deleting - implement hole punching to extent range - reduce number of stripe tree lookups during bio submission - more self-tests - updated self-tests (delayed refs) - error handling improvements - cleanups, refactoring - remove rest of backref caching infrastructure from relocation, not needed anymore - error message updates - remove unnecessary calls when extent buffer was marked dirty - unused parameter removal - code moved to new files Other code changes: add rb_find_add_cached() to the rb-tree API" tag 'for-6.14-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (127 commits) btrfs: selftests: add a selftest for deleting two out of three extents btrfs: selftests: add test for punching a hole into 3 RAID stripe-extents btrfs: selftests: add selftest for punching holes into the RAID stripe extents btrfs: selftests: test RAID stripe-tree deletion spanning two items btrfs: selftests: don't split RAID extents in half btrfs: selftests: check for correct return value of failed lookup btrfs: don't use btrfs_set_item_key_safe on RAID stripe-extents btrfs: implement hole punching for RAID stripe extents btrfs: fix deletion of a range spanning parts two RAID stripe extents btrfs: fix tail delete of RAID stripe-extents btrfs: fix front delete range calculation for RAID stripe extents btrfs: assert RAID stripe-extent length is always greater than 0 btrfs: don't try to delete RAID stripe-extents if we don't need to btrfs: selftests: correct RAID stripe-tree feature flag setting btrfs: add io_uring interface for encoded writes btrfs: remove the unused locked_folio parameter from btrfs_cleanup_ordered_extents() btrfs: add extra error messages for delalloc range related errors btrfs: subpage: dump the involved bitmap when ASSERT() failed btrfs: subpage: fix the bitmap dump of the locked flags btrfs: do proper folio cleanup when run_delalloc_nocow() failed ...	2025-01-20 13:09:30 -08:00
Linus Torvalds	1851bccf60	gfs2 changes - In the quota code, to avoid spurious audit messages, don't call capable() when quotas are off. - When changing the 'j' flag of an inode, truncate the inode address space to avoid mixing "buffer head" and "iomap" pages. -----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEEJZs3krPW0xkhLMTc1b+f6wMTZToFAmeOWTkUHGFncnVlbmJh QHJlZGhhdC5jb20ACgkQ1b+f6wMTZTqMEBAAh5bu0h7611UZTsYkDvGClnxc7oBo j/0JCMyeHDvIRaGBFnzFFx1QNXW9ChXy/FgocQED13LPaiZ7kuFjwQPJoW9TH440 QrWa8xpxiiSz232maP0wQ7Y6K7EW7GW9tvCrqXj64PGb56TWQu+vEcACUtL6sN3V +Xli2TljZNDwGsBQygiGvUR2ICfipFNUoV4Yrxv15WdOM1cQmF9F5P7SwFe9mCBh tjF/D/vxxmQwR5njalnF0oTFSNlQmYpuaLPzZMdOnsFbwyFda/DncdLbbl9LsmMF +C7zzAC3gY8Iq6K4azDE1SI6Gh5JGxA8lcIHtDVbUoFJwHVV/Jg8kWGPRcTuvKs+ LL8moxut6Id6HmPDmJA2tjjpKYfbnGstNdUIbozNhV5A634AmlbaqA+PwFxDcNs4 JZdbK4tPSVV7fzodQHZg0vewB+E49yBsCtZ+ows27MzgQFYWKrcngkf/Twn+e5F+ s59cFi31KzgaLMMCelkDlFwg5Dp8QDYAKZ0UYknU/rVoHFERGCrBn1QwDCk8aMa3 /IeCQGDu3ry7kpMXGWXIRu4Bmf/k1J89H2UjMdBMTquqzU6QYcKdzvE6JLt8GHTK buE/D1y9t2wxERKBmjHb6KGUcso7RCVwT48cvEPmj+7OdynBPufDGPP666gpsA46 qZNhpfMnS+f1zrQ= =0Cc4 -----END PGP SIGNATURE----- Merge tag 'gfs2-for-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 updates from Andreas Gruenbacher: - In the quota code, to avoid spurious audit messages, don't call capable() when quotas are off - When changing the 'j' flag of an inode, truncate the inode address space to avoid mixing "buffer head" and "iomap" pages * tag 'gfs2-for-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: gfs2: Truncate address space when flipping GFS2_DIF_JDATA flag gfs2: reorder capability check last	2025-01-20 13:06:28 -08:00
Linus Torvalds	b971424b6e	vfs-6.14-rc1.afs -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZ4pS3AAKCRCRxhvAZXjc okSwAPkB8Ra+oTplB/yzmab5kFB0+IUSHAiBfG6TCYb45op7wgEAs4+ignZkb+Bi PsrfV7soiTGNUYSDVKOw7LS6PJEzkgA= =3mcq -----END PGP SIGNATURE----- Merge tag 'vfs-6.14-rc1.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull afs updates from Christian Brauner: "Dynamic root improvements: - Create an /afs/.<cell> mountpoint to match the /afs/<cell> mountpoint when a cell is created - Add some more checks on cell names proposed by the user to prevent dodgy symlink bodies from being created. Also prevent rootcell from being altered once set to simplify the locking - Change the handling of /afs/@cell from being a dentry name substitution at lookup time to making it a symlink to the current cell name and also provide a /afs/.@cell symlink to point to the dotted cell mountpoint Fixes: - Fix the abort code check in the fallback handling for the YFS.RemoveFile2 RPC call - Use call->op->server() for oridnary filesystem RPC calls that have an operation descriptor instead of call->server()" * tag 'vfs-6.14-rc1.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: afs: Fix the fallback handling for the YFS.RemoveFile2 RPC call afs: Make /afs/@cell and /afs/.@cell symlinks afs: Add rootcell checks afs: Make /afs/.<cell> as well as /afs/<cell> mountpoints	2025-01-20 11:40:48 -08:00
Linus Torvalds	47c9f2b3c8	vfs-6.14-rc1.statx.dio -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZ4pSTwAKCRCRxhvAZXjc oiSbAQCIWp8Jm2FX9Mv+eX8erLFlyQSDQauAnqPtW/SvMbpgFgEAuZOTSRU3GSqI NiowpYms9OckO638GlNHlSTUTcV4YwU= =obHT -----END PGP SIGNATURE----- Merge tag 'vfs-6.14-rc1.statx.dio' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs direct-io updates from Christian Brauner: "File systems that write out of place usually require different alignment for direct I/O writes than what they can do for reads. Add a separate dio read align field to statx, as many out of place write file systems can easily do reads aligned to the device sector size, but require bigger alignment for writes. This is usually papered over by falling back to buffered I/O for smaller writes and doing read-modify-write cycles, but performance for this sucks, so applications benefit from knowing the actual write alignment" * tag 'vfs-6.14-rc1.statx.dio' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: xfs: report larger dio alignment for COW inodes xfs: report the correct read/write dio alignment for reflinked inodes xfs: cleanup xfs_vn_getattr fs: add STATX_DIO_READ_ALIGN fs: reformat the statx definition	2025-01-20 11:16:50 -08:00
Linus Torvalds	7e587c20ad	vfs-6.14-rc1.libfs -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZ4pSLQAKCRCRxhvAZXjc oq92AP4qTO8+FFRok2nhHlK4YNPhiqni1KabYXuHakL1ESw8OQD+O1wLgw8FUkgv jxi+KmxMz9Asg2wdnLrSGEZJ709eOgc= =6dn7 -----END PGP SIGNATURE----- Merge tag 'vfs-6.14-rc1.libfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs libfs updates from Christian Brauner: "This improves the stable directory offset behavior in various ways. Stable offsets are needed so that NFS can reliably read directories on filesystems such as tmpfs: - Improve the end-of-directory detection According to getdents(3), the d_off field in each returned directory entry points to the next entry in the directory. The d_off field in the last returned entry in the readdir buffer must contain a valid offset value, but if it points to an actual directory entry, then readdir/getdents can loop. Introduce a specific fixed offset value that is placed in the d_off field of the last entry in a directory. Some user space applications assume that the EOD offset value is larger than the offsets of real directory entries, so the largest valid offset value is reserved for this purpose. This new value is never allocated by simple_offset_add(). When ->iterate_dir() returns, getdents{64} inserts the ctx->pos value into the d_off field of the last valid entry in the readdir buffer. When it hits EOD, offset_readdir() sets ctx->pos to the EOD offset value so the last entry is updated to point to the EOD marker. When trying to read the entry at the EOD offset, offset_readdir() terminates immediately. - Rely on d_children to iterate stable offset directories Instead of using the mtree to emit entries in the order of their offset values, use it only to map incoming ctx->pos to a starting entry. Then use the directory's d_children list, which is already maintained properly by the dcache, to find the next child to emit. - Narrow the range of directory offset values returned by simple_offset_add() to 3 .. (S32_MAX - 1) on all platforms. This means the allocation behavior is identical on 32-bit systems, 64-bit systems, and 32-bit user space on 64-bit kernels. The new range still permits over 2 billion concurrent entries per directory. - Return ENOSPC when the directory offset range is exhausted. Hitting this error is almost impossible though. - Remove the simple_offset_empty() helper" * tag 'vfs-6.14-rc1.libfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: libfs: Use d_children list to iterate simple_offset directories libfs: Replace simple_offset end-of-directory detection Revert "libfs: fix infinite directory reads for offset dir" Revert "libfs: Add simple_offset_empty()" libfs: Return ENOSPC when the directory offset range is exhausted	2025-01-20 11:00:53 -08:00
Linus Torvalds	100ceb4817	vfs-6.14-rc1.mount.v2 -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZ44+LwAKCRCRxhvAZXjc orNaAQCGDqtxgqgGLsdx9dw7yTxOm9opYBaG5qN7KiThLAz2PwD+MsHNNlLVEOKU IQo9pa23UFUhTipFSeszOWza5SGlxg4= =hdst -----END PGP SIGNATURE----- Merge tag 'vfs-6.14-rc1.mount.v2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs mount updates from Christian Brauner: - Add a mountinfo program to demonstrate statmount()/listmount() Add a new "mountinfo" sample userland program that demonstrates how to use statmount() and listmount() to get at the same info that /proc/pid/mountinfo provides - Remove pointless nospec.h include - Prepend statmount.mnt_opts string with security_sb_mnt_opts() Currently these mount options aren't accessible via statmount() - Add new mount namespaces to mount namespace rbtree outside of the namespace semaphore - Lockless mount namespace lookup Currently we take the read lock when looking for a mount namespace to list mounts in. We can make this lockless. The simple search case can just use a sequence counter to detect concurrent changes to the rbtree For walking the list of mount namespaces sequentially via nsfs we keep a separate rcu list as rb_prev() and rb_next() aren't usable safely with rcu. Currently there is no primitive for retrieving the previous list member. To do this we need a new deletion primitive that doesn't poison the prev pointer and a corresponding retrieval helper Since creating mount namespaces is a relatively rare event compared with querying mounts in a foreign mount namespace this is worth it. Once libmount and systemd pick up this mechanism to list mounts in foreign mount namespaces this will be used very frequently - Add extended selftests for lockless mount namespace iteration - Add a sample program to list all mounts on the system, i.e., in all mount namespaces - Improve mount namespace iteration performance Make finding the last or first mount to start iterating the mount namespace from an O(1) operation and add selftests for iterating the mount table starting from the first and last mount - Use an xarray for the old mount id While the ida does use the xarray internally we can use it explicitly which allows us to increment the unique mount id under the xa lock. This allows us to remove the atomic as we're now allocating both ids in one go - Use a shared header for vfs sample programs - Fix build warnings for new sample program to list all mounts * tag 'vfs-6.14-rc1.mount.v2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: samples/vfs: fix build warnings samples/vfs: use shared header samples/vfs/mountinfo: Use __u64 instead of uint64_t fs: remove useless lockdep assertion fs: use xarray for old mount id selftests: add listmount() iteration tests fs: cache first and last mount samples: add test-list-all-mounts selftests: remove unneeded include selftests: add tests for mntns iteration seltests: move nsfs into filesystems subfolder fs: simplify rwlock to spinlock fs: lockless mntns lookup for nsfs rculist: add list_bidir_{del,prev}_rcu() fs: lockless mntns rbtree lookup fs: add mount namespace to rbtree late fs: prepend statmount.mnt_opts string with security_sb_mnt_opts() mount: remove inlude/nospec.h include samples: add a mountinfo program to demonstrate statmount()/listmount()	2025-01-20 10:44:51 -08:00
Linus Torvalds	1a89a6924b	kernel-6.14-rc1.pid -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZ4pR0wAKCRCRxhvAZXjc ojb2AQD5QfpTEX/ju1TkenTvoNl+JfnIjaVSY40Lm9DWYzmCMAEAuRvf5WRIV713 00/RVOrUvsLobzhmnk0yw53EQ5A+pA0= =2NDA -----END PGP SIGNATURE----- Merge tag 'kernel-6.14-rc1.pid' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull pid_max namespacing update from Christian Brauner: "The pid_max sysctl is a global value. For a long time the default value has been 65535 and during the pidfd dicussions Linus proposed to bump pid_max by default. Based on this discussion systemd started bumping pid_max to 2^22. So all new systems now run with a very high pid_max limit with some distros having also backported that change. The decision to bump pid_max is obviously correct. It just doesn't make a lot of sense nowadays to enforce such a low pid number. There's sufficient tooling to make selecting specific processes without typing really large pid numbers available. In any case, there are workloads that have expections about how large pid numbers they accept. Either for historical reasons or architectural reasons. One concreate example is the 32-bit version of Android's bionic libc which requires pid numbers less than 65536. There are workloads where it is run in a 32-bit container on a 64-bit kernel. If the host has a pid_max value greater than 65535 the libc will abort thread creation because of size assumptions of pthread_mutex_t. That's a fairly specific use-case however, in general specific workloads that are moved into containers running on a host with a new kernel and a new systemd can run into issues with large pid_max values. Obviously making assumptions about the size of the allocated pid is suboptimal but we have userspace that does it. Of course, giving containers the ability to restrict the number of processes in their respective pid namespace indepent of the global limit through pid_max is something desirable in itself and comes in handy in general. Independent of motivating use-cases the existence of pid namespaces makes this also a good semantical extension and there have been prior proposals pushing in a similar direction. The trick here is to minimize the risk of regressions which I think is doable. The fact that pid namespaces are hierarchical will help us here. What we mostly care about is that when the host sets a low pid_max limit, say (crazy number) 100 that no descendant pid namespace can allocate a higher pid number in its namespace. Since pid allocation is hierarchial this can be ensured by checking each pid allocation against the pid namespace's pid_max limit. This means if the allocation in the descendant pid namespace succeeds, the ancestor pid namespace can reject it. If the ancestor pid namespace has a higher limit than the descendant pid namespace the descendant pid namespace will reject the pid allocation. The ancestor pid namespace will obviously not care about this. All in all this means pid_max continues to enforce a system wide limit on the number of processes but allows pid namespaces sufficient leeway in handling workloads with assumptions about pid values and allows containers to restrict the number of processes in a pid namespace through the pid_max interface" * tag 'kernel-6.14-rc1.pid' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: tests/pid_namespace: add pid_max tests pid: allow pid_max to be set per pid namespace	2025-01-20 10:29:11 -08:00
Linus Torvalds	37c12fcb3c	kernel-6.14-rc1.cred -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZ4pRuAAKCRCRxhvAZXjc okEiAP4wZOkUGX+d3FUXxM1DJfCsBssoYh01S4LE+s+hkq81vgD8D7PRZk7d12Jw zaS6/cLt12UDz1v6Ez103S9AQ5E6ywg= =Sknj -----END PGP SIGNATURE----- Merge tag 'kernel-6.14-rc1.cred' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull cred refcount updates from Christian Brauner: "For the v6.13 cycle we switched overlayfs to a variant of override_creds() that doesn't take an extra reference. To this end the {override,revert}_creds_light() helpers were introduced. This generalizes the idea behind {override,revert}_creds_light() to the {override,revert}_creds() helpers. Afterwards overriding and reverting credentials is reference count free unless the caller explicitly takes a reference. All callers have been appropriately ported" * tag 'kernel-6.14-rc1.cred' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (30 commits) cred: fold get_new_cred_many() into get_cred_many() cred: remove unused get_new_cred() nfsd: avoid pointless cred reference count bump cachefiles: avoid pointless cred reference count bump dns_resolver: avoid pointless cred reference count bump trace: avoid pointless cred reference count bump cgroup: avoid pointless cred reference count bump acct: avoid pointless reference count bump io_uring: avoid pointless cred reference count bump smb: avoid pointless cred reference count bump cifs: avoid pointless cred reference count bump cifs: avoid pointless cred reference count bump ovl: avoid pointless cred reference count bump open: avoid pointless cred reference count bump nfsfh: avoid pointless cred reference count bump nfs/nfs4recover: avoid pointless cred reference count bump nfs/nfs4idmap: avoid pointless reference count bump nfs/localio: avoid pointless cred reference count bumps coredump: avoid pointless cred reference count bump binfmt_misc: avoid pointless cred reference count bump ...	2025-01-20 10:13:06 -08:00
Linus Torvalds	5f85bd6aec	vfs-6.14-rc1.pidfs -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZ4pRdwAKCRCRxhvAZXjc otQjAP9ooUH2d/jHZ49Rw4q/3BkhX8R2fFEZgj2PMvtYlr0jQwD/d8Ji0k4jINTL AIFRfPdRwrD+X35IUK3WPO42YFZ4rAg= =5wgo -----END PGP SIGNATURE----- Merge tag 'vfs-6.14-rc1.pidfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull pidfs updates from Christian Brauner: - Rework inode number allocation Recently we received a patchset that aims to enable file handle encoding and decoding via name_to_handle_at(2) and open_by_handle_at(2). A crucical step in the patch series is how to go from inode number to struct pid without leaking information into unprivileged contexts. The issue is that in order to find a struct pid the pid number in the initial pid namespace must be encoded into the file handle via name_to_handle_at(2). This can be used by containers using a separate pid namespace to learn what the pid number of a given process in the initial pid namespace is. While this is a weak information leak it could be used in various exploits and in general is an ugly wart in the design. To solve this problem a new way is needed to lookup a struct pid based on the inode number allocated for that struct pid. The other part is to remove the custom inode number allocation on 32bit systems that is also an ugly wart that should go away. Allocate unique identifiers for struct pid by simply incrementing a 64 bit counter and insert each struct pid into the rbtree so it can be looked up to decode file handles avoiding to leak actual pids across pid namespaces in file handles. On both 64 bit and 32 bit the same 64 bit identifier is used to lookup struct pid in the rbtree. On 64 bit the unique identifier for struct pid simply becomes the inode number. Comparing two pidfds continues to be as simple as comparing inode numbers. On 32 bit the 64 bit number assigned to struct pid is split into two 32 bit numbers. The lower 32 bits are used as the inode number and the upper 32 bits are used as the inode generation number. Whenever a wraparound happens on 32 bit the 64 bit number will be incremented by 2 so inode numbering starts at 2 again. When a wraparound happens on 32 bit multiple pidfds with the same inode number are likely to exist. This isn't a problem since before pidfs pidfds used the anonymous inode meaning all pidfds had the same inode number. On 32 bit sserspace can thus reconstruct the 64 bit identifier by retrieving both the inode number and the inode generation number to compare, or use file handles. This gives the same guarantees on both 32 bit and 64 bit. - Implement file handle support This is based on custom export operation methods which allows pidfs to implement permission checking and opening of pidfs file handles cleanly without hacking around in the core file handle code too much. - Support bind-mounts Allow bind-mounting pidfds. Similar to nsfs let's allow bind-mounts for pidfds. This allows pidfds to be safely recovered and checked for process recycling. Instead of checking d_ops for both nsfs and pidfs we could in a follow-up patch add a flag argument to struct dentry_operations that functions similar to file_operations->fop_flags. * tag 'vfs-6.14-rc1.pidfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: selftests: add pidfd bind-mount tests pidfs: allow bind-mounts pidfs: lookup pid through rbtree selftests/pidfd: add pidfs file handle selftests pidfs: check for valid ioctl commands pidfs: implement file handle support exportfs: add permission method fhandle: pull CAP_DAC_READ_SEARCH check into may_decode_fh() exportfs: add open method fhandle: simplify error handling pseudofs: add support for export_ops pidfs: support FS_IOC_GETVERSION pidfs: remove 32bit inode number handling pidfs: rework inode number allocation	2025-01-20 09:59:00 -08:00
Linus Torvalds	4b84a4c8d4	vfs-6.14-rc1.misc -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZ4pRjQAKCRCRxhvAZXjc omUyAP9k31Qr7RY1zNtmpPfejqc+3Xx+xXD7NwHr+tONWtUQiQEA/F94qU2U3ivS AzyDABWrEQ5ZNsm+Rq2Y3zyoH7of3ww= =s3Bu -----END PGP SIGNATURE----- Merge tag 'vfs-6.14-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull misc vfs updates from Christian Brauner: "Features: - Support caching symlink lengths in inodes The size is stored in a new union utilizing the same space as i_devices, thus avoiding growing the struct or taking up any more space When utilized it dodges strlen() in vfs_readlink(), giving about 1.5% speed up when issuing readlink on /initrd.img on ext4 - Add RWF_DONTCACHE iocb and FOP_DONTCACHE file_operations flag If a file system supports uncached buffered IO, it may set FOP_DONTCACHE and enable support for RWF_DONTCACHE. If RWF_DONTCACHE is attempted without the file system supporting it, it'll get errored with -EOPNOTSUPP - Enable VBOXGUEST and VBOXSF_FS on ARM64 Now that VirtualBox is able to run as a host on arm64 (e.g. the Apple M3 processors) we can enable VBOXSF_FS (and in turn VBOXGUEST) for this architecture. Tested with various runs of bonnie++ and dbench on an Apple MacBook Pro with the latest Virtualbox 7.1.4 r165100 installed Cleanups: - Delay sysctl_nr_open check in expand_files() - Use kernel-doc includes in fiemap docbook - Use page->private instead of page->index in watch_queue - Use a consume fence in mnt_idmap() as it's heavily used in link_path_walk() - Replace magic number 7 with ARRAY_SIZE() in fc_log - Sort out a stale comment about races between fd alloc and dup2() - Fix return type of do_mount() from long to int - Various cosmetic cleanups for the lockref code Fixes: - Annotate spinning as unlikely() in __read_seqcount_begin The annotation already used to be there, but got lost in commit `52ac39e5db` ("seqlock: seqcount_t: Implement all read APIs as statement expressions") - Fix proc_handler for sysctl_nr_open - Flush delayed work in delayed fput() - Fix grammar and spelling in propagate_umount() - Fix ESP not readable during coredump In /proc/PID/stat, there is the kstkesp field which is the stack pointer of a thread. While the thread is active, this field reads zero. But during a coredump, it should have a valid value However, at the moment, kstkesp is zero even during coredump - Don't wake up the writer if the pipe is still full - Fix unbalanced user_access_end() in select code" * tag 'vfs-6.14-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (28 commits) gfs2: use lockref_init for qd_lockref erofs: use lockref_init for pcl->lockref dcache: use lockref_init for d_lockref lockref: add a lockref_init helper lockref: drop superfluous externs lockref: use bool for false/true returns lockref: improve the lockref_get_not_zero description lockref: remove lockref_put_not_zero fs: Fix return type of do_mount() from long to int select: Fix unbalanced user_access_end() vbox: Enable VBOXGUEST and VBOXSF_FS on ARM64 pipe_read: don't wake up the writer if the pipe is still full selftests: coredump: Add stackdump test fs/proc: do_task_stat: Fix ESP not readable during coredump fs: add RWF_DONTCACHE iocb and FOP_DONTCACHE file_operations flag fs: sort out a stale comment about races between fd alloc and dup2 fs: Fix grammar and spelling in propagate_umount() fs: fc_log replace magic number 7 with ARRAY_SIZE() fs: use a consume fence in mnt_idmap() file: flush delayed work in delayed fput() ...	2025-01-20 09:40:49 -08:00
Linus Torvalds	d582952424	vfs-6.14-rc1.kcore -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZ4pRDgAKCRCRxhvAZXjc ojt3AQCY/X9EHTeiJ/1eBZd/mopcu6ftyjcVpiCIZzqXnr6DKAD+Odb/8C7/Axlg A/ne6RjV4+DXOz8qJpaRAu4aV2zyMAs= =xDe5 -----END PGP SIGNATURE----- Merge tag 'vfs-6.14-rc1.kcore' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull /proc/kcore updates from Christian Brauner: "The performance of /proc/kcore reads has been showing up as a bottleneck for the drgn debugger. drgn scripts often spend ~25% of their time in the kernel reading from /proc/kcore. A lot of this overhead comes from silly inefficiencies. This pull request contains fixes for the low-hanging fruit. The fixes are all fairly small and straightforward. The result is a 25% improvement in read latency in micro-benchmarks (from ~235 nanoseconds to ~175) and a 15% improvement in execution time for real-world drgn scripts: - Make /proc/kcore entry permanent - Avoid walking the list on every read - Use percpu_rw_semaphore for kclist_lock - Make Omar Sandoval the official maintainer for /proc/kcore" * tag 'vfs-6.14-rc1.kcore' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: MAINTAINERS: add me as /proc/kcore maintainer proc/kcore: use percpu_rw_semaphore for kclist_lock proc/kcore: don't walk list on every read proc/kcore: mark proc entry as permanent	2025-01-20 09:36:55 -08:00
Linus Torvalds	ca56a74a31	vfs-6.14-rc1.netfs -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZ4pRKQAKCRCRxhvAZXjc ov2dAQCULWjTBWdF8Ro2bfNeXzWvUUnSPjoLJ9B4xlrOB9c2MAEAiwkKHkzAxUco hCvaRJc3H2ze2wrgbIABPKB2noQVVwk= =4ojv -----END PGP SIGNATURE----- Merge tag 'vfs-6.14-rc1.netfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs netfs updates from Christian Brauner: "This contains read performance improvements and support for monolithic single-blob objects that have to be read/written as such (e.g. AFS directory contents). The implementation of the two parts is interwoven as each makes the other possible. - Read performance improvements The read performance improvements are intended to speed up some loss of performance detected in cifs and to a lesser extend in afs. The problem is that we queue too many work items during the collection of read results: each individual subrequest is collected by its own work item, and then they have to interact with each other when a series of subrequests don't exactly align with the pattern of folios that are being read by the overall request. Whilst the processing of the pages covered by individual subrequests as they complete potentially allows folios to be woken in parallel and with minimum delay, it can shuffle wakeups for sequential reads out of order - and that is the most common I/O pattern. The final assessment and cleanup of an operation is then held up until the last I/O completes - and for a synchronous sequential operation, this means the bouncing around of work items just adds latency. Two changes have been made to make this work: (1) All collection is now done in a single "work item" that works progressively through the subrequests as they complete (and also dispatches retries as necessary). (2) For readahead and AIO, this work item be done on a workqueue and can run in parallel with the ultimate consumer of the data; for synchronous direct or unbuffered reads, the collection is run in the application thread and not offloaded. Functions such as smb2_readv_callback() then just tell netfslib that the subrequest has terminated; netfslib does a minimal bit of processing on the spot - stat counting and tracing mostly - and then queues/wakes up the worker. This simplifies the logic as the collector just walks sequentially through the subrequests as they complete and walks through the folios, if buffered, unlocking them as it goes. It also keeps to a minimum the amount of latency injected into the filesystem's low-level I/O handling The way netfs supports filesystems using the deprecated PG_private_2 flag is changed: folios are flagged and added to a write request as they complete and that takes care of scheduling the writes to the cache. The originating read request can then just unlock the pages whatever happens. - Single-blob object support Single-blob objects are files for which the content of the file must be read from or written to the server in a single operation because reading them in parts may yield inconsistent results. AFS directories are an example of this as there exists the possibility that the contents are generated on the fly and would differ between reads or might change due to third party interference. Such objects will be written to and retrieved from the cache if one is present, though we allow/may need to propose multiple subrequests to do so. The important part is that read from/write to the server is monolithic. Single blob reading is, for the moment, fully synchronous and does result collection in the application thread and, also for the moment, the API is supplied the buffer in the form of a folio_queue chain rather than using the pagecache. - Related afs changes This series makes a number of changes to the kafs filesystem, primarily in the area of directory handling: - AFS's FetchData RPC reply processing is made partially asynchronous which allows the netfs_io_request's outstanding operation counter to be removed as part of reducing the collection to a single work item. - Directory and symlink reading are plumbed through netfslib using the single-blob object API and are now cacheable with fscache. This also allows the afs_read struct to be eliminated and netfs_io_subrequest to be used directly instead. - Directory and symlink content are now stored in a folio_queue buffer rather than in the pagecache. This means we don't require the RCU read lock and xarray iteration to access it, and folios won't randomly disappear under us because the VM wants them back. - The vnode operation lock is changed from a mutex struct to a private lock implementation. The problem is that the lock now needs to be dropped in a separate thread and mutexes don't permit that. - When a new directory or symlink is created, we now initialise it locally and mark it valid rather than downloading it (we know what it's likely to look like). - We now use the in-directory hashtable to reduce the number of entries we need to scan when doing a lookup. The edit routines have to maintain the hash chains. - Cancellation (e.g. by signal) of an async call after the rxrpc_call has been set up is now offloaded to the worker thread as there will be a notification from rxrpc upon completion. This avoids a double cleanup. - A "rolling buffer" implementation is created to abstract out the two separate folio_queue chaining implementations I had (one for read and one for write). - Functions are provided to create/extend a buffer in a folio_queue chain and tear it down again. This is used to handle AFS directories, but could also be used to create bounce buffers for content crypto and transport crypto. - The was_async argument is dropped from netfs_read_subreq_terminated() Instead we wake the read collection work item by either queuing it or waking up the app thread. - We don't need to use BH-excluding locks when communicating between the issuing thread and the collection thread as neither of them now run in BH context. - Also included are a number of new tracepoints; a split of the netfslib write collection code to put retrying into its own file (it gets more complicated with content encryption). - There are also some minor fixes AFS included, including fixing the AFS directory format struct layout, reducing some directory over-invalidation and making afs_mkdir() translate EEXIST to ENOTEMPY (which is not available on all systems the servers support). - Finally, there's a patch to try and detect entry into the folio unlock function with no folio_queue structs in the buffer (which isn't allowed in the cases that can get there). This is a debugging patch, but should be minimal overhead" * tag 'vfs-6.14-rc1.netfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (31 commits) netfs: Report on NULL folioq in netfs_writeback_unlock_folios() afs: Add a tracepoint for afs_read_receive() afs: Locally initialise the contents of a new symlink on creation afs: Use the contained hashtable to search a directory afs: Make afs_mkdir() locally initialise a new directory's content netfs: Change the read result collector to only use one work item afs: Make {Y,}FS.FetchData an asynchronous operation afs: Fix cleanup of immediately failed async calls afs: Eliminate afs_read afs: Use netfslib for symlinks, allowing them to be cached afs: Use netfslib for directories afs: Make afs_init_request() get a key if not given a file netfs: Add support for caching single monolithic objects such as AFS dirs netfs: Add functions to build/clean a buffer in a folio_queue afs: Add more tracepoints to do with tracking validity cachefiles: Add auxiliary data trace cachefiles: Add some subrequest tracepoints netfs: Remove some extraneous directory invalidations afs: Fix directory format encoding struct afs: Fix EEXIST error returned from afs_rmdir() to be ENOTEMPTY ...	2025-01-20 09:29:11 -08:00
Linus Torvalds	91309a7082	x86: use cmov for user address masking This was a suggestion by David Laight, and while I was slightly worried that some micro-architecture would predict cmov like a conditional branch, there is little reason to actually believe any core would be that broken. Intel documents that their existing cores treat CMOVcc as a data dependency that will constrain speculation in their "Speculative Execution Side Channel Mitigations" whitepaper: "Other instructions such as CMOVcc, AND, ADC, SBB and SETcc can also be used to prevent bounds check bypass by constraining speculative execution on current family 6 processors (Intel® Core™, Intel® Atom™, Intel® Xeon® and Intel® Xeon Phi™ processors)" and while that leaves the future uarch issues open, that's certainly true of our traditional SBB usage too. Any core that predicts CMOV will be unusable for various crypto algorithms that need data-independent timing stability, so let's just treat CMOV as the safe choice that simplifies the address masking by avoiding an extra instruction and doesn't need a temporary register. Suggested-by: David Laight <David.Laight@aculab.com> Link: https://www.intel.com/content/dam/develop/external/us/en/documents/336996-speculative-execution-side-channel-mitigations.pdf Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2025-01-20 08:51:44 -08:00
Linus Torvalds	027ea4f5f2	x86: use proper 'clac' and 'stac' opcode names Back when we added SMAP support, all versions of binutils didn't necessarily understand the 'clac' and 'stac' instructions. So we implemented those instructions manually as ".byte" sequences. But we've since upgraded the minimum version of binutils to version 2.25, and that included proper support for the SMAP instructions, and there's no reason for us to use some line noise to express them any more. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2025-01-20 08:39:29 -08:00
Christian Brauner	68e6b7d98b	samples/vfs: fix build warnings Fix build warnings reported from linux-next. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Link: https://lore.kernel.org/r/20250120192504.4a1965a0@canb.auug.org.au Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-01-20 13:14:21 +01:00
Christian Brauner	f9d94f78a8	samples/vfs: use shared header Share some infrastructure between sample programs and fix a build failure that was reported. Reported-by: Sasha Levin <sashal@kernel.org> Link: https://lore.kernel.org/r/Z42UkSXx0MS9qZ9w@lappy Link: https://qa-reports.linaro.org/lkft/sashal-linus-next/build/v6.13-rc7-511-g109a8e0fa9d6/testrun/26809210/suite/build/test/gcc-8-allyesconfig/log Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-01-20 12:50:35 +01:00
Linus Torvalds	5293b5f97e	Merge branch 'vsnprintf' This merges the vsnprintf internal cleanups I did, which were triggered by a combination of performance issues (see for example commit `f9ed1f7c2e`: "genirq/proc: Use seq_put_decimal_ull_width() for decimal values") and discussion about tracing abusing the vsnprintf code in odd ways. The intent was to improve code generation, but also to possibly eventually expose the cleaned-up printf format decoding state machine. It certainly didn't get to the point where we'd want to expose the format decoding to external users, but it's an improvement over what we used to have. Several of the complex case statements have been simplified, or removed entirely to be replaced by simple table lookups. * branch 'vsnprintf': vsnprintf: fix the number base for non-numeric formats vsnprintf: fix up kerneldoc for argument name changes vsprintf: don't make the 'binary' version pack small integer arguments vsnprintf: collapse the number format state into one single state vsnprintf: mark the indirect width and precision cases unlikely vsnprintf: inline skip_atoi() again vsprintf: deal with format specifiers with a lookup table vsprintf: deal with format flags with a simple lookup table vsprintf: associate the format state with the format pointer vsprintf: fix calling convention for format_decode() vsprintf: avoid nested switch statement on same variable vsprintf: simplify number handling	2025-01-19 21:28:57 -08:00
Linus Torvalds	ffd294d346	Linux 6.13	2025-01-19 15:51:45 -08:00
Linus Torvalds	9528d418de	- Mark serialize() noinstr so that it can be used from instrumentation-free code - Make sure FRED's RSP0 MSR is synchronized with its corresponding per-CPU value in order to avoid double faults in hotplug scenarios - Disable EXECMEM_ROX on x86 for now because it didn't receive proper x86 maintainers review, went in and broke a bunch of things -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmeM5PkACgkQEsHwGGHe VUrrsRAAkQLbaww4SwB2CAS0VARt/ethSuna+uElVfm74IoQqblBCzLAe+Gd6aWR ANSpsPkKumfsFu6j8+KCDZf86SnToxEbKv8+xV0ZIohFYL7DDFkuhDsyuWp2nBTe E9FCOY2J6w4JTik1QuhCnyMjNg3UFE/I9nAa3gwqOhLxZiyuSdzCZ1Vs3NX2zcmv x8zht+ZY+y/qgn6yuVcPoCwx6QlkoWxoJA6cOIwNdJ6mVOruYb7Aozmix6bL6oDS 5qPoz8qQyAQbPS++ezb80facZqGx2kvw8Alz5msxkRBYvjnckjEd08+QS7y/7fet 0XmJh+RPpTL8krG2na5JE5RtMTTHuj0oEu/vIUOJ42PGfhaebaYYXwbI0Nub4hDv dCddIq+WZwk+Jn9jTGnkW2894ozmwE/l+GWbJ7MhWH02Mbbg/vMrebj4VIkGTFL0 xlU7KGiRPVgZa3S9A3zhh7nY9XEK5PFUMpSFFY5G657C+HP+wSyEZNqEjdsxV8nJ SZBOuzWUoWbhIeXNf+zkf2x2zrrkFwqIZG1H5fSqQF1tDaZVkBgkWsWBP4zuVnXo rS/9dBJeyzWLXMY0ZQdO5Tk32/ICkgqlsQFPiQL4shn2+1OnLRfUqzF8MCQZWypt KU8xkRGzu3vGHmVeW4Le39z/vooLbO+0GfBcj+iYMHuhCnQFCEA= =b6qH -----END PGP SIGNATURE----- Merge tag 'x86_urgent_for_v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Mark serialize() noinstr so that it can be used from instrumentation- free code - Make sure FRED's RSP0 MSR is synchronized with its corresponding per-CPU value in order to avoid double faults in hotplug scenarios - Disable EXECMEM_ROX on x86 for now because it didn't receive proper x86 maintainers review, went in and broke a bunch of things * tag 'x86_urgent_for_v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/asm: Make serialize() always_inline x86/fred: Fix the FRED RSP0 MSR out of sync with its per-CPU cache x86: Disable EXECMEM_ROX support	2025-01-19 09:33:40 -08:00
Linus Torvalds	25144ea31b	- Reset hrtimers correctly when a CPU hotplug state traversal happens "half-ways" and leaves hrtimers not (re-)initialized properly - Annotate accesses to a timer group's ignore flag to prevent KCSAN from raising data_race warnings - Make sure timer group initialization is visible to timer tree walkers and avoid a hypothetical race - Fix another race between CPU hotplug and idle entry/exit where timers on a fully idle system are getting ignored - Fix a case where an ignored signal is still being handled which it shouldn't be -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmeM4vgACgkQEsHwGGHe VUpw3RAAnKu9B1bizT4lAfK4dsh9I/beDmeCaLZ8kWWq6Ot4LJry8qsAnr8Pami9 WsFxn/LSCvqsTOZ5aSuYNyWpeUJBrHzCvaTrdpOfgpgq1eZzS/iuuZBXJtV018dp DKVnjHSjCseXW9EfquXHcHIaHswzvdAsep/y8qTRiJlJ13lmihGOxbQDjQTWP9mm geYDXKaih25dMeON7RJ+ND6Lnai8lfxyncv7HPE5wAo4Exr5jCDPIBHSXhE4Hgde f4SnBnvevJCDKwbrvpN4ecgFLBj4FIe80TiKFpGDk1rFG0myLQrDps6RMfZ6XzB4 duEu5chfsECbp3ioq7nC/6hRuJCIXJrN20Ij0MQAGFGPKO+uKlSfWVY4Zqc3lbxZ xROHxTeYYram8RIzclUnwhprZug5SN3PUyyvAYgswLxc4tKJ39elHIkh5c+TReHK qFG+//Xnyd9dmYNb+SwbDRps17F50bJLDhOS+7FwkYRkCG8NoxlR2EAJeI0RGBzL 1B9EE5Nht9awwzWtVCbJizyfbkRZSok5lToyqPIyjsrmueiz+qiZDnAdqhiArfJ7 ThcN21BKSvMBJ5/tEyAPfE7dK08Sxq6oBuP8mtscv0CK9cU1sis99/0COwfclhlm 8EfAyHpIz7cMsmnENAURMAyeB3L4HRIc8FJUw60IO9HRe6RH1Hs= =Xfav -----END PGP SIGNATURE----- Merge tag 'timers_urgent_for_v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Borislav Petkov: - Reset hrtimers correctly when a CPU hotplug state traversal happens "half-ways" and leaves hrtimers not (re-)initialized properly - Annotate accesses to a timer group's ignore flag to prevent KCSAN from raising data_race warnings - Make sure timer group initialization is visible to timer tree walkers and avoid a hypothetical race - Fix another race between CPU hotplug and idle entry/exit where timers on a fully idle system are getting ignored - Fix a case where an ignored signal is still being handled which it shouldn't be * tag 'timers_urgent_for_v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: hrtimers: Handle CPU state correctly on hotplug timers/migration: Annotate accesses to ignore flag timers/migration: Enforce group initialization visibility to tree walkers timers/migration: Fix another race between hotplug and idle entry/exit signal/posixtimers: Handle ignore/blocked sequences correctly	2025-01-19 09:09:07 -08:00
Linus Torvalds	b031457ab1	- Fix an OF node leak in irqchip init's error handling path - Fix sunxi systems to wake up from suspend with an NMI by pressing the power button - Do not spuriously enable interrupts in gic-v3 in a nested interrupts-off section - Make sure gic-v3 handles properly a failure to enter a low power state -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmeM3eIACgkQEsHwGGHe VUqbEw/7BfNk7aiL9OrHg9J5IPE51hSzctr+MrlqRPkBBdpQF5Q4HC4R7ZlSBy44 k8zR96jO6qCgo+rObj1BmtYk6Yndxf1WDauVovaK774IMJVyhtJ4hGvPCYt1jOYt BZ/rF3mkPxC8ui/e2HsZVwMXjmuPaf6tZ8Bvsof0qrxy6ayh+L4b8KPBYXtXdoYr GxwuIoK7VHDj+0HTcBtzCAB3CLhYAeknwVGeGMP9xnE4DkqoYurfsmY60OnIXz0n j2oTTeQaLHFcAXcw5B7jABYeCAh3UfrKg9Og108z8aQ3hvTcGvM6ou/bBdVUVDSY E2bCW8DC2hmHUqEUQ2zW5CueVFCBkyi/JYbkuXG56gGH/qZ2moIcSm01ELNxiX4X jh6StDOKhqczSJb0rbzNK8BT8ZfiJK4OQSJOvIWyKEb3gEJxW3pEuFHiKcUECGZe EAXo81UF7R5eMFmJubAa1LYw60LSD83/1SKbeXHT5GnMWA51YIL16xYUsthpQnpV xZ/XVtPU1oM2cl+03q70VSVPZw0Eisd5i+5TSX9D0hdXfAD4g070oaOEZEc+ZFEj 7VO68AmGgAb1ZoKFNxDt/+tM/r4GJxiKTqMgjYX/dLN5jhtTbHi8Nui9h4m8WGqK 58WjqDuba7ajrYpt8ah8uWd8lU2zFvtFNXTQl6YS8oVPbJM2CxY= =Hjnc -----END PGP SIGNATURE----- Merge tag 'irq_urgent_for_v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Borislav Petkov: - Fix an OF node leak in irqchip init's error handling path - Fix sunxi systems to wake up from suspend with an NMI by pressing the power button - Do not spuriously enable interrupts in gic-v3 in a nested interrupts-off section - Make sure gic-v3 handles properly a failure to enter a low power state * tag 'irq_urgent_for_v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip: Plug a OF node reference leak in platform_irqchip_probe() irqchip/sunxi-nmi: Add missing SKIP_WAKE flag irqchip/gic-v3-its: Don't enable interrupts in its_irq_set_vcpu_affinity() irqchip/gic-v3: Handle CPU_PM_ENTER_FAILED correctly	2025-01-19 09:04:33 -08:00
Linus Torvalds	8ff6d472ab	- Do not adjust the weight of empty group entities and avoid scheduling artifacts - Avoid scheduling lag by computing lag properly and thus address an EEVDF entity placement issue -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmeM2+8ACgkQEsHwGGHe VUrqHA/+PyPsILNfDFnaXD/Shv7kCgk3hH2YVrCnyIQn8fhTCL+K0luXRRmAPhl1 pEbRyEtVRXZM9WyHJ1r7VmnIHVanUKWcuZzAx8HT9b7Tkrqy/xSriH/jAV7FYG9h gIhWA4vXT34HG2MQHTecJ/4gvkeI5tV93nkZ6vwEwwN4pGxSvijPR3yr3yl286Nt BlHP9cXBWWsxTq4CYb2j5q9nmymKuiDmhD8jc1fVjDKxjmrNHEwX5mtvtRLGI1dR O4pjUEqYQ+TWIkX8ZIRBPzLd9b6750ncO/9yb24cY7Z3RMKov4wz8b819Dt77cTp XF+bT+8fsaKNRjkzPEseZU4OL16ZO+33Kcn+JoNPWvbONgFKusZ4qkCCwvWpiQlV 0Eddh1XjKBoXA1tR6VrREcUKQ2zWoXrn8tlHUTMfPuPq1jlbQNtzN7OWcR+WBy7L FiClWzWLv0fCeTXaAEcO9+/MN5z2lDQsJRiLtAlGh0JJU6/1H6IVX2tSZllkpR/R 5pyHCpYsh5cucx6cQPk+rp9O6PDYu2frMuJ8itWlQLHeSzjzOoeE6EwwZcMQG4xb UG/azMbmqWrtmZqCgNCBtq7RAkVRe0IuxWuCtV1VkatU+2RXdtV85tVKn/JG2KLW 05c8GTdQZgnoY65/TZHwudhkzytclynKrrvhfKFllAWnb8D8gyQ= =fZWm -----END PGP SIGNATURE----- Merge tag 'sched_urgent_for_v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Borislav Petkov: - Do not adjust the weight of empty group entities and avoid scheduling artifacts - Avoid scheduling lag by computing lag properly and thus address an EEVDF entity placement issue * tag 'sched_urgent_for_v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/fair: Fix update_cfs_group() vs DELAY_DEQUEUE sched/fair: Fix EEVDF entity placement bug causing scheduling lag	2025-01-19 09:01:17 -08:00
Al Viro	561e3a0c40	io_uring/fdinfo: fix io_uring_show_fdinfo() misuse of ->d_iname Output of io_uring_show_fdinfo() has several problems: * racy use of ->d_iname * junk if the name is long - in that case it's not stored in ->d_iname at all * lack of quoting (names can contain newlines, etc. - or be equal to "<none>", for that matter). * lines for empty slots are pointless noise - we already have the total amount, so having just the non-empty ones would carry the same information. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2025-01-19 07:28:37 -07:00
Linus Torvalds	fda5e3f284	Fix regression in GFP output in trace events It was reported that the GFP flags in trace events went from human readable to just their hex values: gfp_flags=GFP_HIGHUSER_MOVABLE\|__GFP_COMP to gfp_flags=0x140cca This was caused by a change that added the use of enums in calculating the GFP flags. As defines get translated into their values in the trace event format files, the user space tooling could easily convert the GFP flags into their symbols via the __print_flags() helper macro. The problem is that enums do not get converted, and the names of the enums show up in the format files and user space tooling cannot translate them. Add TRACE_DEFINE_ENUM() around the enums used for GFP flags which is the tracing infrastructure macro that informs the tracing subsystem what the values for enums and it can then expose that to user space. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZ4u7AxQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qgIkAP0VVW80Ck5K9hpDJ3SvSgaGDntSegY7 lI0ExVqGsJz8GQEAzkaRjgGXuXfzGzA9K7ZUe9X4R8W0Xkl9GisvqqEU1Ak= =rzFM -----END PGP SIGNATURE----- Merge tag 'trace-v6.13-rc7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fix from Steven Rostedt: "Fix regression in GFP output in trace events It was reported that the GFP flags in trace events went from human readable to just their hex values: gfp_flags=GFP_HIGHUSER_MOVABLE\|__GFP_COMP to gfp_flags=0x140cca This was caused by a change that added the use of enums in calculating the GFP flags. As defines get translated into their values in the trace event format files, the user space tooling could easily convert the GFP flags into their symbols via the __print_flags() helper macro. The problem is that enums do not get converted, and the names of the enums show up in the format files and user space tooling cannot translate them. Add TRACE_DEFINE_ENUM() around the enums used for GFP flags which is the tracing infrastructure macro that informs the tracing subsystem what the values for enums and it can then expose that to user space" * tag 'trace-v6.13-rc7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: gfp: Fix the GFP enum values shown for user space tracing tools	2025-01-18 13:22:53 -08:00
Linus Torvalds	595523945b	Devicetree fixes for 6.13, part 2: Another fix and testcase to avoid the newly added WARN in the case of non-translatable addresses. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEktVUI4SxYhzZyEuo+vtdtY28YcMFAmeK0gAACgkQ+vtdtY28 YcNanRAAjtXeeRK8NZJ9JfjTDaNMxQJ89WVSKmbJBNH8M/hWFK9eq2fJvNwnPJJT 4xHfWvNr7znyC2kLm6k/LKGmyV/mQqiJ8vCaBKQ1KssjQ1hXR7JiXbniw78Rc0LD zfY9nDxMRm8jJhz5T0TNfhWn1bBAUA9dibew8etdo2KqCXGrFnGIZoeFU/ro8Tzy /bW90QhlUxze9V4bH/4UBvTmfK8WmSTobG9r7Z2/YXcbTxB1uDGKZhV4jhJ4Issa qrue201VlReHdRmiYWeDz57m/aXBudOvLaDJVymzoYJU5FypNH/757cvmysiQxzq tRRLgZLhAye5qB3HCX5xa6v3CvAvuf01EzTMMLKuKA4IkkaoO6CjjG7D/ip4lvdc Ekt4OFGYdxGQaJNeXmaLioJzhrsS0NQe4ZFUQF1YnMdOfI3vneVsRcG6CHgKyfft pnoBiakIiapNj4S/UTn+A2bI+/rG0P9rbMwhpb3ojJBlp5IpC/8vqKzeA8Hzy19I 3Yi3lfX4sscSvQJk6izkTxwDMfx/FsrQfkf6PS8eYfhAlyMcc1JKuxN15sIoEth7 TapRH0hR5aCe64jYWpIEIwaVb+d2izdEPf+9Dd7v/LBRe44mvHnLAMs2936UxvUg U/dKP9iR/pWaibPpIGuK6RCRY4uO6qJfzwvSRPrJNohZUk8RNhM= =Uw0f -----END PGP SIGNATURE----- Merge tag 'devicetree-fixes-for-6.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull devicetree fixes from Rob Herring: "Another fix and testcase to avoid the newly added WARN in the case of non-translatable addresses" * tag 'devicetree-fixes-for-6.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: of/address: Fix WARN when attempting translating non-translatable addresses of/unittest: Add test that of_address_to_resource() fails on non-translatable address	2025-01-17 15:01:24 -08:00
Linus Torvalds	ed9add2b32	soc: fixes for 6.13, part 4 Two last minute fixes: one build issue on TI soc drivers, and a regression in the renesas reset controller driver. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmeKhqUACgkQYKtH/8kJ Uie9ARAAxtghxRPixgucuNr2qgLB/zaCJ8r78DrS70Duc6meyH82uOfv0zBgKHbi y/a09FsLbJUG6jC2Ipd2gxXJ3U2mTHisIYEcCw6bmppLHcByteI7bkB5xqv8Sw4V p0n4eqohQ8FruzCdHQgKA8P3C1Bw3t4m/ODzEc/gs6RFyfQdfRHHGtFuN7wzjcpu pzL4q2n2wuNv6ys5o/QLPlP8zrskSemhJvNxBozq/Lfkh1eCPPo4xnKguCXPnip3 VyJYT85UpqyyB7ngKoSw4CaL+eMHSMvT19AkzDRffay47dAhOvgKzh3MROWHFVp2 9NR5XFTQShmSobxevM8arLJUbZt0lyRxZoB+/708WMHWNCy3OPWWRAaZNr7uNyAk 8J7gCWN34nOX657jlBgYYZmp27yspgyGTe7MOWr65qiBVWFBKfZvppyyBvbfgXuG A4xcGPpa20lucx2npbEzJ4TrRD3CMSfMFMK7xqBdBYdGMJELihRVYg8Frq4GYWNE SfBFsLMTM+rq2jPg1t65Y8aYK1Y7xABos1OT7Gai8+6nLlOgAt7lk2SlIqzNemG9 0ME4DAp1SG/bgM30kVraU6/L1uZmmoHhJEeTS2GDFC5XHdfz79gUzU97l/tPaa+w T3C6y42Ehl+LZnP+GAdIdvecQv8O+zTIlRVqkE35Cx7Igt5mAdE= =UIAA -----END PGP SIGNATURE----- Merge tag 'soc-fixes-6.13-4' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull SoC fixes from Arnd Bergmann: "Two last minute fixes: one build issue on TI soc drivers, and a regression in the renesas reset controller driver" * tag 'soc-fixes-6.13-4' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: soc: ti: pruss: Fix pruss APIs reset: rzg2l-usbphy-ctrl: Assign proper of node to the allocated device	2025-01-17 14:49:53 -08:00

1 2 3 4 5 ...

1326565 Commits