Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge mainline v6.4 #530

Merged
merged 10,000 commits into from
May 12, 2024
Merged

Merge mainline v6.4 #530

merged 10,000 commits into from
May 12, 2024

Conversation

ddiss
Copy link

@ddiss ddiss commented Sep 8, 2023

Only a few minor changes needed for the merge

chenhuacai and others added 30 commits June 15, 2023 14:35
LoongArch PMCFG has 10bit event id rather than 8 bit, so fix it.

Cc: [email protected]
Signed-off-by: Jun Yi <[email protected]>
Signed-off-by: Huacai Chen <[email protected]>
The hardware monitoring points for instruction fetching and load/store
operations need to align 4 bytes and 1/2/4/8 bytes respectively.

Reported-by: Colin King <[email protected]>
Signed-off-by: Qing Zhang <[email protected]>
Signed-off-by: Huacai Chen <[email protected]>
The debugfs_create_dir() returns ERR_PTR in case of an error and the
correct way of checking it is using the IS_ERR_OR_NULL inline function
rather than the simple null comparision. This patch fixes the issue.

Cc: [email protected]
Suggested-By: Ivan Orlov <[email protected]>
Signed-off-by: Immad Mir <[email protected]>
Signed-off-by: Huacai Chen <[email protected]>
…it()

This patch prevents the system from crashing when unloading the ISM module.

How to reproduce: Attach an ISM device and execute 'rmmod ism'.

Error-Log:
- Trying to free already-free IRQ 0
- WARNING: CPU: 1 PID: 966 at kernel/irq/manage.c:1890 free_irq+0x140/0x540

After calling ism_dev_exit() for each ISM device in the exit routine,
pci_unregister_driver() will execute ism_remove() for each ISM device.
Because ism_remove() also calls ism_dev_exit(),
free_irq(pci_irq_vector(pdev, 0), ism) is called twice for each ISM
device. This results in a crash with the error
'Trying to free already-free IRQ'.

In the exit routine, it is enough to call pci_unregister_driver()
because it ensures that ism_dev_exit() is called once per
ISM device.

Cc: <[email protected]> # 6.3+
Fixes: 89e7d2b ("net/ism: Add new API for client registration")
Reviewed-by: Niklas Schnelle <[email protected]>
Signed-off-by: Julian Ruess <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
adding three people from Alibaba as reviewers for SMC.
They are currently working on improving SMC on other architectures than
s390 and help with reviewing patches on top.

Thank you D. Wythe, Tony Lu and Wen Gu for your contributions and
collaboration and welcome on board as reviewers!

Reviewed-by: Wenjia Zhang <[email protected]>
Signed-off-by: Jan Karcher <[email protected]>
Acked-by: Tony Lu <[email protected]>
Acked-by: Wen Gu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
It probbaly makes no sense to support arbitrary network devices
for lapbether.

syzbot reported:

skbuff: skb_under_panic: text:ffff80008934c100 len:44 put:40 head:ffff0000d18dd200 data:ffff0000d18dd1ea tail:0x16 end:0x140 dev:bond1
kernel BUG at net/core/skbuff.c:200 !
Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
Modules linked in:
CPU: 0 PID: 5643 Comm: dhcpcd Not tainted 6.4.0-rc5-syzkaller-g4641cff8e810 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/25/2023
pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : skb_panic net/core/skbuff.c:196 [inline]
pc : skb_under_panic+0x13c/0x140 net/core/skbuff.c:210
lr : skb_panic net/core/skbuff.c:196 [inline]
lr : skb_under_panic+0x13c/0x140 net/core/skbuff.c:210
sp : ffff8000973b7260
x29: ffff8000973b7270 x28: ffff8000973b7360 x27: dfff800000000000
x26: ffff0000d85d8150 x25: 0000000000000016 x24: ffff0000d18dd1ea
x23: ffff0000d18dd200 x22: 000000000000002c x21: 0000000000000140
x20: 0000000000000028 x19: ffff80008934c100 x18: ffff8000973b68a0
x17: 0000000000000000 x16: ffff80008a43bfbc x15: 0000000000000202
x14: 0000000000000000 x13: 0000000000000001 x12: 0000000000000001
x11: 0000000000000201 x10: 0000000000000000 x9 : f22f7eb937cced00
x8 : f22f7eb937cced00 x7 : 0000000000000001 x6 : 0000000000000001
x5 : ffff8000973b6b78 x4 : ffff80008df9ee80 x3 : ffff8000805974f4
x2 : 0000000000000001 x1 : 0000000100000201 x0 : 0000000000000086
Call trace:
skb_panic net/core/skbuff.c:196 [inline]
skb_under_panic+0x13c/0x140 net/core/skbuff.c:210
skb_push+0xf0/0x108 net/core/skbuff.c:2409
ip6gre_header+0xbc/0x738 net/ipv6/ip6_gre.c:1383
dev_hard_header include/linux/netdevice.h:3137 [inline]
lapbeth_data_transmit+0x1c4/0x298 drivers/net/wan/lapbether.c:257
lapb_data_transmit+0x8c/0xb0 net/lapb/lapb_iface.c:447
lapb_transmit_buffer+0x178/0x204 net/lapb/lapb_out.c:149
lapb_send_control+0x220/0x320 net/lapb/lapb_subr.c:251
lapb_establish_data_link+0x94/0xec
lapb_device_event+0x348/0x4e0
notifier_call_chain+0x1a4/0x510 kernel/notifier.c:93
raw_notifier_call_chain+0x3c/0x50 kernel/notifier.c:461
__dev_notify_flags+0x2bc/0x544
dev_change_flags+0xd0/0x15c net/core/dev.c:8643
devinet_ioctl+0x858/0x17e4 net/ipv4/devinet.c:1150
inet_ioctl+0x2ac/0x4d8 net/ipv4/af_inet.c:979
sock_do_ioctl+0x134/0x2dc net/socket.c:1201
sock_ioctl+0x4ec/0x858 net/socket.c:1318
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:870 [inline]
__se_sys_ioctl fs/ioctl.c:856 [inline]
__arm64_sys_ioctl+0x14c/0x1c8 fs/ioctl.c:856
__invoke_syscall arch/arm64/kernel/syscall.c:38 [inline]
invoke_syscall+0x98/0x2c0 arch/arm64/kernel/syscall.c:52
el0_svc_common+0x138/0x244 arch/arm64/kernel/syscall.c:142
do_el0_svc+0x64/0x198 arch/arm64/kernel/syscall.c:191
el0_svc+0x4c/0x160 arch/arm64/kernel/entry-common.c:647
el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:665
el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:591
Code: aa1803e6 aa1903e7 a90023f5 947730f5 (d4210000)

Fixes: 1da177e ("Linux-2.6.12-rc2")
Reported-by: syzbot <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Cc: Martin Schiller <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Inside macsec_add_dev() we free percpu macsec->secy.tx_sc.stats and
macsec->stats on some of the memory allocation failure paths. However, the
net_device is already registered to that moment: in macsec_newlink(), just
before calling macsec_add_dev(). This means that during unregister process
its priv_destructor - macsec_free_netdev() - will be called and will free
the stats again.

Remove freeing percpu stats inside macsec_add_dev() because
macsec_free_netdev() will correctly free the already allocated ones. The
pointers to unallocated stats stay NULL, and free_percpu() treats that
correctly.

Found by Linux Verification Center (linuxtesting.org) with Syzkaller.

Fixes: 0a28bfd ("net/macsec: Add MACsec skb_metadata_dst Tx Data path support")
Fixes: c09440f ("macsec: introduce IEEE 802.1AE driver")
Signed-off-by: Fedor Pchelkin <[email protected]>
Reviewed-by: Sabrina Dubroca <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
In systems without MSI-X capabilities, xdp_txq_queues_mode is calculated
in efx_allocate_msix_channels, but when enabling MSI-X fails, it was not
changed to a proper default value. This was leading to the driver
thinking that it has dedicated XDP queues, when it didn't.

Fix it by setting xdp_txq_queues_mode to the correct value if the driver
fallbacks to MSI or legacy IRQ mode. The correct value is
EFX_XDP_TX_QUEUES_BORROWED because there are no XDP dedicated queues.

The issue can be easily visible if the kernel is started with pci=nomsi,
then a call trace is shown. It is not shown only with sfc's modparam
interrupt_mode=2. Call trace example:
 WARNING: CPU: 2 PID: 663 at drivers/net/ethernet/sfc/efx_channels.c:828 efx_set_xdp_channels+0x124/0x260 [sfc]
 [...skip...]
 Call Trace:
  <TASK>
  efx_set_channels+0x5c/0xc0 [sfc]
  efx_probe_nic+0x9b/0x15a [sfc]
  efx_probe_all+0x10/0x1a2 [sfc]
  efx_pci_probe_main+0x12/0x156 [sfc]
  efx_pci_probe_post_io+0x18/0x103 [sfc]
  efx_pci_probe.cold+0x154/0x257 [sfc]
  local_pci_probe+0x42/0x80

Fixes: 6215b60 ("sfc: last resort fallback for lack of xdp tx queues")
Reported-by: Yanghang Liu <[email protected]>
Signed-off-by: Íñigo Huguet <[email protected]>
Acked-by: Martin Habets <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
…han()

Now spi_geni_grab_gpi_chan() errors are correctly reported, the
-EPROBE_DEFER error should be returned from probe in case the
GPI dma driver is built as module and/or not probed yet.

Fixes: b59c122 ("spi: spi-geni-qcom: Add support for GPI dma")
Fixes: 6532582 ("spi: spi-geni-qcom: fix error handling in spi_geni_grab_gpi_chan()")
Signed-off-by: Neil Armstrong <[email protected]>
Link: https://lore.kernel.org/r/20230615-topic-sm8550-upstream-fix-spi-geni-qcom-probe-v2-1-670c3d9e8c9c@linaro.org
Signed-off-by: Mark Brown <[email protected]>
The addition of might_sleep() to down_timeout() caused the latter to
enable interrupts unconditionally in some cases, which in turn broke
the ACPI S3 wakeup path in acpi_suspend_enter(), where down_timeout()
is called by acpi_disable_all_gpes() via acpi_ut_acquire_mutex().

Namely, if CONFIG_DEBUG_ATOMIC_SLEEP is set, might_sleep() causes
might_resched() to be used and if CONFIG_PREEMPT_VOLUNTARY is set,
this triggers __cond_resched() which may call preempt_schedule_common(),
so __schedule() gets invoked and it ends up with enabled interrupts (in
the prev == next case).

Now, enabling interrupts early in the S3 wakeup path causes the kernel
to crash.

Address this by modifying acpi_suspend_enter() to disable GPEs without
attempting to acquire the sleeping lock which is not needed in that code
path anyway.

Fixes: 99409b9 ("locking/semaphore: Add might_sleep() to down_*() family")
Reported-by: Srinivas Pandruvada <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Cc: 5.15+ <[email protected]> # 5.15+
Since commit 955fb87 ("thermal/intel/intel_soc_dts_iosf: Use Intel
TCC library") intel_soc_dts_iosf is reporting the wrong temperature.

The driver expects tj_max to be in milli-degrees-celcius but after
the switch to the TCC library this is now in degrees celcius so
instead of e.g. 90000 it is set to 90 causing a temperature 45
degrees below tj_max to be reported as -44910 milli-degrees
instead of as 45000 milli-degrees.

Fix this by adding back the lost factor of 1000.

Fixes: 955fb87 ("thermal/intel/intel_soc_dts_iosf: Use Intel TCC library")
Reported-by: Bernhard Krug <[email protected]>
Signed-off-by: Hans de Goede <[email protected]>
Acked-by: Zhang Rui <[email protected]>
Cc: 6.3+ <[email protected]> # 6.3+
Signed-off-by: Rafael J. Wysocki <[email protected]>
As described in commit 38d11da ("dm: don't lock fs when the map is
NULL in process of resume"), a deadlock may be triggered between
do_resume() and do_mount().

This commit preserves the fix from commit 38d11da but moves it to
where it also serves to fix a similar deadlock between do_suspend()
and do_mount().  It does so, if the active map is NULL, by clearing
DM_SUSPEND_LOCKFS_FLAG in dm_suspend() which is called by both
do_suspend() and do_resume().

Fixes: 38d11da ("dm: don't lock fs when the map is NULL in process of resume")
Signed-off-by: Li Lingfeng <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>
Must check pmd->fail_io before using pmd->data_sm since
pmd->data_sm may be destroyed by other processes.

       P1(kworker)                             P2(message)
do_worker
 process_prepared
  process_prepared_discard_passdown_pt2
   dm_pool_dec_data_range
                                    pool_message
                                     commit
                                      dm_pool_commit_metadata
                                        ↓
                                       // commit failed
                                      metadata_operation_failed
                                       abort_transaction
                                        dm_pool_abort_metadata
                                         __open_or_format_metadata
                                           ↓
                                          dm_sm_disk_open
                                            ↓
                                           // open failed
                                           // pmd->data_sm is NULL
    dm_sm_dec_blocks
      ↓
     // try to access pmd->data_sm --> UAF

As shown above, if dm_pool_commit_metadata() and
dm_pool_abort_metadata() fail in pool_message process, kworker may
trigger UAF.

Fixes: be500ed ("dm space maps: improve performance with inc/dec on ranges of blocks")
Cc: [email protected]
Signed-off-by: Li Lingfeng <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>
issue_discard() passes GFP_NOWAIT to __blkdev_issue_discard() despite
its code assuming bio_alloc() always succeeds.

Commit 3dba53a ("dm thin: use __blkdev_issue_discard for async
discard support") clearly shows where things went bad:

Before commit 3dba53a, dm-thin.c's open-coded
__blkdev_issue_discard_async() properly handled using GFP_NOWAIT.
Unfortunately __blkdev_issue_discard() doesn't and it was missed
during review.

Cc: [email protected]
Signed-off-by: Mike Snitzer <[email protected]>
Split abnormal IO in terms of the corresponding operation specific
max_sectors (max_discard_sectors, max_secure_erase_sectors or
max_write_zeroes_sectors).

This fixes a significant dm-thinp discard performance regression that
was introduced with commit e2dd8ac ("dm bio prison v1: improve
concurrent IO performance"). Relative to discard: max_discard_sectors
is used instead of max_sectors; which fixes excessive discard splitting
(e.g. max_sectors=128K vs max_discard_sectors=64M).

Tested by discarding an 1 Petabyte dm-thin device:
lvcreate -V 1125899906842624B -T test/pool -n thin
time blkdiscard /dev/test/thin

Before this fix (splitting discards every 128K): ~116m
 After this fix (splitting discards every 64M) : 0m33.460s

Reported-by: Zorro Lang <[email protected]>
Fixes: 06961c4 ("dm: split discards further if target sets max_discard_granularity")
Requires: 13f6fac ("dm: allow targets to require splitting WRITE_ZEROES and SECURE_ERASE")
Fixes: e2dd8ac ("dm bio prison v1: improve concurrent IO performance")
Signed-off-by: Mike Snitzer <[email protected]>
According to nla_parse_nested_deprecated(), the tb[] is supposed to the
destination array with maxtype+1 elements. In current
tipc_nl_media_get() and __tipc_nl_media_set(), a larger array is used
which is unnecessary. This patch resize them to a proper size.

Fixes: 1e55417 ("tipc: add media set to new netlink api")
Fixes: 46f15c6 ("tipc: add media get/dump to new netlink api")
Signed-off-by: Lin Ma <[email protected]>
Reviewed-by: Florian Westphal <[email protected]>
Reviewed-by: Tung Nguyen <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
…open

Fix a possible memory leak in __stmmac_open when stmmac_init_phy fails.
It's also needed to free everything allocated by stmmac_setup_dma_desc
and not just the dma_conf struct.

Drop free_dma_desc_resources from __stmmac_open and correctly call
free_dma_desc_resources on each user of __stmmac_open on error.

Reported-by: Jose Abreu <[email protected]>
Fixes: ba39b34 ("net: ethernet: stmicro: stmmac: generate stmmac dma conf before open")
Signed-off-by: Christian Marangi <[email protected]>
Cc: [email protected]
Reviewed-by: Simon Horman <[email protected]>
Reviewed-by: Jose Abreu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
Previously, timestamps were printed using "%lld.%u" which is incorrect
for nanosecond values lower than 100,000,000 as they're fractional
digits, therefore leading zeros are meaningful.

This patch changes the format strings to "%lld.%09u" in order to add
leading zeros to the nanosecond value.

Fixes: 568ebc5 ("ptp: add the PTP_SYS_OFFSET ioctl to the testptp program")
Fixes: 4ec54f9 ("ptp: Fix compiler warnings in the testptp utility")
Fixes: 6ab0e47 ("Documentation: fix misc. warnings")
Signed-off-by: Alex Maftei <[email protected]>
Acked-by: Richard Cochran <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
Add check for ioremap() and return the error if it fails in order to
guarantee the success of ioremap().

Fixes: 862cd65 ("octeon_ep: Add driver framework and device initialization")
Signed-off-by: Jiasheng Jiang <[email protected]>
Reviewed-by: Kalesh AP <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
Recently syzkaller reported a 7-year-old null-ptr-deref [0] that occurs
when a UDP-Lite socket tries to allocate a buffer under memory pressure.

Someone should have stumbled on the bug much earlier if UDP-Lite had been
used in a real app.  Also, we do not always need a large UDP-Lite workload
to hit the bug since UDP and UDP-Lite share the same memory accounting
limit.

Removing UDP-Lite would simplify UDP code removing a bunch of conditionals
in fast path.

Let's add a deprecation notice when UDP-Lite socket is created and schedule
its removal to 2025.

Link: https://lore.kernel.org/netdev/[email protected]/ [0]
Signed-off-by: Kuniyuki Iwashima <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
DCCP was marked as Orphan in the MAINTAINERS entry 2 years ago in commit
054c461 ("MAINTAINERS: dccp: move Gerrit Renker to CREDITS").  It says
we haven't heard from the maintainer for five years, so DCCP is not well
maintained for 7 years now.

Recently DCCP only receives updates for bugs, and major distros disable it
by default.

Removing DCCP would allow for better organisation of TCP fields to reduce
the number of cache lines hit in the fast path.

Let's add a deprecation notice when DCCP socket is created and schedule its
removal to 2025.

Signed-off-by: Kuniyuki Iwashima <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
Kuniyuki Iwashima says:

====================
udplite/dccp: Print deprecation notice.

UDP-Lite is assumed to have no users for 7 years, and DCCP is
orphaned for 7 years too.

Let's add deprecation notice and see if anyone responds to it.
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
…cifs-2.6

Pull smb client fixes from Steve French:
 "Eight, mostly small, smb3 client fixes:

   - important fix for deferred close oops (race with unmount) found
     with xfstest generic/098 to some servers

   - important reconnect fix

   - fix problem with max_credits mount option

   - two multichannel (interface related) fixes

   - one trivial removal of confusing comment

   - two small debugging improvements (to better spot crediting
     problems)"

* tag '6.4-rc6-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: add a warning when the in-flight count goes negative
  cifs: fix lease break oops in xfstest generic/098
  cifs: fix max_credits implementation
  cifs: fix sockaddr comparison in iface_cmp
  smb/client: print "Unknown" instead of bogus link speed value
  cifs: print all credit counters in DebugData
  cifs: fix status checks in cifs_tree_connect
  smb: remove obsolete comment
…ux/kernel/git/tytso/ext4

Pull ext4 fixes from Ted Ts'o:
 "Fix two regressions in ext4, one report by syzkaller[1], and reported
  by multiple users (and tracked by regzbot[2])"

[1] https://syzkaller.appspot.com/bug?extid=4acc7d910e617b360859
[2] https://linux-regtracking.leemhuis.info/regzbot/regression/ZIauBR7YiV3rVAHL@glitch/

* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: drop the call to ext4_error() from ext4_get_group_info()
  Revert "ext4: remove unnecessary check in ext4_bg_num_gdb_nometa"
…p.org/agd5f/linux into drm-fixes

amd-drm-fixes-6.4-2023-06-14:

amdgpu:
- GFX9 preemption fixes
- Add missing radeon secondary PCI ID
- vblflash fixes
- SMU 13 fix
- VCN 4.0 fix
- Re-enable TOPDOWN flag for large BAR systems to fix regression
- eDP fix
- PSR hang fix
- DPIA fix

radeon:
- fbdev client warning fix

Signed-off-by: Dave Airlie <[email protected]>
From: Alex Deucher <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
This seems to have existed for ever but is now more apparant after
commit 9bff18d ("drm/ttm: use per BO cleanup workers")

My analysis: two threads are running, one in the irq signalling the
fence, in dma_fence_signal_timestamp_locked, it has done the
DMA_FENCE_FLAG_SIGNALLED_BIT setting, but hasn't yet reached the
callbacks.

The second thread in nouveau_cli_work_ready, where it sees the fence is
signalled, so then puts the fence, cleanups the object and frees the
work item, which contains the callback.

Thread one goes again and tries to call the callback and causes the
use-after-free.

Proposed fix: lock the fence signalled check in nouveau_cli_work_ready,
so either the callbacks are done or the memory is freed.

Reviewed-by: Karol Herbst <[email protected]>
Fixes: 11e451e ("drm/nouveau: remove fence wait code from deferred client work handler")
Cc: [email protected]
Signed-off-by: Dave Airlie <[email protected]>
Link: https://lore.kernel.org/dri-devel/[email protected]/
…/git/mchehab/linux-media

Pull media fixes from Mauro Carvalho Chehab:
 "A fix for dvb-core to avoid a race condition during DVB board
  registration"

* tag 'media/v6.4-6' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  Revert "media: dvb-core: Fix use-after-free on race condition at dvb_frontend"
…/kernel/git/broonie/regmap

Pull regmap fix from Mark Brown:
 "Another fix for the maple tree cache, Takashi noticed that unlike
  other caches the maple tree cache didn't check for read only registers
  before trying to sync which would result in spurious syncs for read
  only registers where we don't have a default.

  This was due to the check being open coded in the caches, we now check
  in the shared 'does this register need sync' function so that is fixed
  for this and future caches"

* tag 'regmap-fix-v6.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
  regmap: regcache: Don't sync read-only registers
…nux/kernel/git/broonie/regulator

Pull regulator fix from Mark Brown:
 "The set of regulators described for the Qualcomm PM8550 just seems to
  have been completely wrong and would likely not have worked at all if
  anything tried to actually configure anything except for enabling and
  disabling at runtime"

* tag 'regulator-fix-v6.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: qcom-rpmh: Fix regulators for PM8550
…rnel/git/broonie/spi

Pull spi fixes from Mark Brown:
 "A few more driver specific fixes.

  The DesignWare fix is for an issue introduced by conversion to the
  chip select accessor functions and is pretty important but the other
  two are less severe"

* tag 'spi-fix-v6.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: dw: Replace incorrect spi_get_chipselect with set
  spi: fsl-dspi: avoid SCK glitches with continuous transfers
  spi: cadence-quadspi: Add missing check for dma_set_mask
Linux 6.4

Conflicts:
	fs/xfs/xfs_buf.c
	lib/raid6/algos.c
	scripts/kallsyms.c
	scripts/link-vmlinux.sh
	tools/Makefile
@ddiss ddiss marked this pull request as draft September 8, 2023 22:31
@ddiss
Copy link
Author

ddiss commented Sep 8, 2023

Looks like I need to go through some of the test failures first before proceeding further. Flagging as draft.

Rebase atop commit 2b5a0e4 ("objtool/idle: Validate __cpuidle code
as noinstr")

Signed-off-by: David Disseldorp <[email protected]>
Rebase atop commit cf587db ("kernel: Allow a kernel thread's name
to be set in copy_process").

Signed-off-by: David Disseldorp <[email protected]>
Rebase atop commit ade1229 ("dma-mapping: no need to pass a
bus_type into get_arch_dma_ops()")

Signed-off-by: David Disseldorp <[email protected]>
@tavip
Copy link
Member

tavip commented Sep 11, 2023

Thanks for the PR @ddiss ! I suspect the CI issues are due to something missing with the clang integration. Nothing obvious to me at first glance, maybe @rodionov can spot something?

@ddiss
Copy link
Author

ddiss commented Sep 11, 2023

Thanks for the PR @ddiss ! I suspect the CI issues are due to something missing with the clang integration. Nothing obvious to me at first glance, maybe @rodionov can spot something?

Thanks @tavip ! Assistance with the CI issues would be much appreciated, as my local environment doesn't currently run the entire LKL test suite.
The minor v2 update I just pushed has the following modifications:

  • drop lkl/tests/disk.c parameter check, which isn't relevant for v6.4 (will submit separately)
  • fix checkpatch complaints

@rodionov
Copy link

Thanks for the PR @ddiss ! I suspect the CI issues are due to something missing with the clang integration. Nothing obvious to me at first glance, maybe @rodionov can spot something?

Indeed, there are some minor changes to scripts/Makefile.clang which might break integration with clang. Investigating this.

@rodionov
Copy link

rodionov commented Sep 12, 2023

@ddiss this one-liner for tools/lkl/Makefile.autoconf should fix the problem with clang-build and lkl-fuzzers build targets:

diff --git a/tools/lkl/Makefile.autoconf b/tools/lkl/Makefile.autoconf
index a15753193a4f..fc47f12a3093 100644
--- a/tools/lkl/Makefile.autoconf
+++ b/tools/lkl/Makefile.autoconf
@@ -166,7 +166,7 @@ endef
 define do_autoconf_llvm
   $(eval LLVM_PREFIX := $(if $(filter %/,$(LLVM)),$(LLVM)))
   $(eval LLVM_SUFFIX := $(if $(filter -%,$(LLVM)),$(LLVM)))
-  export CROSS_COMPILE := $(CROSS_COMPILE)
+  export CLANG_TARGET_FLAGS_lkl := $(CROSS_COMPILE)
   export CC := $(LLVM_PREFIX)clang$(LLVM_SUFFIX)
   export LD := $(LLVM_PREFIX)ld.lld$(LLVM_SUFFIX)
   export AR := $(LLVM_PREFIX)llvm-ar$(LLVM_SUFFIX)

Due to some changes in scripts/Makefile.clang it no longer uses CROSS_COMPILE variable to pass --target= argument to llvm toolchain. Instead, it is using CLANG_TARGET_FLAGS_$(SRCARCH) which effectively is CLANG_TARGET_FLAGS_lkl.

Regarding kasan tests it's still not clear why it is failing -- doesn't seem to be related to clang.

@tavip
Copy link
Member

tavip commented Sep 12, 2023

Quick update on kasan test failure: kasan kunit tests have switch to tracepoints in 7ce0ea1 and LKL currently does not support tracepoints. Still looking into what it would take to implement tracepoint support for LKL.

@ddiss
Copy link
Author

ddiss commented Sep 12, 2023

@rodionov thanks - I've pushed a new commit with your proposed Makefile.autoconf change.
edit: I've retained your authorship and added my sign-off tag to satisfy checkpatch.
edit again: arg, checkpatch still complains as the author sign-off is missing

@zouyonghao
Copy link

Is it possible to add a new branch or a tag for each Linux release?

@ddiss
Copy link
Author

ddiss commented Sep 21, 2023

Is it possible to add a new branch or a tag for each Linux release?

I think a signed tag (e.g. lkl-vX.Y) where X.Y matches the corresponding mainline tag would be nice to have.
Branching for each release would be a bit confusing IMO; it might be interpreted as indicating ongoing maintenance for old versions.

@ddiss
Copy link
Author

ddiss commented Dec 14, 2023

Quick update on kasan test failure: kasan kunit tests have switch to tracepoints in 7ce0ea1 and LKL currently does not support tracepoints. Still looking into what it would take to implement tracepoint support for LKL.

@tavip any luck finding out what's needed for LKL tracepoints? I'm a bit clueless when it comes to tracepoint support, so would greatly appreciate any guidance you could offer.

@thehajime
Copy link
Member

@ddiss @tavip @rodionov long time no see you all...

Quick update on kasan test failure: kasan kunit tests have switch to tracepoints in 7ce0ea1 and LKL currently does not support tracepoints. Still looking into what it would take to implement tracepoint support for LKL.

@tavip any luck finding out what's needed for LKL tracepoints? I'm a bit clueless when it comes to tracepoint support, so would greatly appreciate any guidance you could offer.

I did a quick look and the following diff makes the kasan-test failure gone.

diff --git a/arch/lkl/Kconfig b/arch/lkl/Kconfig
index cb319652ff49..915b5400bbbd 100644
--- a/arch/lkl/Kconfig
+++ b/arch/lkl/Kconfig
@@ -41,6 +41,9 @@ config LKL
        select GENERIC_STRNCPY_FROM_USER
        select GENERIC_STRNLEN_USER
        select HAVE_ARCH_KASAN
+       select TRACING
 
 config LKL_FUZZING
        bool "LLVM fuzzing instrumentation"
diff --git a/tools/lkl/tests/boot.c b/tools/lkl/tests/boot.c
index d8601306ac34..118903dc68bd 100644
--- a/tools/lkl/tests/boot.c
+++ b/tools/lkl/tests/boot.c
@@ -511,7 +511,7 @@ static int lkl_test_kasan(void)
 
        line = strtok(log, "\n");
        while (line) {
-               if (sscanf(line, "[ %*f] ok %*d - kasa%c%c", &c, &d) == 1 &&
+               if (sscanf(line, "[ %*f] ok %*d kasa%c%c", &c, &d) == 1 &&
                           c == 'n') {
                        lkl_test_logf("%s", line);
                        return TEST_SUCCESS;

One thing that I don't understand is the format difference of kasan test report. I cannot spot where it is coming from...
With the older version, it generates

  [    0.661363] # kasan: pass:43 fail:0 skip:13 total:56
  [    0.661367] # Totals: pass:43 fail:0 skip:13 total:56
  [    0.661371] ok 1 - kasan

while the new one does

  [    0.779125] # kasan: pass:45 fail:0 skip:13 total:58
  [    0.779131] # Totals: pass:45 fail:0 skip:13 total:58
  [    0.779137] ok 1 kasan

(with - or not)

Hope this helps this PR move forward.

@ddiss
Copy link
Author

ddiss commented May 10, 2024

@thehajime it's great to hear from you and thanks for the patch!
Would you mind if I add it as a separate commit with your authorship / sign off to this PR?

@thehajime
Copy link
Member

thehajime@d66e6c0

Yes, you can pick it from above.

rodionov and others added 2 commits May 10, 2024 20:24
Due to some changes in scripts/Makefile.clang it no longer uses
CROSS_COMPILE variable to pass --target= argument to llvm toolchain.
Instead, it is using CLANG_TARGET_FLAGS_$(SRCARCH) which effectively is
CLANG_TARGET_FLAGS_lkl.

Signed-off-by: Eugene Rodionov <[email protected]>
Signed-off-by: David Disseldorp <[email protected]>
This commit enables trace point and related kconfigs when
kasan is used.

Fixes: 7ce0ea1 ("kasan: switch kunit tests to console tracepoints")
Signed-off-by: Hajime Tazaki <[email protected]>
@ddiss
Copy link
Author

ddiss commented May 10, 2024

thehajime@d66e6c0

Yes, you can pick it from above.

Done. I've also updated @rodionov 's Makefile.autoconf commit to include an author sign off as per #530 (comment) .

@ddiss ddiss marked this pull request as ready for review May 10, 2024 10:43
@thehajime
Copy link
Member

lgtm. Would wait for @tavip @rodionov for the review.

@tavip
Copy link
Member

tavip commented May 10, 2024

LGTM, thanks @thehajime for tracking this down!

@rodionov
Copy link

LGTM. Thanks, folks!

@thehajime thehajime merged commit 6757641 into lkl:master May 12, 2024
13 checks passed
@thehajime
Copy link
Member

thanks all !
I'll take a look at further upstreams (6.6, longterm one especially) in my spare time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.