Replies: 50 comments 1 reply
-
@garceri Hello,
For all these outputs, please employ the Markdown formatting. |
Beta Was this translation helpful? Give feedback.
-
I think Genoa and Zen4/Hawk Point (#84) are missing the temperature for the same reason: no thermal register known. |
Beta Was this translation helpful? Give feedback.
-
Meanwhile I can fix the voltage Vcore if you provide me the CLI output requested above. |
Beta Was this translation helpful? Give feedback.
-
Can you please pull |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Which temperature are we reading when System is idling ?
|
Beta Was this translation helpful? Give feedback.
-
Let the server idle for ten minutes, Temp readings reported by corefreq seem to be off: This is what lm-sensors report
and finally this is what corefreq reports:
|
Beta Was this translation helpful? Give feedback.
-
First Gen needed an temperature offset. Perhaps same with Genoa. Can you compile my SMU tool zencli ? cc zencli.c -o zencli As root, you will peek the thermal registers I just know: ## since Zen gen1
zencli smu 0x59800
## per CCD
zencli smu 0x59954
zencli smu 0x59958
zencli smu 0x5995C
zencli smu 0x59960
zencli smu 0x59964
zencli smu 0x59968
zencli smu 0x5996C
## Family 19h APU
zencli smu 0x59B08
As a défaut, CoreFreq is showing the relative frequency. |
Beta Was this translation helpful? Give feedback.
-
The Mhz reported, are relative to what ? Here are the zencli outputs you requested:
|
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Can you also dump the CCD range ? ## per CCD
zencli smu 0x59954
zencli smu 0x59958
zencli smu 0x5995C
zencli smu 0x59960
zencli smu 0x59964
zencli smu 0x59968
zencli smu 0x5996C |
Beta Was this translation helpful? Give feedback.
-
I see, https://elixir.bootlin.com/linux/latest/source/drivers/hwmon/k10temp.c#L475 So SMU address is 0x59B00 + (CCD number * 4) |
Beta Was this translation helpful? Give feedback.
-
Fortunately thermal function was already made for Raphael. |
Beta Was this translation helpful? Give feedback.
-
It may be a wrong Register address Line 208 in 87344fa and replace code with this one: #define SMU_AMD_THM_TCTL_CCD_REGISTER_F19H_61H \
(SMU_AMD_THM_TCTL_REGISTER_F17H + 0x300) Next please rebuild, unload, reload and test for temperature EDIT: For DIMM geometry, can you also peek those addresses ## Prior Zen4
./zencli smu 0x50030
./zencli smu 0x50034
./zencli smu 0x50038
./zencli smu 0x5003C
## Since Zen4
./zencli smu 0x50040
./zencli smu 0x50044
./zencli smu 0x50048
./zencli smu 0x5004C |
Beta Was this translation helpful? Give feedback.
-
I pulled and recompiled the commit you mentioned (7c926af from develop) and seems to have improved a bit, still there are some cores whose's temperatures are somehow reported as 0 C..
Here are the zencli peeks you requested:
|
Beta Was this translation helpful? Give feedback.
-
P-StatesYou can reprogram P-States from UI but also using the driver parameters. For example, your EPYC default P-States were discovered as bellow.
Now suppose you want to alter these two enabled P-States to respectively insmod build/corefreqk.ko Register_ClockSource=1 \
Register_Governor=1 Register_CPU_Idle=1 Register_CPU_Freq=1 \
Turbo_Ratio_Unlock=1 TurboBoost_Enable="0,1" \
Ratio_Boost="-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,12,30" Remarks
I hope you won't find this method too complicated. |
Beta Was this translation helpful? Give feedback.
-
It will be some time until I can test it, the server is running production workloads and I'll need to take it offline to test this. |
Beta Was this translation helpful? Give feedback.
-
Actions are transferred to the todo list To refresh the EPYC 9274F Wiki, can you however pull the latest from corefreq-cli -s -n -m -n -z -n -B -n -M -n -k -n -C 1 -n -i 1 -n -V 1 -n -W 1 -n -R |
Beta Was this translation helpful? Give feedback.
-
Here ya go:
|
Beta Was this translation helpful? Give feedback.
-
Commit dc6280f fixes the spacing in the Power Monitoring bottom area, especially when a 3 digits power is measured. Format:
CPU Freq(MHz) Accumulator Min Energy(J) Max Min Power(W) Max
000 3798.08 000000000000411306 0.00 6.28 8.87 0.00 6.28 8.87
001 3682.77 000000000000496376 0.00 7.57 8.20 0.00 7.57 8.20
002 3850.84 000000000000417714 0.00 6.37 9.53 0.00 6.37 9.53
003 3830.85 000000000000496311 0.00 7.57 8.59 0.00 7.57 8.59
004 3860.38 000000000000437088 0.00 6.67 8.64 0.00 6.67 8.64
005 3826.94 000000000000448859 0.00 6.85 8.19 0.00 6.85 8.19
006 3951.14 000000000000407283 0.00 6.21 8.10 0.00 6.21 8.10
007 3820.92 000000000000512015 0.00 7.81 8.20 0.00 7.81 8.20
008 3838.89 000000000000453518 0.00 6.92 7.74 0.00 6.92 7.74
009 3807.62 000000000000450171 0.00 6.87 8.38 0.00 6.87 8.38
010 3969.38 000000000000451074 0.00 6.88 7.74 0.00 6.88 7.74
011 4025.41 000000000000339855 0.00 5.19 7.88 0.00 5.19 7.88
012 3892.19 000000000000417007 0.00 6.36 7.74 0.00 6.36 7.74
013 3905.71 000000000000386342 0.00 5.90 7.68 0.00 5.90 7.68
014 3726.84 000000000000527824 0.00 8.05 9.04 0.00 8.05 9.04
015 3908.34 000000000000494423 0.00 7.54 8.21 0.00 7.54 8.21
016 3984.11 000000000000000000 0.00 0.00 0.00 0.00 0.00 0.00
017 3749.12 000000000000000000 0.00 0.00 0.00 0.00 0.00 0.00
018 3783.17 000000000000000000 0.00 0.00 0.00 0.00 0.00 0.00
019 3755.73 000000000000000000 0.00 0.00 0.00 0.00 0.00 0.00
020 3774.03 000000000000000000 0.00 0.00 0.00 0.00 0.00 0.00
021 3810.57 000000000000000000 0.00 0.00 0.00 0.00 0.00 0.00
022 4013.79 000000000000000000 0.00 0.00 0.00 0.00 0.00 0.00
023 3953.22 000000000000000000 0.00 0.00 0.00 0.00 0.00 0.00
024 3711.33 000000000000000000 0.00 0.00 0.00 0.00 0.00 0.00
025 4015.93 000000000000000000 0.00 0.00 0.00 0.00 0.00 0.00
026 3878.68 000000000000000000 0.00 0.00 0.00 0.00 0.00 0.00
027 3880.01 000000000000000000 0.00 0.00 0.00 0.00 0.00 0.00
028 3820.70 000000000000000000 0.00 0.00 0.00 0.00 0.00 0.00
029 3956.70 000000000000000000 0.00 0.00 0.00 0.00 0.00 0.00
030 3721.36 000000000000000000 0.00 0.00 0.00 0.00 0.00 0.00
031 3987.81 000000000000000000 0.00 0.00 0.00 0.00 0.00 0.00
Energy(J) Package[0] Cores Uncore Memory
17.68 145.5 145.50 0.02 109.1 126.85 10.50 17.1 18.67 0.00 0.0 0.00
Power(W)
17.68 145.5 145.50 0.02 109.1 126.85 10.50 17.1 18.67 0.00 0.0 0.00 Can you please show me yours ? |
Beta Was this translation helpful? Give feedback.
-
Compiled with that commit:
|
Beta Was this translation helpful? Give feedback.
-
Thank you. Version |
Beta Was this translation helpful? Give feedback.
-
It looks like some bits have architecturally been fixed since kernel I'm now reading the Is it still case on Genoa ? |
Beta Was this translation helpful? Give feedback.
-
Hello, In this AMD HSMP source code, I'm reading an address exception for Zen family 1A Can you edit and replace value of Line 222 in 3a25d36 with this value: #define SMU_HSMP_CMD 0x3b10934 I would also like to put the Line 7892 in 3a25d36 as below: switch (PUBLIC(RO(Proc))->ArchID) {
case AMD_Zen4_PHX2:
case AMD_Zen4_HWK:
case AMD_Zen4_PHX:
case AMD_Zen4_RPL:
case AMD_Zen3Plus_RMB:
case AMD_Zen3_VMR:
case AMD_Zen2_MTS:
Core_AMD_SMN_Read(XtraCOF,
SMU_AMD_F17H_MATISSE_COF,
PRIVATE(OF(Zen)).Device.DF);
break;
case AMD_Zen4_Bergamo:
case AMD_EPYC_Rome_CPK:
case AMD_Zen4_Genoa:
Core_AMD_SMN_Read(XtraCOF,
SMU_AMD_F17H_ZEN2_MCM_COF,
PRIVATE(OF(Zen)).Device.DF);
break;
} Rebuild and reload CoreFreq Finally post the output of Thank you for helping. |
Beta Was this translation helpful? Give feedback.
-
Will do as soon as I can, pleased to help! |
Beta Was this translation helpful? Give feedback.
-
Here you go, sorry for the delay:
|
Beta Was this translation helpful? Give feedback.
-
Thanks.
|
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
@garceri Hello. No clue about the EDIT: you may also find thermal limits into the BMC screens and/or the CPU cooler datasheet. |
Beta Was this translation helpful? Give feedback.
-
Source: BIOS / BMC / Bundle Firmware for MBD-H13SSL-N : IPMI Firmware / BIOS Release Notes
|
Beta Was this translation helpful? Give feedback.
-
I can't get any temperature readings under Ubuntu 22.04 running kernel 6.5
Beta Was this translation helpful? Give feedback.
All reactions