mirror of
https://github.com/torvalds/linux.git
synced 2025-12-01 07:26:02 +07:00
Merge tag 'x86_cache_for_v6.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 resource control updates from Borislav Petkov: "Add support on AMD for assigning QoS bandwidth counters to resources (RMIDs) with the ability for those resources to be tracked by the counters as long as they're assigned to them. Previously, due to hw limitations, bandwidth counts from untracked resources would get lost when those resources are not tracked. Refactor the code and user interfaces to be able to also support other, similar features on ARM, for example" * tag 'x86_cache_for_v6.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (35 commits) fs/resctrl: Fix counter auto-assignment on mkdir with mbm_event enabled MAINTAINERS: resctrl: Add myself as reviewer x86/resctrl: Configure mbm_event mode if supported fs/resctrl: Introduce the interface to switch between monitor modes fs/resctrl: Disable BMEC event configuration when mbm_event mode is enabled fs/resctrl: Introduce the interface to modify assignments in a group fs/resctrl: Introduce mbm_L3_assignments to list assignments in a group fs/resctrl: Auto assign counters on mkdir and clean up on group removal fs/resctrl: Introduce mbm_assign_on_mkdir to enable assignments on mkdir fs/resctrl: Provide interface to update the event configurations fs/resctrl: Add event configuration directory under info/L3_MON/ fs/resctrl: Support counter read/reset with mbm_event assignment mode x86/resctrl: Implement resctrl_arch_reset_cntr() and resctrl_arch_cntr_read() x86/resctrl: Refactor resctrl_arch_rmid_read() fs/resctrl: Introduce counter ID read, reset calls in mbm_event mode fs/resctrl: Pass struct rdtgroup instead of individual members fs/resctrl: Add the functionality to unassign MBM events fs/resctrl: Add the functionality to assign MBM events x86,fs/resctrl: Implement resctrl_arch_config_cntr() to assign a counter with ABMC fs/resctrl: Introduce event configuration field in struct mon_evt ...
This commit is contained in:
@@ -26,6 +26,7 @@ MBM (Memory Bandwidth Monitoring) "cqm_mbm_total", "cqm_mbm_local"
|
||||
MBA (Memory Bandwidth Allocation) "mba"
|
||||
SMBA (Slow Memory Bandwidth Allocation) ""
|
||||
BMEC (Bandwidth Monitoring Event Configuration) ""
|
||||
ABMC (Assignable Bandwidth Monitoring Counters) ""
|
||||
=============================================== ================================
|
||||
|
||||
Historically, new features were made visible by default in /proc/cpuinfo. This
|
||||
@@ -256,6 +257,144 @@ with the following files:
|
||||
# cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
|
||||
0=0x30;1=0x30;3=0x15;4=0x15
|
||||
|
||||
"mbm_assign_mode":
|
||||
The supported counter assignment modes. The enclosed brackets indicate which mode
|
||||
is enabled. The MBM events associated with counters may reset when "mbm_assign_mode"
|
||||
is changed.
|
||||
::
|
||||
|
||||
# cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
|
||||
[mbm_event]
|
||||
default
|
||||
|
||||
"mbm_event":
|
||||
|
||||
mbm_event mode allows users to assign a hardware counter to an RMID, event
|
||||
pair and monitor the bandwidth usage as long as it is assigned. The hardware
|
||||
continues to track the assigned counter until it is explicitly unassigned by
|
||||
the user. Each event within a resctrl group can be assigned independently.
|
||||
|
||||
In this mode, a monitoring event can only accumulate data while it is backed
|
||||
by a hardware counter. Use "mbm_L3_assignments" found in each CTRL_MON and MON
|
||||
group to specify which of the events should have a counter assigned. The number
|
||||
of counters available is described in the "num_mbm_cntrs" file. Changing the
|
||||
mode may cause all counters on the resource to reset.
|
||||
|
||||
Moving to mbm_event counter assignment mode requires users to assign the counters
|
||||
to the events. Otherwise, the MBM event counters will return 'Unassigned' when read.
|
||||
|
||||
The mode is beneficial for AMD platforms that support more CTRL_MON
|
||||
and MON groups than available hardware counters. By default, this
|
||||
feature is enabled on AMD platforms with the ABMC (Assignable Bandwidth
|
||||
Monitoring Counters) capability, ensuring counters remain assigned even
|
||||
when the corresponding RMID is not actively used by any processor.
|
||||
|
||||
"default":
|
||||
|
||||
In default mode, resctrl assumes there is a hardware counter for each
|
||||
event within every CTRL_MON and MON group. On AMD platforms, it is
|
||||
recommended to use the mbm_event mode, if supported, to prevent reset of MBM
|
||||
events between reads resulting from hardware re-allocating counters. This can
|
||||
result in misleading values or display "Unavailable" if no counter is assigned
|
||||
to the event.
|
||||
|
||||
* To enable "mbm_event" counter assignment mode:
|
||||
::
|
||||
|
||||
# echo "mbm_event" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
|
||||
|
||||
* To enable "default" monitoring mode:
|
||||
::
|
||||
|
||||
# echo "default" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
|
||||
|
||||
"num_mbm_cntrs":
|
||||
The maximum number of counters (total of available and assigned counters) in
|
||||
each domain when the system supports mbm_event mode.
|
||||
|
||||
For example, on a system with maximum of 32 memory bandwidth monitoring
|
||||
counters in each of its L3 domains:
|
||||
::
|
||||
|
||||
# cat /sys/fs/resctrl/info/L3_MON/num_mbm_cntrs
|
||||
0=32;1=32
|
||||
|
||||
"available_mbm_cntrs":
|
||||
The number of counters available for assignment in each domain when mbm_event
|
||||
mode is enabled on the system.
|
||||
|
||||
For example, on a system with 30 available [hardware] assignable counters
|
||||
in each of its L3 domains:
|
||||
::
|
||||
|
||||
# cat /sys/fs/resctrl/info/L3_MON/available_mbm_cntrs
|
||||
0=30;1=30
|
||||
|
||||
"event_configs":
|
||||
Directory that exists when "mbm_event" counter assignment mode is supported.
|
||||
Contains a sub-directory for each MBM event that can be assigned to a counter.
|
||||
|
||||
Two MBM events are supported by default: mbm_local_bytes and mbm_total_bytes.
|
||||
Each MBM event's sub-directory contains a file named "event_filter" that is
|
||||
used to view and modify which memory transactions the MBM event is configured
|
||||
with. The file is accessible only when "mbm_event" counter assignment mode is
|
||||
enabled.
|
||||
|
||||
List of memory transaction types supported:
|
||||
|
||||
========================== ========================================================
|
||||
Name Description
|
||||
========================== ========================================================
|
||||
dirty_victim_writes_all Dirty Victims from the QOS domain to all types of memory
|
||||
remote_reads_slow_memory Reads to slow memory in the non-local NUMA domain
|
||||
local_reads_slow_memory Reads to slow memory in the local NUMA domain
|
||||
remote_non_temporal_writes Non-temporal writes to non-local NUMA domain
|
||||
local_non_temporal_writes Non-temporal writes to local NUMA domain
|
||||
remote_reads Reads to memory in the non-local NUMA domain
|
||||
local_reads Reads to memory in the local NUMA domain
|
||||
========================== ========================================================
|
||||
|
||||
For example::
|
||||
|
||||
# cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_total_bytes/event_filter
|
||||
local_reads,remote_reads,local_non_temporal_writes,remote_non_temporal_writes,
|
||||
local_reads_slow_memory,remote_reads_slow_memory,dirty_victim_writes_all
|
||||
|
||||
# cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_local_bytes/event_filter
|
||||
local_reads,local_non_temporal_writes,local_reads_slow_memory
|
||||
|
||||
Modify the event configuration by writing to the "event_filter" file within
|
||||
the "event_configs" directory. The read/write "event_filter" file contains the
|
||||
configuration of the event that reflects which memory transactions are counted by it.
|
||||
|
||||
For example::
|
||||
|
||||
# echo "local_reads, local_non_temporal_writes" >
|
||||
/sys/fs/resctrl/info/L3_MON/event_configs/mbm_total_bytes/event_filter
|
||||
|
||||
# cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_total_bytes/event_filter
|
||||
local_reads,local_non_temporal_writes
|
||||
|
||||
"mbm_assign_on_mkdir":
|
||||
Exists when "mbm_event" counter assignment mode is supported. Accessible
|
||||
only when "mbm_event" counter assignment mode is enabled.
|
||||
|
||||
Determines if a counter will automatically be assigned to an RMID, MBM event
|
||||
pair when its associated monitor group is created via mkdir. Enabled by default
|
||||
on boot, also when switched from "default" mode to "mbm_event" counter assignment
|
||||
mode. Users can disable this capability by writing to the interface.
|
||||
|
||||
"0":
|
||||
Auto assignment is disabled.
|
||||
"1":
|
||||
Auto assignment is enabled.
|
||||
|
||||
Example::
|
||||
|
||||
# echo 0 > /sys/fs/resctrl/info/L3_MON/mbm_assign_on_mkdir
|
||||
# cat /sys/fs/resctrl/info/L3_MON/mbm_assign_on_mkdir
|
||||
0
|
||||
|
||||
"max_threshold_occupancy":
|
||||
Read/write file provides the largest value (in
|
||||
bytes) at which a previously used LLC_occupancy
|
||||
@@ -380,10 +519,77 @@ When monitoring is enabled all MON groups will also contain:
|
||||
for the L3 cache they occupy). These are named "mon_sub_L3_YY"
|
||||
where "YY" is the node number.
|
||||
|
||||
When the 'mbm_event' counter assignment mode is enabled, reading
|
||||
an MBM event of a MON group returns 'Unassigned' if no hardware
|
||||
counter is assigned to it. For CTRL_MON groups, 'Unassigned' is
|
||||
returned if the MBM event does not have an assigned counter in the
|
||||
CTRL_MON group nor in any of its associated MON groups.
|
||||
|
||||
"mon_hw_id":
|
||||
Available only with debug option. The identifier used by hardware
|
||||
for the monitor group. On x86 this is the RMID.
|
||||
|
||||
When monitoring is enabled all MON groups may also contain:
|
||||
|
||||
"mbm_L3_assignments":
|
||||
Exists when "mbm_event" counter assignment mode is supported and lists the
|
||||
counter assignment states of the group.
|
||||
|
||||
The assignment list is displayed in the following format:
|
||||
|
||||
<Event>:<Domain ID>=<Assignment state>;<Domain ID>=<Assignment state>
|
||||
|
||||
Event: A valid MBM event in the
|
||||
/sys/fs/resctrl/info/L3_MON/event_configs directory.
|
||||
|
||||
Domain ID: A valid domain ID. When writing, '*' applies the changes
|
||||
to all the domains.
|
||||
|
||||
Assignment states:
|
||||
|
||||
_ : No counter assigned.
|
||||
|
||||
e : Counter assigned exclusively.
|
||||
|
||||
Example:
|
||||
|
||||
To display the counter assignment states for the default group.
|
||||
::
|
||||
|
||||
# cd /sys/fs/resctrl
|
||||
# cat /sys/fs/resctrl/mbm_L3_assignments
|
||||
mbm_total_bytes:0=e;1=e
|
||||
mbm_local_bytes:0=e;1=e
|
||||
|
||||
Assignments can be modified by writing to the interface.
|
||||
|
||||
Examples:
|
||||
|
||||
To unassign the counter associated with the mbm_total_bytes event on domain 0:
|
||||
::
|
||||
|
||||
# echo "mbm_total_bytes:0=_" > /sys/fs/resctrl/mbm_L3_assignments
|
||||
# cat /sys/fs/resctrl/mbm_L3_assignments
|
||||
mbm_total_bytes:0=_;1=e
|
||||
mbm_local_bytes:0=e;1=e
|
||||
|
||||
To unassign the counter associated with the mbm_total_bytes event on all the domains:
|
||||
::
|
||||
|
||||
# echo "mbm_total_bytes:*=_" > /sys/fs/resctrl/mbm_L3_assignments
|
||||
# cat /sys/fs/resctrl/mbm_L3_assignments
|
||||
mbm_total_bytes:0=_;1=_
|
||||
mbm_local_bytes:0=e;1=e
|
||||
|
||||
To assign a counter associated with the mbm_total_bytes event on all domains in
|
||||
exclusive mode:
|
||||
::
|
||||
|
||||
# echo "mbm_total_bytes:*=e" > /sys/fs/resctrl/mbm_L3_assignments
|
||||
# cat /sys/fs/resctrl/mbm_L3_assignments
|
||||
mbm_total_bytes:0=e;1=e
|
||||
mbm_local_bytes:0=e;1=e
|
||||
|
||||
When the "mba_MBps" mount option is used all CTRL_MON groups will also contain:
|
||||
|
||||
"mba_MBps_event":
|
||||
@@ -1429,6 +1635,125 @@ View the llc occupancy snapshot::
|
||||
# cat /sys/fs/resctrl/p1/mon_data/mon_L3_00/llc_occupancy
|
||||
11234000
|
||||
|
||||
|
||||
Examples on working with mbm_assign_mode
|
||||
========================================
|
||||
|
||||
a. Check if MBM counter assignment mode is supported.
|
||||
::
|
||||
|
||||
# mount -t resctrl resctrl /sys/fs/resctrl/
|
||||
|
||||
# cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
|
||||
[mbm_event]
|
||||
default
|
||||
|
||||
The "mbm_event" mode is detected and enabled.
|
||||
|
||||
b. Check how many assignable counters are supported.
|
||||
::
|
||||
|
||||
# cat /sys/fs/resctrl/info/L3_MON/num_mbm_cntrs
|
||||
0=32;1=32
|
||||
|
||||
c. Check how many assignable counters are available for assignment in each domain.
|
||||
::
|
||||
|
||||
# cat /sys/fs/resctrl/info/L3_MON/available_mbm_cntrs
|
||||
0=30;1=30
|
||||
|
||||
d. To list the default group's assign states.
|
||||
::
|
||||
|
||||
# cat /sys/fs/resctrl/mbm_L3_assignments
|
||||
mbm_total_bytes:0=e;1=e
|
||||
mbm_local_bytes:0=e;1=e
|
||||
|
||||
e. To unassign the counter associated with the mbm_total_bytes event on domain 0.
|
||||
::
|
||||
|
||||
# echo "mbm_total_bytes:0=_" > /sys/fs/resctrl/mbm_L3_assignments
|
||||
# cat /sys/fs/resctrl/mbm_L3_assignments
|
||||
mbm_total_bytes:0=_;1=e
|
||||
mbm_local_bytes:0=e;1=e
|
||||
|
||||
f. To unassign the counter associated with the mbm_total_bytes event on all domains.
|
||||
::
|
||||
|
||||
# echo "mbm_total_bytes:*=_" > /sys/fs/resctrl/mbm_L3_assignments
|
||||
# cat /sys/fs/resctrl/mbm_L3_assignment
|
||||
mbm_total_bytes:0=_;1=_
|
||||
mbm_local_bytes:0=e;1=e
|
||||
|
||||
g. To assign a counter associated with the mbm_total_bytes event on all domains in
|
||||
exclusive mode.
|
||||
::
|
||||
|
||||
# echo "mbm_total_bytes:*=e" > /sys/fs/resctrl/mbm_L3_assignments
|
||||
# cat /sys/fs/resctrl/mbm_L3_assignments
|
||||
mbm_total_bytes:0=e;1=e
|
||||
mbm_local_bytes:0=e;1=e
|
||||
|
||||
h. Read the events mbm_total_bytes and mbm_local_bytes of the default group. There is
|
||||
no change in reading the events with the assignment.
|
||||
::
|
||||
|
||||
# cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
|
||||
779247936
|
||||
# cat /sys/fs/resctrl/mon_data/mon_L3_01/mbm_total_bytes
|
||||
562324232
|
||||
# cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
|
||||
212122123
|
||||
# cat /sys/fs/resctrl/mon_data/mon_L3_01/mbm_local_bytes
|
||||
121212144
|
||||
|
||||
i. Check the event configurations.
|
||||
::
|
||||
|
||||
# cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_total_bytes/event_filter
|
||||
local_reads,remote_reads,local_non_temporal_writes,remote_non_temporal_writes,
|
||||
local_reads_slow_memory,remote_reads_slow_memory,dirty_victim_writes_all
|
||||
|
||||
# cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_local_bytes/event_filter
|
||||
local_reads,local_non_temporal_writes,local_reads_slow_memory
|
||||
|
||||
j. Change the event configuration for mbm_local_bytes.
|
||||
::
|
||||
|
||||
# echo "local_reads, local_non_temporal_writes, local_reads_slow_memory, remote_reads" >
|
||||
/sys/fs/resctrl/info/L3_MON/event_configs/mbm_local_bytes/event_filter
|
||||
|
||||
# cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_local_bytes/event_filter
|
||||
local_reads,local_non_temporal_writes,local_reads_slow_memory,remote_reads
|
||||
|
||||
k. Now read the local events again. The first read may come back with "Unavailable"
|
||||
status. The subsequent read of mbm_local_bytes will display the current value.
|
||||
::
|
||||
|
||||
# cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
|
||||
Unavailable
|
||||
# cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
|
||||
2252323
|
||||
# cat /sys/fs/resctrl/mon_data/mon_L3_01/mbm_local_bytes
|
||||
Unavailable
|
||||
# cat /sys/fs/resctrl/mon_data/mon_L3_01/mbm_local_bytes
|
||||
1566565
|
||||
|
||||
l. Users have the option to go back to 'default' mbm_assign_mode if required. This can be
|
||||
done using the following command. Note that switching the mbm_assign_mode may reset all
|
||||
the MBM counters (and thus all MBM events) of all the resctrl groups.
|
||||
::
|
||||
|
||||
# echo "default" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
|
||||
# cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
|
||||
mbm_event
|
||||
[default]
|
||||
|
||||
m. Unmount the resctrl filesystem.
|
||||
::
|
||||
|
||||
# umount /sys/fs/resctrl/
|
||||
|
||||
Intel RDT Errata
|
||||
================
|
||||
|
||||
|
||||
Reference in New Issue
Block a user