1
0
Fork 0
mirror of synced 2025-03-06 20:59:54 +01:00
Commit graph

15 commits

Author SHA1 Message Date
Srinivas Pandruvada
01c10f88c9 platform/x86/intel-uncore-freq: tpmi: Provide cluster level control
The new generation of CPUs have granular control at a cluster level.
Each package/die can have multiple power domains, which further can
have multiple fabric clusters. The TPMI interface allows control at
fabric cluster level.

Use the updated uncore sysfs feature to expose controls at cluster
level. At each cluster level there is a control for maximum and minimum
uncore frequency. Also present current uncore frequency at a cluster
level.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Tested-by: Wendy Wang <wendy.wang@intel.com>
Link: https://lore.kernel.org/r/20230418171340.681662-4-srinivas.pandruvada@linux.intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-05-09 11:54:42 +02:00
Srinivas Pandruvada
9b8dea80e3 platform/x86/intel-uncore-freq: Support for cluster level controls
An SoC can contain multiple power domains with individual or collection
of mesh partitions. This partition is called fabric cluster.

Certain type of meshes will need to run at the same frequency, they will
be placed in the same fabric cluster. Benefit of fabric cluster is that
it offers a scalable mechanism to deal with partitioned fabrics in a SoC.

The current sysfs interface supports control at package and die level.
This interface is not enough to support more granular control at
fabric cluster level.

SoCs with the support of TPMI (Topology Aware Register and PM Capsule
Interface), can have multiple power domains. Each power domain can
contain one or more fabric clusters.

To support such granular controls, enhance uncore common to optionally
create new directories to provide controls at fabric cluster level. It
is also important to have flexibility to change granularity for future
version of SoCs. If the directory name contains scope like:
"package_*_die_*_power_domain_*_cluster_*", then this is not expandable.

The cpufreq policies also have different scopes. There the scope of the
policy (affected_cpus) specified by attributes inside each policy.
So, follow the same model for uncore frequency scaling sysfs as:
"sys/devices/system/cpu/cpufreq/policy*"

Allow client drivers to optionally support granular control for each
fabric cluster. Here, the directory name will be "uncore" suffixed with
an unique instance number. For example: uncore00, uncore01 etc.
Attributes in the directory identify package id, power domain and
fabric cluster id. This interface is expandable even if some new level
of granularity is introduced. A new sysfs attribute can identify new
level.

For compatibility with the existing sysfs and provide easy way to set
limits for each fabric cluster in the package/die, the existing control
at package/die levels are still provided. For majority of users, this is
an easy approach.

For example: On a single package/die system, with three power domains
and one fabric cluster per power domain:

$tree -L 2 /sys/devices/system/cpu/intel_uncore_frequency/
/sys/devices/system/cpu/intel_uncore_frequency/
├── package_00_die_00
│   ├── current_freq_khz
│   ├── initial_max_freq_khz
│   ├── initial_min_freq_khz
│   ├── max_freq_khz
│   └── min_freq_khz
├── uncore00
│   ├── current_freq_khz
│   ├── domain_id
│   ├── fabric_cluster_id
│   ├── initial_max_freq_khz
│   ├── initial_min_freq_khz
│   ├── max_freq_khz
│   ├── min_freq_khz
│   └── package_id
├── uncore01
│   ├── current_freq_khz
│   ├── domain_id
│   ├── fabric_cluster_id
│   ├── initial_max_freq_khz
│   ├── initial_min_freq_khz
│   ├── max_freq_khz
│   ├── min_freq_khz
│   └── package_id
└── uncore02
    ├── current_freq_khz
    ├── domain_id
    ├── fabric_cluster_id
    ├── initial_max_freq_khz
    ├── initial_min_freq_khz
    ├── max_freq_khz
    ├── min_freq_khz
    └── package_id

The attribute for cluster id is "fabric_cluster_id" instead of just
"cluster_id" is to avoid confusion with usage of term clusters in
other part of the Linux kernel.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Tested-by: Wendy Wang <wendy.wang@intel.com>
Link: https://lore.kernel.org/r/20230418171340.681662-3-srinivas.pandruvada@linux.intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-05-09 11:54:42 +02:00
Srinivas Pandruvada
8a54e2253e platform/x86/intel-uncore-freq: Uncore frequency control via TPMI
Implement support of uncore frequency control via TPMI (Topology Aware
Register and PM Capsule Interface). This driver provides the similar
functionality as the current uncore frequency driver using MSRs.

The hardware interface to read/write is basically substitution of MSR
0x620 and 0x621. There are specific MMIO offset and bits to get/set
minimum and maximum uncore ratio, similar to MSRs.

The scope of the uncore MSRs is package/die. But new generation of CPUs
have more granular control at a cluster level. Each package/die can have
multiple power domains, which further can have multiple clusters. The
TPMI interface allows control at cluster level.

The primary use case for uncore sysfs is to set maximum and minimum
uncore frequency to reduce power consumption or latency. The current
uncore sysfs control is per package/die. This is enough for the majority
of users as workload will move to different power domains as it moves
between different CPUs.

The current uncore sysfs provides controls at package/die level. When
user sets maximum/minimum limits, the driver sets the same limits to
each cluster.

Here number of power domains = number of resources in this aux device.
There are offsets and bits to discover number of clusters and offset for
each cluster level controls.

The TPMI documentation can be downloaded from:
https://github.com/intel/tpmi_power_management

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Tested-by: Wendy Wang <wendy.wang@intel.com>
Link: https://lore.kernel.org/r/20230420220514.747573-1-srinivas.pandruvada@linux.intel.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-05-09 11:54:42 +02:00
Srinivas Pandruvada
75e406b540 platform/x86/intel-uncore-freq: Return error on write frequency
Currently when the uncore_write() returns error, it is silently
ignored. Return error to user space when uncore_write() fails.

Fixes: 49a474c7ba ("platform/x86: Add support for Uncore frequency control")
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Tested-by: Wendy Wang <wendy.wang@intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20230418153230.679094-1-srinivas.pandruvada@linux.intel.com
Cc: stable@vger.kernel.org
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-05-09 10:53:58 +02:00
Linus Torvalds
556eb8b791 Driver core changes for 6.4-rc1
Here is the large set of driver core changes for 6.4-rc1.
 
 Once again, a busy development cycle, with lots of changes happening in
 the driver core in the quest to be able to move "struct bus" and "struct
 class" into read-only memory, a task now complete with these changes.
 
 This will make the future rust interactions with the driver core more
 "provably correct" as well as providing more obvious lifetime rules for
 all busses and classes in the kernel.
 
 The changes required for this did touch many individual classes and
 busses as many callbacks were changed to take const * parameters
 instead.  All of these changes have been submitted to the various
 subsystem maintainers, giving them plenty of time to review, and most of
 them actually did so.
 
 Other than those changes, included in here are a small set of other
 things:
   - kobject logging improvements
   - cacheinfo improvements and updates
   - obligatory fw_devlink updates and fixes
   - documentation updates
   - device property cleanups and const * changes
   - firwmare loader dependency fixes.
 
 All of these have been in linux-next for a while with no reported
 problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZEp7Sw8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ykitQCfamUHpxGcKOAGuLXMotXNakTEsxgAoIquENm5
 LEGadNS38k5fs+73UaxV
 =7K4B
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here is the large set of driver core changes for 6.4-rc1.

  Once again, a busy development cycle, with lots of changes happening
  in the driver core in the quest to be able to move "struct bus" and
  "struct class" into read-only memory, a task now complete with these
  changes.

  This will make the future rust interactions with the driver core more
  "provably correct" as well as providing more obvious lifetime rules
  for all busses and classes in the kernel.

  The changes required for this did touch many individual classes and
  busses as many callbacks were changed to take const * parameters
  instead. All of these changes have been submitted to the various
  subsystem maintainers, giving them plenty of time to review, and most
  of them actually did so.

  Other than those changes, included in here are a small set of other
  things:

   - kobject logging improvements

   - cacheinfo improvements and updates

   - obligatory fw_devlink updates and fixes

   - documentation updates

   - device property cleanups and const * changes

   - firwmare loader dependency fixes.

  All of these have been in linux-next for a while with no reported
  problems"

* tag 'driver-core-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (120 commits)
  device property: make device_property functions take const device *
  driver core: update comments in device_rename()
  driver core: Don't require dynamic_debug for initcall_debug probe timing
  firmware_loader: rework crypto dependencies
  firmware_loader: Strip off \n from customized path
  zram: fix up permission for the hot_add sysfs file
  cacheinfo: Add use_arch[|_cache]_info field/function
  arch_topology: Remove early cacheinfo error message if -ENOENT
  cacheinfo: Check cache properties are present in DT
  cacheinfo: Check sib_leaf in cache_leaves_are_shared()
  cacheinfo: Allow early level detection when DT/ACPI info is missing/broken
  cacheinfo: Add arm64 early level initializer implementation
  cacheinfo: Add arch specific early level initializer
  tty: make tty_class a static const structure
  driver core: class: remove struct class_interface * from callbacks
  driver core: class: mark the struct class in struct class_interface constant
  driver core: class: make class_register() take a const *
  driver core: class: mark class_release() as taking a const *
  driver core: remove incorrect comment for device_create*
  MIPS: vpe-cmp: remove module owner pointer from struct class usage.
  ...
2023-04-27 11:53:57 -07:00
Srinivas Pandruvada
4f59630a5e platform/x86: intel-uncore-freq: Add client processors
Make Intel uncore frequency driver support to client processor starting
from Alder Lake.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://lore.kernel.org/r/20230330145939.1022261-1-srinivas.pandruvada@linux.intel.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-04-07 15:27:33 +02:00
Greg Kroah-Hartman
c8e15075b2 platform/x86: intel-uncore-freq: move to use bus_get_dev_root()
Direct access to the struct bus_type dev_root pointer is going away soon
so replace that with a call to bus_get_dev_root() instead, which is what
it is there for.

Cc: Mark Gross <markgross@kernel.org>
Cc: platform-driver-x86@vger.kernel.org
Acked-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://lore.kernel.org/r/20230313182918.1312597-5-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-17 15:29:12 +01:00
Artem Bityutskiy
9c252ecf30 platform/x86: intel-uncore-freq: add Emerald Rapids support
Make Intel uncore frequency driver support Emerald Rapids by adding its
CPU model to the match table.

Emerald Rapids uncore frequency control is the same as in Sapphire
Rapids.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-11-23 17:00:11 +01:00
ye xingchen
76a13da75d platform/x86: intel-uncore-freq: Use sysfs_emit() to instead of scnprintf()
Replace the open-code with sysfs_emit() to simplify the code.

Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn>
Link: https://lore.kernel.org/r/20220923063314.239146-1-ye.xingchen@zte.com.cn
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-09-27 15:06:32 +02:00
Srinivas Pandruvada
8d75f7b4a3 platform/x86: intel-uncore-freq: Prevent driver loading in guests
Loading this driver in guests results in unchecked MSR access error for
MSR 0x620.

There is no use of reading and modifying package/die scope uncore MSRs
in guests. So check for CPU feature X86_FEATURE_HYPERVISOR to prevent
loading of this driver in guests.

Fixes: dbce412a77 ("platform/x86/intel-uncore-freq: Split common and enumeration part")
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=215870
Suggested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://lore.kernel.org/r/20220427100304.2562990-1-srinivas.pandruvada@linux.intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-04-27 16:55:54 +02:00
Dan Carpenter
f2a6c7e747 platform/x86: intel-uncore-freq: fix uncore_freq_common_init() error codes
Currently the uncore_freq_common_init() return one on success and
zero on failure.  There is only one caller and it has a "forgot to set
the error code" bug.  Change uncore_freq_common_init() to return
negative error codes which makes the code simpler and avoids this kind
of bug in the future.

Fixes: dbce412a77 ("platform/x86/intel-uncore-freq: Split common and enumeration part")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://lore.kernel.org/r/20220304131925.GG28739@kili
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-03-08 16:33:23 +01:00
Srinivas Pandruvada
dbce412a77 platform/x86/intel-uncore-freq: Split common and enumeration part
Split the current driver in two parts:
- Common part: All the commom function other than enumeration function.
- Enumeration/HW specific part: The current enumeration using CPU model
is left in the old module. This uses service of common driver to register
sysfs objects. Also provide callbacks for MSR access related to uncore.
- Add MODULE_DEVICE_TABLE to uncore-frequency.c

No functional changes are expected.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20220204000306.2517447-5-srinivas.pandruvada@linux.intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-02-17 13:20:30 +01:00
Srinivas Pandruvada
414eef2728 platform/x86/intel/uncore-freq: Display uncore current frequency
Add a new sysfs attribute "current_freq_khz" to display current uncore
frequency. This value is read from MSR 0x621.

Root user permission is required to read uncore current frequency.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20220204000306.2517447-4-srinivas.pandruvada@linux.intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-02-17 13:20:30 +01:00
Srinivas Pandruvada
ae7b2ce578 platform/x86/intel/uncore-freq: Use sysfs API to create attributes
Use of sysfs API is always preferable over using kobject calls to create
attributes. Remove usage of kobject_init_and_add() and use
sysfs_create_group(). To create relationship between sysfs attribute
and uncore instance use device_attribute*, which is defined per
uncore instance.

To create uniform locking for both read and write attributes take
lock in the sysfs callbacks, not in the actual functions where
the MSRs are read or updated.

No functional changes are expected.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20220204000306.2517447-3-srinivas.pandruvada@linux.intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-02-17 13:20:30 +01:00
Srinivas Pandruvada
ce2645c458 platform/x86/intel/uncore-freq: Move to uncore-frequency folder
Move the current driver from platform/x86/intel/uncore-frequency.c
to platform/x86/intel/uncore-frequency/uncore-frequency.c.

No functional changes are expected.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20220204000306.2517447-2-srinivas.pandruvada@linux.intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-02-17 13:20:30 +01:00