1
0
Fork 0
mirror of synced 2025-03-06 20:59:54 +01:00
Commit graph

8 commits

Author SHA1 Message Date
Kan Liang
ff4207f793 perf evsel: Add arch_evsel__hw_name()
The commit 55bcf6ef31 ("perf: Extend PERF_TYPE_HARDWARE and
PERF_TYPE_HW_CACHE") extends the two types to become PMU aware types for
a hybrid system. However, current evsel__hw_name doesn't take the PMU
type into account. It mistakenly returns the "unknown-hardware" for the
hardware event with a specific PMU type.

Add an arch specific arch_evsel__hw_name() to specially handle the PMU
aware hardware event.

Currently, the extend PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE is only
supported by X86. Only implement the specific arch_evsel__hw_name() for
X86 in the patch.

Nothing is changed for the other archs.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220721065706.2886112-3-zhengjun.xing@linux.intel.com
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-29 13:41:19 -03:00
Ravi Bangoria
9ab95b0b15 perf record ibs: Warn about sampling period skew
Samples without an L3 miss are discarded and counter is reset with
random value (between 1-15 for fetch PMU and 1-127 for op PMU) when IBS
L3 miss filtering is enabled. This causes a sampling period skew but
there is no way to reconstruct aggregated sampling period. So print a
warning at perf record if user sets l3missonly=1.

Ex:

  # perf record -c 10000 -C 0 -e ibs_op/l3missonly=1/
  WARNING: Hw internally resets sampling period when L3 Miss Filtering is enabled
  and tagged operation does not cause L3 Miss. This causes sampling period skew.

Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <rrichter@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: like.xu.linux@gmail.com
Cc: x86@kernel.org
Link: http://lore.kernel.org/lkml/20220604044519.594-2-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-24 13:18:22 -03:00
Zhengjun Xing
151e7d7503 perf record: Support sample-read topdown metric group for hybrid platforms
With the hardware TopDown metrics feature, the sample-read feature should
be supported for a TopDown group, e.g., sample a non-topdown event and read
a Topdown metric group. But the current perf record code errors are out.

For a TopDown metric group,the slots event must be the leader of the group,
but the leader slots event doesn't support sampling. To support sample-read
the TopDown metric group, uses the 2nd event of the group as the "leader"
for the purposes of sampling.

Only the platform with the TopDown metric feature supports sample-read the
topdown group. In commit acb65150a4 ("perf record: Support sample-read
topdown metric group"), it adds arch_topdown_sample_read() to indicate
whether the TopDown group supports sample-read, it should only work on the
non-hybrid systems, this patch extends the support for hybrid platforms.

Before:

  # ./perf record -e "{cpu_core/slots/,cpu_core/cycles/,cpu_core/topdown-retiring/}:S" -a sleep 1
  Error:
  The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu_core/topdown-retiring/).
  /bin/dmesg | grep -i perf may provide additional information.

After:

  # ./perf record -e "{cpu_core/slots/,cpu_core/cycles/,cpu_core/topdown-retiring/}:S" -a sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.238 MB perf.data (369 samples) ]

Fixes: acb65150a4 ("perf record: Support sample-read topdown metric group")
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220602153603.1884710-1-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-03 21:30:10 +02:00
Zhengjun Xing
e69a5c0102 perf evlist: Extend arch_evsel__must_be_in_group to support hybrid systems
For the hybrid system, the "slots" event changes to "cpu_core/slots/", need
extend API arch_evsel__must_be_in_group() to support hybrid systems.

In the origin code, for hybrid system event "cpu_core/slots/", the output
of the API arch_evsel__must_be_in_group() is "false" (in fact,it should be
"true"). Currently only one API evsel__remove_from_group() calls it. In
evsel__remove_from_group(), it adds the second condition to check, so the
output of evsel__remove_from_group() still is correct. That's the reason
why there isn't an instant error. I'd like to fix the issue found in API
arch_evsel__must_be_in_group() in case someone else using the function in
the other place.

Fixes: d98079c05b ("perf evlist: Keep topdown counters in weak group")
Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20220601152544.1842447-1-zhengjun.xing@linux.intel.com
Cc: peterz@infradead.org
Cc: adrian.hunter@intel.com
Cc: alexander.shishkin@intel.com
Cc: acme@kernel.org
Cc: ak@linux.intel.com
Cc: jolsa@redhat.com
Cc: mingo@redhat.com
Cc: linux-kernel@vger.kernel.org
Cc: linux-perf-users@vger.kernel.org
2022-06-03 21:12:34 +02:00
Kan Liang
39d5f412da perf evsel: Fixes topdown events in a weak group for the hybrid platform
The patch ("perf evlist: Keep topdown counters in weak group") fixes the
perf metrics topdown event issue when the topdown events are in a weak
group on a non-hybrid platform. However, it doesn't work for the hybrid
platform.

  $./perf stat -e '{cpu_core/slots/,cpu_core/topdown-bad-spec/,
  cpu_core/topdown-be-bound/,cpu_core/topdown-fe-bound/,
  cpu_core/topdown-retiring/,cpu_core/branch-instructions/,
  cpu_core/branch-misses/,cpu_core/bus-cycles/,cpu_core/cache-misses/,
  cpu_core/cache-references/,cpu_core/cpu-cycles/,cpu_core/instructions/,
  cpu_core/mem-loads/,cpu_core/mem-stores/,cpu_core/ref-cycles/,
  cpu_core/cache-misses/,cpu_core/cache-references/}:W' -a sleep 1

  Performance counter stats for 'system wide':

       751,765,068      cpu_core/slots/                        (84.07%)
   <not supported>      cpu_core/topdown-bad-spec/
   <not supported>      cpu_core/topdown-be-bound/
   <not supported>      cpu_core/topdown-fe-bound/
   <not supported>      cpu_core/topdown-retiring/
        12,398,197      cpu_core/branch-instructions/          (84.07%)
         1,054,218      cpu_core/branch-misses/                (84.24%)
       539,764,637      cpu_core/bus-cycles/                   (84.64%)
            14,683      cpu_core/cache-misses/                 (84.87%)
         7,277,809      cpu_core/cache-references/             (77.30%)
       222,299,439      cpu_core/cpu-cycles/                   (77.28%)
        63,661,714      cpu_core/instructions/                 (84.85%)
                 0      cpu_core/mem-loads/                    (77.29%)
        12,271,725      cpu_core/mem-stores/                   (77.30%)
       542,241,102      cpu_core/ref-cycles/                   (84.85%)
             8,854      cpu_core/cache-misses/                 (76.71%)
         7,179,013      cpu_core/cache-references/             (76.31%)

         1.003245250 seconds time elapsed

A hybrid platform has a different PMU name for the core PMUs, while
the current perf hard code the PMU name "cpu".

The evsel->pmu_name can be used to replace the "cpu" to fix the issue.
For a hybrid platform, the pmu_name must be non-NULL. Because there are
at least two core PMUs. The PMU has to be specified.
For a non-hybrid platform, the pmu_name may be NULL. Because there is
only one core PMU, "cpu". For a NULL pmu_name, we can safely assume that
it is a "cpu" PMU.

In case other PMUs also define the "slots" event, checking the PMU type
as well.

With the patch,

  $ perf stat -e '{cpu_core/slots/,cpu_core/topdown-bad-spec/,
  cpu_core/topdown-be-bound/,cpu_core/topdown-fe-bound/,
  cpu_core/topdown-retiring/,cpu_core/branch-instructions/,
  cpu_core/branch-misses/,cpu_core/bus-cycles/,cpu_core/cache-misses/,
  cpu_core/cache-references/,cpu_core/cpu-cycles/,cpu_core/instructions/,
  cpu_core/mem-loads/,cpu_core/mem-stores/,cpu_core/ref-cycles/,
  cpu_core/cache-misses/,cpu_core/cache-references/}:W' -a sleep 1

  Performance counter stats for 'system wide':

     766,620,266   cpu_core/slots/                                        (84.06%)
      73,172,129   cpu_core/topdown-bad-spec/ #    9.5% bad speculation   (84.06%)
     193,443,341   cpu_core/topdown-be-bound/ #    25.0% backend bound    (84.06%)
     403,940,929   cpu_core/topdown-fe-bound/ #    52.3% frontend bound   (84.06%)
     102,070,237   cpu_core/topdown-retiring/ #    13.2% retiring         (84.06%)
      12,364,429   cpu_core/branch-instructions/                          (84.03%)
       1,080,124   cpu_core/branch-misses/                                (84.24%)
     564,120,383   cpu_core/bus-cycles/                                   (84.65%)
          36,979   cpu_core/cache-misses/                                 (84.86%)
       7,298,094   cpu_core/cache-references/                             (77.30%)
     227,174,372   cpu_core/cpu-cycles/                                   (77.31%)
      63,886,523   cpu_core/instructions/                                 (84.87%)
               0   cpu_core/mem-loads/                                    (77.31%)
      12,208,782   cpu_core/mem-stores/                                   (77.31%)
     566,409,738   cpu_core/ref-cycles/                                   (84.87%)
          23,118   cpu_core/cache-misses/                                 (76.71%)
       7,212,602   cpu_core/cache-references/                             (76.29%)

       1.003228667 seconds time elapsed

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220518143900.1493980-2-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-20 11:09:41 -03:00
Ian Rogers
d98079c05b perf evlist: Keep topdown counters in weak group
On Intel Icelake, topdown events must always be grouped with a slots
event as leader. When a metric is parsed a weak group is formed and
retried if perf_event_open fails. The retried events aren't grouped
breaking the slots leader requirement. This change modifies the weak
group "reset" behavior so that topdown events aren't broken from the
group for the retry.

  $ perf stat -e '{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring,branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,baclears.any,ARITH.DIVIDER_ACTIVE}:W' -a sleep 1

   Performance counter stats for 'system wide':

    47,867,188,483      slots                                                         (92.27%)
   <not supported>      topdown-bad-spec
   <not supported>      topdown-be-bound
   <not supported>      topdown-fe-bound
   <not supported>      topdown-retiring
     2,173,346,937      branch-instructions                                           (92.27%)
        10,540,253      branch-misses             #    0.48% of all branches          (92.29%)
        96,291,140      bus-cycles                                                    (92.29%)
         6,214,202      cache-misses              #   20.120 % of all cache refs      (92.29%)
        30,886,082      cache-references                                              (76.91%)
    11,773,726,641      cpu-cycles                                                    (84.62%)
    11,807,585,307      instructions              #    1.00  insn per cycle           (92.31%)
                 0      mem-loads                                                     (92.32%)
     2,212,928,573      mem-stores                                                    (84.69%)
    10,024,403,118      ref-cycles                                                    (92.35%)
        16,232,978      baclears.any                                                  (92.35%)
        23,832,633      ARITH.DIVIDER_ACTIVE                                          (84.59%)

       0.981070734 seconds time elapsed

After:

  $ perf stat -e '{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring,branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,baclears.any,ARITH.DIVIDER_ACTIVE}:W' -a sleep 1

   Performance counter stats for 'system wide':

       31040189283      slots                                                         (92.27%)
        8997514811      topdown-bad-spec          #     28.2% bad speculation         (92.27%)
       10997536028      topdown-be-bound          #     34.5% backend bound           (92.27%)
        4778060526      topdown-fe-bound          #     15.0% frontend bound          (92.27%)
        7086628768      topdown-retiring          #     22.2% retiring                (92.27%)
        1417611942      branch-instructions                                           (92.26%)
           5285529      branch-misses             #    0.37% of all branches          (92.28%)
          62922469      bus-cycles                                                    (92.29%)
           1440708      cache-misses              #    8.292 % of all cache refs      (92.30%)
          17374098      cache-references                                              (76.94%)
        8040889520      cpu-cycles                                                    (84.63%)
        7709992319      instructions              #    0.96  insn per cycle           (92.32%)
                 0      mem-loads                                                     (92.32%)
        1515669558      mem-stores                                                    (84.68%)
        6542411177      ref-cycles                                                    (92.35%)
           4154149      baclears.any                                                  (92.35%)
          20556152      ARITH.DIVIDER_ACTIVE                                          (84.59%)

       1.010799593 seconds time elapsed

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220517052724.283874-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-17 12:01:18 -03:00
Ravi Bangoria
eb39bf3256 perf evsel: Don't set exclude_guest by default
Perf tool sets exclude_guest by default while calling perf_event_open().
Because IBS does not have filtering capability, it always gets rejected
by IBS PMU driver and thus perf falls back to non-precise sampling. Fix
it by not setting exclude_guest by default on AMD.

Before:
  $ sudo ./perf record -C 0 -vvv true |& grep precise
    precise_ip                       3
  decreasing precise_ip by one (2)
    precise_ip                       2
  decreasing precise_ip by one (1)
    precise_ip                       1
  decreasing precise_ip by one (0)

After:
  $ sudo ./perf record -C 0 -vvv true |& grep precise
    precise_ip                       3
  decreasing precise_ip by one (2)
    precise_ip                       2

Committer notes:

Fixup init to zero for perf_env in older compilers:

  arch/x86/util/evsel.c:15:26: error: missing field 'os_release' initializer [-Werror,-Wmissing-field-initializers]
          struct perf_env env = {0};
                                  ^

Committer notes:

Namhyung remarked:

  It'd be nice if it can cover explicit "-e cycles:pp" as well.

Ravi clarified:

  For explicit :pp modifier, evsel->precise_max does not get set and thus perf
  does not try with different attr->precise_ip values while exclude_guest set.
  So no issue with explicit :pp:

    $ sudo ./perf record -C 0 -e cycles:pp -vvv |& grep "precise_ip\|exclude_guest"
      precise_ip                       2
      exclude_guest                    1
      precise_ip                       2
      exclude_guest                    1
    switching off exclude_guest, exclude_host
      precise_ip                       2
    ^C

  Also, with :P modifier, evsel->precise_max gets set but exclude_guest does
  not and thus :P also works fine:

    $ sudo ./perf record -C 0 -e cycles:P -vvv |& grep "precise_ip\|exclude_guest"
      precise_ip                       3
    decreasing precise_ip by one (2)
      precise_ip                       2
    ^C

Reported-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20211103072112.32312-1-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-07 12:26:24 -03:00
Kan Liang
ea8d0ed6ea perf tools: Support PERF_SAMPLE_WEIGHT_STRUCT
The new sample type, PERF_SAMPLE_WEIGHT_STRUCT, is an alternative of the
PERF_SAMPLE_WEIGHT sample type. Users can apply either the
PERF_SAMPLE_WEIGHT sample type or the PERF_SAMPLE_WEIGHT_STRUCT sample
type to retrieve the sample weight, but they cannot apply both sample
types simultaneously.

The new sample type shares the same space as the PERF_SAMPLE_WEIGHT
sample type. The lower 32 bits are exactly the same for both sample
type. The higher 32 bits may be different for different architecture.

Add arch specific arch_evsel__set_sample_weight() to set the new sample
type for X86. Only store the lower 32 bits for the sample->weight if the
new sample type is applied. In practice, no memory access could last
than 4G cycles. No data will be lost.

If the kernel doesn't support the new sample type. Fall back to the
PERF_SAMPLE_WEIGHT sample type.

There is no impact for other architectures.

Committer notes:

Fixup related to PERF_SAMPLE_CODE_PAGE_SIZE, present in acme/perf/core
but not upstream yet.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/1612296553-21962-6-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-08 16:25:00 -03:00