1
0
Fork 0
mirror of synced 2025-03-06 20:59:54 +01:00
linux/tools/perf/arch/x86/util/evsel.c
Ravi Bangoria eb39bf3256 perf evsel: Don't set exclude_guest by default
Perf tool sets exclude_guest by default while calling perf_event_open().
Because IBS does not have filtering capability, it always gets rejected
by IBS PMU driver and thus perf falls back to non-precise sampling. Fix
it by not setting exclude_guest by default on AMD.

Before:
  $ sudo ./perf record -C 0 -vvv true |& grep precise
    precise_ip                       3
  decreasing precise_ip by one (2)
    precise_ip                       2
  decreasing precise_ip by one (1)
    precise_ip                       1
  decreasing precise_ip by one (0)

After:
  $ sudo ./perf record -C 0 -vvv true |& grep precise
    precise_ip                       3
  decreasing precise_ip by one (2)
    precise_ip                       2

Committer notes:

Fixup init to zero for perf_env in older compilers:

  arch/x86/util/evsel.c:15:26: error: missing field 'os_release' initializer [-Werror,-Wmissing-field-initializers]
          struct perf_env env = {0};
                                  ^

Committer notes:

Namhyung remarked:

  It'd be nice if it can cover explicit "-e cycles:pp" as well.

Ravi clarified:

  For explicit :pp modifier, evsel->precise_max does not get set and thus perf
  does not try with different attr->precise_ip values while exclude_guest set.
  So no issue with explicit :pp:

    $ sudo ./perf record -C 0 -e cycles:pp -vvv |& grep "precise_ip\|exclude_guest"
      precise_ip                       2
      exclude_guest                    1
      precise_ip                       2
      exclude_guest                    1
    switching off exclude_guest, exclude_host
      precise_ip                       2
    ^C

  Also, with :P modifier, evsel->precise_max gets set but exclude_guest does
  not and thus :P also works fine:

    $ sudo ./perf record -C 0 -e cycles:P -vvv |& grep "precise_ip\|exclude_guest"
      precise_ip                       3
    decreasing precise_ip by one (2)
      precise_ip                       2
    ^C

Reported-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20211103072112.32312-1-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-07 12:26:24 -03:00

31 lines
813 B
C

// SPDX-License-Identifier: GPL-2.0
#include <stdio.h>
#include <stdlib.h>
#include "util/evsel.h"
#include "util/env.h"
#include "linux/string.h"
void arch_evsel__set_sample_weight(struct evsel *evsel)
{
evsel__set_sample_bit(evsel, WEIGHT_STRUCT);
}
void arch_evsel__fixup_new_cycles(struct perf_event_attr *attr)
{
struct perf_env env = { .total_mem = 0, } ;
if (!perf_env__cpuid(&env))
return;
/*
* On AMD, precise cycles event sampling internally uses IBS pmu.
* But IBS does not have filtering capabilities and perf by default
* sets exclude_guest = 1. This makes IBS pmu event init fail and
* thus perf ends up doing non-precise sampling. Avoid it by clearing
* exclude_guest.
*/
if (env.cpuid && strstarts(env.cpuid, "AuthenticAMD"))
attr->exclude_guest = 0;
free(env.cpuid);
}