linux

mirror of synced 2025-03-06 20:59:54 +01:00

Author	SHA1	Message	Date
Adrian Hunter	87bbaf1a4b	x86/insn: Add support for APX EVEX to the instruction decoder logic Intel Advanced Performance Extensions (APX) extends the EVEX prefix to support: - extended general purpose registers (EGPRs) i.e. r16 to r31 - Push-Pop Acceleration (PPX) hints - new data destination (NDD) register - suppress status flags writes (NF) of common instructions - new instructions Refer to the Intel Advanced Performance Extensions (Intel APX) Architecture Specification for details. The extended EVEX prefix does not need amended instruction decoder logic, except in one area. Some instructions are defined as SCALABLE which means the EVEX.W bit and EVEX.pp bits are used to determine operand size. Specifically, if an instruction is SCALABLE and EVEX.W is zero, then EVEX.pp value 0 (representing no prefix NP) means default operand size, whereas EVEX.pp value 1 (representing 66 prefix) means operand size override i.e. 16 bits Add an attribute (INAT_EVEX_SCALABLE) to identify such instructions, and amend the logic appropriately. Amend the awk script that generates the attribute tables from the opcode map, to recognise "(es)" as attribute INAT_EVEX_SCALABLE. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20240502105853.5338-8-adrian.hunter@intel.com	2024-05-02 13:13:45 +02:00
Adrian Hunter	eada38d575	x86/insn: Add support for REX2 prefix to the instruction decoder logic Intel Advanced Performance Extensions (APX) uses a new 2-byte prefix named REX2 to select extended general purpose registers (EGPRs) i.e. r16 to r31. The REX2 prefix is effectively an extended version of the REX prefix. REX2 and EVEX are also used with PUSH/POP instructions to provide a Push-Pop Acceleration (PPX) hint. With PPX hints, a CPU will attempt to fast-forward register data between matching PUSH and POP instructions. REX2 is valid only with opcodes in maps 0 and 1. Similar extension for other maps is provided by the EVEX prefix, covered in a separate patch. Some opcodes in maps 0 and 1 are reserved under REX2. One of these is used for a new 64-bit absolute direct jump instruction JMPABS. Refer to the Intel Advanced Performance Extensions (Intel APX) Architecture Specification for details. Define a code value for the REX2 prefix (INAT_PFX_REX2), and add attribute flags for opcodes reserved under REX2 (INAT_NO_REX2) and to identify opcodes (only JMPABS) that require a mandatory REX2 prefix (INAT_REX2_VARIANT). Amend logic to read the REX2 prefix and get the opcode attribute for the map number (0 or 1) encoded in the REX2 prefix. Amend the awk script that generates the attribute tables from the opcode map, to recognise "REX2" as attribute INAT_PFX_REX2, and "(!REX2)" as attribute INAT_NO_REX2, and "(REX2)" as attribute INAT_REX2_VARIANT. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20240502105853.5338-6-adrian.hunter@intel.com	2024-05-02 13:13:44 +02:00
Nikolay Borisov	07a5d4bcbf	x86/insn: Directly assign x86_64 state in insn_init() No point in checking again as this was already done by the caller. Signed-off-by: Nikolay Borisov <nik.borisov@suse.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240222111636.2214523-3-nik.borisov@suse.com	2024-02-22 12:23:27 +01:00
Nikolay Borisov	427e1646f1	x86/insn: Remove superfluous checks from instruction decoding routines It's pointless checking if a particular part of an instruction is decoded before calling the routine responsible for decoding it as this check is duplicated in the routines itself. Streamline the code by removing the superfluous checks. No functional difference. Signed-off-by: Nikolay Borisov <nik.borisov@suse.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240222111636.2214523-2-nik.borisov@suse.com	2024-02-22 12:23:04 +01:00
Borislav Petkov	f96b467583	x86/insn: Use get_unaligned() instead of memcpy() Use get_unaligned() instead of memcpy() to access potentially unaligned memory, which, when accessed through a pointer, leads to undefined behavior. get_unaligned() describes much better what is happening there anyway even if memcpy() does the job. In addition, since perf tool builds with -Werror, it would fire with: util/intel-pt-decoder/../../../arch/x86/lib/insn.c: In function '__insn_get_emulate_prefix': tools/include/../include/asm-generic/unaligned.h:10:15: error: packed attribute is unnecessary [-Werror=packed] 10 \| const struct { type x; } __packed *__pptr = (typeof(__pptr))(ptr); \ because -Werror=packed would complain if the packed attribute would have no effect on the layout of the structure. In this case, that is intentional so disable the warning only for that compilation unit. That part is Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> No functional changes. Fixes: `5ba1071f75` ("x86/insn, tools/x86: Fix undefined behavior due to potential unaligned accesses") Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Tested-by: Stephen Rothwell <sfr@canb.auug.org.au> Link: https://lkml.kernel.org/r/YVSsIkj9Z29TyUjE@zn.tnic	2021-10-06 11:56:37 +02:00
Numfor Mbiziwo-Tiapo	5ba1071f75	x86/insn, tools/x86: Fix undefined behavior due to potential unaligned accesses Don't perform unaligned loads in __get_next() and __peek_nbyte_next() as these are forms of undefined behavior: "A pointer to an object or incomplete type may be converted to a pointer to a different object or incomplete type. If the resulting pointer is not correctly aligned for the pointed-to type, the behavior is undefined." (from http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf) These problems were identified using the undefined behavior sanitizer (ubsan) with the tools version of the code and perf test. [ bp: Massage commit message. ] Signed-off-by: Numfor Mbiziwo-Tiapo <nums@google.com> Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lkml.kernel.org/r/20210923161843.751834-1-irogers@google.com	2021-09-24 12:37:38 +02:00
Borislav Petkov	0705ef64d1	tools/insn: Restore the relative include paths for cross building Building perf on ppc causes: In file included from util/intel-pt-decoder/intel-pt-insn-decoder.c:15: util/intel-pt-decoder/../../../arch/x86/lib/insn.c:14:10: fatal error: asm/inat.h: No such file or directory 14 \| #include <asm/inat.h> /__ignore_sync_check__ / \| ^~~~~~~~~~~~ Restore the relative include paths so that the compiler can find the headers. Fixes: `93281c4a96` ("x86/insn: Add an insn_decode() API") Reported-by: Ian Rogers <irogers@google.com> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Ian Rogers <irogers@google.com> Tested-by: Stephen Rothwell <sfr@canb.auug.org.au> Link: https://lkml.kernel.org/r/20210317150858.02b1bbc8@canb.auug.org.au	2021-03-17 20:17:05 +01:00
Borislav Petkov	f935178b5c	x86/insn: Make insn_complete() static ... and move it above the only place it is used. Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210304174237.31945-22-bp@alien8.de	2021-03-15 13:03:46 +01:00
Borislav Petkov	93281c4a96	x86/insn: Add an insn_decode() API Users of the instruction decoder should use this to decode instruction bytes. For that, have insn*() helpers return an int value to denote success/failure. When there's an error fetching the next insn byte and the insn falls short, return -ENODATA to denote that. While at it, make insn_get_opcode() more stricter as to whether what has seen so far is a valid insn and if not. Copy linux/kconfig.h for the tools-version of the decoder so that it can use IS_ENABLED(). Also, cast the INSN_MODE_KERN dummy define value to (enum insn_mode) for tools use of the decoder because perf tool builds with -Werror and errors out with -Werror=sign-compare otherwise. Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lkml.kernel.org/r/20210304174237.31945-5-bp@alien8.de	2021-03-15 11:05:47 +01:00
Borislav Petkov	d30c7b820b	x86/insn: Add a __ignore_sync_check__ marker Add an explicit __ignore_sync_check__ marker which will be used to mark lines which are supposed to be ignored by file synchronization check scripts, its advantage being that it explicitly denotes such lines in the code. Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lkml.kernel.org/r/20210304174237.31945-4-bp@alien8.de	2021-03-15 11:00:57 +01:00
Borislav Petkov	508ef28674	x86/insn: Add @buf_len param to insn_init() kernel-doc comment It wasn't documented so add it. No functional changes. Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lkml.kernel.org/r/20210304174237.31945-3-bp@alien8.de	2021-03-15 11:00:19 +01:00
Vasily Gorbik	5ed934e57e	x86/insn: Fix vector instruction decoding on big endian cross-compiles Running instruction decoder posttest on an s390 host with an x86 target with allyesconfig shows errors. Instructions used in a couple of kernel objects could not be correctly decoded on big endian system. insn_decoder_test: warning: objdump says 6 bytes, but insn_get_length() says 5 insn_decoder_test: warning: Found an x86 instruction decoder bug, please report this. insn_decoder_test: warning: ffffffff831eb4e1: 62 d1 fd 48 7f 04 24 vmovdqa64 %zmm0,(%r12) insn_decoder_test: warning: objdump says 7 bytes, but insn_get_length() says 6 insn_decoder_test: warning: Found an x86 instruction decoder bug, please report this. insn_decoder_test: warning: ffffffff831eb4e8: 62 51 fd 48 7f 44 24 01 vmovdqa64 %zmm8,0x40(%r12) insn_decoder_test: warning: objdump says 8 bytes, but insn_get_length() says 6 This is because in a few places instruction field bytes are set directly with further usage of "value". To address that introduce and use a insn_set_byte() helper, which correctly updates "value" on big endian systems. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>	2021-01-13 18:13:17 -06:00
Martin Schwidefsky	1d509f2a6e	x86/insn: Support big endian cross-compiles The x86 instruction decoder code is shared across the kernel source and the tools. Currently objtool seems to be the only tool from build tools needed which breaks x86 cross-compilation on big endian systems. Make the x86 instruction decoder build host endianness agnostic to support x86 cross-compilation and enable objtool to implement endianness awareness for big endian architectures support. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Co-developed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>	2021-01-13 18:13:11 -06:00
Masami Hiramatsu	4d65adfcd1	x86: xen: insn: Decode Xen and KVM emulate-prefix signature Decode Xen and KVM's emulate-prefix signature by x86 insn decoder. It is called "prefix" but actually not x86 instruction prefix, so this adds insn.emulate_prefix_size field instead of reusing insn.prefixes. If x86 decoder finds a special sequence of instructions of XEN_EMULATE_PREFIX and 'ud2a; .ascii "kvm"', it just counts the length, set insn.emulate_prefix_size and fold it with the next instruction. In other words, the signature and the next instruction is treated as a single instruction. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: x86@kernel.org Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: Borislav Petkov <bp@alien8.de> Cc: xen-devel@lists.xenproject.org Cc: Randy Dunlap <rdunlap@infradead.org> Link: https://lkml.kernel.org/r/156777564986.25081.4964537658500952557.stgit@devnote2	2019-10-17 21:31:57 +02:00
Josh Poimboeuf	00a263902a	perf intel-pt: Use shared x86 insn decoder Now that there's a common version of the decoder for all tools, use it instead of the local copy. Also use perf's check-headers.sh script to diff the decoder files to make sure they remain in sync with the kernel version. Objtool has a similar check. Committer notes: Had to keep this all pointing explicitely to x86 headers/files, i.e. instead of asm/isnn.h we had to use ../include/asm/insn.h when the files were in differemt dirs, or just replace "<asm/foo.h>" with "foo.h". This way we continue to be able to process perf.data files with Intel PT traces in distros other than x86. Also fixed up the awk script paths to use $(srcdir)/tools/arch instead or relative directories so that we keep detached tarballs (make help \| grep perf) working. For now the include lines in these headers are being ignored so as not to flag false reports of kernel/tools out of sync. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: x86@kernel.org Link: http://lore.kernel.org/lkml/8a37e615d2880f039505d693d1e068a009358a2b.1567118001.git.jpoimboe@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-08-31 22:27:52 -03:00
Josh Poimboeuf	d046b72548	objtool: Move x86 insn decoder to a common location The kernel tree has three identical copies of the x86 instruction decoder. Two of them are in the tools subdir. The tools subdir is supposed to be completely standalone and separate from the kernel. So having at least one copy of the kernel decoder in the tools subdir is unavoidable. However, we don't need two of them. Move objtool's copy of the decoder to a shared location, so that perf will also be able to use it. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: x86@kernel.org Link: http://lore.kernel.org/lkml/55b486b88f6bcd0c9a2a04b34f964860c8390ca8.1567118001.git.jpoimboe@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-08-31 22:27:52 -03:00

16 commits