1
0
Fork 0
mirror of synced 2025-03-06 20:59:54 +01:00
Commit graph

65 commits

Author SHA1 Message Date
Holger Dengler
b3840c8bfc s390/ap: rename ap debug configuration option
The configuration option ZCRYPT_DEBUG is used only in ap queue code,
so rename it to AP_DEBUG. It also no longer depends on ZCRYPT but on
AP. While at it, also update the help text.

Signed-off-by: Holger Dengler <dengler@linux.ibm.com>
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-09 17:29:56 +02:00
Harald Freudenberger
6a2892d09d s390/ap: add debug possibility for AP messages
This patch introduces two dynamic debug hexdump
invocation possibilities to be able to a) dump an
AP message immediately before it goes into the
firmware queue and b) dump a fresh from the
firmware queue received AP message.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-02-16 14:30:13 +01:00
Harald Freudenberger
08b2c3706d s390/zcrypt: introduce dynamic debugging for AP and zcrypt code
This patch replaces all the s390 debug feature calls with
debug level by dynamic debug calls pr_debug. These calls
are much more flexible and each single invocation can get
enabled/disabled at runtime wheres the s390 debug feature
debug calls have only one knob - enable or disable all in
one bunch. The benefit is especially significant with
high frequency called functions like the AP bus scan. In
most debugging scenarios you don't want and need them, but
sometimes it is crucial to know exactly when and how long
the AP bus scan took.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-02-16 14:30:12 +01:00
Harald Freudenberger
207022d39d s390/ap: handle outband SE bind state change
This patch addresses some weird scenarios where an outband
manipulation of the SE bind state of a queue assigned and
maybe in use by an SE guest with AP pass-through support
took place. So for example when the guest has bound and
associated a queue and then this domain has been zeroed on
the service element.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-11-30 16:24:23 +01:00
Harald Freudenberger
d4c53ae8e4 s390/ap: store TAPQ hwinfo in struct ap_card
As of now the AP card struct held only part of the
queue's hwinfo (that is the GR2 register content returned
with an TAPQ invocation). This patch reworks struct ap_card
to hold the whole hwinfo now.

As there is a nice bit field union on top of this
ap_tapq_hwinfo struct, all the ugly bit checkings can
now get replaced by simple evaluations of the required
bit field.

Suggested-by: Ingo Franzki <ifranzki@linux.ibm.com>
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-11-30 16:24:23 +01:00
Harald Freudenberger
c40284b364 s390/ap: re-enable interrupt for AP queues
This patch introduces some code lines which check
for interrupt support enabled on an AP queue after
a reply has been received. This invocation has been
chosen as there is a good chance to have the queue
empty at that time. As the enablement of the irq
imples a state machine change the queue should not
have any pending requests or unreceived replies.

Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-11-05 22:34:57 +01:00
Harald Freudenberger
01c89ab7f8 s390/ap: rework to use irq info from ap queue status
This patch reworks the irq handling and reporting code
for the AP queue interrupt handling to always use the
irq info from the queue status.

Until now the interrupt status of an AP queue was stored
into a bool variable within the ap_queue struct. This
variable was set on a successful interrupt enablement
and cleared with kicking a reset. However, it may be
that the interrupt state is manipulated outband for
example by a hypervisor. This patch removes this variable
and instead the irq bit from the AP queue status which is
always reflecting the current irq state is used.

Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-11-05 22:34:57 +01:00
Harald Freudenberger
a19a161482 s390/zcrypt: introduce new internal AP queue se_bound attribute
This patch introduces a new AP queue internal attribute
se_bound which reflects the bound state of an APQN within
a Secure Execution environment.

With introduction of Secure Execution guests now an
AP firmware queue needs to be bound to the guest before
usage. This patch introduces a new internal attribute
reflecting this bound state and some glue code to handle
this new field during lifetime of an AP queue device.

Together with that now the zcrypt scheduler considers
the state of the AP queues when a message is about to be
distributed among the existing queues. There is a new
function ap_queue_usable() which returns true only when
all conditions for using this AP queue device are fulfilled.
In details this means: the AP queue needs to be configured,
not checkstopped and within an SE environment it needs
to be bound. So the new function gives and indication
if the AP queue device is ready to serve requests or not.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-10-16 13:04:09 +02:00
Harald Freudenberger
32d1d9204f s390/ap: re-init AP queues on config on
On a state toggle from config off to config on and on the
state toggle from checkstop to not checkstop the queue's
internal states was set but the state machine was not
nudged. This did not care as on the first enqueue of a
request the state machine kick ran.

However, within an Secure Execution guest a queue is
only chosen by the scheduler when it has been bound.
But to bind a queue, it needs to run through the initial
states (reset, enable interrupts, ...). So this is like
a chicken-and-egg problem and the result was in fact
that a queue was unusable after a config off/on toggle.

With some slight rework of the handling of these states
now the new function _ap_queue_init_state() is called
which is the core of the ap_queue_init_state() function
but without locking handling. This has the benefit that
it can be called on all the places where a (re-)init
of the AP queue's state machine is needed.

Fixes: 2d72eaf036 ("s390/ap: implement SE AP bind, unbind and associate")
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-10-16 13:04:09 +02:00
Harald Freudenberger
5ac8c72462 s390/zcrypt: remove CEX2 and CEX3 device drivers
Remove the legacy device driver code for CEX2 and CEX3 cards.

The last machines which are able to handle CEX2 crypto cards
are z10 EC first available 2008 and z10 BC first available 2009.
The last machines able to handle a CEX3 crypto card are
z196 first available 2010 and z114 first available 2011.

Please note that this does not imply to drop CEX2 and CEX3
support in general. With older kernels on hardware up to the
aforementioned machine models these crypto cards will get
support by IBM.

The removal of the CEX2 and CEX3 device drivers code opens up
some simplifications, for example support for crypto cards
without rng support can be removed also.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-07-24 12:12:22 +02:00
Harald Freudenberger
0fdcc88bb9 s390/zcrypt: cleanup some debug code
This patch removes most of the debug code which
is build in when CONFIG_ZCRYPT_DEBUG is enabled.
There is no real exploiter for this code any more and
at least one ioctl fails with this code enabled.

The CONFIG_ZCRYPT_DEBUG kernel config option still
makes sense as some debug sysfs entries can get
enabled with this and maybe long term a new better
designed debug and error injection way will get
introduced.

This patch only removes code surrounded by the named
kernel config option. This option should by default
always be off anyway. The structs and defines removed
by the patch have been used only by code surrounded
by a CONFIG_ZCRYPT_DEBUG ifdef and thus can be removed
also.

In the end this patch removes all the failure-injection
possibilities which had been available when the kernel
had been build with CONFIG_ZCRYPT_DEBUG. It has never
been used that much and was too unflexible anyway.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-07-03 11:19:41 +02:00
Harald Freudenberger
038c5bedbc s390/ap: add ap status asynch error support
Review and extend the low level AP code to be able to
deal with asynchronous reported errors on APQNs.

The hypervisor and the SE guest may be confronted with
an asynchronously reported error at return of an AP
instruction. So all places where AP instructions are
called need review and may eventually need extensions.
However, not all places need rework. As together with
the AP status and the enabled asynch bit there is always
a response code set. The asynch error reporting comes
with new response codes which may be simple handled in
the default case of a switch statement.

The idea behind this patch is to report asynch errors
as -EPERM (read this as "Operation not permitted") which
reflects the fact that only a rapq (with F bit enabled)
is a valid AP instruction when an asynch error is flagged.

The AP queue state machine functions return
AP_SM_WAIT_NONE when a asynch error is detected to reflect
the fact, that the state machine can't do anything with
such an error as long as the queue is reset.

Unfortunately the ap bus scan function needed some
update as the ap_queue_info() now needs to return
3 states: 1 if an APQN exists and info is available,
-1 if it is assumed an APQN does not exist and the new
return value 0 without any info values filled. This 0
returncode is handled as "there is an APQN but we currently
don't know any more hw info about this, so please use
your previous info and try again later".

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-03-20 11:12:49 +01:00
Harald Freudenberger
2d72eaf036 s390/ap: implement SE AP bind, unbind and associate
Implementation of the new functions for SE AP support:
bind, unbind and associate. There are two new sysfs
attributes for this:

/sys/devices/ap/cardxx/xx.yyyy/se_bind
/sys/devices/ap/cardxx/xx.yyyy/se_associate

Writing a 1 into the se_bind attribute triggers the
SE AP bind for this AP queue, writing a 0 into does
an unbind - that's a reset (RAPQ) with the F bit enabled.

The se_associate attribute needs an integer value in
range 0...2^16-1 written in. This is the index into a
secrets table feed into the ultravisor. For more details
please see the Architecture documents.

These both new ap queue attributes are only visible
inside a SE guest with SB (Secure Binding) available.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-03-20 11:12:49 +01:00
Harald Freudenberger
263c8454db s390/ap: introduce low frequency polling possibility
For some events the ap bus needs to poll. For example
when an AP queue is reset until the reset is through.
Also when no interrupt support is available (e.g. zVM)
there is a need to poll until all requests have been
processed and all replies have been delivered.

Polling is done with a high resolution timer by default
run with a rate of 4kHz (LPAR) or 666Hz (zVM guest).

For some events (wait for reset complete, wait for irq
enabled complete) this is a much too high poll rate
which triggers a lot of TAPQ invocations.

This patch introduces the possibility for the state
machine functions to return a new wait enum
AP_SM_WAIT_LOW_TIMEOUT which gives a hint to the
ap_wait() function to eventually set up the timer
with a more relaxed timeout value of 25Hz.

This patch also includes a slight rework of the sysfs
functions parsing the timer related stuff: Use of
kstrtobool and kstrtoul instead of sscanf.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-03-20 11:12:49 +01:00
Harald Freudenberger
4bdf3c3956 s390/ap: provide F bit parameter for ap_rapq() and ap_zapq()
Extent the ap inline functions ap_rapq() (calls PQAP(RAPQ))
and ap_zapq() (calls PQAP(ZAPQ)) with a new parameter to
enable the new architectured F bit which forces an
unassociate and/or unbind on a secure execution associated
and/or bound queue.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-03-20 11:12:49 +01:00
Harald Freudenberger
088174960e s390/ap: filter ap card functions, new queue functions attribute
With SE SB (Secure Binding) some currently unused and thus always
zero bits in the TAPQ GR2 result are now used to show the binding
state of a queue. So to check if a card has changed the comparing
base is exactly this GR2 value shown as 'ap_function' in sysfs
(/sys/devices/ap/cardxx/ap_functions). Now there is some queue
specific info in this info and so a new mask TAPQ_CARD_FUNC_CMP_MASK
is used to filter out only the relevant bits for card compare.

For the same reason now the function bits (including exactly this
bind/associate information) need to be exposed to user space now.
So tools like lszcrypt can evaluate binding/association state on a
queue base. So here comes a new sysfs attribute

  /sys/devices/ap/cardxx/xx.yyyy/ap_functions

This sysfs attribute is similar to the already existing
ap_functions attribute at ap card level. It shows the
upper 32 bits of GR2 from an invocation of TAPQ for this
AP queue.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-03-20 11:12:48 +01:00
Harald Freudenberger
964d581daf s390/zcrypt: replace scnprintf with sysfs_emit
Replace scnprintf() with sysfs_emit() and friends
where possible.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-03-20 11:12:48 +01:00
Harald Freudenberger
8794c59613 s390/zcrypt: rework length information for dqap
The inline ap_dqap function does not return the number of
bytes actually written into the message buffer. The calling
code inspects the AP message header to figure out what kind
of AP message has been received and pulls the length
information from this header. This processing may not work
correctly in cases where only a fragment of the reply is
received.

With this patch the ap_dqap inline function now returns
the number of actually written bytes in the *length parameter.
So the calling function has a chance to compare the number of
received bytes against what the AP message header length
field states. This is especially useful in cases where a
message could only get partially received.

The low level reply processing functions needed some rework
to be able to catch this new length information and compare
it the right way. The rework also deals with some situations
where until now the reply length was not correctly calculated
and/or set.

All this has been heavily tested as the modifications on
the reply length information may affect crypto load.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-03-20 11:12:47 +01:00
Harald Freudenberger
003d248fee s390/zcrypt: make psmid unsigned long instead of long long
Since s390 kernel build does not support 32 bit build any
more there is no difference between long and long long.
So this patch reworks all occurrences of psmid (a 64 bit
value) to use unsigned long now.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-03-20 11:12:47 +01:00
Harald Freudenberger
ebf95e8846 s390/ap,zcrypt,vfio: introduce and use ap_queue_status_reg union
Introduce a new ap queue status register wrapper union to access register
wide values. So the inline assembler only sees register wide values but the
surrounding code may use a more structured view of the same value and a
reader of the code (and the compiler) gets a clear understanding about the
mapping between fields and register values.

All the changes to access the ap queue status are local to the inline
functions within ap.h. However, the struct ap_qirq_ctrl has been replaces
by a union for same reason and this needed slight adaptions in the calling
code.

Suggested-by: Halil Pasic <pasic@linux.ibm.com>
Suggested-by: Andreas Arnez <arnez@linux.ibm.com>
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-27 15:29:36 +01:00
Nicolin Chen
10e19d492a vfio/ap: Pass in physical address of ind to ap_aqic()
The ap_aqic() is called by vfio_ap_irq_enable() where it passes in a
virt value that's casted from a physical address "h_nib". Inside the
ap_aqic(), it does virt_to_phys() again.

Since ap_aqic() needs a physical address, let's just pass in a pa of
ind directly. So change the "ind" to "pa_ind".

Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Tested-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Link: https://lore.kernel.org/r/20220723020256.30081-4-nicolinc@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-23 07:29:10 -06:00
Harald Freudenberger
2004b57cde s390/zcrypt: code cleanup
This patch tries to fix as much as possible of the
checkpatch.pl --strict findings:
  CHECK: Logical continuations should be on the previous line
  CHECK: No space is necessary after a cast
  CHECK: Alignment should match open parenthesis
  CHECK: 'useable' may be misspelled - perhaps 'usable'?
  WARNING: Possible repeated word: 'is'
  CHECK: spaces preferred around that '*' (ctx:VxV)
  CHECK: Comparison to NULL could be written "!msg"
  CHECK: Prefer kzalloc(sizeof(*zc)...) over kzalloc(sizeof(struct...)...)
  CHECK: Unnecessary parentheses around resp_type->work
  CHECK: Avoid CamelCase: <xcRB>

There is no functional change comming with this patch, only
code cleanup, renaming, whitespaces, indenting, ... but no
semantic change in any way. Also the API (zcrypt and pkey
header file) is semantically unchanged.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Jürgen Christ <jchrist@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-04-25 13:54:14 +02:00
Harald Freudenberger
a7e701dba1 s390/zcrypt: handle checkstopped cards with new state
A crypto card may be in checkstopped state. With this
patch this is handled as a new state in the ap card and
ap queue structs. There is also a new card sysfs attribute

  /sys/devices/ap/cardxx/chkstop

and a new queue sysfs attribute

  /sys/devices/ap/cardxx/xx.yyyy/chkstop

displaying the checkstop state of the card or queue. Please
note that the queue's checkstop state is only a copy of the
card's checkstop state but makes maintenance much easier.

The checkstop state expressed here is the result of an
RC 0x04 (CHECKSTOP) during an AP command, mostly the
PQAP(TAPQ) command which is 'testing' the queue.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Jürgen Christ <jchrist@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-03-08 00:33:00 +01:00
Harald Freudenberger
d64e5e9120 s390/ap/zcrypt: debug feature improvements
This patch adds some debug feature improvements related
to some failures happened in the past. With CEX8 the max
request and response sizes have been extended but the
user space applications did not rework their code and
thus ran into receive buffer issues. This ffdc patch
here helps with additional checks and debug feature
messages in debugging and pointing to the root cause of
some failures related to wrong buffer sizes.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Jürgen Christ <jchrist@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-03-08 00:33:00 +01:00
Harald Freudenberger
3f74eb5f78 s390/zcrypt: rework of debug feature messages
This patch reworks all the debug feature invocations to be
more uniform. All invocations now use the macro with the
level already part of the macro name. All messages now start
with %s filled with __func__ (well there are still some
exceptions), and some message text has been shortened or
reworked.

There is no functional code touched with this patch.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-10-26 15:21:27 +02:00
Harald Freudenberger
3826350e6d s390/ap: Fix hanging ioctl caused by orphaned replies
When a queue is switched to soft offline during heavy load and later
switched to soft online again and now used, it may be that the caller
is blocked forever in the ioctl call.

The failure occurs because there is a pending reply after the queue(s)
have been switched to offline. This orphaned reply is received when
the queue is switched to online and is accidentally counted for the
outstanding replies. So when there was a valid outstanding reply and
this orphaned reply is received it counts as the outstanding one thus
dropping the outstanding counter to 0. Voila, with this counter the
receive function is not called any more and the real outstanding reply
is never received (until another request comes in...) and the ioctl
blocks.

The fix is simple. However, instead of readjusting the counter when an
orphaned reply is detected, I check the queue status for not empty and
compare this to the outstanding counter. So if the queue is not empty
then the counter must not drop to 0 but at least have a value of 1.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-10-26 15:21:27 +02:00
Heiko Carstens
948e50551b s390/ap: fix kernel doc comments
Get rid of warnings like:

drivers/s390/crypto/ap_bus.c:216: warning:
 bad line:
drivers/s390/crypto/ap_bus.c:444:
 warning: Function parameter or member 'floating' not described in 'ap_interrupt_handler'

Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-09-15 14:29:21 +02:00
Harald Freudenberger
cabebb697c s390/ap: fix state machine hang after failure to enable irq
If for any reason the interrupt enable for an ap queue fails the
state machine run for the queue returned wrong return codes to the
caller. So the caller assumed interrupt support for this queue in
enabled and thus did not re-establish the high resolution timer used
for polling. In the end this let to a hang for the user space process
waiting "forever" for the reply.

This patch reworks these return codes to return correct indications
for the caller to re-establish the timer when a queue runs without
interrupt support.

Please note that this is fixing a wrong behavior after a first
failure (enable interrupt support for the queue) failed. However,
looks like this occasionally happens on KVM systems.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-26 20:22:12 +02:00
Harald Freudenberger
1f0d22defd s390/ap: Rework ap_dqap to deal with messages greater than recv buffer
Rework of the ap_dqap() inline function with the dqap inline assembler
invocation and the caller code in ap_queue.c to be able to handle
replies which exceed the receive buffer size.

ap_dqap() now provides two additional parameters to handle together
with the caller the case where a reply in the firmware queue entry
exceeds the given message buffer size. It depends on the caller how to
exactly handle this. The behavior implemented now by ap_sm_recv() in
ap_queue.c is to simple purge this entry from the firmware queue and
let the caller 'receive' a -EMSGSIZE for the request without
delivering any reply data - not even a truncated reply message.

However, the reworked ap_dqap() could now get invoked in a way that
the message is received in multiple parts and the caller assembles the
parts into one reply message.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Suggested-by: Juergen Christ <jchrist@linux.ibm.com>
Reviewed-by: Juergen Christ <jchrist@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-07-08 15:37:27 +02:00
Harald Freudenberger
bd39654a22 s390/AP: support new dynamic AP bus size limit
This patch provides support for new dynamic AP bus message limit
with the existing zcrypt device driver and AP bus core code.

There is support for a new field 'ml' from TAPQ query. The field
gives if != 0 the AP bus limit for this card in 4k chunk units.
The actual message size limit per card is shown as a new read-only
sysfs attribute. The sysfs attribute

  /sys/devices/ap/cardxx/max_msg_size

shows the upper limit in bytes used by the AP bus and zcrypt device
driver for requests and replies send to and received from this card.
Currently up to CEX7 support only max 12kB msg size and thus the field
shows 12288 meaning the upper limit of a valid msg for this card is
12kB. Please note that the usable payload is somewhat lower and
depends on the msg type and thus the header struct which is to be
prepended by the zcrypt dd.

The dispatcher responsible for choosing the right card and queue is
aware of the individual card AP bus message limit. So a request is
only assigned to a queue of a card which is able to handle the size of
the request (e.g. a 14kB request will never go to a max 12kB card).
If no such card is found the ioctl will fail with ENODEV.

The reply buffer held by the device driver is determined by the ml
field of the TAPQ for this card. If a response from the card exceeds
this limit however, the response is not truncated but the ioctl for
this request will fail with errno EMSGSIZE to indicate that the device
driver has dropped the response because it would overflow the buffer
limit.

If the request size does not indicate to the dispatcher that an
adapter with extended limit is to be used, a random card will be
chosen when no specific card is addressed (ANY addressing). This may
result in an ioctl failure when the reply size needs an adapter with
extended limit but the randomly chosen one is not capable of handling
the broader reply size. The user space application needs to use
dedicated addressing to forward such a request only to suitable cards
to get requests like this processed properly.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ingo Tuchscherer <ingo.tuchscherer@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-07-05 12:44:23 +02:00
Harald Freudenberger
e73a99f328 s390/ap: Fix hanging ioctl caused by wrong msg counter
When a AP queue is switched to soft offline, all pending
requests are purged out of the pending requests list and
'received' by the upper layer like zcrypt device drivers.
This is also done for requests which are already enqueued
into the firmware queue. A request in a firmware queue
may eventually produce an response message, but there is
no waiting process any more. However, the response was
counted with the queue_counter and as this counter was
reset to 0 with the offline switch, the pending response
caused the queue_counter to get negative. The next request
increased this counter to 0 (instead of 1) which caused
the ap code to assume there is nothing to receive and so
the response for this valid request was never tried to
fetch from the firmware queue.

This all caused a queue to not work properly after a
switch offline/online and in the end processes to hang
forever when trying to send a crypto request after an
queue offline/online switch cicle.

Fixed by a) making sure the counter does not drop below 0
and b) on a successful enqueue of a message has at least
a value of 1.

Additionally a warning is emitted, when a reply can't get
assigned to a waiting process. This may be normal operation
(process had timeout or has been killed) but may give a
hint that something unexpected happened (like this odd
behavior described above).

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-06-16 23:32:02 +02:00
Harald Freudenberger
4366dd7251 s390/zcrypt: fix wrong format specifications
Fixes 5 wrong format specification findings found by the
kernel test robot in ap_queue.c:

warning: format specifies type 'unsigned char' but the argument has type 'int' [-Wformat]
                               __func__, status.response_code,

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reported-by: kernel test robot <lkp@intel.com>
Fixes: 2ea2a6099a ("s390/ap: add error response code field for ap queue devices")
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-10-09 23:45:30 +02:00
Harald Freudenberger
27c4f6738b s390/zcrypt: Introduce Failure Injection feature
Introduce a way to specify additional debug flags with an crpyto
request to be able to trigger certain failures within the zcrypt
device drivers and/or ap core code.

This failure injection possibility is only enabled with a kernel debug
build CONFIG_ZCRYPT_DEBUG) and should never be available on a regular
kernel running in production environment.

Details:

* The ioctl(ICARSAMODEXPO) get's a struct ica_rsa_modexpo. If the
  leftmost bit of the 32 bit unsigned int inputdatalength field is
  set, the uppermost 16 bits are separated and used as debug flag
  value. The process is checked to have the CAP_SYS_ADMIN capability
  enabled or EPERM is returned.

* The ioctl(ICARSACRT) get's a struct ica_rsa_modexpo_crt. If the
  leftmost bit of the 32 bit unsigned int inputdatalength field is set,
  the uppermost 16 bits are separated and used als debug flag
  value. The process is checked to have the CAP_SYS_ADMIN capability
  enabled or EPERM is returned.

* The ioctl(ZSECSENDCPRB) used to send CCA CPRBs get's a struct
  ica_xcRB. If the leftmost bit of the 32 bit unsigned int status
  field is set, the uppermost 16 bits of this field are used as debug
  flag value. The process is checked to have the CAP_SYS_ADMIN
  capability enabled or EPERM is returned.

* The ioctl(ZSENDEP11CPRB) used to send EP11 CPRBs get's a struct
  ep11_urb. If the leftmost bit of the 64 bit unsigned int req_len
  field is set, the uppermost 16 bits of this field are used as debug
  flag value. The process is checked to have the CAP_SYS_ADMIN
  capability enabled or EPERM is returned.

So it is possible to send an additional 16 bit value to the zcrypt API
to be used to carry a failure injection command which may trigger
special behavior within the zcrypt API and layers below. This 16 bit
value is for the rest of the test referred as 'fi command' for Failure
Injection.

The lower 8 bits of the fi command construct a numerical argument in
the range of 1-255 and is the 'fi action' to be performed with the
request or the resulting reply:

* 0x00 (all requests): No failure injection action but flags may be
  provided which may affect the processing of the request or reply.
* 0x01 (only CCA CPRBs): The CPRB's agent_ID field is set to
  'FF'. This results in an reply code 0x90 (Transport-Protocol
  Failure).
* 0x02 (only CCA CPRBs): After the APQN to send to has been chosen,
  the domain field within the CPRB is overwritten with value 99 to
  enforce an reply with RY 0x8A.
* 0x03 (all requests): At NQAP invocation the invalid qid value 0xFF00
  is used causing an response code of 0x01 (AP queue not valid).

The upper 8 bits of the fi command may carry bit flags which may
influence the processing of an request or response:

* 0x01: No retry. If this bit is set, the usual loop in the zcrypt API
  which retries an CPRB up to 10 times when the lower layers return
  with EAGAIN is abandoned after the first attempt to send the CPRB.
* 0x02: Toggle special. Toggles the special bit on this request. This
  should result in an reply code RY~0x41 and result in an ioctl
  failure with errno EINVAL.

This failure injection possibilities may get some further extensions
in the future. As of now this is a starting point for Continuous Test
and Integration to trigger some failures and watch for the reaction of
the ap bus and zcrypt device driver code.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-10-07 21:50:01 +02:00
Harald Freudenberger
e0332629e3 s390/ap/zcrypt: revisit ap and zcrypt error handling
Revisit the ap queue error handling: Based on discussions and
evaluatios with the firmware folk here is now a rework of the response
code handling for all the AP instructions. The idea is to distinguish
between failures because of some kind of invalid request where a retry
does not make any sense and a failure where another attempt to send
the very same request may succeed. The first case is handled by
returning EINVAL to the userspace application. The second case results
in retries within the zcrypt API controlled by a per message retry
counter.

Revisit the zcrpyt error handling: Similar here, based on discussions
with the firmware people here comes a rework of the handling of all
the reply codes.  Main point here is that there are only very few
cases left, where a zcrypt device queue is switched to offline. It
should never be the case that an AP reply message is 'unknown' to the
device driver as it indicates a total mismatch between device driver
and crypto card firmware. In all other cases, the code distinguishes
between failure because of invalid message (see above - EINVAL) or
failures of the infrastructure (see above - EAGAIN).

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-10-07 21:50:01 +02:00
Harald Freudenberger
4f2fcccdb5 s390/ap: add card/queue deconfig state
This patch adds a new config state to the ap card and queue
devices. This state reflects the response code
0x03 "AP deconfigured" on TQAP invocation and is tracked with
every ap bus scan.

Together with this new state now a card/queue device which
is 'deconfigured' is not disposed any more. However, for backward
compatibility the online state now needs to take this state into
account. So a card/queue is offline when the device is not configured.
Furthermore a device can't get switched from offline to online state
when not configured.

The config state is shown in sysfs at
  /sys/devices/ap/cardxx/config
for the card and
  /sys/devices/ap/cardxx/xx.yyyy/config
for each queue within each card.
It is a read-only attribute reflecting the negation of the
'AP deconfig' state as it is noted in the AP documents.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-10-07 21:50:00 +02:00
Harald Freudenberger
2ea2a6099a s390/ap: add error response code field for ap queue devices
On AP instruction failures the last response code is now
kept in the struct ap_queue. There is also a new sysfs
attribute showing this field (enabled only on debug kernels).

Also slight rework of the AP_DBF macros to get some more
content into one debug feature message line.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-10-07 21:50:00 +02:00
Harald Freudenberger
0b641cbd24 s390/ap: split ap queue state machine state from device state
The state machine for each ap queue covered a mixture of
device states and state machine (firmware queue state) states.

This patch splits the device states and the state machine
states into two different enums and variables. The major
state is the device state with currently these values:

  AP_DEV_STATE_UNINITIATED - fresh and virgin, not touched
  AP_DEV_STATE_OPERATING   - queue dev is working normal
  AP_DEV_STATE_SHUTDOWN	   - remove/unbind/shutdown in progress
  AP_DEV_STATE_ERROR	   - device is in error state

only when the device state is > UNINITIATED the state machine
is run. The state machine represents the states of the firmware
queue:

  AP_SM_STATE_RESET_START - starting point, reset (RAPQ) ap queue
  AP_SM_STATE_RESET_WAIT  - reset triggered, waiting to be finished
			    if irqs enabled, set up irq (AQIC)
  AP_SM_STATE_SETIRQ_WAIT - enable irq triggered, waiting to be
			    finished, then go to IDLE
  AP_SM_STATE_IDLE	  - queue is operational but empty
  AP_SM_STATE_WORKING	  - queue is operational, requests are stored
			    and replies may wait for getting fetched
  AP_SM_STATE_QUEUE_FULL  - firmware queue is full, so only replies
			    can get fetched

For debugging each ap queue shows a sysfs attribute 'states' which
displays the device and state machine state and is only available
when the kernel is build with CONFIG_ZCRYPT_DEBUG enabled.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-10-07 21:49:59 +02:00
Harald Freudenberger
dc4b6ded3c s390/ap: rename and clarify ap state machine related stuff
There is a state machine held for each ap queue device.
The states and functions related to this where somethimes
noted with _sm_ somethimes without. This patch clarifies
and renames all the ap queue state machine related functions,
enums and defines to have a _sm_ in the name.

There is no functional change coming with this patch - it's
only beautifying code.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2020-07-03 10:49:49 +02:00
Harald Freudenberger
74ecbef7b9 s390/zcrypt: code beautification and struct field renames
Some beautifications related to the internal only used
struct ap_message and related code. Instead of one int carrying
only the special flag now a u32 flags field is used.

At struct CPRBX the pointers to additional data are now marked
with __user. This caused some changes needed on code, where
these structs are also used within the zcrypt misc functions.

The ica_rsa_* structs now use the generic types __u8, __u32, ...
instead of char, unsigned int.

zcrypt_msg6 and zcrypt_msg50 use min_t() instead of min().

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2020-07-03 10:49:34 +02:00
Harald Freudenberger
bc4b295e87 s390/ap: introduce new ap function ap_get_qdev()
Provide a new interface function to be used by the ap drivers:
  struct ap_queue *ap_get_qdev(ap_qid_t qid);
Returns ptr to the struct ap_queue device or NULL if there
was no ap_queue device with this qid found. When something is
found, the reference count of the embedded device is increased.
So the caller has to decrease the reference count after use
with a call to put_device(&aq->ap_dev.device).

With this patch also the ap_card_list is removed from the
ap core code and a new hashtable is introduced which stores
hnodes of all the ap queues known to the ap bus.

The hashtable approach and a first implementation of this
interface comes from a previous patch from
Anthony Krowiak and an idea from Halil Pasic.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Suggested-by: Tony Krowiak <akrowiak@linux.ibm.com>
Suggested-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-05-20 10:22:51 +02:00
Harald Freudenberger
41677b1d94 s390/ap: remove power management code from ap bus and drivers
The s390 power management support has been removed. So the
api registration and the suspend and resume callbacks and
all the code related to this for the ap bus and the ap drivers
is removed with this patch.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-03-27 10:22:47 +01:00
Joe Perches
fcf0220abc s390/zcrypt: use fallthrough;
Convert the various uses of fallthrough comments to fallthrough;

Done via script
Link: https://lore.kernel.org/lkml/b56602fcf79f849e733e7b521bb0e17895d390fa.1582230379.git.joe.com/

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-03-25 12:39:37 +01:00
Harald Freudenberger
40501c70e3 s390/zcrypt: replace snprintf/sprintf with scnprintf
snprintf() may not always return the correct size of used bytes but
instead the length the resulting string would be if it would fit into
the buffer. So scnprintf() is the function to use when the real length
of the resulting string is needed.

Replace all occurrences of snprintf() with scnprintf() where the return
code is further processed. Also find and fix some occurrences where
sprintf() was used.

Suggested-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-03-23 13:41:54 +01:00
Harald Freudenberger
fcd98d4002 s390/zcrypt: fix card and queue total counter wrap
The internal statistic counters for the total number of
requests processed per card and per queue used integers. So they do
wrap after a rather huge amount of crypto requests processed. This
patch introduces uint64 counters which should hold much longer but
still may wrap. The sysfs attributes request_count for card and queue
also used only %ld and now display the counter value with %llu.

This is not a security relevant fix. The int overflow which happened
is not in any way exploitable as a security breach.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-02-10 12:49:35 +01:00
Harald Freudenberger
0c874cd042 s390/zcrypt: move ap device reset from bus to driver code
This patch moves the reset invocation of an ap device when
fresh detected from the ap bus to the probe() function of
the driver responsible for this device.

The virtualisation of ap devices makes it necessary to
remove unconditioned resets on fresh appearing apqn devices.
It may be that such a device is already enabled for guest
usage. So there may be a race condition between host ap bus
and guest ap bus doing the reset. This patch moves the
reset from the ap bus to the zcrypt drivers. So if there
is no zcrypt driver bound to an ap device - for example
the ap device is bound to the vfio device driver - the
ap device is untouched passed to the vfio device driver.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-01-09 16:59:18 +01:00
Vasily Gorbik
3cdd986067 s390/zcrypt: adjust switch fall through comments for -Wimplicit-fallthrough
Silence the following warnings when built with -Wimplicit-fallthrough=3
enabled by default since 5.3-rc2:
In file included from ./include/linux/preempt.h:11,
                 from ./include/linux/spinlock.h:51,
                 from ./include/linux/mmzone.h:8,
                 from ./include/linux/gfp.h:6,
                 from ./include/linux/slab.h:15,
                 from drivers/s390/crypto/ap_queue.c:13:
drivers/s390/crypto/ap_queue.c: In function 'ap_sm_recv':
./include/linux/list.h:577:2: warning: this statement may fall through [-Wimplicit-fallthrough=]
  577 |  for (pos = list_first_entry(head, typeof(*pos), member); \
      |  ^~~
drivers/s390/crypto/ap_queue.c:147:3: note: in expansion of macro 'list_for_each_entry'
  147 |   list_for_each_entry(ap_msg, &aq->pendingq, list) {
      |   ^~~~~~~~~~~~~~~~~~~
drivers/s390/crypto/ap_queue.c:155:2: note: here
  155 |  case AP_RESPONSE_NO_PENDING_REPLY:
      |  ^~~~
drivers/s390/crypto/zcrypt_msgtype6.c: In function 'convert_response_ep11_xcrb':
drivers/s390/crypto/zcrypt_msgtype6.c:871:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
  871 |   if (msg->cprbx.cprb_ver_id == 0x04)
      |      ^
drivers/s390/crypto/zcrypt_msgtype6.c:874:2: note: here
  874 |  default: /* Unknown response type, this should NEVER EVER happen */
      |  ^~~~~~~
drivers/s390/crypto/zcrypt_msgtype6.c: In function 'convert_response_rng':
drivers/s390/crypto/zcrypt_msgtype6.c:901:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
  901 |   if (msg->cprbx.cprb_ver_id == 0x02)
      |      ^
drivers/s390/crypto/zcrypt_msgtype6.c:907:2: note: here
  907 |  default: /* Unknown response type, this should NEVER EVER happen */
      |  ^~~~~~~
drivers/s390/crypto/zcrypt_msgtype6.c: In function 'convert_response_xcrb':
drivers/s390/crypto/zcrypt_msgtype6.c:838:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
  838 |   if (msg->cprbx.cprb_ver_id == 0x02)
      |      ^
drivers/s390/crypto/zcrypt_msgtype6.c:844:2: note: here
  844 |  default: /* Unknown response type, this should NEVER EVER happen */
      |  ^~~~~~~
drivers/s390/crypto/zcrypt_msgtype6.c: In function 'convert_response_ica':
drivers/s390/crypto/zcrypt_msgtype6.c:801:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
  801 |   if (msg->cprbx.cprb_ver_id == 0x02)
      |      ^
drivers/s390/crypto/zcrypt_msgtype6.c:808:2: note: here
  808 |  default: /* Unknown response type, this should NEVER EVER happen */
      |  ^~~~~~~

Acked-by: Patrick Steuer <patrick.steuer@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-08-02 13:58:23 +02:00
Harald Freudenberger
16222cfb96 s390/zcrypt: fix possible deadlock situation on ap queue remove
With commit 01396a374c ("s390/zcrypt: revisit ap device remove
procedure") the ap queue remove is now a two stage process. However,
a del_timer_sync() call may trigger the timer function which may
try to lock the very same spinlock as is held by the function
just initiating the del_timer_sync() call. This could end up in
a deadlock situation. Very unlikely but possible as you need to
remove an ap queue at the exact sime time when a timeout of a
request occurs.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reported-by: Pierre Morel <pmorel@linux.ibm.com>
Fixes: commit 01396a374c ("s390/zcrypt: revisit ap device remove procedure")
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-04-10 17:46:24 +02:00
Harald Freudenberger
01396a374c s390/zcrypt: revisit ap device remove procedure
Working with the vfio-ap driver let to some revisit of the way
how an ap (queue) device is removed from the driver.
With the current implementation all the cleanup was done before
the driver even got notified about the removal. Now the ap
queue removal is done in 3 steps:
1) A preparation step, all ap messages within the queue
   are flushed and so the driver does 'receive' them.
   Also a new state AP_STATE_REMOVE assigned to the queue
   makes sure there are no new messages queued in.
2) Now the driver's remove function is invoked and the
   driver should do the job of cleaning up it's internal
   administration lists or whatever. After 2) is done
   it is guaranteed, that the driver is not invoked any
   more. On the other hand the driver has to make sure
   that the APQN is not accessed any more after step 2
   is complete.
3) Now the ap bus code does the job of total cleanup of the
   APQN. A reset with zero is triggered and the state of
   the queue goes to AP_STATE_UNBOUND.
   After step 3) is complete, the ap queue has no pending
   messages and the APQN is cleared and so there are no
   requests and replies lingering around in the firmware
   queue for this APQN. Also the interrupts are disabled.

After these remove steps the ap queue device may be assigned
to another driver.

Stress testing this remove/probe procedure showed a problem with the
correct module reference counting. The actual receive of an reply in
the driver is done asynchronous with completions. So with a driver
change on an ap queue the message flush triggers completions but the
threads waiting for the completions may run at a time where the queue
already has the new driver assigned. So the module_put() at receive
time needs to be done on the driver module which queued the ap
message. This change is also part of this patch.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-03-06 09:34:03 +01:00
Harald Freudenberger
b1af7528d2 s390/zcrypt: use new state UNBOUND during queue driver rebind
When an alternate driver (vfio-ap) has bound an ap queue and this
binding is revised the ap queue device is in an intermittent
state not bound to any driver. The internal state variable
covered this with the state AP_STATE_BORKED which is also used to
reflect broken devices. When now an ap bus scan runs such a
device is destroyed and on the next scan reconstructed.

So a stress test with high frequency switching the queue driver
between the default and the vfio-ap driver hit this gap and the
queue was removed until the next ap bus scan. This fix now
introduces another state for the in-between condition for a queue
momentary not bound to a driver and so the ap bus scan function
skips this device instead of removing it.

Also some very slight but maybe helpful debug feature messages
come with this patch - in particular a message showing that a
broken card/queue device will get removed.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-02-13 08:27:44 +01:00
Harald Freudenberger
42a87d4103 s390/zcrypt: make sysfs reset attribute trigger queue reset
Until now there is no way to reset a AP queue or card. Driving a card
or queue offline and online again does only toggle the 'software'
online state. The only way to trigger a (hardware) reset is by running
hot-unplug/hot-plug for example on the HMC.

This patch makes the queue reset attribute in sysfs writable.
Writing into this attribute triggers a reset on the AP queue's state
machine. So the AP queue is flushed and state machine runs through the
initial states which cause a reset (PQAP(RAPQ)) and a re-registration
to interrupts (PQAP(AQIC)) if available.

The reset sysfs attribute is writable by root only. So only an
administrator is allowed to initiate a reset of AP queues. Please note
that the queue's counter values are left untouched by the reset.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-12-13 10:42:26 +01:00