The nl_type and nl_arg arrays defined in vfwprintf may be accessed
with an index up to and including NL_ARGMAX, but they are only of size
NL_ARGMAX, meaning they may be written to or read from 1 element too
far.
commit d42269d7c8 appropriated the
stream error flag temporarily to let the printf family of functions
suppress further output attempts after encountering a write error.
since the wide printf code relies on (narrow) vfprintf to print
padding and numeric conversions, a hack was put in vfprintf not to
clear the initial error status unless the stream is narrow oriented.
this was okay, because calling vfprintf on a wide-oriented stream
(outside of internal use by the implementation) produces undefined
behavior. however, it was highly non-obvious to anyone reading the
wide printf code, where the calls to fprintf without first checking
for error status appeared erroneous.
this patch removes all direct use of fprintf from the wide printf
core, except in the numeric conversions case where it was already
checked before starting processing of the directive that the error
status is not set. the other calls, which were performing padding, are
replaced by a new pad() helper function, which performs the check and
abstracts out the mechanism of writing the padding.
direct use of the error flag is also replaced by ferror, which is
defined as a macro in stdio_impl.h, expanding directly to the flag
check with no call or locking overhead.
this fixes a broader bug for which a special case was reported by
Bruno Haible, in the form of %n getting processed (and reporting the
number of wide characters which would have been written, but weren't)
after an encoding error (EILSEQ). in addition to the %n case, some but
not all of the format specifiers continued to attempt output after an
error. in particular, %c, %lc, and %s all used fputwc directly without
any check for error status.
as long as the error condition was permanent rather than transient,
these write attempts had no visible side effects, but in theory it
could be visible, for example with EAGAIN/EWOULDBLOCK or ENOSPC, if
the condition precluding output came to an end. this could produce
output with missing non-final data, rather than just truncated output,
albeit with the function still returning -1 as expected to report an
error.
to fix this, a check is added to stop processing of any new directive
(including %n) if the stream is already in error state, and direct use
of fputwc is replaced with calls to the out() helper function, which
checks for error status.
note that fprintf is also used directly without checking error status,
but due to how commit d42269d7c8
previously attempted to solve the issue of output after error, the
call to fprintf does not attempt to write anything when the
wide-oriented stream is already in error state. this is non-obvious,
and is quite a hack, so it should be changed, but I've left it alone
for now to make the bug fix commit itself as non-invasive as possible.
libc.h was intended to be a header for access to global libc state and
related interfaces, but ended up included all over the place because
it was the way to get the weak_alias macro. most of the inclusions
removed here are places where weak_alias was needed. a few were
recently introduced for hidden. some go all the way back to when
libc.h defined CANCELPT_BEGIN and _END, and all (wrongly implemented)
cancellation points had to include it.
remaining spurious users are mostly callers of the LOCK/UNLOCK macros
and files that use the LFS64 macro to define the awful *64 aliases.
in a few places, new inclusion of libc.h is added because several
internal headers no longer implicitly include libc.h.
declarations for __lockfile and __unlockfile are moved from libc.h to
stdio_impl.h so that the latter does not need libc.h. putting them in
libc.h made no sense at all, since the macros in stdio_impl.h are
needed to use them correctly anyway.
The switch statement has no 'default:' case and the function ends
immediately following the switch, so the extra comparison did not
communicate any extra information to the compiler.
this patch fixes a large number of missed internal signed-overflow
checks and errors in determining when the return value (output length)
would exceed INT_MAX, which should result in EOVERFLOW. some of the
issues fixed were reported by Alexander Cherepanov; others were found
in subsequent review of the code.
aside from the signed overflows being undefined behavior, the
following specific bugs were found to exist in practice:
- overflows computing length of floating point formats with huge
explicit precisions, integer formats with prefix characters and huge
explicit precisions, or string arguments or format strings longer
than INT_MAX, resulted in wrong return value and wrong %n results.
- literal width and precision values outside the range of int were
misinterpreted, yielding wrong behavior in at least one well-defined
case: string formats with precision greater than INT_MAX were
sometimes truncated.
- in cases where EOVERFLOW is produced, incorrect values could be
written for %n specifiers past the point of exceeding INT_MAX.
in addition to fixing these bugs, we now stop producing output
immediately when output length would exceed INT_MAX, rather than
continuing and returning an error only at the end.
the idiom fprintf(f, "%.*s", n, "") was wrongly used in vfwprintf as a
means of producing n spaces; instead it produces no output. the
correct form is fprintf(f, "%*s", n, ""), using width instead of
precision, since for %s the later is a maximum rather than a minimum.
the old idiom, f->mode |= f->mode+1, was adapted from the idiom for
setting byte orientation, f->mode |= f->mode-1, but the adaptation was
incorrect. unless the stream was alreasdy set byte-oriented, this code
incremented f->mode each time it was executed, which would eventually
lead to overflow. it could be fixed by changing it to f->mode |= 1,
but upcoming changes will require slightly more work at the time of
wide orientation, so it makes sense to just call fwide. as an
optimization in the single-character functions, fwide is only called
if the stream is not already wide-oriented.
previously, write errors neither stopped further output attempts nor
caused the function to return an error to the caller. this could
result in silent loss of output, possibly in the middle of output in
the event of a non-permanent error.
the simplest solution is temporarily clearing the error flag for the
target stream, then suppressing further output when the error flag is
set and checking/restoring it at the end of the operation to determine
the correct return value.
since the wide version of the code internally calls the narrow fprintf
to perform some of its underlying operations, initial clearing of the
error flag is suppressed when performing a narrow vfprintf on a
wide-oriented stream. this is not a problem since the behavior of
narrow operations on wide-oriented streams is undefined.
in some cases, these functions internally call a byte-based input or
output function before calling getwc/putwc, so they cannot rely on the
latter to set the orientation.
this header evolved to facilitate the extremely lazy practice of
omitting explicit includes of the necessary headers in individual
stdio source files; not only was this sloppy, but it also increased
build time.
now, stdio_impl.h is only including the headers it needs for its own
use; any further headers needed by source files are included directly
where needed.
to deal with the fact that the public headers may be used with pre-c99
compilers, __restrict is used in place of restrict, and defined
appropriately for any supported compiler. we also avoid the form
[restrict] since older versions of gcc rejected it due to a bug in the
original c99 standard, and instead use the form *restrict.
this implementation is extremely ugly and inefficient, but it avoids a
good deal of code duplication and bloat. it may be cleaned up later to
eliminate the remaining code duplication and some of the warts, but i
don't really care about its performance.
note that swprintf is not yet implemented.