6.3.1.1 Boolean, characters, and integers
674
673
If an int can represent all values of the original type, the value is converted to an int; int can repre-
sent values
converted to int
Commentary
Type conversions occur at translation time, when actual values are usually unknown. The standard requires
the translator to assume that the value of the expression can be any one of the representable values supported
by its type. While flow analysis could reduce the range of possible values, the standard does not require such
analysis to be performed. (If it is performed, a translator cannot use it to change the external behavior of a
program; that is, optimizations may be performed but the semantics specified by the standard is followed.)
Other Languages
Most languages have a single signed integer type, so there is rarely a smaller integer type that needs implicit
conversion.
Coding Guidelines
Some developers incorrectly assume that objects declared using typedef names do not take part in the integer
typedef
assumption
of no integer
promotions
promotions. Incorrect assumptions by a developer are very difficult to deduce from an analysis of the source
code. In some cases the misconception will be harmless, the actual program behavior being identical to
the misconstrued behavior. In other cases the behavior is different. Guideline recommendations are not a
substitute for proper developer training.
Example
1 typedef short SHORT;
2
3 extern SHORT es_1,
4 es_2;
5
6 void f(void)
7 {
8 unsigned int ui = 3; /
*
Value representable in a signed int.
*
/
9
10 if (es_1 == (es_2 + 1)) /
*
Operands converted to int.
*
/
11 ;
12 if (ui > es_1) /
*
Right operand converted to unsigned int.
*
/
13 ;
14 }
674
otherwise, it is converted to an unsigned int. int cannot rep-
resent values
converted to
unsigned int
Commentary
This can occur for the types
unsigned short
, or
unsigned char
, if either of them has the same represen-
tation as an
unsigned int
. Depending on the type chosen to be compatible with an enumeration type, it is
possible for an object that has an enumerated type to be promoted to the type unsigned int.
Common Implementations
On 16-bit processors the types
short
and
int
usually have the same representation, so
unsigned short
promotes to
unsigned int
. On 32-bit processors the type
short
usually has less precision than
int
, so the
type
unsigned short
promotes to
int
. There are a few implementations, mostly on DSP-based processors,
where the character types have the same width as the type int.
[984]
Coding Guidelines
Existing source code ported, from an environment in which the type
int
has greater width than
short
, to
an environment where they both have the same width may have its behavior changed. If the following is
executed on a host where the width of type int is greater than the width of short:
June 24, 2009 v 1.2
6.3.1.1 Boolean, characters, and integers
675
1 #include <stdio.h>
2
3 extern unsigned short us;
4 extern signed int si; /
*
Can hold negative values.
*
/
5
6 void f(void)
7 {
8 if (us > si)
9 printf("Pass\n");
10 else
11 printf("Fail\n");
12 }
the object
us
will be promoted to the type
int
. There will not be any change of values. On a host where the
types
int
and
short
have the same width, an
unsigned short
will be promoted to
unsigned int
. This
will lead to
si
being promoted to
unsigned int
(the usual arithmetic conversions) and a potential change in
its value. (If it has a small negative value, it will convert to a large positive value.) The relational comparison
will then return a different result than in the previous promotion case.
Cg
674.1
An object having an unsigned integer type shall not be implicitly converted to
unsigned int
through the
application of the integer promotions.
The consequence of this guideline recommendation is that such conversions need to be made explicit, using a
cast to an integer type whose rank is greater than or equal to int.
675
These are called the integer promotions.
48)
integer promo-
tions
Commentary
This defines the term integer promotions. Integer promotions occur when an object having a rank less than
int
appears in certain contexts. This behavior differs from arithmetic conversions where the type of a
footnote
48
690
different object is involved. Integer promotions are affected by the relative widths of types (compared to the
width of
int
). If the type
int
has greater width than
short
then, in general (the presence of extended integer
types whose rank is also less than
int
can complicate the situation), all types of less rank will convert to
int
.
If short has the same precision as int, an unsigned short will invariably promote to an unsigned int.
It is possible to design implementations where the integer conversions don’t follow a simple pattern, such
as the following:
signed short 16 bits including sign unsigned short 24 bits
signed int 24 bits including sign unsigned int 32 bits
Your author does not know of any implementation that uses this kind of unusual combination of bits for
its integer type representation.
C90
These are called the integral promotions.
27)
C
++
The C
++
Standard uses the C90 Standard terminology (and also points out, 3.9.1p7, “A synonym for integral
type is integer type.”).
Other Languages
The unary numeric promotions and binary numeric promotions in Java have the same effect.
v 1.2 June 24, 2009
6.3.1.1 Boolean, characters, and integers
676
Common Implementations
Many processors have load instructions that convert values having narrower types to a wider type. For
instance, loading a byte into a register and either sign extending (
signed char
), or zero filling (
unsigned
char
) the value to occupy 32 bits (promotion to
int
). On processors having instructions that operate on
values having a type narrower than
int
more efficiently than type
int
, optimizers can make use of the as-if
rule to improve efficiency. For instance, in some cases an analysis of the behavior of a program may find that
operand values and the result value is always representable in their unpromoted type. Implementations need
only to act as if the object had been converted to the type int, or unsigned int.
Coding Guidelines
If the guideline recommendation specifying use of a single integer type is followed there would never be any
480.1 object
int type only
integer promotions. The issue of implicit conversions versus explicit conversions might be a possible cause
of a deviation from this recommendation and is discussed elsewhere.
653 operand
convert automati-
cally
Example
1 signed short s1, s2, s3;
2 unsigned short us1, us2, us3;
3
4 void f(void)
5 {
6 s1 = s2 + s3; /
*
7
*
The result of + may be undefined.
8
*
The conversion for the = may be undefined.
9
*
/
10 /
*
s1 = (short)((int)s2 + (int)s3);
*
/
11 s1 = us2 + s3; /
*
The conversion for the = may be undefined.
*
/
12 /
*
13
*
The result of the binary + is always defined (unless
14
*
the type int is only one bit wider than a short; no
15
*
known implementations have this property).
16
*
17
*
Either both shorts promote to a wider type:
18
*
19
*
s1 = (short)((int)us2 + (int)s3);
20
*
21
*
or they both promote to an unsigned type of the same width:
22
*
23
*
s1 = (short)((unsigned int)us2 + (unsigned int)s3);
24
*
/
25 s1 = us2 + us3; /
*
The conversion for the = may be undefined.
*
/
26 us1 = us2 + us3; /
*
Always defined
*
/
27 us1 = us2 + s3; /
*
Always defined
*
/
28 us1 = s2 + s3; /
*
The result of + may undefined.
*
/
29 }
Table 675.1:
Occurrence of integer promotions (as a percentage of all operands appearing in all expressions). Based on the
translated form of this book’s benchmark programs.
Original Type % Original Type %
unsigned char 2.3 char 1.2
unsigned short 1.9 short 0.5
676
All other types are unchanged by the integer promotions.
June 24, 2009 v 1.2
6.3.1.1 Boolean, characters, and integers
677
Commentary
The integer promotions are only applied to values whose integer type has a rank less than that of the
int
type.
C
++
This is not explicitly specified in the C
++
Standard. However, clause 4.5, Integral promotions, discusses no
other types, so the statement is also true in C
++
677
The integer promotions preserve value including sign.value preserving
Commentary
These rules are sometimes known as value preserving promotions. They were chosen by the Committee
because they result in the least number of surprises to developers when applied to operands. The promoted
value would remain unchanged whichever of the two rules used by implementations were used. However,
in many cases this promoted value appears as an operand of a binary operator. If unsigned preserving
promotions were used (see Common implementations below), the value of the operand could have its sign
changed (e.g., if the operands had types
unsigned char
and
signed char
, both their final operand type
would have been
unsigned int
), potentially leading to a change of that value (if it was negative). The
unsigned preserving promotions (sometimes called rules rather than promotions) are sometimes also known
as sign preserving rules because the form of the sign is preserved.
Most developers think in terms of values, not signedness. A rule that attempts to preserve sign can cause a
change of value, something that is likely to be unexpected. Value preserving rules can also produce results
that are unexpected, but these occur much less often.
Rationale
The unsigned preserving rules greatly increase the number of situations where
unsigned int
confronts
signed
int
to yield a questionably signed result, whereas the value preserving rules minimize such confrontations.
Thus, the value preserving rules were considered to be safer for the novice, or unwary, programmer. After
much discussion, the C89 Committee decided in favor of value preserving rules, despite the fact that the UNIX
C compilers had evolved in the direction of unsigned preserving.
Other Languages
This is only an issue for languages that contain more than one signed integer type and an unsigned integer
type.
Common Implementations
The base document specified unsigned preserving rules. If the type being promoted was either
unsigned
base doc-
ument
1
char
or
unsigned short
, it was converted to an
unsigned int
. The corresponding signed types were
promoted to
signed int
. Some implementations provide an option to change their default behavior to
follow unsigned preserving rules.
[610,1342,1370]
Coding Guidelines
Existing, very old, source code may rely on using the unsigned preserving rules. It can only do this if the
translator is also running in such a mode, either because that is the only one available or because the translator
is running in a compatibility mode to save on the porting (to the ISO rules) cost. Making developers aware of
any of the issues involved in operating in a nonstandard C environment is outside the scope of these coding
guidelines.
Example
1 extern unsigned char uc;
2
3 void f(void)
4 {
v 1.2 June 24, 2009
6.3.1.2 Boolean type
680
5 int si = -1;
6 /
*
7
*
Value preserving rules promote uc to an int -> comparison succeeds.
8
*
9
*
Signed preserving rules promote uc to an unsigned int, usual arithmetic
10
*
conversions then convert si to unsigned int -> comparison fails.
11
*
/
12 if (uc > si)
13 ;
14 }
678
As discussed earlier, whether a “plain” char is treated as signed is implementation-defined. char
plain treated as
Commentary
The implementation-defined treatment of “plain” char will only affect the result of the integer promotions if
516 char
range, repre-
sentation and
behavior
any of the character types can represent the same range of values as an object of type
int
or
unsigned int
.
679
Forward references: enumeration specifiers (6.7.2.2), structure and union specifiers (6.7.2.1).
6.3.1.2 Boolean type
680
When any scalar value is converted to _Bool, the result is 0 if the value compares equal to 0; _Bool
converted to
Commentary
Converting a scalar value to type
_Bool
is effectively the same as a comparison against
0
; that is,
(_Bool)x
is effectively the same as (x != 0) except in the latter case the type of the result is int.
Conversion to
_Bool
is different from other conversions, appearing in a strictly conforming program, in
that it is not commutative— (T1)(_Bool)x need not equal (_Bool)(T1)x. For instance:
(int)(_Bool)0.5 ⇒ 1
(_Bool)(int)0.5 ⇒ 0
Reordering the conversions in a conforming program could also return different results:
(signed)(unsigned)-1 ⇒ implementation-defined
(unsigned)(signed)-1 ⇒ UINT_MAX
C90
Support for the type _Bool is new in C99.
C
++
4.12p1
An rvalue of arithmetic, enumeration, pointer, or pointer to member type can be converted to an rvalue of type
bool. A zero value, null pointer value, or null member pointer value is converted to false;
The value of
false
is not defined by the C
++
Standard (unlike
true
, it is unlikely to be represented using
any value other than zero). But in contexts where the integer conversions are applied:
4.7p4
. . . the value false is converted to zero . . .
Other Languages
Many languages that include a boolean type specify that it can hold the values true and false, without
specifying any representation for those values. Java only allows boolean types to be converted to boolean
types. It does not support the conversion of any other type to boolean.
June 24, 2009 v 1.2
6.3.1.3 Signed and unsigned integers
682
Coding Guidelines
The issue of treating boolean values as having a well-defined role independent of any numeric value is
discussed elsewhere; for instance, treating conversions of values to the type
_Bool
as representing a change
boolean role 476
of role, not as representing the values 0 and 1. The issue of whether casting a value to the type
_Bool
, rather
than comparing it against zero, represents an idiom that will be recognizable to C developers is discussed
elsewhere.
boolean role 476
681
otherwise, the result is 1.
Commentary
In some contexts C treats any nonzero value as representing true — for instance, controlling expressions
if statement
operand com-
pare against 0
1744
(which are also defined in terms of a comparison against zero). A conversion to
_Bool
reduces all nonzero
values to the value 1.
C
++
4.12p1
. . . ; any other value is converted to true.
The value of
true
is not defined by the C
++
Standard (implementations may choose to represent it internally
using any nonzero value). But in contexts where the integer conversions are applied:
4.7p4
. . . the value true is converted to one.
6.3.1.3 Signed and unsigned integers
682
When a value with integer type is converted to another integer type other than
_Bool
, if the value can be
represented by the new type, it is unchanged.
Commentary
While it would very surprising to developers if the value was changed, the standard needs to be complete and
specify the behavior of all conversions. For integer types this means that the value has to be within the range
specified by the corresponding numerical limits macros.
numeri-
cal limits
300
The type of a bit-field is more than just the integer type used in its declaration. The width is also considered
to be part of its type. This means that assignment, for instance, to a bit-field object may result in the value
bit-field
interpreted as
1407
being assigned having its value changed.
1 void DR_120(void)
2 {
3 struct {
4 unsigned int mem : 1;
5 } x;
6 /
*
7
*
The value 3 can be represented in an unsigned int,
8
*
but is changed by the assignment in this case.
9
*
/
10 x.mem = 3;
11 }
C90
Support for the type _Bool is new in C99, and the C90 Standard did not need to include it as an exception.
Other Languages
This general statement holds true for conversions in other languages.
v 1.2 June 24, 2009
6.3.1.3 Signed and unsigned integers
684
Common Implementations
The value being in range is not usually relevant because most implementations do not perform any range
checks on the value being converted. When converting to a type of lesser rank, the common implementation
behavior is to ignore any bit values that are not significant in the destination type. (The sequence of bits in
the value representation of the original type is truncated to the number of bits in the value representation
of the destination type.) If the representation of a value does not have any bits set in these ignored bits, the
converted value will be the same as the original value. In the case of conversions to value representations
containing more bits, implementations simply sign-extend for signed values and zero-fill for unsigned values.
Coding Guidelines
One way of reducing the possibility that converted values are not representable in the converted type is to
reduce the number of conversions. This is one of the rationales behind the general guideline on using a single
integer type.
480.1 object
int type only
683
Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more
unsigned integer
conversion to
than the maximum value that can be represented in the new type until the value is in the range of the new
type.
49)
Commentary
This behavior is what all known implementations do for operations on values having unsigned types. The
standard is enshrining existing processor implementation practices in the language. As footnote 49 points
691 footnote
49
out, this adding and subtracting is done on the abstract mathematical value, not on a value with a given C
type. There is no need to think in terms of values wrapping (although this is a common way developers think
about the process).
C90
Otherwise: if the unsigned integer has greater size, the signed integer is first promoted to the signed integer
corresponding to the unsigned integer; the value is converted to unsigned by adding to it one greater than the
largest number that can be represented in the unsigned integer type.
28)
When a value with integral type is demoted to an unsigned integer with smaller size, the result is the nonnegative
remainder on division by the number one greater than the largest unsigned number that can be represented in the
type with smaller size.
The C99 wording is a simpler way of specifying the C90 behavior.
Common Implementations
For unsigned values and signed values represented using two’s complement, the above algorithm can be
implemented by simply chopping off the significant bits that are not available in the representation of the
new type.
Coding Guidelines
The behavior for this conversion may be fully specified by the standard. The question is whether a conversion
should be occurring in the first place.
496 unsigned
computation
modulo reduced
684
Otherwise, the new type is signed and the value cannot be represented in it; integer value
not represented
in signed integer
Commentary
To be exact, the standard defines no algorithm for reducing the value to make it representable (because there
is no universal agreement between different processors on what to do in this case).
Other Languages
The problem of what to do with a value that, when converted to a signed integer type, cannot be represented
is universal to all languages supporting more than one signed integer type, or support an unsigned integer
type (the overflow that can occur during an arithmetic operation is a different case).
June 24, 2009 v 1.2
6.3.1.4 Real floating and integer
686
Coding Guidelines
A guideline recommendation that the converted value always be representable might be thought to be
equivalent to one requiring that a program not contain defects. However, while the standard may not specify
an algorithm for this conversion, there is a commonly seen implementation behavior. Developers sometimes
intentionally make use of this common behavior and the applicable guideline is the one dealing with the use
of representation information.
represen-
tation in-
formation
using
569.1
685
either the result is implementation-defined or an implementation-defined signal is raised.signed inte-
ger conversion
implementation-
defined
Commentary
There is no universally agreed-on behavior (mathematical or processor) for the conversion of out-of-range
signed values, so the C Standard’s Committee could not simply define this behavior as being what happens in
practice. The definition of implementation-defined behavior does not permit an implementation to raise a
implementation-
defined
behavior
42
signal; hence, the additional permission to raise a signal is specified here.
C90
The specification in the C90 Standard did not explicitly specify that a signal might be raised. This is because
the C90 definition of implementation-defined behavior did not rule out the possibility of an implementation
raising a signal. The C99 wording does not permit this possibility, hence the additional permission given
here.
C
++
4.7p3
. . . ; otherwise, the value is implementation-defined.
The C
++
Standard follows the wording in C90 and does not explicitly permit a signal from being raised in
this context because this behavior is considered to be within the permissible range of implementation-defined
behaviors.
Other Languages
Languages vary in how they classify the behavior of a value not being representable in the destination type.
Java specifies that all the unavailable significant bits (in the destination type) are discarded. Ada requires that
an exception be raised. Other languages tend to fall between these two extremes.
Common Implementations
The quest for performance and simplicity means that few translators generate machine code to check for
nonrepresentable conversions. The usual behavior is for the appropriate number of least significant bits from
the original value to be treated as the converted value. The most significant bit of this new value is treated
as a sign bit, which is sign-extended to fill the available space if the value is being held in a register. If the
conversion occurs immediately before a store (i.e., a right-hand side value is converted before being assigned
into the left hand side object), there is often no conversion; the appropriate number of value bits are simply
written into storage.
Some older processors
[287]
have the ability to raise a signal if a conversion operation on an integer value is
not representable. On such processors an implementation can choose to use this instruction or use a sequence
of instructions having the same effect, that do not raise a signal.
6.3.1.4 Real floating and integer
686
When a finite value of real floating type is converted to an integer type other than
_Bool
, the fractional part is
floating-point
converted to
integer
discarded (i.e., the value is truncated toward zero).
Commentary
NaNs are not finite values and neither are they infinities.
IEEE-754 6.3
v 1.2 June 24, 2009
6.3.1.4 Real floating and integer
686
The Sign Bit. . . . and the sign of the result of the round floating-point number to integer operation is the sign of
the operand. These rules shall apply even when operands or results are zero or infinite.
When a floating-point value in the range (-1.0, -0.0) is converted to an integer type, the result is required to
be a positive zero.
616 negative zero
only generated by
C90
Support for the type _Bool is new in C99.
Other Languages
This behavior is common to most languages.
Common Implementations
Many processors include instructions that perform truncation when converting values of floating type to
an integer type. On some processors the rounding mode, which is usually set to round-to-nearest, has to
352
FLT_ROUNDS
be changed to round-to-zero for this conversion, and then changed back after the operation. This is an
execution-time overhead. Some implementations give developers the choice of faster execution provided
they are willing to accept round-to-nearest behavior. In some applications the difference in behavior is
significantly less than the error in the calculation, so it is acceptable.
Coding Guidelines
An expression consisting of a cast of a floating constant to an integer type is an integer constant expression.
1328 integer con-
stant expres-
sion
Such a constant can be evaluated at translation time. However, there is no requirement that the translation-time
evaluation produce exactly the same results as the execution-time evaluation. Neither is there a requirement
that the translation-time handling of floating-point constants be identical. In the following example it is
possible that a call to printf will occur.
1 #include <stdio.h>
2
3 void f(void)
4 {
5 float fval = 123456789.0;
6 long lval = (long)123456789.0;
7
8 if (lval != (long)fval)
9 printf("(long)123456789.0 == %ld and %ld\n", lval, (long)fval);
10 }
There is a common, incorrect, developer assumption that floating constants whose fractional part is zero
are always represented exactly by implementations (i.e., many developers have a mental model that such
constants are really integers with the characters
.0
appended to them). While it is technically possible to
convert many such constants exactly, experience shows that a surprising number of translators fail to achieve
the required degree of accuracy (e.g., the floating constant 6.0 might be translated to the same internal
representation as the floating constant 5.999999 and subsequently converted to the integer constant 5).
Rev
686.1
A program shall not depend on the value of a floating constant being converted to an integer constant
having the same value.
A developer who has made the effort of typing a floating constant is probably expecting it to be used as a
floating type. Based on this assumption a floating constant that is implicitly converted to an integer type is
unexpected behavior. Such an implicit conversion can occur if the floating constant is the right operand of an
assignment or the argument in a function call. Not only is the implicit conversion likely to be unexpected by
the original author, but subsequent changes to the code that cause a function-like macro to be invoked, rather
than a function call, to result in a significant change in behavior.
June 24, 2009 v 1.2
6.3.1.4 Real floating and integer
687
In the following example, a floating constant passed to
CALC_1
results in
glob
being converted to a floating
type. If the value of
glob
contains more significant digits than supported by the floating type, the final result
assigned to
loc
will not be the value expected. Using explicit casts, as in
CALC_2
, removes the problem
caused by the macro argument having a floating type. However, as discussed elsewhere, other dependencies
operand
convert au-
tomatically
653
are introduced. Explicitly performing the cast, where the argument is passed, mimics the behavior of a
function call and shows that the developer is aware of the type of the argument.
1 #define X_CONSTANT 123456789.0
2 #define Y_CONSTANT 2
3
4 #define CALC_1(a) ((a) + (glob))
5 #define CALC_2(a) ((long)(a) + (glob))
6 #define CALC_3(a) ((a) + (glob))
7
8 extern long glob;
9
10 void f(void)
11 {
12 long loc;
13
14 loc = CALC_1(X_CONSTANT);
15 loc = CALC_1(Y_CONSTANT);
16
17 loc = CALC_2(X_CONSTANT);
18 loc = CALC_2(Y_CONSTANT);
19
20 loc = CALC_3((long)X_CONSTANT);
21 loc = CALC_3(Y_CONSTANT);
22 }
The previous discussion describes some of the unexpected behaviors that can occur when a floating constant
is implicitly converted to an integer type. Some of the points raised also apply to objects having a floating
type. The costs and benefits of relying on implicit conversions or using explicit casts are discussed, in general,
elsewhere. That discussion did not reach a conclusion that resulted in a guideline recommendation being
operand
convert au-
tomatically
653
made. Literals differ from objects in that they are a single instance of a single value. As such developers
have greater control over their use, on a case by case basis, and a guideline recommendation is considered to
be more worthwhile. This guideline recommendation is similar to the one given for conversions of suffixed
integer constants.
integer
constant
with suffix, not
immediately
converted
835.2
Cg
686.2
A floating constant shall not be implicitly converted to an integer type.
687
If the value of the integral part cannot be represented by the integer type, the behavior is undefined.
50)
Commentary
The exponent part in a floating-point representation allows very large values to be created, these could
significantly exceed the representable range supported by any integer type. The behavior specified by
the standard reflects both the fact that there is no commonly seen processor behavior in this case and the
execution-time overhead of performing some defined behavior.
Other Languages
Other languages vary in their definition of behavior. Like integer values that are not representable in the
destination type, some languages require an exception to be raise while others specify undefined behavior. In
this case Java uses a two step-process. It first converts the real value to the most negative, or largest positive
v 1.2 June 24, 2009
Không có nhận xét nào:
Đăng nhận xét