Common type of ubyte and const ubyte is int (page 3)

Settings

Help

Index » General » Common type of ubyte and const ubyte is int (page 3)

May 04

Re: Common type of ubyte and const ubyte is int

Posted by Daniel N
in reply to Walter Bright

Permalink

Daniel N

Posted in reply to Walter Bright

Permalink

On Friday, 3 May 2024 at 19:40:36 UTC, Walter Bright wrote:
> On 5/1/2024 7:42 AM, Steven Schveighoffer wrote:
>> It seems rule 2 would apply instead of rule 6? but I don't like it.
>
> ```
> #include <stdio.h>
>
> void main()
> {
>     char u;
>     const char v;
>     printf("%ld %ld\n", sizeof(u), sizeof(1?u:v));
> }
> ```
>
> This prints "1 4". D follows the same integral promotion rules, and the reason is if one translates C code to D, one doesn't get an unpleasant hidden surprise.

However as you can see in my post
https://forum.dlang.org/post/fhmzjloxfzzwgmohkxnc@forum.dlang.org
C++ actually went the other direction on this, in the rare cases that they differ, shouldn't D be free to chose the best option from C or C++?

Considering C++ had auto ever since C++11 but it was only recently added in C23. There's a lot more C++ code that depends on the type than C code.

May 04

Re: Common type of ubyte and const ubyte is int

Posted by Walter Bright
in reply to Daniel N

Permalink

Walter Bright

Posted in reply to Daniel N

Permalink

On 5/4/2024 8:34 AM, Daniel N wrote:
> However as you can see in my post
> https://forum.dlang.org/post/fhmzjloxfzzwgmohkxnc@forum.dlang.org
> C++ actually went the other direction on this, in the rare cases that they differ, shouldn't D be free to chose the best option from C or C++?

True, but D is more compatible with C than with C++.

May 04

Re: Common type of ubyte and const ubyte is int

Posted by Walter Bright
in reply to matheus

Permalink

Walter Bright

Posted in reply to matheus

Permalink

On 5/3/2024 1:14 PM, matheus wrote:
> Like this D code:
> 
> import std.stdio;
> 
> void main(){
>      char u;
>      const char v;
>      writefln("%d %d", (u.sizeof), (1?u:v).sizeof);
> }
> 
> Prints: "1 1".

You're right.

But change `char` to `ubyte` and you'll get "1 4".

This is because the table in impcnvtab.d does not deal with char types:

https://github.com/dlang/dmd/blob/master/compiler/src/dmd/impcnvtab.d#L165

So, char types do not undergo integral promotion when trying to bring two expressions to a common type. Byte, ubyte, short, and ushort do.

This can be debated as to which it should be, but there's a lot of water under that bridge.

May 05

Re: Common type of ubyte and const ubyte is int

Posted by Steven Schveighoffer
in reply to Walter Bright

Permalink

Steven Schveighoffer

Posted in reply to Walter Bright

Permalink

On Friday, 3 May 2024 at 19:40:36 UTC, Walter Bright wrote:

On 5/1/2024 7:42 AM, Steven Schveighoffer wrote:

It seems rule 2 would apply instead of rule 6? but I don't like it.

#include <stdio.h>

void main()
{
    char u;
    const char v;
    printf("%ld %ld\n", sizeof(u), sizeof(1?u:v));
}

This prints "1 4". D follows the same integral promotion rules, and the reason is if one translates C code to D, one doesn't get an unpleasant hidden surprise.

Cool, now let's try char against char:

printf("%ld]\n", sizeof(1? u : u));

This prints 4 still. Wait, what does D do?

ubyte u;
writeln((1 ? u : u).sizeof);

prints "1". This must be a mistake, right? How does anyone ever port C code with this glaring change in functionality?!

I'm being a little bit overdramatic here, but you get the drift.

C does not have auto, or function overloading, so the inferred type of a ternary expression in terms of integer promotion is meaningless. There isn't even a typeof expression in C, so you have to resort to sizeof expressions. As far as I can tell, this is the only place where you can see the difference. Since C allows implicit truncation, nobody will ever notice that this type is int.

As pointed out by Daniel N, C++ gets this right. You know who would be surprised by compiling C code if it did something so different it affected outcomes? C++ developers. They use C libraries as-is, not even porting, whenever they want. If those things started misbehaving, they would notice.

But they don't care. Why? Because it doesn't affect anything in C. Please, just change this.

-Steve

May 04

Re: Common type of ubyte and const ubyte is int

Posted by Jonathan M Davis
in reply to Walter Bright

Permalink

Jonathan M Davis

Posted in reply to Walter Bright

Permalink

On Friday, May 3, 2024 1:40:36 PM MDT Walter Bright via Digitalmars-d wrote:
> On 5/1/2024 7:42 AM, Steven Schveighoffer wrote:
> > It seems rule 2 would apply instead of rule 6? but I don't like it.
>
> ```
> #include <stdio.h>
>
> void main()
> {
>      char u;
>      const char v;
>      printf("%ld %ld\n", sizeof(u), sizeof(1?u:v));
> }
> ```
>
> This prints "1 4". D follows the same integral promotion rules, and the reason is if one translates C code to D, one doesn't get an unpleasant hidden surprise.

Sure, but are there actually any unpleasant surprises if C code using the ternary operator is converted to D? If you have

    uint8_t left;
    const uint8_t right;

    uint8_t result = cond ? left : right;

whether integer promotion occurs or not is irrelevant, because no arithmetic is happening, and C will happily downcast an int to a uint8_t. So, whether the ternary operator results in uint8_t or int doesn't affect the result at all.

On the flip side, if you assign the result to an int,

    uint8_t left;
    const uint8_t right;

    int result = cond ? left : right;

whether integer promotion occurred is again irrelevant, because no arithmetic is occurring, and both results will fit in an int whether the ternary operator converted the result to int or left it as uint8_t.

And since char in C is the same as either uint8_t or int8_t depending on the platform, using char in those two examples would have the same result.

So, as far as I can tell, the C code does not care whether integer promotion occurs with the ternary operator or not. It would care if it had type inference via something like D's auto, but it doesn't. The value of the result is the same whether integer promotion occurs or not, and that's all that C actually cares about unless you're doing something like sizeof on the expression, which is not exactly a typical thing to be doing. And that's probably why C++ was perfectly fine with changing the behavior. It doesn't actually break code in practice.

This issue only comes up in D, because we have auto, and we make downcasting with integer types illegal. So, promoting byte to int when no arithmetic is actually occurring just because the other branch of the ternary used const is highly surprising and increases the chances of code breakage due to downcasting not being legal.

It's also pretty terrible for generic code, because with almost all types if you have

    T left;
    const T right;

    auto result = cond ? left : right;

the result will be const T, whereas with smaller integer types, it would be int, which almost no one will expect. And unless VRP kicks in

    T left;
    const T right;

    const T result = cond ? left : right;

will fail to compile for small integer types while it compiles for every other type in the language.

So, from what I can tell, we're not going to break C code whether the ternary operator does integer promotion or not, but it _does_ cause problems for D code that integer promotion occurs even though there is no arithmetic expression involved.

And on top of that, by changing D to not do integer promotion with the ternary operator, we would be more compatible with C++ where the difference _does_ matter, because they have their own auto, and they have function overloading.

So, I don't see how sticking to the C behavior in this case helps at all, and it clearly hurts us for both straight up D code and for porting C++ code to D.

- Jonathan M Davis

May 05

Re: Common type of ubyte and const ubyte is int

Posted by Dom DiSc
in reply to Jonathan M Davis

Permalink

Dom DiSc

Posted in reply to Jonathan M Davis

Permalink

On Sunday, 5 May 2024 at 04:13:16 UTC, Jonathan M Davis wrote:
> And on top of that, by changing D to not do integer promotion with the ternary operator, we would be more compatible with C++ where the difference _does_ matter, because they have their own auto, and they have function overloading.
>
integer promotion wouldn't hurt us, if it would promote to something sensible. promoting (ubyte, uint) to int is not sensible. it introduces a sign where there was none before and thereby destroy large uint values. I see no benefit in staying compatible to such strange behaviour of C.

The common type of (T, const T) should be const T for any T and not something arbitrary. (And int is arbitrary, because why not short or long? - and don't call C compatibility here, because C *does* use short on 16bit machines and long for some 64bit machines!)

May 05

Re: Common type of ubyte and const ubyte is int

Posted by Don Allen
in reply to Jonathan M Davis

Permalink

Don Allen

Posted in reply to Jonathan M Davis

Permalink

On Sunday, 5 May 2024 at 04:13:16 UTC, Jonathan M Davis wrote:
> On Friday, May 3, 2024 1:40:36 PM MDT Walter Bright via Digitalmars-d wrote:
>> On 5/1/2024 7:42 AM, Steven Schveighoffer wrote:
>> > It seems rule 2 would apply instead of rule 6? but I don't like it.
>>
>> ```
>> #include <stdio.h>
>>
>> void main()
>> {
>>      char u;
>>      const char v;
>>      printf("%ld %ld\n", sizeof(u), sizeof(1?u:v));
>> }
>> ```
>>
>> This prints "1 4". D follows the same integral promotion rules, and the reason is if one translates C code to D, one doesn't get an unpleasant hidden surprise.
>
> Sure, but are there actually any unpleasant surprises if C code using the ternary operator is converted to D? If you have
>
>     uint8_t left;
>     const uint8_t right;
>
>     uint8_t result = cond ? left : right;
>
> whether integer promotion occurs or not is irrelevant, because no arithmetic is happening, and C will happily downcast an int to a uint8_t. So, whether the ternary operator results in uint8_t or int doesn't affect the result at all.
>
> On the flip side, if you assign the result to an int,
>
>     uint8_t left;
>     const uint8_t right;
>
>     int result = cond ? left : right;
>
> whether integer promotion occurred is again irrelevant, because no arithmetic is occurring, and both results will fit in an int whether the ternary operator converted the result to int or left it as uint8_t.
>
> And since char in C is the same as either uint8_t or int8_t depending on the platform, using char in those two examples would have the same result.
>
> So, as far as I can tell, the C code does not care whether integer promotion occurs with the ternary operator or not. It would care if it had type inference via something like D's auto, but it doesn't. The value of the result is the same whether integer promotion occurs or not, and that's all that C actually cares about unless you're doing something like sizeof on the expression, which is not exactly a typical thing to be doing. And that's probably why C++ was perfectly fine with changing the behavior. It doesn't actually break code in practice.
>
> This issue only comes up in D, because we have auto, and we make downcasting with integer types illegal. So, promoting byte to int when no arithmetic is actually occurring just because the other branch of the ternary used const is highly surprising and increases the chances of code breakage due to downcasting not being legal.
>
> It's also pretty terrible for generic code, because with almost all types if you have
>
>     T left;
>     const T right;
>
>     auto result = cond ? left : right;
>
> the result will be const T, whereas with smaller integer types, it would be int, which almost no one will expect. And unless VRP kicks in
>
>     T left;
>     const T right;
>
>     const T result = cond ? left : right;
>
> will fail to compile for small integer types while it compiles for every other type in the language.
>
> So, from what I can tell, we're not going to break C code whether the ternary operator does integer promotion or not, but it _does_ cause problems for D code that integer promotion occurs even though there is no arithmetic expression involved.
>
> And on top of that, by changing D to not do integer promotion with the ternary operator, we would be more compatible with C++ where the difference _does_ matter, because they have their own auto, and they have function overloading.
>
> So, I don't see how sticking to the C behavior in this case helps at all, and it clearly hurts us for both straight up D code and for porting C++ code to D.
>
> - Jonathan M Davis

I think Jonathan, Steve and others have made a strong case for fixing this language anomaly.The terms "unpleasant surprise" and "no one would expect" are not what you want to read in descriptions of your language. We all expect to have a mental model of what the code we are writing does. Discovering the hard way that, despite reasonable diligence, our model was wrong tends to drive people away. This is the Principle of Least Surprise.

Have any of you written any PL-1? No? Here's an example of why (this may not be syntactically correct; I haven't written any PL-1 in over 50 years):
````
declare foo bit(1)

foo = 1
````
What is the value of foo after this executes? The answer is 0.

Why? 1 is a decimal constant with precision (p) of 1. To do the assignment, it is first converted to a binary constant of length p+3: 0001. This is converted to the bit string '0001' of length 4, which is then assigned to foo, a bit string of length 1. Assignment of longer to shorter bit strings is done high order bit first.

A lot of other issues contributed to PL-1's ultimate lack of success, but things like this certainly did not help.

Top | Forum index | About this forum

Forums