Von Rossum on Strong vs. Weak Typing

Am I right if I say that by your definition Ruby is strongly typed?

(just doing some brain gymnastics :slight_smile:

···

On Sat, Feb 15, 2003 at 05:48:57AM +0900, Marcin ‘Qrczak’ Kowalczyk wrote:

Sat, 15 Feb 2003 04:18:57 +0900, Paul Brannan pbrannan@atdesk.com pisze:

Type inferencing doesn’t mean the types aren’t fully checked at
compile time. Weak typing implies a bit of run time type checking,
which OCaml doesn’t do.

C is weakly typed, but it does not do run-time type checking.

For me “weakly typed” means that either there is no well-defined
and interesting concept of the type of each object, or it exists but
in practice it too often lies.


_ _

__ __ | | ___ _ __ ___ __ _ _ __
'_ \ / | __/ __| '_ _ \ / ` | ’ \
) | (| | |
__ \ | | | | | (| | | | |
.__/ _,
|_|/| || ||_,|| |_|
Running Debian GNU/Linux Sid (unstable)
batsman dot geo at yahoo dot com

The only “intuitive” interface is the nipple. After that, it’s all learned.
– Bruce Ediger, bediger@teal.csn.org, on X interfaces

Sat, 15 Feb 2003 08:20:52 +0900, Mauricio Fernández batsman.geo@yahoo.com pisze:

For me “weakly typed” means that either there is no well-defined
and interesting concept of the type of each object, or it exists but
in practice it too often lies.

Am I right if I say that by your definition Ruby is strongly typed?

Sort of, among dynamically typed languages.

Static typing can lead to yet stronger typing - there you not only can
ask about the type of an object, but about the type of an expression
or about the type of argument a function expects.

These questions may require a complex language to formulate the answer.
Parametric polymorphism, bounded polymorphism, Haskell’s/Clean’s
typeclasses are tools which make describing these types statically
possible even for generic functions. The answer may look like “the type
of that parameter is list-of-(the type of the other parameter), where
the type of the other parameter is anything which has equality defined”
(this is the type of the second parameter of Haskell’s function ‘elem’
which checks existence of an element in a list).

···


__("< Marcin Kowalczyk
__/ qrczak@knm.org.pl
^^ http://qrnik.knm.org.pl/~qrczak/

C is weakly typed, but it does not do run-time type checking.

I dunno, C lives outside the set of languages that really care about
strong or weak typing.

C is portable machine-code.

It lets you define named constant offsets from index registers. These names
are called “struct members”.

:slight_smile:

Increasingly off-topic, but I’m curious about which three types you
consider C to have. Integer, pointer, float? Surely not string or
array, which are both pointers with as little syntax sugar as Ritchie
could get away with. But you might consider a pointer a glorified
integer.

···

On Friday, February 14, 2003, at 12:21 PM, Dan Sugalski wrote:

I dunno, C lives outside the set of languages that really care about
strong or weak typing. If you really need to find its place in the
dichotomy, it’s a strongly typed language that lets you lie to it.
(That it only really has three types is a separate issue, of > course…)


Pictures of things as they used to be,
Don’t show me no more, please.

Integer, pointer, and float are it. I don’t consider integer buffers
with in-band markers as “strings”. Ick. I’m used to dealing with
split I/D space machines, so there’s a natural split between integers
and pointers.

Given that I’ve decoded IEEE floats using int-cast pointer tricks, I
could see arguing that C has only one real type, but that might be
going a little too far… :slight_smile:

···

At 7:08 AM +0900 2/15/03, Chris Thomas wrote:

On Friday, February 14, 2003, at 12:21 PM, Dan Sugalski wrote:

I dunno, C lives outside the set of languages that really care
about strong or weak typing. If you really need to find its place
in the dichotomy, it’s a strongly typed language that lets you lie
to it. (That it only really has three types is a separate issue,
of > course…)

Increasingly off-topic, but I’m curious about which three types you
consider C to have. Integer, pointer, float? Surely not string or
array, which are both pointers with as little syntax sugar as
Ritchie could get away with. But you might consider a pointer a
glorified integer.


Dan

--------------------------------------“it’s like this”-------------------
Dan Sugalski even samurai
dan@sidhe.org have teddy bears and even
teddy bears get drunk

I dunno, C lives outside the set of languages that really care about
strong or weak typing. If you really need to find its place in the
dichotomy, it’s a strongly typed language that lets you lie to it.
(That it only really has three types is a separate issue, of > course…)

Increasingly off-topic, but I’m curious about which three types you
consider C to have. Integer, pointer, float? Surely not string or

If the integer is large enough you can fit the pointer inside and there’s
only two left :slight_smile:

···

On Sat, Feb 15, 2003 at 07:08:11AM +0900, Chris Thomas wrote:

On Friday, February 14, 2003, at 12:21 PM, Dan Sugalski wrote:

array, which are both pointers with as little syntax sugar as Ritchie
could get away with. But you might consider a pointer a glorified
integer.


_ _

__ __ | | ___ _ __ ___ __ _ _ __
'_ \ / | __/ __| '_ _ \ / ` | ’ \
) | (| | |
__ \ | | | | | (| | | | |
.__/ _,
|_|/| || ||_,|| |_|
Running Debian GNU/Linux Sid (unstable)
batsman dot geo at yahoo dot com

Being overloaded is the sign of a true Debian maintainer.
– JHM on #Debian

I would argue that the C language defines many types, some of which are
builtin types:

  • various integer types (char, short int, int, long int, long long int
    in C99, and signed and unsigned verions of each)
  • enumerated types (which in C are really just fancy integers)
  • various real types (float and double, plus long double in C99)
  • pointer types (data pointers and function pointers)
  • arrays (which can decay into pointers, but cannot be implicitly
    converted from a pointer)
  • structs and unions (which cannot be implicitly converted to or from
    any other type)
  • functions (which are different from from function pointers, but can
    decay into function pointers)

C99 also defines some additional types:

  • a boolean type (_Bool)
  • a complex type (_Complex)
  • a wide character type (wchar_t)

Some of these types can be implicitly converted to and/or from other
types, but not all of them can.

All of these types have a specific set of operators that work with them;
trying to use an operator with the wrong type results in code that will
not compile:

  • the function call () operator works only with function pointers and
    functions
  • the bitwise operators work only with integer types
  • the unary * operator works only with pointers

Paul

···

On Sat, Feb 15, 2003 at 08:08:55AM +0900, Dan Sugalski wrote:

Integer, pointer, and float are it. I don’t consider integer buffers
with in-band markers as “strings”. Ick. I’m used to dealing with
split I/D space machines, so there’s a natural split between integers
and pointers.

I would argue that C doesn’t define any types, but merely makes it
convenient to reference memory of various lengths, and call instructions
on them.

Consider that a char is just 1 byte, and while there’s pretty syntax:

char a = 'A'

there’s no difference between that and

char a = 65

because it looks identical to the compiler. Same for short/int/long,
and whether it’s signed or unsigned just tells the compiler which
instructions to use. Structs just allow convenient access to larger
allocations of memory, and unions just let you look at the same piece of
memory in multiple ways. Floats are just a different size, and a
different set of instructions. Pointers are just a bit of memory with
some nice syntax for referring to other bits of memory based on them.

This isn’t just missing the forest for the trees, like saying “well,
any code is eventually machine code and data”. C’s mask over the
numbers is very thin (which is usually useful).

But they aren’t really lots of physically different data types. After
it’s been compiled, it’s all just a sea of memory, and before it’s
compiled, it’s just a bunch of numbers. (This is from a perspective of
the language… you can’t inspect your code and data after compilation.
Ruby and other languages give you more that a sea of memory at runtime.)

Bringing this back on topic, C is statically typed, which gives the
compiler the ability to do some nice optimizations, but results in
weaker typing, too (at least by matz’s definition).

···

On Sat, 15 Feb 2003 12:21:55 +0900 Paul Brannan pbrannan@atdesk.com wrote:

On Sat, Feb 15, 2003 at 08:08:55AM +0900, Dan Sugalski wrote:

Integer, pointer, and float are it. I don’t consider integer buffers

with in-band markers as “strings”. Ick. I’m used to dealing with
split I/D space machines, so there’s a natural split between
integers and pointers.

I would argue that the C language defines many types, some of which
are builtin types:

Ryan Pavlik rpav@users.sf.net

“I have tasted sweet, sweet invincibility.
It tastes like varnish.” - 8BT

Consider that a char is just 1 byte, and while there’s pretty syntax:

char a = 'A'

there’s no difference between that and

char a = 65

Both ‘A’ and 65 are literals with type int. You haven’t shown that C is
typeless; only that ‘A’ and 65 have the same type, and that this type
can be implicitly converted to a char.

Bringing this back on topic, C is statically typed, which gives the
compiler the ability to do some nice optimizations, but results in
weaker typing, too (at least by matz’s definition).

Static typing does not result in weak typing. C++, Java, and OCaml are
examples of languages that have strong static typing. C’s weak typing
is the result of being able to implicitly convert from almost any type
to almost any other type (or as matz says, the type “can be mixed and
changed very easily”).

Paul

···

On Sat, Feb 15, 2003 at 02:35:08PM +0900, Ryan Pavlik wrote:

Consider that a char is just 1 byte, and while there’s pretty
syntax:

char a = 'A'

there’s no difference between that and

char a = 65

Both ‘A’ and 65 are literals with type int. You haven’t shown that C
is typeless; only that ‘A’ and 65 have the same type, and that this
type can be implicitly converted to a char.

I said C doesn’t really define types, but it just lets you address
memory of various sizes. I also said everything was just a number. :wink:
This is an example of that. A char is really just a byte-length piece
of memory; and int is word-length. It may be called ‘char’, but it’s
still just a number, a bit of memory. Nothing really makes it a special
type.

Bringing this back on topic, C is statically typed, which gives the
compiler the ability to do some nice optimizations, but results in
weaker typing, too (at least by matz’s definition).

Static typing does not result in weak typing. C++, Java, and OCaml
are examples of languages that have strong static typing. C’s weak
typing is the result of being able to implicitly convert from almost
any type to almost any other type (or as matz says, the type “can be
mixed and changed very easily”).

I am referring to C, not static typing in general. C’s typing
specifically weakens itself. The reason C’s types can be mixed and
changed is a result of what I described above: everything is just a
number, or a piece of memory of a given size. I won’t argue one way or
the other about what determines strong typing, or if casting makes
something weak. (I don’t believe it does… I would define ruby’s #to_X
calls as a type of casting, but we’d just be arguing definitions, and
that’s pointless.)

Of course, it does bring up an interesting point. If a language has
only a single type, is it static or dynamic, strong or weak? :wink:

···

On Sun, 16 Feb 2003 13:44:06 +0900 Paul Brannan pbrannan@atdesk.com wrote:

On Sat, Feb 15, 2003 at 02:35:08PM +0900, Ryan Pavlik wrote:


Ryan Pavlik rpav@users.sf.net

“Spinal hazards are hazardous…” - 8BT

Actually, I would say C is weakly typed because it is easy to accidently
interpret data as the wrong type … especially in pre-ANSI C (without
prototypes).

For example …

One file contains …

extern int f();
int main () {
f(3.1416);
}

In another file …

int f(x)
int x;
{

}

Unions are also problematic, although in that case you expect problems
so accidents are less likely.

···

On Sat, 2003-02-15 at 23:44, Paul Brannan wrote:

On Sat, Feb 15, 2003 at 02:35:08PM +0900, Ryan Pavlik wrote:

Consider that a char is just 1 byte, and while there’s pretty syntax:

char a = 'A'

there’s no difference between that and

char a = 65

Both ‘A’ and 65 are literals with type int. You haven’t shown that C is
typeless; only that ‘A’ and 65 have the same type, and that this type
can be implicitly converted to a char.

Bringing this back on topic, C is statically typed, which gives the
compiler the ability to do some nice optimizations, but results in
weaker typing, too (at least by matz’s definition).

Static typing does not result in weak typing. C++, Java, and OCaml are
examples of languages that have strong static typing. C’s weak typing
is the result of being able to implicitly convert from almost any type
to almost any other type (or as matz says, the type “can be mixed and
changed very easily”).


– Jim Weirich jweirich@one.net http://w3.one.net/~jweirich

“Beware of bugs in the above code; I have only proved it correct,
not tried it.” – Donald Knuth (in a memo to Peter van Emde Boas)

> Actually, I would say C is weakly typed because it is easy to > accidently interpret data as the wrong type ... especially in pre-ANSI > C (without prototypes).

I’m not sure… is it weakly typed just because you can lie to it? :wink:
C types are basically just a promise as to the size. If you break your
promise, it’s not really the compiler’s fault. However, I don’t think
that’s the case here:

For example …

One file contains …

extern int f();

IIRC, in K&R C (and ANSI too), () makes a default int parameter. This
also happens to be the result of no prototype.

int main () {
f(3.1416);
}

This should just be demoted to int by the compiler.

> Unions are also problematic, although in that case you expect problems > so accidents are less likely.

Well, if we’re going by matz’s “interchangeable” definition, C is only
“sort-of” weak typing, because what happens is fully determinate at
compile time. You can do weird things which can prove useful:

struct A {
   int a, b, c, d;
};

int main() {
   struct A  a;
   int      *ptr = (int*)&a;
   int       i;
   
   for(i = 0; i < 4; i++) *(ptr + i) = i;
}

…but it’s arguable whether interchanging and lying to the compiler
are the same thing. :wink: For instance, a fairly black-and-white case of
interchangeable types is numbers and strings in Perl or PHP. They can
(mostly) be interchanged. Odd things sometimes happen at boundary
conditions (like 0), but “123” + 456 results in 579. Telling the
gullible compiler that my &a is an int* works, but it’s me treating
one thing in two ways, not the compiler treating two things in one
way.

I say they’re all just numbers, so there’s nothing to do differently
anyway… which brought up the one-type question in another reply.

···

On Sun, 16 Feb 2003 15:14:02 +0900 Jim Weirich jweirich@one.net wrote:


Ryan Pavlik rpav@users.sf.net

“Spinal hazards are hazardous…” - 8BT

Actually, no. K&R C will just pass whatever data you give it to the
function without regard of the type of the arguments. Indeed, there was
no way to declare the argument types of an external functions in K&R C.
That’s why you can’t pass float arguments to functions. Even if you
declared an argument to take a float rather than a double, C would
silently promote all float arguments to doubles at both the calling site
and the function definition.

In ANSI C, a function without a prototype is automatically given a
“Miranda”[0] prototype that matches the arguments of its first use.
Usages after the first use will use the miranda prototype.

That’s one of the changes that C++ introduced to the C language for more
type safety … required prototypes for all functions.

···

On Sun, 2003-02-16 at 01:37, Ryan Pavlik wrote:

IIRC, in K&R C (and ANSI too), () makes a default int parameter. This
also happens to be the result of no prototype


– Jim Weirich jweirich@one.net http://w3.one.net/~jweirich

“Beware of bugs in the above code; I have only proved it correct,
not tried it.” – Donald Knuth (in a memo to Peter van Emde Boas)

[0] Miranda Prototypes. So named for the US miranda rights read to
criminals upon arrest … paraphrased in part “… if you do not have an
attorney, one will be appointed to you …”. Substitute “prototype”
for “attorney”.