java sucks (2000)

I think Java is the best language going today, which is to say, it’s the marginally acceptable one among the set of complete bagbiting loser languages that we have to work with out here in the real world. Java is far, far more pleasant to work with than C or C++ or Perl or Tcl/Tk or even Emacs-Lisp. When I first started using Java, it felt like an old friend: like finally I was back using a real object system, before the blights of C (the PDP-11 assembler that thinks it’s a language) and C++ (the PDP-11 assembler that thinks it’s an object system) took over the world.

However, as I settled in, I found a lot of things about Java that irritate me. As thi…

However, as I settled in, I found a lot of things about Java that irritate me. As this happened, I wrote them down. The following document was mostly written while I was learning the language, during the design and development of Grendel back in 1997. Therefore, some of the following complaints might have been addressed in later versions of the language, or they might have been misunderstandings on my part.

It’s too bad Sun has been working as hard as they can, in their typical Sun way, to destroy Java by holding on to it so closely that nobody else can actually improve it.

I’ve also merged in a few complaints from Dan Bornstein and Richard Mlynarik. Thanks guys!

About the Java name, and associated politics:

The fact is that there are four completely different things that go by the name ``Java’’:

A language;
An enormous class library;
A virtual machine;
A security model.

Sun would like you to believe that these are all the same thing, and that the name ``Java’’ implies all of them, but this is marketing fiction. Worse than that, the fact that Sun has tried so hard to push this idea has done grievous damage to the acceptance of Java.

Java-the-language is, overall, a very good thing, and works well.
Java-the-class-library is mostly passable.
Java-the-virtual-machine is an interesting research project, a nice proof of concept, and is basically usable for a certain class of problems (those problems where speed isn’t all that important: basically, those tasks where you could get away with using Perl instead of C.)
Java-the-security-model is another interesting research project, but it only barely works right now. In a few years, maybe they’ll have it figured out or replaced.

If Sun hadn’t tried so hard to conflate these four completely different things, if they had first shipped native-code Java compilers, then the VM, then the security model, then Java probably would have completely displaced C++ by now.

The whole ``write once run anywhere’’ idea (which is to say, the virtual machine) is a wonderful idea, and I wish it the best of luck. But it’s not true yet. It might be someday: in the meantime, I’d like to write programs in Java today, the way I can write programs in C today. So I have to recompile for every architecture on which I want to run. Ok, I wish I didn’t, but that’s what I have to do today anyway, but I have to do it in C instead of Java.

Virtual machines are cool. Security models that allow network-distributed code are cool. Serialization and agent-like behavior is also cool.

But these are not what I’m most interested in. There are a lot of people who are most interested in those things, but me, I just want to write a program that will run on some suitable number of architectures. I’m happy distributing binaries for each architecture to do that. Sure, having one binary that ran on everything would be nice, but you know, it’s just not a hard requirement.

Today, I program in C.

I think C is a pretty crummy language. I would like to write the same kinds of programs in a better language.

First the good stuff:

Java doesn’t have free().

I have to admit right off that, after that, all else is gravy. That one point makes me able to forgive just about anything else, no matter how egregious. Given this one point, everything else in this document fades nearly to insignificance.

But...

About the Java language itself:

(I’m separating my complaints about Java the language, and Java the class library, despite Sun’s repeated attempts to blur this important and fundamental distinction.)- It’s hard to live with none of: lexically scoped local functions; a macro system; and inlined functions.

I really hate the lack of downward-funargs; anonymous classes are a lame substitute. (I can live without long-lived closures, but I find lack of function pointers a huge pain.)
The fact that static methods aren’t really class methods (they’re actually global functions: you can’t override them in a subclass) is pretty dumb.
It’s far from obvious how one hints that a method should be inlined, or otherwise go real fast. Does `final’ do it? Does `private final’ do it? Given that there is no preprocessor to let you do per-function shorthand, and no equivalent of Common Lisp’s flet (or even macrolet), one ends up either duplicating code, or allowing the code to be inefficient. Those are both bad choices.
Two identical byte[] arrays aren’t equal and don’t hash the same. Maybe this is just a bug, but:
You can’t fix this by subclassing Hashtable.
You can’t fix this by subclassing Array because it’s not really an object. What you can do is wrap an Object around an Array and let that implement hashCode and equals by digging around in its contained array, but that adds not-insignificant memory overhead (16 bytes per object, today.)
Gee, I know, I’ll write my own hash table. I’ve only done that a thousand times.
I can’t seem to manage to iterate the characters in a String without implicitly involving half a dozen method calls per character.
The other alternative is to convert the String to a byte[] first, and iterate the bytes, at the cost of creating lots of random garbage.
Generally, I’m dissatisfied with the overhead added by Unicode support in those cases where I’m sure that there are no non-ASCII characters. There ought to be two subclasses of an abstract String class, one that holds Unicode, and one that holds 8-bit quantities. They should offer identical APIs and be indistinguishable, except for the fact that if a string has only 8-bit characters, it takes up half as much memory!
Of course, String being final eliminates even the option of implementing that.
Interfaces seem a huge, cheesy copout for avoiding multiple inheritance; they really seem like they were grafted on as an afterthought. Maybe there’s a good reason for them being the way they are, but I don’t see it; it looks like they were just looking for a way to multiply-inherit methods without allowing call-next-method and without allowing instance variables?
There’s something kind of screwy going on with type promotion that I don’t totally understand yet but that I’m pretty sure I don’t like. This gets a compiler error about type conflicts:

abstract class List {
abstract List next();
}

class foo extends List {
foo n;
foo next() { return n; }
}

I think that’s wrong, because every foo is-a List. The compiler seems to be using type-of rather than typep.

This ``integers aren’t objects’’ nonsense really pisses me off. Why did they do that? Is the answer as lame as, ``we wanted the `int’ type to be 32 bits instead of 31’’? (You only really need one bit of type on the pointer if you don’t need small conses, after all.)

The way this bit me is, I’ve got code that currently takes an array of objects, and operates on them in various opaque ways (all it cares about is equality, they’re just cookies.) I was thinking of changing these objects to be shorts instead of objects, for compactness of their containing objects: they’d be indexes into a shared table, instead of pointers to shared objects.

To do this, I would have to rewrite that other code to know that they’re shorts instead of objects. Because one can’t assign a short to a variable or argument that expects an Object, and consequently, one can’t invoke the equal method on a short.

Wrapping them up in Short objects would kind of defeat the purpose: then they’d be bigger than the pointer to the original object rather than smaller.

And in related news, it’s a total pain that one can’t iterate over the contents of an array without knowing intimate details about its contents: you have to know whether it’s byte[], or int[], or Object[]. I mean, it is not rocket science to have a language that can transparently access both boxed and unboxed storage. It’s not as if Java isn’t doing all the requisite runtime type checks already! It’s as if they went out of their way to make this not work...

Is there some philosophical point I’m missing? Is the notion of separating your algorithms from your data structures suddenly no longer a part of the so-called ``object oriented’’ pantheon?

After all this time, people still think that integer overflow is better than degrading to bignums, or raising an exception?

Of course, they have Bignums now (ha!) All you have to do (ha!) is rewrite your code to look like this:

result = x.add(y.multiply(BigInteger.valueOf(7))).pow(3).abs().setBit(27);

Note that some parameters must be BigIntegers, and some must be ints, and some must be longs, with largely no rhyme or reason. (This complaint is in the ``language’’ section and not the ``library’’ section because this shit should be part of the language, i.e., at the syntax level.)

I miss typedef. If I have integers that represent something, I can’t make type assertions about them except that they are ints. Unless I’m willing to swaddle them in blankets by wrapping Integer objects around them.
Similarly, I think the available idioms for simulating enum and :keywords are fairly lame. (There’s no way for the compiler to issue that life-saving warning, ``enumeration value `x’ not handled in switch‘’, for example.)

They go to the trouble of building a single two-element enumerated type into the language (Boolean) but won’t give us a way to define our own?

As far as I can see, there’s no efficient way to implement `assert’ or `#ifdef DEBUG’. Java gets half a point for this by promising that if you have a static final boolean, then conditionals that use it will get optimized away if appropriate. This means you can do things like

if (randomGlobalObject.DEBUG) { assert(whatever, "whatever!"); }

but that’s so gratuitously verbose that it makes my teeth hurt. (See also, lack of any kind of macro system.)

By having `new’ be the only possible interface to allocation, and by having no back door through which you can escape from the type safety prison, there are a whole class of ancient, well-known optimizations that one just cannot perform. If something isn’t done about this, the language is never going to be fast enough for some tasks, no matter how good the JITs get. And ``write once run everywhere‘’ will continue to be the marketing fantasy that it is today.
I sure miss multi-dispatch. (The CLOS notion of doing method lookup based on the types of all of the arguments, rather than just on the type of the implicit zero’th argument, this).
The finalization system is lame. Worse than merely being lame, they brag about how lame it is! To paraphrase the docs: ``Your object will only be finalized once, even if it’s resurrected in finalization! Isn’t that grand?!‘’ Post-mortem finalization was figured out years ago and works well. Too bad Sun doesn’t know that.
Relatedly, there are no ``weak pointers.‘’ Without weak pointers and a working finalization system, you can’t implement a decent caching mechanism for, e.g., a communication framework that maintains proxies to objects on other machines, and likewise keeps track of other machines’ references to your objects.
You can’t close over anything but final variables in an inner class! Their rationale is that it might be ``confusing.‘’ Of course you can get the effect you want by manually wrapping your variables inside of one-element arrays. The very first time I tried using inner classes, I got bitten by this – that is, I naively attempted to modify a closed-over variable and the compiler complained at me, so I in fact did the one-element array thing. The only other time I’ve used inner classes, again, I needed the same functionality; I started writing it the obvious way and let out a huge sigh of frustration when, half way through, I realized what I had done and manually walked back through the code turning my

Object foo = <whatever>;

into

final Object[] foo = { <whatever> };

and all the occurence of foo into foo[0]. Arrrgh!

The access model with respect to the mutability (or read-only-ness) of objects blows. Here’s an example:

System.in, out and err (the stdio streams) are all final variables. They didn’t used to be, but some clever applet-writer realized that you could change them and start intercepting all output and do all sorts of nasty stuff. So, the whip-smart folks at Sun went and made them final. But hey! Sometimes it’s okay to change them! So, they also added System.setIn, setOut, and setErr methods to change them!

``Change a final variable?!‘’ I hear you cry. Yep. They sneak in through native code and change finals now. You might think it’d give ’em pause to think and realize that other people might also want to have public read-only yet privately writable variables, but no.

Oh, but it gets even better: it turns out they didn’t really have to sneak in through native code anyway, at least as far as the JVM is concerned, since the JVM treats final variables as always writable to the class they’re defined in! There’s no special case for constructors: they’re just always writable. The javac compiler, on the other hand, pretends that they’re only assignable once, either in static init code for static finals or once per constructor for instance variables. It also will optimize access to finals, despite the fact that it’s actually unsafe to do so.

Something else related to this absurd lack of control over who can modify an object and who cannot is that there is no notion of constant space: constantry is all per-class, not per-object. If I’ve got a loop that does

String foo = "x";

it does what you’d expect, because the loader happens to have special-case magic that interns strings, but if I do:

String foo[] = { "x", "y" };

then guess what, it conses up a new array each time through the loop! Um, thanks, but don’t most people expect literal constants to be immutable? If I wanted to copy it, I would copy it. The language also should impose the contract that literal constants are immutable.

Even without the language having immutable objects, a non-losing compiler could eliminate the consing in some limited situations through static analysis, but I’m not holding my breath.

Using final on variables doesn’t do anything useful in this case; as far as I can tell, the only reason that final works on variables at all is to force you to specify it on variables that are closed over in inner classes.

The locking model is broken.
First, they impose a full word of overhead on each and every object, just in case someone somewhere sometime wants to grab a lock on that object. What, you say that you know that nobody outside of your code will ever get a pointer to this object, and that you do your locking elsewhere, and you have a zillion of these objects so you’d like them to take up as little memory as possible? Sorry. You’re screwed.
Any piece of code can assert a lock on an object and then never un-lock it, causing deadlocks. This is a gaping security hole for denial-of-service attacks.

In any half-way-rational design, the lock associated with an object would be treated just like any other slot, and only methods statically ``belonging’’ to that class could frob it.

But then you get into the bug of Java not doing closures properly. See, you want to write a method:

public synchronized void with_this_locked (thunk f) { f.funcall (); }

but then actually writing any code becomes a disaster because of the mind-blowing worthlessness of inner classes.

There is no way to signal without throwing: that is, there is no way to signal an exceptional condition, and have some condition handler tell you ``go ahead and proceed anyway.‘’ By the time the condition handler is run, the excepting scope has already been exited.
The distinction between slots and methods is stupid. Doing foo.x should be defined to be equivalent to foo.x(), with lexical magic for ``foo.x = ...‘’ assignment. Compilers should be trivially able to inline zero-argument accessor methods to be inline object+offset loads. That way programmers wouldn’t break every single one of their callers when they happen to change the internal implementation of something from something which happened to be a ``slot’’ to something with slightly more complicated behavior.
The notion of methods "belonging" to classes is lame. Anybody anytime should be allowed to defined new, non-conflicting methods on any class (without overriding existing methods.) This causes no abstraction-breakage, since code which cares couldn’t, by definition, be calling the new, ``externally-defined’’ methods.

This is just another way of saying that the pseudo-Smalltalk object model loses and that generic functions (suitably constrained by the no-external-overrides rule) win.

Library:

It comes with hash tables, but not qsort? Thanks!
String has length+24 bytes of overhead over byte[]:

class String implements java.io.Serializable {
private char value[];  // 4 bytes + 12 bytes of array header
private int offset;    // 4 bytes
private int count;     // 4 bytes
}

The only reason for this overhead is so that String.substring() can return strings which share the same value array. Doing this at the cost of adding 8 bytes to each and every String object is not a net savings...
If you have a huge string, pull out a substring() of it, hold on to the substring and allow the longer string to become garbage (in other words, the substring has a longer lifetime) the underlying bytes of the huge string never go away.
The file manipulation primitives are inadequate; for example, there’s no way to ask questions like ``is the file system case-insensitive?‘’ or, ``what is the maximum file name length?‘’, or ``is it required that file extensions be exactly three characters long?‘’ Which could be worked around, but for:
The architecture-interrogation primitives are inadequate; there is no robust way to ask ``am I running on Windows’’ or ``am I running on Unix.‘’
There is no way to access link() on Unix, which is the only reliable way to implement file locking.
There is no way to do ftruncate(), except by copying and renaming the whole file.
Is "%10s %03d" really too much to ask? Yeah, I know there are packages out on the net trying to reproduce every arcane nuance of printf(), but controlling field width and padding seems pretty darned basic to me.
A RandomAccessFile cannot be used as a FileInputStream. More specifically, there is no class or interface which those two classes have in common. So, despite the fact that both implement read() and a slew of other like-functioning methods, there is no way to write a method which works on streams of either type.

Identical lossage exists for the pairing of RandomAccessFile and FileOutputStream. WHAT WERE THEY THINKING?

markSupported is stupid.
What in the world is the difference between System and Runtime? The division seems completely random and arbitrary to me.
What in the world is application-level crap like checkPrintJobAccess() doing in the base language class library? There’s all kinds of special-case abstraction-breaking garbage like this.

Stay tuned, I’m sure I’ll have found something new to hate by tomorrow.

(Well, that’s how this document originally ended. But it’s not true, because I’m back to hacking in C, since it’s still the only way to ship portable programs.)

Similar Posts