Hacker News new | past | comments | ask | show | jobs | submit login
What if null was an Object in Java? (donraab.medium.com)
77 points by ingve 15 days ago | hide | past | favorite | 160 comments



The problem isn't really that null isn't an object, it's that the type system allows any object reference of a particular class to also be null.

After using Typescript for the past couple years, it's such a joy when you define a variable of type Foo and know it won't contain null. Granted, TS also has to deal with undefined vs null weirdness (and even more weirdness in that an object not containing a property is subtly different from that object property being undefined), but in general the support for "optional" typing works very well.

Java tried to add things like the Optional class, but without first-class language support it's just a mess and you have developers doing crazy things, like having a method return an Optional that will sometimes also return null, by design (yes I've actually seen this).


I'm writing some powershell at the moment, after a few weeks of work I realized that sometimes my return types for some functions would change even though I explicitely declare them (for exemple a 'hashtable' would become a 'System Object[]' which is short for 'table full of garbage').

Turns out some ops without assignment and some printed messages would go into the pipeline (as in, to be piped in with `|`) and become part of the return value of the functions ; to make that work powershell changes the return type, appends the message or the non-assigned op to the hashtable and returns that as an object table.

To this day I'm not aware of any language doing this kind of thing, but maybe js does ? Anyway I fear I'm going to get some grey hair because of that.


> having a method return an Optional that will sometimes also return null

Wow that's terrible. When Optional landed in Java (8?) I came to the conclusion that the point was to use it as a contract from the developer of a method: "I return an Optional in this method so I guarantee it won't ever be null". If this is not respected, there's absolutely no reason to use Optional.


Totally agree, which is why I was baffled when I saw this code. But it goes to show that if something isn't enforced at the compiler level, "clever" developers will find a way around it.


> it's such a joy when you define a variable of type Foo and know it won't contain null

Why is that? I've never seen a good explanation of this point of view.

This all feels like just syntactic sugar to me. Sure you might not have null, you might have an empty object or any other variation. At the end of the day, if a value was expected to be available but is not available (for whatever reason), reliable code is going to have to deal with that absence. No matter how it is expressed.


The point is that every function is explicitly telling you if it can or can not return null. Your editor and typechecker will tell you if you need to handle null after calling that function.

If you have a non-null returning function that you need to change so it can return null, then your typechecker will tell you all the places where you now need to handle the new null return.

It doesn't need to be a very large codebase before this becomes a very useful tool to help when refactoring.


Seeing all this discussion reinforces the idea that no new programming language should have this "every type implicitly contains null" thing


Tony Hoare calls it (null) his Billion Dollar Mistake.

https://www.infoq.com/presentations/Null-References-The-Bill...


We use None in Python - is that similar?


Yes, and no dynamically typed language can do much better than this, because in those your value of type X can always be another type (and you need to defensively check if you want to catch errors)


Also no new languages should default to mutable variables, fields, and types, when the language supposed to be in the same field as Java, eg. large enterprise services. And for both problems, they should be consistent, because now they are mess (there is an example in the article regarding nulls in collections, and for mutability a great example is collect(Collectors.toList()) vs toList()).


Yes, sum types and generics solve this nicely.

It's a shame Java was designed before they had generics, so the core of the language can't take advantage of them.


It’s still possible to do something like TypeScript, where everything is like generics in Java. But Java try to be consistent, while TypeScript purposefully not. There are a ton of features in TypeScript which are awesome, but it breaks in some rare cases. Java won’t allow that.


Kotlin makes nullability part of the typesystem. It's a good reference for what would happen if Java did this better because if you are using Java you can just switch to Kotlin and experience this in your own code base (you can mix Java and Kotlin code easily).

You can actually write extension functions for nullable types and calling them on a null value does not cause a Nullpointer exception. This also works with Java classes.

For example, there's a CharSequence?.isNullOrBlank(): Boolean function in the Kotlin standard library that allows you to implement a null or blank check on strings. There's a similar isNullOrEmpty function on Lists (and yes, this does work for nullable generic types too).

There are many good reasons to switch from Java to Kotlin. But this is probably one of the bigger ones and genuinely low hanging fruit. Made me feel stupid after years of dealing with Java NPEs routinely and writing a lot of very verbose and needlessly defensive Java code in attempts to avoid them. All that cruft goes away with a proper language. Other languages that do this too are available of course. Pick one. Any one that isn't Java. Dealing with this stuff in 2024 is stupid.


But null is still null, a very special non-reference signifying absence. And not some pretend-object that walks like a valid reference, swims like a valid reference, but somehow doesn't quite quack like one. The problems surfacing as an NPE don't magically go away by null-less approaches, they just bite you at different times, in different form. Sometimes sooner, which is good, but sometimes also later. Java made null worse by not taking nullability into the type system (these days nullability annotations do a tolerable job at pretending it was) and lacking scope functions that make it easy to check without the eyesore of flooding local scope with names. Kotlin is really awesome for not falling into the trap of chasing the pipe dream of a world without absence.


CharSequence?.isNullOrBlank()

In java I can make my own static function

isNullOrBlank(String input)

Which works on nulls too, right?

It's annoying to have to reference another class - but otherwise it's not much less ergonomic.


You can, you can even call the kotlin version that way if it happens to be on the classpath, but it's truly horrible ergonomics if for every call you have to guess between class-ref-otherargs and ref-otherargs forms. It's one of those little cuts you grow numb to.


CharSequence is the base class that things like java.lang.String inherit. So I understood that as an instance method you can call like "foo".isNullOrBlank() (in Kotlin syntax)


I think the nullable reference type system in c# is a perfectly workable compromise for languages that had nullability from the very beginning. Once a code base uses them fully, most null-related bugs are gone. And it’s far from an unwieldy bolt-on.


With a good IDE configured to break on nullability errors, @Nonnull and @Nullable can also be used to replace the C# system in practice (not the same, but close enough). Unfortunately, this requires coordination from different team members and external dependencies rather than standard language features.


It's funny because I really dislike the C# approach. It's not truly enforced - any code with nullable disabled can call public functions with `null`, even if on your side they were marked as non-null.


Yup, unfortunately for us there is a lot of C# code written without NRT. If all of your code utilizes NRT and you respect the warnings, you can nearly eliminate issues related to null. There are some paper cuts around generics, but generally it has been working quite well for me. Our entire code base has NRT enabled and the only place where null sneaks in is with EF Core missing includes. This is actually kinda nice because if you get a null-whatever exception, 99% of the time it's a missing include.


This is true, but you can fairly easily audit your solution for files/projects which have it disabled. And yeah, you need to verify the data at the boundaries of your trusted code - but that's the same situation as it's always been. Where NRT help is avoiding boilerplate null checks in the dense forest of your business logic.


Well my point is more that if they introduced an entirely new non-null reference type to the language then you could never make that mistake - it would be non-optional for those calling into your code to respect whether null can be passed in or not based on the type you used. Since they instead made it an optional system you can not rely on those using a library you make to actually turn it on, you have to accept that null _can_ be passed in even if you mark something as non-null according to the system on your side, which is dumb.


Yes, that is possible but you still get warnings and you should compile with warnings as errors anyway.


You only get warnings if you turn those on, you can use `<nullable>disable</nullable>` and then the language works exactly as it previously did. In fact, when you disable it you get warnings if you _use_ the nullable system and mark a reference as `?`.


The Nullable class is nice and I love to use it but it would be even better with an option type like in F#


There isn’t any big difference between T? And Option<T> semantically. Many code bases use an Option<T> in C# to indicate a 0-or-1 object result, but refactor those to T? instead with nullable. It combines better e.g Option<Option<T>> doesn’t need to be handled manually.


I'm going to be a bit pedantic here: There is a semantic difference between Option<Option<T>> and Option<T>. If I intend to retrieve a setting from a file, the former allows me to differentiate between a missing file or a missing setting, while the latter destroys that information. i.e.: There are 3 possible cases, while only 2 can be represented.

So T? doesn't compose, while Option<T> does, which I'd consider a big difference.

However, without a builtin option type, duck-typing, or a pleasant way of converting the types, Option<T> may become a hassle (especially when different dependencies ship their own). And as T? is shipped with the language this is probably why it is used when the composability is not required.

P.S.: In C# T? even neatly composes.

  return obj?.a?.b
is equivalent to

  if (obj != null && obj.a != null) {
    return obj.a.b
  } else return null;


That sounds like you should be using Result<T, E> to handle the two E cases you are describing. Success with Ok(T), with Err(Missing File) plus Err(Missing Setting).


Probably. A case I see more often is a dictionary with nullable values. A T? return can't distinguish key missing or key present with null value.

Which leads to a bit of an argument - nesting gives poor ergonomics, while collapsing impacts the ability to write generic code.


Which you can't do in C# either without sum types :)


> There is a semantic difference between Option<Option<T>> and Option<T>. If I intend to retrieve a setting from a file, the former allows me to differentiate between a missing file or a missing setting

Does nesting Option's really has practical use or does it quickly become confusing?

In your example, Option<Option<T>> return type doesn't tell me by itself that this differentiates between a missing file or a missing setting. I would need to get this information from somewhere else.


Sometimes you'll have nested Options just because you're mapping a fallible operation over a fallible input. You don't want the resulting `Option<Option<T>>` to immediately collapse; then you wouldn't know which upstream operation failed. It's true that `Option<Option<T>>` is very generic (i.e. it doesn't inherently tell you what each None means), but flattening Options removes more information; it isn't a solution to the problem you're posing. At least you can post-process an `Option<Option<T>>` into a multi-variant, self-documenting result type before you pass the value off to some other consumer.

In other words, nested optionals might not be very readable, but they're a necessary product of having a highly modular, reusable toolkit, and you can always massage them into more informative, domain-specific types at whatever point that becomes appropriate.


The biggest impact is on generic code - you don't want supplying Foo? as T rather than Foo to mean that there are side effects where successful returns and error cases are both represented by null.


In some cases (limited) nesting really seems useful. Nullable parameters for a Copywith method is another one (does null value mean 'no change' or 'set it to null').

The thing is that nullable covers majority of cases and is quicker to read and write (Foo? vs Option<Foo>). Doing a?.b?.c is also more elegant than anything equivalent using an Option type.

Is there any languages that successfully combines both, nullable '?' syntax and an Option type?


The Haskell equivalent `c <$> b <$> a` is roughly as concise.


Yeah - this is something I noticed coming to Swift after spending a bunch of time in rust. Swift has T? syntax, and rust has Option<T>. The Swift syntax felt much more “lived in” - like, nullables were much easier to use and I found myself using them more.

Rust has dozens of helper methods for option - like map, map_or, map_or_else, and so on. When you’re starting out, it’s quite hard to find what you want in the morass of options. And many of the helper functions take a closure, which messes up your ability to do control flow (return, break, continue) from the containing function. In Swift, like typescript, I found the syntax sugar around options to be much easier to learn. Eg obj?.a?.b rather than checks docs obj.and_then(|o| o.a).and_then(|a| a.b).

“If let” in rust helps. (Similar to guard let in Swift) Ie, you can write if let Some(x) = x { … }. But you still can’t combine that with other conditions in the if statement - which drives me nuts.

https://doc.rust-lang.org/std/option/enum.Option.html


I think Option<T> is an overrated construct when it's not a monad. The whole value is being able to write a chain of operations

    myobj.dothing(x).otherthing(y)
Without having to check the intermediate results. When it's just a tagged union you end up having to check for null everywhere anyway. It's better than "untyped" null because it can't pop up anywhere but it's not super ergonomic. I think Nullable where the Option is implicit does it better.


One important distinction I didn't get at first is that Nullable<T> is a struct (not a class). It is only for value types. Nullable reference types are implemented via attributes.


Why stop at null, when you can have both null and undefined? Throw in unknown, and you've got a hat trick, a holy trinity of nothingness!

Of course the Rumsfeld Matrix further breaks down the three different types of unknowns.

https://en.wikipedia.org/wiki/There_are_unknown_unknowns

>"Reports that say that something hasn't happened are always interesting to me, because as we know, there are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns—the ones we don't know we don't know. And if one looks throughout the history of our country and other free countries, it is the latter category that tends to be the difficult ones." -Donald Rumsfeld

1) Known knowns: These are the things we know that we know. They represent the clear, confirmed knowledge that can be easily communicated and utilized in decision-making.

2) Known unknowns: These are the things we know we do not know. This category acknowledges the presence of uncertainties or gaps in our knowledge that are recognized and can be specifically identified.

3) Unknown unknowns: These are the things we do not know we do not know. This category represents unforeseen challenges and surprises, indicating a deeper level of ignorance where we are unaware of our lack of knowledge.

And Microsoft COM hinges on the IUnknown interface.

https://en.wikipedia.org/wiki/Tony_Hoare#Research_and_career

>Speaking at a software conference in 2009, Tony Hoare apologized for inventing the null reference:

>"I call it my billion-dollar mistake. It was the invention of the null reference in 1965. At that time, I was designing the first comprehensive type system for references in an object oriented language (ALGOL W). My goal was to ensure that all use of references should be absolutely safe, with checking performed automatically by the compiler. But I couldn't resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years." -Tony Hoare

https://news.ycombinator.com/item?id=19568378

>"My favorite is always the Billion-Dollar Mistake of having null in the language. And since JavaScript has both null and undefined, it's the Two-Billion-Dollar Mistake." -Anders Hejlsberg

>"It is by far the most problematic part of language design. And it's a single value that -- ha ha ha ha -- that if only that wasn't there, imagine all the problems we wouldn't have, right? If type systems were designed that way. And some type systems are, and some type systems are getting there, but boy, trying to retrofit that on top of a type system that has null in the first place is quite an undertaking." -Anders Hejlsberg


I think there is a way to migrate to an union type myType | null. Instead.


Union types are pretty ugly.

Eg they can't distinguish between one or two layers of nullable. Or between null and an optional null.

(In eg Rust syntax between `Option<Option<MyType>>` vs `Option<MyType>`. Or `()` vs `Option<()>`, where `()` would be how you spell `null` in Rust.)

Using tags is simpler and cleaner.


Ruby has a `NilClass` and the best/worst part of it is the to_s is "", to_i is 0, to_f is 0.0, to_a is [], to_h is {}

It's incredibly clean and convenient until you wake up one morning and have no idea what is happening in your code


I dont think there is something wrong with that once you think about what is a Null Element (or identity) in a group that is represented by a set of elements and a function:

Integer, + => 0

Float, + => 0.0

Array, add => []

Hash, merge => {}

and so on.

I think maybe we can debate the operations/functions, but they make sense. For Integer in some ways you can define almost all other operations that you commonly use based on the addition.

So while nil is an object when trying to find a representation in other group I find it logical or expected.

Also Ruby will not automatically try to coerce nil when not asked to do so

like for example 0 + nil will throw an error.


Integers support both addition and multiplication and taking maximum and minimums, and a few other semi-group operations. Do you want to define different Null elements for all of them?


No, I don't want to define a representation for Null to all possible combinations of a set and a function/operation. That can be done by each developer if they see it fit and of the operations they want to have this.

But, for me, it makes sense to have a default representation for Null (that it is not automatically coerced—only if the developer explicitly asks for it) for one of the most common operations in that specific group.


Of course since it's Ruby you can just monkey patch those to_s methods to do whatever the hell you want, confounding anyone else working on your codebase.

I love using Ruby when I'm the only one who will ever have to look at or touch it.


> the to_s is "", to_i is 0, to_f is 0.0, to_a is [], to_h is {}

I somehow can't help reading that as some sort of high school sports-cheer: "Gimme an S to the quote to the I to the oh to the F to the zero! Goooo Rubies!"


Those conversions make sense though. They all mean empty or null. It's what I would expect from a language like Ruby.


Integer 0 means empty with respect to a particular operation (addition) but is not empty with respect to all operations (ex. multiplication)


Indeed. One's greatest strength is also their greatest weakness.


It'd be fairly easy to modify Java so that the primitives, including null, behave more like objects. And certainly, Java has inched that direction.

It doesn't solve anything fundamental though. You could make null.toString() return "null", but in most cases that will just be a bug. You're missing a null check or failed to initialize something.


We already have None in e.g. Python but that merely means that "x.mathod_name()" instead of throwing a NullPointerException raises an AttributeError, because None has no method "method_name". Okay? Not really any meaningfully different.


Python isn't exactly statically typed.


If Null is not a subtype of MyType, then you wouldn't be able to assign null to a variable decalred as MyType, without breaking the rest of the rules of Java. I don't really see how this could work, even theoretically.


There's no reason that something can't both an object and the bottom of the type hierarchy.

It's, technically, an instance of multiple inheritance. Java doesn't generally allow this, but there's tons of special cases in the compiler for things that you can't do yourself. For example, defining operators is done in the compiler, but you can't define operators for your own classes.


It's the other way around: null needs to be a subtype of MyType, not a supertype: anywhere I can pass a MyType I should be able to pass a null.

You could get most of the way to the blog post's outcomes by simply adding a few methods to Object (is(Not)Null, primarily) and having null be a magic (because it's all classes and none) instance that will NPE on all but the few defined Object methods, I think, but none of that answers the burning question of _why_ that I feel the article doesn't really address.


Correct. In type theory, a type that is a subtype of all other types is called a bottom type: https://en.wikipedia.org/wiki/Bottom_type


It would be a bottom-type of reference-types. This wouldn’t work for value-types like int, at least not without boxing, which would be very painful.


But an int can't be null. If a method takes or returns an int it's guaranteed to be not null.


…yes; that’s my point.


Fair, but those are kind of off in their own in the java hierarchy as is.


I don't actually think it solves any problems. You still have to null check.


Null should be valid.

Kotlin solved Java's problem by making it a compiler error if a value that can be null isn't checked and shown to be null or the actual value, eliminating an entire class of exceptions.


I’m not familiar enough with kotlin to comment fully but from your description the checker framework [0] appears to do the same thing in Java.

I confess I’m not fond of checker framework. I find the error messages can be obtuse but it is very effective.

0 - https://checkerframework.org/


Kotlin supports nullability on the level of the type system, it is similar to TypeScript in this respect.

The problem with nullability annotations in Java is that they are not enforced, and there is no commonly adopted standard. There are like ten competing libraries with similar annotations. There was JSR 305 ("Annotations for Software Defect Detection"), but it has been dormant. When you import a third party library, you never know what kind of nullability annotations it uses and if it uses them at all.



null would just mean the zero value instead of the absence of a value

  String foo = null;

  String bar = "";

  foo.equals(bar) --> true
This works well provided the data type has a sensible zero value like collection types

EDIT: I'm blocked from posting so I won't be responding further, thank you for discussion.


A null collection and an empty collection are two different things. A nullable collection is one that has the state “no collection” semantically separate from “empty collection”.

Similarly an Option<byte> has 257 different values while a byte has 256 different values. That the byte has a good zero value doesn’t change that - the whole reason for choosing a maybe-byte is having 257 values, not 256.


Right that depends if you subscribe to the belief that null means the absence of a value `Option<T>` or does it mean the zero value `T`.


If null and [] should be the same thing then I’d make absolutely sure you can’t represent both. You don’t want two states representing the same thing. That should be easy to ensure if a language is reasonable. E.g a field that can’t be null (best case a non-nullable type, otherwise maybe a constructor guaranteeing it’s never null)

As the example of byte vs option<byte> either you want 256 states or you want 257. If you have 256 or 257 states you want to represent will decide which type is correct. The other choice of type is incorrect.

In some languages, these things are blurred because the language doesn’t let you choose a correct type for the state space, but I’m talking about the case where you can (coincidentally the set of languages I’d use).


Null is the absence of a value. How do to distinguish 0 from no value?


The point is to eliminate the idea of an absence of a value. A variable is always assigned to a value, but there is a special value called null which behaves as a kind of sentinel value whose methods all return the null value for their respective type.


Yes - but as we keep repeating - if you do that why would you use null as a possible state to begin with? Not specific to Java, but in general.

E.g a boolean in java has two states true/false while a Boolean with capital B has 3 states true/false/null.

In that context you can choose type to represent how many states you have. E.g if it’s a field representing a cache of a bool value you can represent “not yet calculated” with null. It you were to magically convert null to false for a Boolean it only has two states! It’s now unusable for the purpose.


> The point is to eliminate the idea of an absence of a value

The whole point is to track the absence of a value. Why would one want to eliminate it?

null is way more trackable that some other special value like 0, -1, 999, "", etc.


You can do that with some types, but making 0 be that sentinel is completely bonkers.

You need a 257th value to have a byte sentinel.


That just gets us back to the problem for which Null is introduced in almost every lamguage: indicating the absence of a value. This is an important feature in every language, and null is the most popular solution to it (the only significant alternative is the Maybe monad).

To put this in more concrete terms, if this change were integrated in Java, how would you indicate the difference between a JSON document which doesn't contain a field VS one where the field is an empty string?


I dare say there's a lot more use of a magic value to indicate no value than a distinguishable representation of it :-)

I base this mostly on an assumption of C still being one of the most widely used languages for the code that's running out there in the world. In C, after all, NULL is just a magic value rather than a distinguishable representation of no value, though that's just one example out of a host of others, all the way down to the venerable NUL-terminated string.

As for your question: something like 'keys(jsonobject).contains("fieldName")' ? Or 'NoSuchFieldException' thrown if you do 'jsonobject.get("fieldName")' ?

(The latter of which, given the general uneasiness NullPointerException creates in us devs, is often how a Java API will work anyway! Checked exceptions won the day! Until the Functional interface was made, at least.)

Or to answer it in the same spirit as this overall comment, why would you need to distinguish between missing and empty? Can't you just define the semantics of the document s.t. those two things having the same effect?


I don't agree that C's NULL is any different from Java's null or modern C++'s nullptr, at least outside of embedded contexts (where sometimes people actually store stuff at the 0 address). Sure, it's normallt just a macro that resolves to 0 at the implementation level. But people use it like in C++: you want to return an int, but also distinguish the case where no int could be returned? Return an int*, and NULL signifies no data.

> As for your question: something like 'keys(jsonobject).contains("fieldName")' ? Or 'NoSuchFieldException' thrown if you do 'jsonobject.get("fieldName")' ?

> (The latter of which, given the general uneasiness NullPointerException creates in us devs, is often how a Java API will work anyway! Checked exceptions won the day! Until the Functional interface was made, at least.)

I was thinking more of the case where you deserialize a JSON object into a Java object, and then inspect the Java object. Regardless, it was just an example - the problem of distinguishing "no value" from "any value" is pervasive in programming, and all languages must have some strategy for it. If we got rid of null from Java, then Option<T> would probably be the only general candidate. Which, to be fair, would be slightly better, as it would at least force you to check.

> Or to answer it in the same spirit as this overall comment, why would you need to distinguish between missing and empty? Can't you just define the semantics of the document s.t. those two things having the same effect?

Not in the general case, no. At least not without doubling every field and adding other inconsistency issues ({"result": "ABC", hasResult: false}).


In the end, you'll have a mixture of NULL and "" in your DB, and a couple of years later a piece of logic written in another language will fail spectacularly.


One response to this is issue is a CHECK constraint LENGTH(column) > 0, so you can’t have empty strings.

Rarely do you have a textual database column where the empty-vs-NULL distinction is semantically meaningful in the application domain. Most of the time, either the column value is missing (arguably better represented by NULL) or has a non-blank value. “Present but blank” is rarely meaningful or useful

Sometimes I pair that with (TRIM(column)=column) to prevent leading or trailing whitespace being saved, which also stops all-blank values being saved

This works really well if you have an RDBMS which supports CREATE DOMAIN, so then you can attach these constraints to a user-defined type and don’t have to repeat them for each column, you just set the type of the column to that user-defined type


This is how I would do it.

  Go: *string

  Java: Option<String> or @Nullable String

  Rust: Option<String>

  TypeScript: string | undefined (or string | null)


The problem is, not all of these languages think that "" and null are equal.


I might choose to rephrase that as "the problem is, some of these languages think that "" and null are equal." :-)


Isn't the lack of strict equality a result of loose typing in those languages?


Maybe - often more to do with overeager coercion, which does tend to go hand in hand with loose typing.


> In the end, you'll have a mixture of NULL and "" in your DB

Not if you use Oracle.


It took many years to eliminate all the instances of "NULL" from the database.


Objective-c allows you to send messages to null objects. On one hand it allows for a form of null-coalescing, but on the other it allows bugs to slip in and get the program into an unexpected state, whereas a more rigorous treatment would result in a more deterministic crash.


It is part of what makes Objective-C phenomenally good for rapid development in the hands of competent developers. However, as team, and codebase sizes increase, it allows for subtle bugs by less competent developers.


It’s also very much related to its use in GUI hierarchies: GUI objects generally have a lot of properties and behaviours, most of which you often don’t care for at any given point. Nil being a shorthand for “do nothing” without needing explicit checking is very convenient.


The only real problem with null, is not having a real optional type in Java. If you have it, 99,999% of null errors go away.

Java Optional std lib class, instead of a language (typesystem level) feature was one of biggest mistakes in its recent history.


At least they plan on introducing null safety for value objects via the „Null-Restricted Value Class Types“ JEP: https://openjdk.org/jeps/8316779


If null-safety on the JVM is important to you, just use Kotlin.


Just use Kotlin. I'm honestly not seeing any need nor benefit in writing Java anymore.


I‘d say its the other way around, Java is closing the gap, and I say that as a Kotlin fan. Nullability in the type system is the big remaining advantage.


Scope functions are still huge. Yes, deep let chains can certainly be considered an antipattern (sometimes I like the approach of writing them, then transforming to more imperative for readability, I think readability peaks at a mix of imperative with some shallow .let), but I miss them in any language that is not kotlin.

Variables should be the exception, not the norm. I have no patience for names that designate some fluid work in progress instead of a value.


> Scope functions are still huge

They can be useful, but I can never remember which one of let, run, with, apply, and also I currently need. Also, I've noticed that they motivate overly "clever" code, especially but not exclusively in junior programmers.


They absolutely do come with a goldilocks problem attached, best used in moderation, and require a bit of an aesthetic to develop. But what doesn't. What appears "clever" to the reader is actually dead simple train of thought coding at write time, with many types of bugs waiting to bite in an imperative implementation simply not possible.

That's why I like writing scope-heavy and then transforming to an aesthetic middle ground between scope expression and imperative. Name stuff when you have something meaningful to say, when there's something helpful to express in a name, not when the syntax mandates a locally unique ID.


> Scope functions are still huge.

I’d put any one of extension methods, value and data classes, immutable variables, structured concurrency, and top-level functions ahead of scope functions for reasons to switch to Kotlin. But hey, if you’re switching, we’re already friends :) .


Really, top level functions? I fail to see their impact, outside of getting rid of ceremony around trivial code examples.

You missed optional and named arguments in the list, those really change cost/benefit decisions in API design. (and, unfortunately, make argument names part of the public interface, this must be the most controversial part in all of kotlin)

I singled out scope functions because all those other things feel very much like the usual set of language differences, whereas scope functions feel completely orthogonal. They could be added to every single language that has excursions and imperative elements, and they would be about the same improvement everywhere.


There is also the "Void" type.

> The {@code Void} class is an uninstantiable placeholder class to hold a reference to the {@code Class} object representing the Java keyword void.

When Java introduced Generics they re-used "Void" type. Method calls need to use "null" when "Void" is the type. So in a way, "type of null" is "Void".


Kotlin can do this -- `(null as String?).isNullOrBlank()`. I really like this feature because it prevents nullability cascade. I'd love to see pure Java adopt and I feel like Kotlin's edge language development has really helped the greater Java ecosystem (records + `data class`, exclusive `when`, so many good examples).


Another option in the language design space which this article doesn't mention is to have different kinds of null indicating what kind of object this null value is intended to be a stand-in for. NULL is a generic non-value, but it's also possible to have null-number, null-string, null-character, etc. which can be handy for detecting certain kinds of type errors. In particular, a constructor for a class C can return a null-C if it fails, which can be very useful for debugging. One of the problems with generic NULLs is that it can be very hard to track down where they came from.


I like the way null is handled in Java, I think it’s pretty well thought out, other than the small awkwardness with basic types.

Contrast that with SQL, which had my external scorn for null != null

So you can do WHERE table.col = Null…. Ugh


> I like the way null is handled in Java, I think it’s pretty well thought out, other than the small awkwardness with basic types.

It's all inherited from C. Null works the same, references are just pointers, and primitives have to be boxed since collections store references. Not sure if I'd say it's well thought out, but coming from C, all the weirdness makes sense.


> Contrast that with SQL, which had my external scorn for null != null

For anyone not familiar, or who doesn't know the reason null behave like this in SQL: in this word, null essentially means "don't know". So you can't assume a = b when a and b are null. They could be different values, just that we don't have them.

This thing being called null, where null has another semantic in the languages from which we send SQL request is confusing, and I suspect many of us would prefer null behave the same way as everywhere else.


SQL is one of the very few places where I deem null to be valid (altough, you could normalize all nulls away, most of the time it isn't worth it)

The null != null thing is something I quite like, as it simplifies a lot of joins/queries.

and as jraph said, null in relational databases has a very different meaning than null in most programming languages.

Don't know vs not assigned


I proposed something similar for D a long time ago, I didn’t realize though that there was prior art there with Smalltalk!


Would a prototype based system like Self or JavaScript allow you to attach custom properties and methods to individual instances of Null? That would be so cool!

Nihilistic Oriented Programming is where you don't use any classes, just customized instances of null.


Java developers introduced Optional for this very case, but Optional's API enforces stream style constructions. Moreover you must handle nulls sometimes (foreign API calls), and then you might run into type checking problems, NPE when put into a Map.of() etc.

For me the given examples from Smalltalk are more appealing. I would trade null for nil, or at least have some nullsafe construction like .? in swift


I totally support the idea of null having type Null, but only if null no longer has type Integer, and null no longer has type Bufferedreader, etc.


Has anyone tried to make null as a function? In Julia, there is a `get(null::Function, collection, key)` method which can be used in a syntax:

value = get(collection, key) do

    error(“$key in the collection not found”)
end

This syntax has captivated my attention lately as it seems like a viable alternative to doing an explicit null check before using the value.


That’s just a default callback.

Actually using this instead of nulls in any sort of general case would be using CPS for all faillible APIs, and that sounds heinous.


What if every null reference was an instance of a Null Object pattern? https://en.wikipedia.org/wiki/Null_object_pattern

Eliminates null checks, Optionals, and NPEs. Probably moots annotations like @NotNull too, but maybe some use cases need those for client APIs.

Makes iterating data structures like graphs simple. Faster too, because the JIT NOP a Null Object's methods. (Mostly; profile to confirm, then tweak as needed.)

I implement a Null Object for each base class.

Each base class has a static final member NULL referencing its Null Object implementation (flyweight, singleton). Then assign variables to AwesomeThing.NULL instead of null.

A spiffy javac could code generate Null Object implementations. (It's on my todo list.) For scalars, just use the default null value should be. int is 0, float is NaN, etc. Their boxed values will need small shims too.

Customizing javac (with some compiler plugin or something) is deep down on my TODO list. So I'm unlikely to be the (first) person to do this work. Sorry.


Objective-C does that but it has the effect of separating serious consequences from their causes. Places that really require non-null have a much more difficult time finding where the null value originated.

And since Objective-C contained a lot of C, it really just separated null dereferences from their causes making debugging much harder.


Thanks for replying. I really appreciate pushback, criticisms, and thoughtful opinions.

I don't know Obj-C, so I'll have to research those use cases, to understand better. eg: Are those across processes? Is this an Obj-C idiom? How to make illegal state representations impossible?

Per the Null Object pattern description, Null Object captures all of the null handling and behavior in one place, vs scattered around. Stuff like null checks and throwing exceptions. It does not eliminate the need for the concept of null.

I'm probably (mostly) wrong about eliminating @NotNull annotations. Especially in method signatures (defensive programming). Last night while commenting, I was thinking about annotating POJOs, to info serialization.


Not everything has a sensible null object.

It also means that forgetting to set a value will often silently give wrong results instead of immediately producing an error.


Apologies, I omitted part of my proposal:

While most Null Objects would (probably) be code generated, you can provide your own as needed.

Said another way: Most of my Null Object implementations have been boilerplate, and therefore could be code generated. But occasionally some are not.


What I mean is that semantically there's often no such thing as a null object for a given domain. So any artificial one that can be made is indistinguishable from null in practice (i.e. all operations on it will throw etc).


I'd like a null on the same order as NaN. You can use it, compute new values with it, call methods on it, but all results are a similar NaV (not-a-value?).


That sounds horrible. Throwing an npe is almost certainly better than just blithely continuing to process broken state.


You’d love objective C: all sends to a nil ref return nil.


The author is now discovering Ruby and Objective-C :-)


What if it was a smart object and the system could figure out what was supposed to happen and give you that back?


I quite like the way dart handles nulls.


The way Crystal and Dart 2 handles it? Null as a separate and unique type unrelated to Object where type unions solve the problem using syntax like `String?` that is equal to `String | Null`.


Yeah, means I don't need to worry about null unless I explicitly allow null.

NSNull says hi


Is a NullPointerException so much worse than a NoSuchElementException? Is there a difference?


Doesn't an Optional basically cover this case


An Optional is just a tri-valued null (null, None, and Some), so no.

It'd be nice if Java had a concept of a never-null reference (like a C++ reference vs. a C++ pointer), but the @NotNull annotation wasn't enforced the last time I checked.

Also, there's no way for an object to express that invariant because encapsulation is so weak. Given only this constructor (and no reflection):

   Foo() { foo.bar = new Bar(); /* foo.bar is final; Bar() does not throw */ }
callers can still get an instance of Foo with bar set to null.

Anyway, null handling in java somehow manages to be worse than C, where you can at least inline a struct into another, statically guaranteeing the instance of the inlined struct exists.

I can't think of another statically typed language that screws this up so badly. It just keeps getting worse with stuff like Optional and @NotNull.

(Disclaimer: I haven't followed java for 4-5 years; it's possible they finally fixed this stuff.)


Wait, javas Optional is a reference type so it can be null? Doesn’t that almost defeat the purpose of it?


Arguably yes, but that doesn't stop people using it.

Basically Java had nulls from the start. A decade or so later some people who didn't like nulls introduced their own Optional type, as a third-party library. Enough people liked it that Optional was added to Java's standard library.

But as it's just an object, it can be null. Some null avoidance enthusiasts also use third-party @Nullable and @NotNull annotations, which some automated code checking tools will attempt to verify during compile/test.


It gives you the ability to treat nulls as something-to-fix.

In a team without Optionals, every time you touch a null that you didn't expect, you have to decide "Is this deliberately null, or was it a mistake?" Without that knowledge, you don't know whether your code should assert against the null, or allow it to pass through as a valid value.

With Optionals, it becomes much simpler to cut through that nonsense. A null is a bug and you fix it (with the exception of json at the boundaries of your system, etc.) If you do find a value where you change your mind about its nullability, changing it to/from Optional will give you compile errors in exactly those other parts of the code that you now have to check/change.


Yup. Your IDE will likely highlight it as an issue, but it's totally legal to return a null Optional. There's nothing special about it, it's just a wrapper class.


Did the project to add value types to Java (I’m sure I heard of it a decade ago) never finish?


Not yet, that's Project Valhalla iirc. It's coming along but hasn't been merged yet. I don't believe it's even a preview feature within the JDK yet.



Kinda.

Every non-primitive is nullable in Java. Adding Optional doesn't/can't change that.

You can have a gentlemen's agreement to prefer None to null.


It doesn’t defeat the problem in theory, but in my experience it does in practice. I’ve never come across an NPE on a nullable reference even in development - it would have to be the result of a really fundamental misunderstanding of the concept.

YMMV. Obviously it depends on your teammates.


Fortunately, Uber made tooling for languages with broken type systems

* https://github.com/uber/NullAway

* https://github.com/uber-go/nilaway


Lombok, Error Prone, and Kotlin also have their takes on the problem.


> I can't think of another statically typed language that screws this up so badly. It just keeps getting worse with stuff like Optional and @NotNull.

Java might be the only language where a simple assignment `x = y` can throw a NullPointerException (due to auto-unboxing)


Assuming that was the only constructor you defined on class Foo, and you used this.bar instead of foo.bar (latter won't compile), then the caller can't possibly get a Foo with bar set to null (except by reflection, and there are ways to prevent that). Moreover, even if new Bar() did throw an (unchecked) exception, the invariant would still hold, since Foo would rethrow the exception. This has always been the case, as far as I know.


Doing it requires two threads.

Thread A sets a shared reference to a newly allocated and null initialized reference to Foo:

   shared = new Foo();
While that's running, thread B invokes a method on the reference that assumes bar is non-null:

   shared.useBar();  // null pointer exception
Later, thread A runs the constructor for Foo.


I think you're right, if access to shared is not in any way synchronized. But the correct way to handle this, at least in this case, is to mark shared as volatile, which guarantees thread B will only ever read null or a fully constructed Foo from shared. This has been the case since Java 5, released 20 years ago, thanks to JSR-133.


By that standard, C and C++ are much worse, since they offer no runtime encapsulation at all, and have much worse and more subtle multithreaded errors (e.g. Java at least guarantees that all native word sized reads/writes are atomic, if I recall correctly). C++ doesn't even guarantee that a reference can't be null, or worse, deallocated before it is dereferenced. They allow you to specify that a field is of some type and shouldn't be null, which is nice, but they don't enforce that in any way, they just call any code path that violates it UB.

For example, this is code that any C or C++ compiler will happily run and do something:

  struct Bar {
    int b;
  };
  struct Foo {
    struct Bar bar;
  } foo;

  strcpy((char*)(&foo), "ABC");
Or in relation to null C++ references:

  int& foo(int* p) {
    return *p;
  } 
  
  int &r = foo(nullptr); //UB, but in practice will likely result in a null reference at runtime
Similarly, accessing an object from multiple threads without synchronization means its value is not fully defined in Java. Unlike C or C++, it is at least known to be a Java type, not a memory corruption vulnerability.


We can quibble on definitions here, but a reference in C++ can not be null. The undefined behavior happens before any assignment to a reference is executed so that at the moment that the assignment happens, the reference is guaranteed to not be null.

In your example, it's the dereference of the pointer to p that is undefined behavior, so anything that happens after that point is also undefined behavior. Note that means there is never an actual assignment to r if p is null.

As I mentioned earlier, this might seem like quibbling with definitions, but this is the proper mental model to have with respect to C++'s semantics.

Having said that, I don't disagree with the main crux of your point, which is that C++'s semantics are terrible and there is little that the language provides to write correct code, but I do think there are subtleties on this matter that are worth clarifying.


I agree, but by that same definition, a private reference field in Java that is initialized in the constructor also can't be null.

My point is that we can compare two things: valid programs, or programs that compile.

In valid C++ programs, references can't be null and there are no data races. In valid Java programs, all final fields initialized in an object's constructor have that value for the lifetime of the object.

If we compare invalid programs that compile, which is an important point as well, then those guarantees go out the window. But here Java is much more forgiving than C++: if you have improper synchronization, you may see fields which are null instead of having their final value, which is bad and confusing. But in C++ with improper synchronization, you can see literally any outcome at all.


That's not accurate for a final field, as final fields are initialized in a special way.

https://docs.oracle.com/javase/specs/jls/se21/html/jls-17.ht...

>An object is considered to be completely initialized when its constructor finishes. A thread that can only see a reference to an object after that object has been completely initialized is guaranteed to see the correctly initialized values for that object's final fields.

If you don't publish a reference to the object from within the constructor, you will not see a null value of the final field, even if the object itself was unsafely published across threads via a non-volatile field.


This too is thanks to JSR-133. In fact, it seems that, thanks to this part of JSR-133, what I said above about marking shared volatile is actually unnecessary. There must be a memory barrier somewhere, under the hood, though.


At least on android arm64, looks like a `dmb ishst` is emitted after the constructor, which allows future loads to not need an explicit barrier. Removing `final` from the field causes that barrier to not be emitted.

https://godbolt.org/#g:!((g:!((g:!((h:codeEditor,i:(filename...


Yeah, I wish the VM would prevent null assignment of optional and force to empty. There are probably side effects I can’t think of here and certainly would cause problems with legacy code misusing optionals.


null can be avoided with a good linter


Not avoided altogether. Static checkers cannot possibly follow all code paths, and they generally err on the side of false negatives rather than risking too many false positives causing people to disable them.


I wasn’t aware they preferred type II errors. That makes sense, but I don’t really expect tools like that to work across modules.


It depends on the specific tool and how it's configured. But that has been my experience with many tools configured with their recommended settings.


It would be fun to subclass null, and make multiple instances of it.

I did hack MACLISP to have multiple NILs once. A surprising number of things appeared to run OK for a while. Of course the PDP-10 had a hardware addressing mode just for NIL so if you weren't the official one compiled code didn't believe you were a legit NIL.

Making another T was more boring, except that the few bugs that were triggered seemed to appear more quickly than in the NIL case. I guess a lot more people explicitly compared with 'T. Eh, grad students, what do you expect?

As to why I did this, well it was decades ago but surely a case of work avoidance.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: