Dr. David J. Pearce

A Story of Cast Expressions


Issue #427Bug with Dereferencing Array Types” seemed like just another bug.  Submitted by a user last week (@Matt–), I didn’t think too much of it.  But, as sometimes happens, appearances can be deceiving.  In this case, the bug identified a flaw in the syntax of Whiley with respect to the treatment of cast expressions!

Cast expressions are well known to be a thorny issue.  In Java, there are well-known limitations with parsing cast expressionsFor example, did you know this Java program is invalid?

public class Test {
    public static void main(String[] args) {
       Double y = new Double(0.1);
       System.out.println((Double) -y);
    }
}

Compiling this gives a rather cryptic error message (with javac):

Test.java:6: cannot find symbol
symbol  : variable Double
location: class Test
       System.out.println((Double) -y);
                           ^
Test.java:6: illegal start of type
       System.out.println((Double) -y);
                          ^
2 errors

The problem is that the Java compiler cannot determine whether the name Double refers to a type (in which case we have a cast) or some other variable (in which case we have a bracketed expression). For example, it could be a field of some sort declared in this or another class. Since, at the point of parsing the source file, the process of name resolution has not been performed, the parser does have the information necessary to disambiguate these cases. Of course, we could integrate the parse and name resolution stages together, but this introduces a lot of unnecessary coupling and complexity (and the designers of Java obviously decided against this).

To work around this problem, the Java grammar prevents the use of a non-primitive type in a cast when followed by certain expressions that cause ambiguity (e.g. -y, +y, etc).``

The Problem

The problem of parsing cast expressions in Whiley is largely the same as for the Java compiler, although Whiley has different syntax which needs to be accounted for. The example from Issue #427 which exposed the problem is the following:

method main(System.Console console):
    &[real] b = new [1.0, 2.0, 3.0]
    real x = (*b)[1]
    ...

The Whiley compiler was reporting that (*b) was not a valid type. This is because it had mistakenly considered the expression (*b)[1] to be cast. Clearly it’s not, and it should never have thought this! But, solving this is not completely straightforward.

To really understand the problem, we need to consider how the Whiley compiler was originally disambiguating casts from bracketed expressions. Roughly speaking the procedure upon encountering a left brace, (, was:

At this point, you may have spotted some critical mistakes in the above logic.  Unfortunately, I didn’t until recently! There are two problems with the above:

The Solution

The solution at this point is fairly straightforward. I’ve simply added an additional stage in the process between steps (2) and (3), which checks whether the parsed expression must definitely be an expression or not. This is a simple solution but, like Java, it still leaves open some cases which the compiler cannot disambiguate. These all involve the small set of tokens which can appear both at the start of an expression and in the middle of an expression. This set, to the best of my knowledge, is: +, -, [, * & and |. Thus, the following expression forms cannot be correctly disambiguated: (N) + e, (N) - e, (N) * e, (N) & e, (N) | e, (N) [ e ].

At the moment, when such an ambiguity arises, the compiler takes a punt and assumes it has a bracketed expression, rather than a cast. Unfortunately, this results in difficult to understand error messages. In the future, it might be better for the compiler to simply report the ambiguity rather than proceeding on an assumption…