You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Comparison of Erlang type checkers: Dialyzer, ETC, and Gradualizer
Dialyzer is a static analysis tool that identifies software discrepancies, such as definite type errors,
code that has become dead or unreachable because of programming error, and unnecessary tests,
in single Erlang modules or entire (sets of) applications.
It was developerd by Kostis Sagonas and is now maintained by the OTP team.
ETC is a type checker described
in "Bidirectional typing for Erlang"
by Nithin Rajendrakumar and Annette Bieniusa.
It's the result of a short academic project into higher-rank polymorphism enabled bidirectional type checking of Erlang code.
Gradualizer is a gradual type checker for Erlang and,
thanks to Gradient, also for Elixir.
The project is experimental, but has decent coverage of the Erlang and Elixir syntax.
The assumption of this comparison is to show that Gradualizer, while still experimental,
is already useful in practice thanks to:
good enough Erlang syntax coverage; as it turned out it's definitely better then ETC,
though there are some language constructs which lead to incorrect reports
better performance than Dialyzer (significantly shorter run times on the same files,
no PLT build/check time);
in fact, thanks to ErlangLS,
Gradualizer can be used real-time in the background of a programmer's editor
easily understandable error messages
This gist compares Dialyzer, ETC, and Gradualizer functionality on the tests accumulated over time
in Gradualizer's repository.
Thanks go to the volunteers who dedicated their time and effort to build the test harness.
The Elixir script used to generate the TSV results file is also available in this gist.
The tests are grouped into four categories:
"should pass" tests - we know Gradualizer should pass when run on these
"should fail" tests - we know Gradualizer should NOT pass when run on
these, i.e. it should detect and report type errors
"known problems which should pass" - as the name implies, code examples
which should not lead to warnings or crashes of Gradualizer
"known problems which should fail" - negative of the above, i.e. code
examples with known issues, which should be detected by Gradualizer, but
are not yet (or Gradualizer crashes on them)
The results are summed up in the attached TSV file.
Dialyzer is quite mature, very stable, and even known by the slogan that "it's never wrong".
ETC and Gradualizer, on the other hand, are still considered experimental.
One interesting example of a polymorphic function that is a Gradualizer false negative
(it does not raise an error though it should), but does not type check with ETC is:
$ cat t3.erl
-module(t3).
-export([p/2]).
-spec p(A, A) -> A.
p(A, B) -> A + B.
-spec test() -> integer().
test() ->
p(1, 3.2).
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
headers=[:type,:dialyzer,:etc,:gradualizer,:dialyzer_seconds,:etc_seconds,:gradualizer_seconds,:file]# series by headerby_header=tsv|>Enum.zip()|>Enum.map(&Tuple.to_list/1)by_header=Enum.zip(headers,by_header)|>Enum.into(%{})
This chart depicts the number of tests which should pass, i.e. should type check. After all, we do not want our type checkers to raise warnings about valid code.
Given we're considering code that's assumed to be valid, none of the type checkers should report warnings here. Cross-checking these tests with different type checkers allows to find bugs in tests or the type checkers themselves.
Dialyzer reports only a few errors. Gradualizer, as expected, reports none. ETC fares quite poorly, passing only 1/3 of the tests, which suggests it might not cover the complete Erlang syntax.
Known problems: tests that should pass, but do not
This chart depicts the number of tests which should pass, i.e. should type check, yet raise Gradualizer errors. In other words, these are the false positives - invalid or misleading reports about non-issues.
Dialyzer reports only a single false positive and is a clear winner here! This confirms the slogan that Dialyzer is never wrong.
ETC reports some of the errors, but not all of them.
Gradualizer, as expected, reports errors for all of the tests.
This chart depicts the number of tests which should fail, i.e. should not type check. These tests check that the warnings we want to see in our buggy code are actually generated.
Dialyzer seems to be somewhat permissive. ETC reports the majority of errors. Gradualizer properly reports all the errors.
Known problems: tests that should fail, but do not
This chart depicts the number of tests which should fail, i.e. should not type check, but are known to type check with Gradualizer. In other words, these are the errors that considered type checkers cannot find. We have to rely on tests, code review, or other techniques to find them.
Dialyzer detects some of the errors in these examples, ETC seems to detect even more. Gradualizer doesn't detect any of them, but the examples are crafted against this type checker, so it's expected.
Dialyzer warns about an unexported function.
Gradualizer type checks functions no matter if they're exported or not.
Dialyzer warning goes away after we export l/1.
Dialyzer sees no other local call sites than bar(apa),
therefore infers the argument type of bar/1 to be just apa,
which in turn means that the second clause of bar/1 is redudant.
This leads to a valid warning.
Adding -export([bar/1]) at the top makes Dialyzer accept the code.
Gradualizer does not warn either with or without the extra export attribute.
Dialyzer detects an improper list creation, which is not an error for Gradualizer (but maybe it should?) - passing --infer to Gradualizer doesn't change the outcome.
should_pass/named_fun_infer_pass.erl
Dialyzer infers that 0 in F(Atoms, 0) is not a list of integers.
Gradualizer does not.
should_pass/named_fun_pass.erl
Dialyzer fails due to arity mismatch.
Gradualizer does so only with --infer. Dialyzer is inferring more aggressively.
should_pass/negate_none.erl
Dialyzer returns "no local return" from foo/0.
Gradualizer is silent.
should_pass/records.erl
Dialyzer reported unexported local functions which would never be called.
Gradualizer did not, but the Erlang compiler would, so it's not a big deal.
should_pass/return_fun.erl
One of the few cases in should_pass tests where Dialyzer actually detects errors,
whereas Gradualizer does not.
return_fun.erl:28:2: Invalid type specification for function return_fun:return_fun_no_spec/0.
The success typing is
() -> fun(() -> 'ok')
ETC:
test/should_pass/return_fun.erl: error in parse transform
Gradualizer: no error!
Let's remember that by default Gradualizer doesn't infer types of functions with no specs.
Gradualizer with --infer:
return_fun.erl:29:25: The fun expression is expected to have type integer()
but it has type fun(() -> any())
Dialyzer seems to infer more types by default.
However, if we ask Gradualizer to try a bit harder, it's even with Dialyzer on this one.
The file is too complex for ETC to process - it crashes :(
should_pass/scope.erl
Another one of the few cases in should_pass tests where Dialyzer detects errors,
but Gradualizer does not.
f(X) ->
case g(X) of
true -> A = 5,
B = 7;
false -> B = 6
end,
B.
g(_X) ->
true.
Dialyzer:
scope.erl:9:9: The pattern 'false' can never match the type 'true'
ETC: ok
Gradualizer: ok
Another example showing that Dialyzer type inference is quite aggressive and quite accurate!
It knows that g/1 can never return anything else than true,
so the latter clause of the case expression enclosing the g(X) call is redundant.
In this case the entire case expression is redundant and B will always be 7.
It's interesting to note that Gradualizer cannot detect the error even if we modify the code as follows:
TODO: create a bug report about it!
-spec f(any()) -> atom().
f(X) ->
case g(X) of
true -> A = 5,
B = 7;
false -> B = 6
end,
B.
-spec g(any()) -> true.
g(_X) ->
true.
known_problems/should_pass/list_tail.erl
The only false positive raised by Dialyzer, which is known for never being wrong. Is it a Dialyzer bug?
atom_tail() ->
[ 1 | list_to_atom("banana") ].
Dialyzer:
list_tail.erl:7:5: Cons will produce an improper list since its 2nd argument is atom()
ETC:
unify failed with types [integer] :=: atom
Gradualizer:
list_tail.erl:7: The expression of type atom() is not a list type
Apparently, all the type checkers agree. It definitely is not a Dialyzer bug. Maybe the test should be moved from known_problems/should_pass to should_fail?
known_problems/should_pass/arith_op_arg_types.erl
Gradualizer is wrong about integer type inference in presence of arithmetic operations. Dialyzer is fine with it.
Gradualizer cannot tell that <<_:24>> is a subtype of <<_:_*8>>.
should_fail/annotated_types_fail.erl
Gradualizer provides type annotation/assertion macros which can be used to provide extra type information by the programmer. Gradualizer correctly finds types discrepancies between specs and annotations.
Dialyzer does not understand the annotation macros and does not have enough info to find type errors. Dialyzer is more permissive.
should_fail/branch.erl
Given the code:
-spec c(boolean()) -> integer().
c(X) ->
X.
Dialyzer (with no extra flags) does not detect type mismatch. With -Wspecdiffs or -Woverspecs it reports:
branch.erl:5:2: Type specification branch:c
(boolean()) -> integer() is a subtype of the success typing: branch:c
(_) -> any()
The report is correct. Arguably, though, it's a bit hard to understand.
Gradualizer reports:
The variable is expected to have type integer() but it has type false | true
Dialyzer (with no options) does not detect type mismatch. With -Wspecdiffs it reports:
branch2.erl:5:2: The success typing for branch2:c/1 implies that the function might also return
'apa' but the specification return is
integer()
Which is expected.
Gradualizer by default reports:
The atom is expected to have type integer() but it has type apa:
1;
c(false) ->
apa.
^^^
should_fail/case_pattern.erl
Given the code:
-spec f(integer(), atom()) -> ok.
f(X, Y) ->
case Y of
X -> ok
end.
Dialyzer (with no options) cannot figure out that the pattern (matching integers) will never match an atom. With -Wspecdiffs a misleading warning is printed:
case_pattern.erl:4:2: Type specification case_pattern:f
(integer(), atom()) -> 'ok' is a subtype of the success typing: case_pattern:f
(_, _) -> 'ok'
The warning suggests that our spec is too specific, whereas in practice its explicitness enables the type checker to figure out that the case expression will never match.
Gradualizer reports:
The variable is expected to have type atom() but it has type integer():
f(X, Y) ->
case Y of
X -> ok
^
At runtime, the code fails with:
2> case_pattern:f(3, a).
** exception error: no case clause matching a
in function case_pattern:f/2 (test/should_fail/case_pattern.erl, line 6)
So Gradualizer detects an actual bug that Dialyzer does not.
should_fail/case_pattern2.erl
Given the code:
-spec f(integer(), atom()) -> ok.
f(X, Y) ->
case {X, Y} of
{Z, Z} -> ok
end.
Dialyzer (with no options) cannot figure out that the pattern will never match the expression as X and Y are of different types. With -Wspecdiffs it suggests the spec is too strict:
case_pattern2.erl:4:2: Type specification case_pattern2:f
(integer(), atom()) -> 'ok' is a subtype of the success typing: case_pattern2:f
(_, _) -> 'ok'
Gradualizer reports that:
The variable is expected to have type atom() but it has type integer():
f(X, Y) ->
case {X, Y} of
{Z, Z} -> ok
^
So Gradualizer prevents the runtime error:
2> case_pattern2:f(3, a).
** exception error: no case clause matching {3,a}
in function case_pattern2:f/2 (test/should_fail/case_pattern2.erl, line 6)
should_fail/covariant_map_keys_fail.erl
Given the code:
-spec good(#{ good := A }) -> A.
good(#{ good := X }) -> X.
-spec not_good(#{good | bad := A}) -> A.
not_good(M) -> good(M). %% This call should fail
-spec kaboom() -> integer().
kaboom() -> not_good(#{ bad => 0 }).
Dialyzer (with no options) does not detect any errors. With -Wspecdiffs it reports:
covariant_map_keys_fail.erl:6:2: Type specification covariant_map_keys_fail:good
(#{'good' := A}) -> A is a subtype of the success typing: covariant_map_keys_fail:good
(#{'good' := _, _ => _}) -> any()
covariant_map_keys_fail.erl:9:2: Type specification covariant_map_keys_fail:not_good
(#{'good' | 'bad' := A}) -> A is a supertype of the success typing: covariant_map_keys_fail:not_good
(#{'good' := _}) -> any()
covariant_map_keys_fail.erl:12:2: The specification for covariant_map_keys_fail:kaboom/0 states that the function might also return
integer() but the inferred return is
none()
Dialyzer suggests the information we pass in good/1 spec is redundant. It hints at an error in not_good/1, but is not explicit about it. It correctly detects an error in kaboom/0.
Gradualizer reports:
The variable is expected to have type #{good := A} but it has type #{bad | good := A}
-spec not_good(#{good | bad := A}) -> A.
not_good(M) -> good(M).
^
Gradualizer does not detect the error in kaboom/0, but it does detect the more local error in not_good/1. If we fix the error in not_good/1, the error in kaboom/0 will automatically be fixed, too.
However, Gradualizer, even with the --infer flag, does not infer the type of #{ bad => 0 } expression passed into not_good/1 as an argument. This is now tracked as josefs/Gradualizer#432.
Interestingly, if we modify the code to:
-spec kaboom() -> integer().
kaboom() ->
M = #{ bad => 0 },
not_good(M).
Gradualizer is able to report also this error:
The variable is expected to have type #{good | bad := A} but it has type #{bad => 0}
kaboom() ->
M = #{ bad => 0 },
not_good(M).
^
The runtime error Gradualizer protects from is:
2> covariant_map_keys_fail:not_good(#{bad => 0}).
** exception error: no function clause matching covariant_map_keys_fail:good(#{bad => 0}) (test/should_fail/covariant_map_keys_fail.erl, line 7)
should_fail/cyclic_type_vars.erl
Given the code:
-spec foo(A) -> B when
A :: true | B,
B :: [A].
Gradualizer reports a cyclic dependency between type variables A and B. Dialyzer does not.
should_fail/exhaustive.erl
Dialyzer (with no options) does not fail type checking the file. With -Wspecdiffs it hints at spec and implementation mismatches, but with not clear indications which are wrong.
Gradualizer assumes the specs to be right and therefore returns explicit exhaustiveness failures:
Whereas Dialyzer (with -Wspecdiffs) only reports a hint that our passed in type is wider than the implementation expects:
exhaustive.erl:50:2: Type specification exhaustive:integer_2
(integer()) -> {} is a supertype of the success typing: exhaustive:integer_2
(0) -> {}
should_fail/exhaustive_float.erl
Given the code:
-type t() :: {int, integer()}
| {float, float()}.
-spec ef(t()) -> ok.
ef(T) ->
case T of
{int, _} -> ok
end.
Dialyzer (with no options) does not report any errors. With -Wspecdiffs it reports the following hint, which suggests that we might be missing some clauses:
exhaustive_float.erl:8:2: Type specification exhaustive_float:ef
(t()) -> 'ok' is not equal to the success typing: exhaustive_float:ef
({'int', _}) -> 'ok'
-type list_variant_t() :: {non_list, integer()}
| {list, [integer()]}.
-spec list_variant_omitted(list_variant_t()) -> ok.
list_variant_omitted(T) ->
case T of
{non_list, _} -> ok
end.
Dialyzer (with no options) does not report any errors. With -Wspecdiffs it reports the following hint, which suggests that we might be missing some clauses:
exhaustive_list_variants.erl:8:2: Type specification exhaustive_list_variants:list_variant_omitted
(list_variant_t()) -> 'ok' is not equal to the success typing: exhaustive_list_variants:list_variant_omitted
({'non_list', _}) -> 'ok'
-type t() :: ala | ola.
-spec f(t(), any()) -> ok.
f(ala, _) -> ok.
By default none of the typecheckers can tell that f/2 is a partial function.
Dialyzer with -Wspecdiffs reports a warning that the success typing is narrower then the spec:
exhaustive_argumentwise.erl:7:2: Type specification exhaustive_argumentwise:f
(t(), any()) -> 'ok' is a supertype of the success typing: exhaustive_argumentwise:f
('ala', _) -> 'ok'
Examples from "Bidirectional Typing for Erlang" by Nithin Rajendrakumar and Annette Bieniusa
By default, Dialyzer and Gradualizer do not report any errors. Dialyzer with -Wspecdiffs reports a misleading warning:
bdtfe1.erl:9:2: Type specification bdtfe1:lookup
(T1, [{T1, T2}]) -> 'none' | T2 is a subtype of the success typing: bdtfe1:lookup
(_, maybe_improper_list()) -> any()
The warning suggests that the spec is narrower than the actual implementation can handle. In practice, though, Dialyzer ignores the extra information we provide in the spec, which is crucial to identifying the bug in find/0.
ETC reports a unification error:
union-unify failed with types [char] :=: (atom|?`A)
ETC reports that a list of characters cannot be unified with a union of atom (none) and a type variable.
Gradualizer still lacks a constraint solver, so it cannot find the polymorphic type error.
In the code above the original example from the paper is copied twice, the latter copy having one type error fixed (the mismatch between true and {}) - this is to verify if our type checkers can find none, one, or both of the errors.
Dialyzer (with no options) reports nothing. With -Wspecdiffs it reports:
bdtfe3_should_fail.erl:8:2: The success typing for bdtfe3_should_fail:foo/1 implies that the function might also return
'true' but the specification return is
{}
bdtfe3_should_fail.erl:12:2: Type specification bdtfe3_should_fail:foo2
(integer()) -> {} is a supertype of the success typing: bdtfe3_should_fail:foo2
(1 | 42) -> {}
So Dialyzer can detect both the return type mismatch against the spec and that the function is partial.
ETC reports a unification failure of boolean and tuple:
unify failed with types boolean :=: tuple
Interestingly, if we comment out the first definition, ETC still fails on the second one with a unification failure of tuple and {}:
unify failed with types {} :=: tuple
Gradualizer with no extra options reports both a type mismatch and that the function is partial:
/Users/erszcz/work/vrnithinkumar/ETC/bdtfe3_should_fail.erl: The atom on line 9 at column 11 is expected to have type {} but it has type true
-spec foo(integer()) -> {}.
foo(1) -> true;
^^^^
/Users/erszcz/work/vrnithinkumar/ETC/bdtfe3_should_fail.erl: Nonexhaustive patterns on line 13 at column 1
Example values which are not covered:
0
Partial function 2/2
It's also interesting to consider the behaviour of the type checkers when we apply some changes:
The style above is somewhat more defensive than just "letting it crash". However, its explicitness makes it easier to understand that the function is partial on purpose, not by omission.
Dialyzer (with no options) reports nothing. With -Wspecdiffs it still reports that the spec is wider than the success typing:
bdtfe3_should_pass.erl:8:2: Type specification bdtfe3_should_pass:foo3
(integer()) -> {} is a supertype of the success typing: bdtfe3_should_pass:foo3
(1 | 42) -> {}
So with Dialyzer there's no point in using the more expressive style.
Interestingly, ETC can't unify the example:
unify failed with types {} :=: tuple
Gradualizer reports no warnings or errors. The explicit style pays off with a quick and successful type check.
3. Rank-2 polymorphism
Erlang spec syntax allows for higher-ranked polymorphism. Let's see how the type checkers cope with that.
The above code is correct and type checks with Dialyzer, Gradualizer, and ETC. Dialyzer with -Wspecdiffs reports that the spec is narrower than the success typing. It's a symptom of Dialyzer ignoring the extra type information provided in the spec and assuming that the implementation is the source of truth, not the spec.
Dialyzer (with no options) does not report any errors. With -Wspecdiffs it reports the usual warning about the spec being narrower than the success typing. However, there's no mention of the actual return value and spec return type mismatch.
ETC reports a unification failure:
unify failed with types integer :=: boolean
Gradualizer does not report any errors.
Working on Gradualizer
graph TD;
A("Run Gradualizer on some code")-->B{Errors reported?};
B -- yes --> I
I(Verify the code with review and/or tests) --> C{Errors actually present?};
B -- no --> F{Errors should be reported?};
F -- yes - we've found a false-negative --> E;
F -- no --> G(Good job writing bug-free code!);
C -- yes --> D(Fix your bugs!);
C -- no - we've found a false-positive --> E(Send a PR with a test/known_problems case!);
E --> H(Ideally, send a PR with a fix!);
08:05:19.634 [warn] Description: 'Authenticity is not established by certificate path validation'
Reason: 'Option {verify, verify_peer} and cacertfile/cacerts is missing'
headers=[:type,:dialyzer,:etc,:gradualizer,:dialyzer_seconds,:etc_seconds,:gradualizer_seconds,:file]# series by headerby_header=tsv|>Enum.zip()|>Enum.map(&Tuple.to_list/1)by_header=Enum.zip(headers,by_header)|>Enum.into(%{})
%{
"known_problems_should_fail" => ["all tests", "gradualizer", "all tests", "dialyzer", "etc",
"gradualizer", "all tests", "dialyzer", "etc", "gradualizer", "all tests", "dialyzer", "etc",
"gradualizer", "all tests", "dialyzer", "etc", "gradualizer", "all tests", "dialyzer",
"gradualizer", "all tests", "gradualizer", "all tests", "gradualizer", "all tests", "dialyzer",
"etc", "gradualizer", "all tests", "dialyzer", "gradualizer", "all tests", "dialyzer",
"gradualizer"],
"known_problems_should_pass" => ["all tests", "dialyzer", "etc", "all tests", "dialyzer",
"all tests", "dialyzer", "all tests", "dialyzer", "etc", "all tests", "dialyzer", "all tests",
"dialyzer", "etc", "all tests", "dialyzer", "etc", "all tests", "all tests", "dialyzer", "etc",
"all tests", "dialyzer", "all tests", "dialyzer", "all tests", "dialyzer", "etc", "all tests",
"dialyzer", "etc", "all tests", "dialyzer", "all tests", "dialyzer", "etc", "all tests",
"dialyzer", "etc"],
"should_fail" => ["all tests", "dialyzer", "all tests", "all tests", "all tests", "all tests",
"all tests", "all tests", "all tests", "dialyzer", "all tests", "dialyzer", "all tests",
"all tests", "dialyzer", "all tests", "dialyzer", "all tests", "dialyzer", "all tests",
"all tests", "dialyzer", "etc", "all tests", "dialyzer", "etc", "all tests", "etc", "all tests",
"dialyzer", "all tests", "dialyzer", "all tests", "dialyzer", "all tests", "dialyzer", "etc",
"all tests", "dialyzer", "all tests", "dialyzer", "all tests", "dialyzer", "etc", "all tests",
"dialyzer", "all tests", ...],
"should_pass" => ["all tests", "dialyzer", "gradualizer", "all tests", "gradualizer", "all tests",
"dialyzer", "gradualizer", "all tests", "dialyzer", "gradualizer", "all tests", "dialyzer",
"etc", "gradualizer", "all tests", "dialyzer", "gradualizer", "all tests", "dialyzer",
"gradualizer", "all tests", "gradualizer", "all tests", "dialyzer", "gradualizer", "all tests",
"dialyzer", "gradualizer", "all tests", "dialyzer", "gradualizer", "all tests", "dialyzer",
"gradualizer", "all tests", "dialyzer", "gradualizer", "all tests", "dialyzer", "etc",
"gradualizer", "all tests", "dialyzer", "gradualizer", "all tests", ...]
}