Hindley-Milner Type Inference (2012)

nestorD · on July 10, 2020

Ocaml use Hindley-Milner type inference. I was taught programming with it and ended up deeply spoiled.

I could not understand why most static language required writing so many types that were easy to deduce and why people where saying that not having to write types made dynamic language better.

I can count on the finger of my hands the number of times when I needed to write explicit types in Ocaml (my single use case for explicit types in Ocaml was data deserialization).

Plus, the compiler is fast.

chrisseaton · on July 10, 2020

> I can count on the finger of my hands the number of times when I needed to write explicit types in Ocaml

Do you genuinely rely on whole-program type inference? Writing fewer than ten explicit types in your whole career writing OCaml really surprises me. For example even the OCaml standard library is not written this way.

I find in practice most people explicitly type their top-level functions as if you don't this it can cause type errors at the point of use to 'poison' the function types and cause very confusing errors far away in the program. Do you not see this when you program without any explicit types?

We can see this in other projects for example Crystal which had to back-track on their goal of whole program inference, because it just doesn't really seem tractable in practice.

GregarianChild · on July 10, 2020

Why not let the computer do the work? My programming flow in Ocaml has always been:

(1) Hack until I've got something stable. Note: at the start of coding I typically don't quite know what the right abstractions are, so a lot of code (and associated types) are in constant flux.

(2) Once program is stable, wrap it in a module, let the compiler work out all types and put them in as module interface.

That gives me the best of all possible worlds: type annotations at abstraction boundaries but you don't have to type them out. Writing out explicit types inhibits fast change. I wish the languages I'm using at the moment (Scala, Rust, C++) allowed a flow like that. Alas, full type inference stops being possible with more expressive typing systems.

MaxBarraclough · on July 10, 2020

> Why not let the computer do the work?

Because a program's types can serve as part of its documentation.

hombre_fatal · on July 11, 2020

The other benefit of top-level/function annotations is that they ensure that your mental model of the code is consistent with what the code actually does.

So many times I will accidentally return 42 or Result<Result<42>> when I mean to return Result<42>. With annotations, my error is instantly obvious. Otherwise I accidentally have a function that returns Result<Result<number>> and it may lead to a very weird error message when I use `foo().map(_ + 100)` somewhere else.

That's letting the computer doing more work, not less of it. Relying too much on inference can bow back into the human doing more of the work to figure out what the types actually are in a given spot.

lmm · on July 11, 2020

Sometimes an IDE hover can be the best of both worlds - the information is there when you want it (and guaranteed to be accurate since it's using the language compiler's own type inference), but not taking up space when you don't want it.

eru · on July 11, 2020

That helps, but it's orthogonal to adding type annotations.

The computer can only tell you what types your code _has_ (from inference), but not what type it _should_ have based on your intentions. Type inference can only complain when your code contradicts with itself (on a type level), but can't tell you when it contradicts with your intentions.

If you don't write out your intentions, you'd have to hover over everything to see if the inferred types differ from your expectations.

IDE hover types and collapsible type annotations (and alternatively also collapsible implementation) are still useful as a way to declutter your UI when reading code.

mathgladiator · on July 10, 2020

This is under-appreciated, and as I have aged I have discovered that I prefer the verbosity over brevity. Namely, when mentoring more junior engineers who can barely write code at all.

mrmr1993 · on July 11, 2020

It doesn't always make sense to plan the types ahead of time; letting the compiler do the work while you code and then creating a nice interface after is a reasonable way to work.

chrisseaton · on July 10, 2020

> let the compiler work out all types and put them in as module interface ... type annotations at abstraction boundaries

Right - that's not global type inference. It's intra-module type-inference.

GregarianChild · on July 10, 2020

I don't understand this objection. OCaml does full program inference. I just prefer to wrap things into modules from time to time. That's a pragmatic choice I make, not a restriction on type-inference.

Aside, when I decide to change where I put module boundaries, full program inference helps too: I just throw away the old module definitions, draw different boundaries and re-run inference.

chrisseaton · on July 10, 2020

I'm not objecting to anything. This is how I thought most people used inference in OCaml.

amw-zero · on July 10, 2020

You do need to write the types for records though.

NOGDP · on July 11, 2020

Is there anything specific that you find Rust is missing in this regard?

eru · on July 11, 2020

I think Rust deliberately limits type inference to information available in a single function?

steveklabnik · on July 11, 2020

herbstein · on July 10, 2020

> Do you genuinely rely on whole-program type inference?

I think their point is that they there were vanishingly few times where they were required to write explicit type annotations. As they say, mostly when dealing with data serialization. That doesn't say whether they wrote more, however.

a1369209993 · on July 11, 2020

Except that would then not count the many times where they 'weren't' required to write explicit type annotations solely because they had already written those annotations. Which is not necessarily a unreasonable distinction to make[0], but I can say the same even about C: there have been less than ten times that I got a compiler error because I declared a function or variable without specifying a type for it.

0: There are plenty of places where not supplying a type annotation is more trouble than just writing it. Eg (Haskell since I'm not familiar with OCaml offhand):

  data Foo = FooBar Int String
           | FooBaz [Int]

I'm not even sure how to avoid writing explicit type annotations for Int, String, and [Int] there; maybe OCaml makes it easy, but I doubt I'd bother literally all but less than ten times.

yawaramin · on July 11, 2020

> I'm not even sure how to avoid writing explicit type annotations for Int, String, and [Int] there; maybe OCaml makes it easy

It does make it easy. OCaml has a full matrix of types:

- Nominal product types (records)

- Nominal sum types (variants)

- Structural product types (objects)

- Structural sum types (polymorphic variants)

You could actually always use structural types and never have to actually define types. But OCaml people actually really like the nominal types because their typing rules are stricter and easier for people to understand. But the structural types can come in really handy, e.g. they give you a very convenient way to encode data between libraries that don't need to know about each other. E.g. there is an effort to add JSON-structured output to the OCaml compiler itself. This is using a polymorphic variant type to represent the JSON data type, to avoid having a dependency on any particular JSON library, but still being compatible with popular OCaml JSON libraries ( https://ocaml-community.github.io/yojson/yojson/Yojson/index... ).

eru · on July 11, 2020

That's a type definition. You can't get away without writing out that definition.

If you wanted to, you could use

    Either (Int, String) [Int]

in your code and rely on type inference to do all the work.

a1369209993 · on July 11, 2020

> You can't get away without writing out that definition.

Not in Haskell as far as I know, but (assuming MonomorphismRestriction) there's no technical reason why

  data Qwe = Q a
  qwe_eps :: Qwe
  qwe_eps = Q ""

couldn't infer a ~ String in the type definition for Qwe. It's just annoying and pointless, because you never actually care about having to write a explicit type annotation there. (Which was my point.) But it's still a explicit type annotation that you're required to write. (Which was (kind of) chrisseaton's point.)

NOGDP · on July 11, 2020

I guess the idea is:

data Foo a b c = FooBar a b | FooBaz c

or

data Foo a b = FooBar a b | FooBaz [a]

But I agree that writing explicit type annotations is overall better practice.

a1369209993 · on July 11, 2020

That declares 'Foo' with kind * -> * -> * -> * , ie a generic type. Whereas I was declaring with kind * , a specific type that might have infered components. Compare:

  quux :: a -> a
  quux x = x
  # quux takes any type genericly
  
  frob y = y () ++ [()]
  # frob takes some specific type, but we don't need to specify which (spoiler alert: it's () -> [()])

Edit: actually the type is generic (not polymorphic) as well.

NOGDP · on July 12, 2020

Why does the kind matter? Both of your examples use type inference and polymorphism - the type 'a' is inferred when quux is used, the type for the second example is inferred from the polymorphic types of (++) and `$`, and both allow to code more freely while in flux.

eatonphil · on July 10, 2020

How much simpler would OCaml or Standard ML be if there were no type inference? I'm perfectly happy to never use type inference (maybe only for local variables).

It feels like people get distracted by H-M inference and forget that there's a pretty nice high level language in SML or OCaml anyway (although I'd rather have modular implicits).

Am I wrong in thinking implementation would be much simpler without type inference?

miloignis · on July 10, 2020

I think it depends on the algorithm & implementation, but in at least some cases type inference adds very little code over type checking. (To some degree, both are propagating and unifying types through the program and throwing an error when trying to unify two things that aren't the same. And depending on the language, you've got to have some non-trivial machinery for handling function overloads, which can require inference-y reasoning anyway (like calling an overloaded function immediately with the result of another overloaded function).)

On the other hand, non-local inference can make things more complicated in some situations, depending on how hard you try. For instance, Rust gives up if you don't know the type of a variable and you try to accesses a member of it, where I once wrote a more complicated iterative solver that would see if it could infer the type of the variable from later context and return to the member access later. For example, consider the following Rust code where the compiler could figure out the types with some additional complexity but doesn't bother:

  fn main() {
      let a = (1..10).collect();
      println!("{:?}", a.len());
      let b: Vec<i32> = a;
  }

  error[E0282]: type annotations needed

chriswarbo · on July 10, 2020

> To some degree, both are propagating and unifying types through the program and throwing an error when trying to unify two things that aren't the same

In fact they're so similar that they can be combined to give 'bidirectional typing'. This is mostly useful for systems that are more expressive than Hindley Milner. For example, dependently-typed languages like Agda and Idris require type annotations for top-level definitions (local definitions can sometimes be inferred). Bidirectional typing is a neater approach to such hit-or-miss inference approaches, and also allows values to be inferred (your function-overloading example is a case of value inference: using the types to figure out which particular function to call).

frank2 · on July 10, 2020

>Rust gives up if you don't know the type of a variable and you try to accesses a member of it, where I once wrote a more complicated iterative solver that would see if it could infer the type of the variable from later context

My guess is that Rust's designers would reply that their compiler is slow enough without the complication you describe.

IggleSniggle · on July 10, 2020

Your comment has me feeling quite bullish on TypeScript type inference

gnulinux · on July 10, 2020

HM is pretty easy implement actually. You just need a unification function, the rest is just a laundry list of rules available everywhere, like Wikipedia. Source: I implemented a typed python dialect as a college project that used HM.

eru · on July 11, 2020

> How much simpler would OCaml or Standard ML be if there were no type inference? I'm perfectly happy to never use type inference (maybe only for local variables).

Just the opposite: type inference forces the type systems to remain simpler. Without type inference you could make them more complicated.

shpongled · on July 10, 2020

Implementation would be simpler, but only by a small degree. To be honest, HM is relatively straightforward to actually implement.

aspaceman · on July 10, 2020

There's some really cool programming language theory that makes things like type inference easier than you think. It's still hard but there's rules and math and stuff for all of it.

yawaramin · on July 11, 2020

You can't actually have a statically-typed language without type inference. Otherwise if you tell the compiler `(5 + 2) : String`, how would it report a type error?

tomp · on July 11, 2020

That’s not type inference, just type propagation - tons simpler.

yawaramin · on July 11, 2020

Sorry but what's the difference? Inference is a way to infer the type of any given expression. How does type propagation contrast?

tomp · on July 11, 2020

Propagation is way easier, and basically just requires tracking types of values & parameters. To infer the type of `a` in

    var a = 1;

I just need to figure out the type of `1`, then realize that the type of `a` is the same.

slightly harder would be typing union / intersection types - e.g. this:

    var a = some_test() ? "foo" : new Bar();

Java would "infer" (kind-of) the type of `a` to be something like `Serializable & Comparable<? extends Serializable & Comparable<...>>` as both `Serializable` and `Comparable` are supertypes of both `String` and `Bar` (well, depends on `Bar` of course). This starts to show how much trickier things can get.

Now, imagine knowing nothing about what can be stored in the variable, and only seeing how it's used - this is the case for function parameters.

    (x) -> x + 1;

Is `x` a `float`? An `int`? A `long`? Maybe a vector? Basically, imagine the most complex type you can write in a given language, and then consider, the compiler would need to be able to prove that this is the only (or at least the most specific) type you could use.

yawaramin · on July 11, 2020

Hmm, all of your examples can be inferred by Hindley-Milner. Of course, for the second one it would be a type error assuming that `Bar` is not a subtype of `String`. I'm not seeing how type propagation of a complex expression could be any simpler or easier than full-fledged type inference of that expression. For example,

    let x = "foo" |> bar |> baz 5

What's the type of `x`? You might say that we can figure it out by propagation, but how's that different from inference?

tomp · on July 12, 2020

Indeed, what I'm saying is that inference is more powerful (i.e. better) than propagation. Most compilers only implement propagation, not full inference, allowing programmers to skip type annotations on local variables (since their types can be propagated from parameters and expressions), but not on parameters.

Basically, there's two complications with type inference. One is type variables that you need to then "solve" - in my last example above, the type of parameter `x` would start as a type parameter α (usually denoted by Greek letters in type systems literature), then it would be "unified" with `int` when used "like" an int (i.e. passed to a function `+ : (int, int) -> int`).

This is what happens in a language with a "simple" type system like OCaml, with no subtyping. (OCaml is slightly more complicated, but still all equations that it needs to solve only involve equality (I think)). If your type system features subtyping, things get a bit more complidated.

    (x) -> new List(x + 1, concat("test", x))

This now generates two constraints, `α <: int` and `α <: string`, that the compiler needs to solve. One solution would be an intersection type `string & int`. Does this type make sense? Does it even exist? Will the programmer understand it? The question isn't only finding the solution, but also which solution is "the best" - how to even define "the best"? I'm not aware of any good answers to this question (not even MLSub, it only works for simple cases).

The other big issue type inference deals with, is generalization. If, at the end of inferring an expression's type, there are no equations involving a type variable (a bit of a simplification, but let's go with it), then it can be "generalized" - turned into polymorphic type.

    (x) -> new List(x)

We might be able to assign the type `forall α . α -> list[α]` (or, a this would be written in Java / C++, `<T> T -> List<T>`), but again this depends on multiple conditions (is `x` used elsewhere? does it escape the scope? is `List` mutable?), and can get much more complicated with more features (e.g. upper and lower bounds, like Java's `<U extends Number>`)

In your example, figuring the type of `x` is trivial compared to above - just take the return type of `baz`.

throwaway894345 · on July 10, 2020

Local variables seems to be the sweet-spot for me, especially when I'm working on a project with others.

eru · on July 11, 2020

Yes, the main reason to give type annotations is to make debugging easier and as documentation.

In eg Haskell, you often write the types first, even though in principle you seldom would have to write the types at all.

ohazi · on July 10, 2020

In Rust, I'll usually add a type hint for clarity if I have a massive method chain full of confusing transformations. Sometimes I'll also temporarily add them to closure arguments so that autocomplete is happier.

But otherwise you're right... they're almost never actually needed unless there's an actual ambiguity (e.g. what kind of collection should .collect() produce?)

It's pretty great. I can confidently do a whole pile of reckless-looking stuff and be reasonably confident that I've done what I intended to do once the compiler is happy.

AzzieElbab · on July 10, 2020

I do this on scala. Given scalac is much slower than ocaml compiler and inference is not as stellar. Regardless, explicit type annotations make code so much easier to read. Anonymous lambdas being an important exception

nine_k · on July 10, 2020

The real footgun in Scala, to my mind, is implicits.

They are inferred, and that's the point of their ergonomics, but when they are inferred incorrectly, it becomes pretty confusing and sometimes cryptic.

lmm · on July 11, 2020

Not to make excuses for the poor errors in the default language, but the "splain" plugin can help a lot, as can ctrl-shift-P in IntelliJ.

iaabtpbtpnn · on July 10, 2020

The HM type inference used by OCaml breaks down when you add in nominal subtyping, which is common in mainstream static languages like Java. OCaml handles this by giving up subtyping; there is none in the core language, and the (rarely-used) object system uses a form of structural subtyping based on row types.

yawaramin · on July 11, 2020

OCaml also has a really good subtyping implementation in the often-used polymorphic variant system. I'll just quote from http://roscidus.com/blog/blog/2014/02/13/ocaml-what-you-gain... :

> Their real purpose is to support subsets and supersets, which are useful all over the place.

And indeed they are super useful, e.g. I am using them to implement functions that deal with web responses: some can deal with only HTTP responses, some with only WebSocket responses, and some with either. All completely type-safe and extensible.

xvilka · on July 11, 2020

I once wondered about subtyping[1] too. Looks like there's a possible solution - MLsub[2][3]. But it will require significant amount of work to modify OCaml inference engine to work that way. For now I just use functors for types with the shared fields. It doesn't add too much lines of code either.

[1] https://discuss.ocaml.org/t/function-argument-polymorphism/3...

[2] https://www.cl.cam.ac.uk/~sd601/papers/mlsub-preprint.pdf

[3] https://github.com/stedolan/mlsub

ohadrau · on July 10, 2020

While the object system is pretty rarely used [1], polymorphic variants see quite a bit of use and are also based on row types.

[1] Recently have been seeing a lot more use in BuckleScript because of the JS object representation

wenc · on July 10, 2020

Just curious, how would one handle something like this in Ocaml:

  x = function() { 
        if (rand() > 0.5) {
          return "abc"
        } else {
          return 2.01
        }
      }

So far, union types and boxing/unboxing are possible answers. Would type inference work in this case? (well, it would for boxing/unboxing but union types?)

saityi · on July 10, 2020

In Standard ML, it would be a type error to do that specifically:

    - if true then "abc" else 2.01;
    stdIn:1.2-1.30 Error: types of if branches do not agree [tycon mismatch]
      then branch: string
      else branch: real
      in expression:
        if true then "abc" else 2.01

You would need to define a new data type to wrap them:

    - datatype string_or_real = r of real | s of string;
    - if true then s "abc" else r 2.01;
    val it = s "abc" : string_or_real

jldugger · on July 10, 2020

Type error. your lambda doesn't have a valid static type signature. You would wrap that around a union type:

wtf = A of string | B of float

And then you'd return A("abc") or B(2.01) and your function would be () -> wtf

chrisseaton · on July 10, 2020

Why's it a type error rather than inferring a sum type?

iaabtpbtpnn · on July 10, 2020

You can get it to infer the sum type by giving tags to your values with backticks, like `A 1.23 and `B "foo". This will be inferred as "`A of int | `B of string" without you having to declare any type. Without those tags, the system assumes you have made a mistake in returning values of separate types. If it didn't do that, you would not get type errors in many/most (all?) of the situations where you want the type system to help by indicating the inconsistency to you.

chrisseaton · on July 11, 2020

Giving tags seems like declaring a type though?

lmm · on July 11, 2020

It's not though. If we think about Result as an example, `1`, `Success(1)` and `Failure(1)` are different values, not just different types.

I think the point you're making is that Standard ML has certain restrictions in its type system that are necessary to make type inference possible: there are no union types, no subtypes (more recent work has found a way to make subtyping compatible with full H-M), polymorphism has to be introduced explicitly with `let`, and there are no higher-kinded types. And in some cases you'd argue those restrictions might incur more code overhead compared to a language that loosens those restrictions at the cost of less reliable type inference (e.g. Scala).

That's sort-of-but-not-really true: in my experience union types and subtypes are always bad ideas and there are better alternatives for any use case where you would use them. You never actually want "String | Int", you want a type with meaningful semantics (like `Result<String, Int>`).

ufo · on July 10, 2020

If you use the "polymorphic variants" feature with backticks it automatically infers the union. However, most of the time the regular non-polymorphic variants are easier to work with.

A good discussion of this can be found in https://dev.realworldocaml.org/variants.html#scrollNav-4

nulptr · on July 10, 2020

because "ssdf" is a string but B("asd") is user-defined sum type.

So in the original example, you can't unify "ssdf" and the int 33, but in the re-written version B("sfd") and A(33) have the same type and can be unified.

ralfn · on July 10, 2020

You might like the language Crystal, it works this way.

https://crystal-lang.org/

chrisseaton · on July 10, 2020

But Crystal recently had to walk back from global type inference as they found it was not realistic in practice, didn't they?

gnulinux · on July 10, 2020

Why is it not realistic? Is it too slow or too unreadable?

chrisseaton · on July 10, 2020

You could dig up the mailing list messages at the time, but yeah it was slow and I think it produced error messages that were extremely hard to use - it would decide the type of functions was something nonsensical and then report that as an error at all the call sites instead.

gnulinux · on July 10, 2020

Makes sense, bad errors is also why I don't use type inference in my code.

3001 · on July 10, 2020

Don't take my word for it but I don't think you can implicitly unify[0] two base type in HM.

https://www.cs.cornell.edu/courses/cs3110/2011sp/Lectures/le...

shpongled · on July 10, 2020

You cannot unify a string and an integer. 'Base type' is just an abstraction over the enumerable primitive types (bool, unit, char, etc). It's generally understood that these are simple cases to unify, so they aren't even mentioned most of the time.

dependenttypes · on July 10, 2020

If you are interested for a language that does something like that then check out https://archive.is/Op8Mf or the other words by Stephen Dolan

dependenttypes · on July 10, 2020

In haskell you would do it like

    import System.Random
    x = do y <- randomIO :: IO Float -- note: this is before there are instances of randomIO for IO Int and other types
           if y > 0.5 then
             return (Left "abc")
           else
             return (Right 2.01)

waltpad · on July 10, 2020

I think another interesting question is, how would you handle the return value of such function in your code?

You would have to either:

- test for the type of that value in order to handle it properly,

- rely of the implicit cast rules of JS, which wouldn't be very useful here I suppose

As it has been said by others, in Ocaml (as well as in many other strongly typed languages) you can use a sum type to solve that problem.

Edit: In some languages, there is also the concept of "intersection types" which, if I understand correctly, let one also handle that sort of situation. The corresponding Wikipedia entry [1] gives a list of languages supporting that concept, and provides examples.

[1]: https://en.wikipedia.org/wiki/Intersection_type

codemonkey-zeta · on July 10, 2020

I've found this type of function useful for duck typing, and I've generally found duck typing to be a really useful tool to add polymorphism to code.

waltpad · on July 10, 2020

The point I was trying to make is that if you have to rely on runtime type information, you might as well use sum types in a more strongly typed language, where the scope of the code gets limited to the type itself.

With "duck typing", you need to have an extra test for types which your code is not supposed to work with, whereas with sum types, the type checker will help you verify that it cannot happen.

That being said, there are situations which a strong type system will not be able to model.

Anyway, at the end of the day, what matters is that you (and your coworkers) feel comfortable with your code. I know by experience that it's not always the case.

eru · on July 11, 2020

I find unions without a tag hard to handle.

To expand on another commenters example, suppose you have a sum type that models successful computation with a return value or a failure with an error message.

If you have tags / constructors, that's all easy. But if you just use naked unions, you can not write code that deals with error messages (or strings in general) in the success case.

roebk · on July 10, 2020

I can’t speak for Ocaml, but in Elm, an if expression as well as a function must return the same type. Therefore the code you posted would be invalid.

1-more · on July 10, 2020

To expand slightly, the Elm compiler will help you out here and show you what you need to fix. Given this function body

  randomReturner n =
      if n < 0.5 then
          123
      else
          "abc"

The compiler will give you the following info:

  The 2nd branch of this `if` does not match all the previous branches:

  22|     if n < 0.5 then
  23|         123
  24|     else
  25|         "abc"
            ^^^^^
  The 2nd branch is a string of type:

      String

  But all the previous branches result in:

      number

  Hint: All branches in an `if` must produce the same type of values. This way, no
  matter which branch we take, the result is always a consistent shape. Read
  <https://elm-lang.org/0.19.1/custom-types> to learn how to “mix” types.

  Hint: Try using String.toInt to convert it to an integer?

You can try it out on Ellie here: https://ellie-app.com/9nXxfKSStG4a1

drew-y · on July 10, 2020

In TypeScript, x would be of type () => "abc" | number.

IggleSniggle · on July 10, 2020

Wouldn't it be string|number unless using the `as const` modifier?

drew-y · on July 11, 2020

It used to be. But more recent versions of TypeScript (I believe since 3.5) are much stricter with strings in implicit unions. It's really quite helpful.

Here's an example where the feature can save you:

```

let x = () => {

    if (Math.random() > 0.5) {

        return "abc"

    } else {

        return 4

    }

}

// ERROR: This condition will always return 'false' since the types '"abc" | 4' and '"sdf"' have no overlap.

if (x() === "sdf") {

   do_work()

}

```

brianberns · on July 10, 2020

In F# (which is very similar to OCaml), you’d have to explicitly cast the return value to “obj” to get this to compile.

ohadrau · on July 10, 2020

Just to clarify to people not familiar with the two languages and their differences, this is somewhere where the two languages diverge greatly. OCaml does not have an "obj" type that we can cast to (and in general casts are quite rare).

bwestergard · on July 10, 2020

You would need to declare that the function returned a sum type (string | int).

mcphage · on July 10, 2020

> I could not understand why most static language required writing so many types that were easy to deduce and why people where saying that not having to write types made dynamic language better.

In Scala, a lot of libraries end up with deduction-only types—ie, types that can only be deduced, because actually typing out the type signature is very long, complicated, and filled with generics-upon-generics. I don’t find that to be a pleasant outcome—if I can’t fill in the type signature myself, trying to understand what the code is going to do ends up that much more complicated. Yet that’s not an uncommon situation.

The_rationalist · on July 11, 2020

But why the complexity? Most libraries should have "simple" types.

lmm · on July 11, 2020

They mostly do. The cases where you get complex types is where you are using dependent typing. For example, imagine you wanted to write an "update" method. So we have this datatype:

    data Address(line1: String, line2: String, postcode: String)
    data User(name: String, address: Address, phoneNumber: String)

And here's an example of what "update" should do, in JSON:

    // Argument 1
    {
      "name": "Bob",
      "address": {
        "line1": "123 Fake Street",
        "line2": "Beverly Hills",
        "postcode": "90210"
      },
      "phoneNumber": "555-555-5555"
    }
    // Argument 2
    {
      "address": {
        "line1": "377 Fake Street"
      },
      "phoneNumber": "555-555-4444"
    }
    // Result
    {
      "name": "Bob",
      "address": {
        "line1": "377 Fake Street",
        "line2": "Beverly Hills",
        "postcode": "90210"
      },
      "phoneNumber": "555-555-4444"
    }

This kind of function is the kind of thing that keeps people using dynamic languages. It's easy, and obvious, what it should do, and what its type signature is, but in most typed languages you can't write it without constructing a whole parallel hierarchy of datatypes by hand.

Scala is one of about 3 languages where you can actually write this in a nice way and it will be correctly typed. But the type you get will be:

    (User, Field[Label['name], Option[String]] :: Field[Label['address], Option[Field[Label['line1], Option[String]] :: Field[Label['line2], Option[String]] :: Field[Label['postcode], Option[String]] :: HNil]] :: Field[Label['phoneNumber], Option[String]] :: HNil) => User

And until recently it was even worse than that because the compiler didn't know to print the "::" types as infix.

Now with first-class record support in the compiler you could improve this a little bit, but fundamentally there's not really any better way to write that type. Maybe if the language had first-class support for dependent types like Idris you could write ti as `(User, UpdateTo(User)) => User` or something, but the reader would still need a way to find out what `UpdateTo(User)` was concretely. So what do you do?

mcphage · on July 11, 2020

They should, but when the types are all inferred, it’s easier to hide that complexity.

azhenley · on July 10, 2020

Interesting that you say that. I come from a very different background but have always that thought type inference hurts the readability of code and should be used sparingly.

I wrote up the thoughts in a post [1] that seemed to get extreme reactions from both camps [2].

[1] Type inference has usability problems. http://web.eecs.utk.edu/~azh/blog/typeinference.html

[2] HN discussion. https://news.ycombinator.com/item?id=20296123

eru · on July 11, 2020

Careful, there are multiple ways to use type inference.

In practice, many people use type inference when exploring and writing code; but then add explicit annotations for readability and documentation. You could even do the last step automatically.

dwohnitmok · on July 10, 2020

Does the bytecode OCaml compiler compile faster than Go? What about Bucklescript? I've never experienced issues with OCaml compile times but I wonder how it compares against a widely acknowledged "fast" compiler.

dropofwill · on July 10, 2020

I think its safe to say it's fast in the general sense, the first time I compiled Revery I didn't believe that it had actually worked it was so fast. Just tried it again 0.98 secs to compile 20-25k lines of native OCaml/Reason.

I'd be interested if anyone found/ran any benchmarks, though I thinks its pretty hard to write a fair compile time benchmark. I did a quick search and came across this post comparing solving the same problem with haskell/ocaml/go, which just kind of off hand mentions the compile times. OCaml versions range from 0.3-0.8s and Go was reported as ~1s.

https://pl-rants.net/posts/haskell-vs-go-vs-ocaml-vs/

dwohnitmok · on July 10, 2020

In that case maybe generics aren't a compile time killer. (Although I'm quite curious how the OCaml compiler compiles so quickly).

eru · on July 11, 2020

Go people have a very weird understanding of generics, and that messes up many discussions.

They typically mix up parametric polymorphism and ad-hoc polymorphism.

I blame C++'s templates for the confusion. (And Go people's common refusal to look much further, and especially not into academia..)

gstipi · on July 11, 2020

> And Go people's common refusal to look much further, and especially not into academia..

I don't think that's a fair statement, given Phil Wadler's formal work on generics in Go - "Featherweight Go" (https://arxiv.org/abs/2005.11710) - which is acknowledged here: https://blog.golang.org/generics-next-step

eru · on July 11, 2020

Yes, hence my weasel word with 'common' and 'most'.

It's good to see that they are taking the likes of Phil Wadler serious. (Perhaps his beard helps?)

> We’d like to thank Philip Wadler and his collaborators for thinking formally about generics in Go and helping us clarify the theoretical aspects of the design.

In the older post at https://blog.golang.org/why-generics you still see them pretty much mix up parametric polymorphism and ad-hoc polymorphism.

Thanks for the link to the paper!

lmm · on July 11, 2020

Technically OCaml typechecking is exponential in the length of your program, but really that's just because it's exponential in the depth of your most deeply nested `let` construct, and most people don't actually write programs that look like

    let a = ...
      in let b = ...
        in let c = ...
          in let d = ...

yawaramin · on July 11, 2020

Wait, that's actually really common in OCaml though. I mean not the indentation of course, but every series of let-bindings is essentially a single nested expression. That's just the way let-bindings work.

shpongled · on July 10, 2020

Generics in the sense of C++ templates can be a compile time killer. However generics in the sense of HM type systems are basically just asking the compiler to fill in the type for you, which it has already figured out. HM type inference in general is ~O(n) (and exponential in the worst case, but you basically have to try to do that)

zenhack · on July 11, 2020

Right, so there are two reasons that C++ templates hurt compile times:

1. C++ templates don't require you to treat the type variables as opaque; depending on what the body of a template does with its type parameters it may or may not type check depending on how the variable is instantiated. By contrast, in OCaml (and most languages) generic functions cannot assume anything about their type parameter. This means that the compiler can type check the body of the generic function once and be done, rather than having to essentially check it at every call site.

2. Aside from type checking, C++ templates generate separate code for each instantiation of the type parameters. This (generally) results in faster code as it is specialized to the type in question, but it means more code is generated, which of course takes time (and it can also bloat the executables). This is probably a bigger deal than the type checking part. By contrast, OCaml only generates one copy of the code for a function, which simply doesn't care about the types of its arguments. This is also a common implementation strategy; Java does something similar.

Note that Rust also suffers from (2), but not (1).

eru · on July 11, 2020

You can get OCaml to behave like 2, but you need eg Generalised Algebraic Datatypes or some magic with modules to do so.

See https://blog.janestreet.com/why-gadts-matter-for-performance... and https://blog.janestreet.com/more-expressive-gadt-encodings-v... for why you'd sometimes want the behaviour in 2.

(I do agree, that that behaviour is a bad default.)

shpongled · on July 11, 2020

Yeah I wonder what the actual cost is of (2). I know the Standard ML compiler MLton takes the monomorphization (2) approach as well, with good success. I'm currently writing a toy SML compiler for fun, and I'm planning on doing the same, since you get better run time performance, assuming you're using that info to do unboxing.

xvilka · on July 11, 2020

There is ongoing discussion [1][2][3] to enable more aggressive optimization for "Flambda" switches, in some cases this can bring more than 2x times speedup.

[1] https://github.com/ocaml/opam-repository/pull/16753

[2] https://discuss.ocaml.org/t/ocaml-4-11-first-beta-release/60...

[3] https://discuss.ocaml.org/t/ann-bap-2-1-0-release/5906/9

literallycancer · on July 10, 2020

Note that code that relies on inference too much is really hard to read without an IDE. So reading it e.g. in the github web interface is a really bad experience.

acchow · on July 11, 2020

It's because Hindley-Milner doesn't play well with subtyping.

You do get subtyping in Ocaml with modules and "module types", but in that case you are explicitly spelling out the module types. They're not inferred.

zenhack · on July 11, 2020

Note that there's some more recent research that bridges the HM-subtyping gap:

http://stedolan.net/research/mlsub.pdf

So this can be done, though afaik there's still some theory work to be done to combine it with some of the more advanced features in modern FP languages (e.g. higher rank types)

lxe · on July 11, 2020

> my single use case for explicit types in Ocaml was data deserialization

Data deserialization is where types are most useful

recursivedoubts · on July 10, 2020

Local type inference can give you 90% of the benefit at a fraction of the cost of a global type inference system. Simply introducing local variable inference probably captures 50+% of that and it's the simplest programming trick I ever saw.

When I first came across it in the gosu code base (https://gosu-lang.github.io) I said:

"Wait. That's all you do? You just take the type from the right hand side and then put it on the left hand side?"

"Yep."

"That's type inference?"

"Yep."

It called into question many aspects of my computer science education.

kccqzy · on July 10, 2020

I agree. C++ has had some form of local type inference when it comes to inferring template types, but in C++11 the auto keyword makes a tremendous improvement.

Even in Haskell where global type inference is supported, people end up putting type signatures on all top-level declarations because doing otherwise makes code less readable and makes type error messages way too confusing. Global type inference means a single type error in a single function can be propagated by the compiler to arbitrarily faraway at the call site, or even several layers deep in a call site. Solving such errors resulting from global type inference simply isn't worth the time.

gnulinux · on July 10, 2020

I've never written Haskell but I'm a Agda programmer (a Haskell dialect with dependent types) and I almost always put all the types and don't rely on type inference. The obvious exception is if something quick like

  a = b

otherwise I'd always do

  a : Some -> Type -> MoreType
  a = f b

Readability is a reason why. Types are extremely pedagogic tools. I believe when I think about my code, I just think about types. I don't think about computation, just the topology of types. Can I go to Int from X. If not why? This is very different than when I write Python code. So seeing types in code helps me a lot understand what's going on.

But the real reason is: errors. It's so much more easier to understand errors when types are explicit.

kccqzy · on July 10, 2020

Agda doesn't have HM-style type inference. Furthermore many times the type is the main content of the program, not so much the value.

gnulinux · on July 10, 2020

This is true, I should have mentioned it, although my point still applies to HM style type inference as well.

cannabis_sam · on July 11, 2020

To a point at least, HM type inference is decidable, but type inference in haskell with certain extensions or in agda (at least with certain pragmas) is not.

The possibility of a diverging type-level function would require the type inference algorithm to solve the halting problem.

(As I understand it at least, been a while since I got into the details)

mijoharas · on July 11, 2020

so, I'm curious, do you write the type of the function first?

When I used to write haskell, I'd very often write a function, iterate on it until it did what I want, then infer the type of the function and insert that into the program. Now, I was relatively new to haskell, so was probably less comfortable writing types than reading them (so I assume it was nicer to have the compiler do that for me, eyeball it to check it's what I meant to do, and put it in).

I'm wondering if someone more familiar would think "alright, what type do I want my function to be", fill that in, and then write the type out?

I definitely agree that a program is more readable with types, which is why I always filled them in (I just did it afterwards!)

eru · on July 11, 2020

I do both in Haskell. Sometimes I write the types first and sometimes I write the function first.

Usually, the functions with more imperative flavour get their implementation first, and the more functionally flavoured bits get their type first. But that's not always the case.

Sometimes I even write the QuickCheck properties first.

gnulinux · on July 13, 2020

Again, can't speak for haskell.

In agda you usually have 2 goals creating a function

* Proving a property of some other object. In this case you have to write the type first since type is the theorem you want to prove, and implementation is the proof.

* Implement a transformation. In this case, as you described you might get away first writing the impl, and then the type, but I personally never even attempt to do that since writing out the type gives me a lot to think about the implementation of the transformation. Once types are explicit, implementation usually ends up being a lot more obvious.

bjourne · on July 11, 2020

Python has optional typing though.

Groxx · on July 12, 2020

Python has optional type hinting for optional static analysis tools (e.g. mypy). Types aren't checked at runtime.

bjourne · on July 12, 2020

The terminology I used was correct. Python has optional typing. What you are thinking of is gradual typing which Python lacks. See "Is Sound Gradual Typing Dead?" for a description on what the differences are.

Groxx · on July 12, 2020

If you pass a string-typed var to a func with an int-typed parameter, it will not fail at runtime. `python the_file.py` will not even emit a warning, much less a failure, because there's absolutely no runtime difference between "has types" and "does not".

That's not optional typing. It's just a standardized magic comment (which the reference implementation ignores). It's far from useless, but it's not adding typing to the language.

bjourne · on July 13, 2020

Yes, that is "optional typing." Read the article I referenced.

Groxx · on July 13, 2020

That appears to be a doc about Racket, which it implies does have run-time checks, from the third sentence here:

>In particular, Typed Racket’s run-time system remains largely untyped. As a result, even the completely typed configurations of our benchmarks usually import constants, functions, and classes from an untyped module in the run-time system. When these values cross this boundary at run-time, the contract system performs checks, and that imposes additional costs.

But I haven't read the whole thing. It doesn't appear to be very related - it seems to be about gradual typing in a language that does run-time checks (with ways to disable them for performance purposes).

Typed python has no additional checks than untyped. So it's still untyped. You wouldn't claim my "// int" comments add optional typing to JavaScript, would you?

---

edit: here, take it from the source instead: https://docs.python.org/3/library/typing.html Note the title is "support for type hints" and this prominent note is right at the top:

>Note: The Python runtime does not enforce function and variable type annotations. They can be used by third party tools such as type checkers, IDEs, linters, etc.

bjourne · on July 13, 2020

It's not a doc about Racket, it's a conference paper. Had you read it you would have found "Optional typing can be traced as far back as MACLISP, which allowed users to declare (unchecked) type specifications [15, §14.2] in an otherwise untyped language." and "Contemporary optional type systems have been developed for Clojure [8], Lua [14], Python, [10]," where the footnote references MyPy. Type hinting is the mechanism through which optional typing is implemented.

Groxx · on July 13, 2020

MACLISP I can't really comment on as I don't know it and Wikipedia makes no mention of this, but yeah - MyPy is definitely an optionally-typed python code checker. It fits that category for both comment-based and in-language type hints, because it performs checks based on that info.

For Python itself though: if you have a syntactic element which does not change behavior in literally any way, whether present or absent and regardless of what it contains: sounds like a comment to me.

eru · on July 11, 2020

Type inference in C++ (and Go) is still very limited. It only works in the 'forward' direction.

In Haskell, OCaml and eg Rust, the compiler can infer the type of a new variable depending on how you are using it later.

secondcoming · on July 10, 2020

auto can be dangerous in that it can introduce unexpected copies, i.e. auto vs auto&. You could just use auto&& and be correct most of the time (unless you actually want to make a copy).

Herb Sutter's Almost Always Auto advice was not good, in my opinion. I still find I actually need to find out the actual type so when I use it it's only typically when the RHS of the assigment mentions the type, static_cast, make_unqiue etc.

I find it saves on typing but not so much for readability. There are different rules if you're writing an STL algorithm versus a final concrete program.

lmm · on July 11, 2020

That's "type inference" according to some people's definitions but not others. In any case it's pretty trivial; it only gives you 90% of the benefit if you're starting from a really low baseline.

dependenttypes · on July 10, 2020

What do you mean by local type inference? Things like let? If so even languages without HM like Agda allow that. HM type inference is specifically for finding the type of the arguments of functions.

jhanschoo · on July 10, 2020

> You just take the type from the right hand side and then put it on the left hand side?

That's what they mean by local variable inference

dependenttypes · on July 11, 2020

I do not really understand what this means. Something like given y : yt and let x = y we can infer that x : yt?

jhanschoo · on July 11, 2020

Yeah.

cheerlessbog · on July 10, 2020

Would it help with this case? https://news.ycombinator.com/item?id=23777197

pornel · on July 10, 2020

This falls short with generics, e.g.

    let map = HashMap::new(); // <K,V> unknown
    map.insert("hello", 1);

eru · on July 11, 2020

I think that works in Rust, and they only do local inference (in the same function).

cultus · on July 10, 2020

I've been really enamored of bidirectional type systems [0] lately, which seem to really give the best of all worlds. They explicitly separate inference from type checking, with the type checker switching between both modes during checking/inference. It's also more syntax directed, which makes good error messages easy.

A pure HM type-system is pretty restrictive, so most ML family languages don't have global type inference anyway.

[0] https://arxiv.org/pdf/1908.05839.pdf

shpongled · on July 10, 2020

Yea bidirectional systems are nice - one of the big benefits is that you can also have polymorphic function arguments, something you can't do in standard HM.

I recently implemented the bidirectional system in Rust from the paper you linked, it was fun (and a challenge!)

[0] https://bit.ly/2W96K2r

cultus · on July 11, 2020

Wow! Very cool!

ohazi · on July 10, 2020

The great thing about languages that implement HM types is that they allow you work as quickly or as carefully as you'd like without needing to change languages.

If you've done something a million times before and are familiar with how it works, you can leave off all the type hints, and it'll figure them out for you and stay out of your way so that you can work more quickly. If you make a mistake somewhere, the compiler will still tell you.

If you're doing something new, or don't quite understand something, you're free to add type hints wherever you need them to gain clarity. You can even do stuff like:

    let var: () = ...;

if you have no idea what type something is and just want to ask the compiler for help.

Once you're done, you can tidy up if you think it makes your code more readable. You can still leave top level hints to be kind to your coworkers or future you.

I think the ability to dynamically shift between "cautious exploratory mode" and "reckless I-know-what-I'm-doing mode" at the drop of a hat is why you hear a lot of anecdotes about people using languages like OCaml or Rust as if they were scripting languages.

CDSlice · on July 10, 2020

To expand on this, if you use CLion as your Rust IDE it will actually show you the types for all your variables automatically. For example, if you had a line like this:

    let foo = vec![1, 2, 3];

CLion would then show the type in a faded font where the type would go if you wrote it out manually like so:

    let foo: Vec<i32> = vec![1, 2, 3];

This is extremely useful when working with more complicated types such as iterator chain because it lets you see exactly what type everything is without having to manually write it all out and have the compiler check it.

riquito · on July 11, 2020

That's available with VSCode and rust-analyzer too (but I suspect others that piggyback on rust-analyzer got that ability). It can get noisy fast, but you can toggle hints for when you need them

frank2 · on July 10, 2020

Haskell, which uses a HM type system, has a type called Dynamic, which is described as follows:

>Finally, dynamically typed programming in Haskell made easy! Tired of making data types in your Haskell programs just to read and manipulate basic JSON/CSV files? Tired of writing imports? Use dynamic, dynamically typed programming for Haskell! [1]

If using an HM type system is really as easy and as fluid as you suggest, what purpose would be served by a type such as Haskell's Dynamic?

Nor is Dynamic a panacea: to use it requires a lot of extra typing (er, extra keypresses) IIRC, and

>A Dynamic may only represent a monomorphic value; an attempt to create a value of type Dynamic from a polymorphically-typed expression will result in an ambiguity error (see toDyn). [2]

1: https://hackage.haskell.org/package/dynamic

2: https://hackage.haskell.org/package/base-4.14.0.0/docs/Data-...

aweinstock · on July 11, 2020

It's probably worth noting that the majority of the development on that library was done on April 1st 2019.

frank2 · on July 11, 2020

The relevant file of source code has, "Copyright (c) The University of Glasgow 2001", which jibes with my memory of having used Dynamic in GHC in 2004 or earlier:

https://hackage.haskell.org/package/base-4.14.0.0/docs/src/D...

Also notice that Dynamic is in the Haskell Package Repository's "base" package, described as containing "Basic libraries", not part of an ancillary area for user-contributed libraries:

https://hackage.haskell.org/package/base-4.14.0.0

There exist type systems of which great grandparent is an accurate description (the most well-known of which is probably that used by Typescript) but the type system of, e.g., Haskell, Ocaml or Rust is not it.

(Now I am getting curious what specific programming language's type system great grandparent had in mind.)

zenhack · on July 11, 2020

I assume aweinstock was referring to the first link you posted:

https://hackage.haskell.org/package/dynamic

Which is an entirely different package, and the relevant types are entirely different. Only the name is the same. The `dynamic` package does indeed appear to be an april fools joke.

The Dynamic type in base is about runtime reflection, which is occasionally useful but not something that comes up often and not really about rapid prototyping.

frank2 · on July 11, 2020

I didn't notice they were different packages.

>The Dynamic type in base is about runtime reflection, which is occasionally useful but not something that comes up often

It is not something that comes up often, but since I am used to languages like Javascript or Scheme, when it does come up, figuring out how to do it in Haskell, e.g., satisfying the monomorphic restriction, has occasionally diverted my focus away from the problem to be solved for hours while I learn more about Haskell's type system.

But yeah, what I just wrote is in no way inconsistent with the comment I initially replied to, namely,

https://news.ycombinator.com/item?id=23795687

so I am going to admit that my comments in this thread have not exactly improved the level of discourse.

Kutta · on July 10, 2020

This tutorial seems to miss the level-based generalization optimization, which is crucial for production-strength HM inference. For, that you can look at:

http://okmij.org/ftp/ML/generalization.html

shpongled · on July 10, 2020

Level based generalization is a massive improvement in speed - I'm currently writing a Standard ML compiler for fun, and I saw a 30-50% decrease in elaboration type-checking duration when I switched to using levels. And it's not difficult at all to implement.

FPreallyHurts · on July 10, 2020

I had to implement this in Haskell while in university, we started with an interpreter for a Scheme-like language, then added static typing and inference.

Parser combinators, monads, monad transformers, curry-howard. Think for 15 minutes, write 5 lines of code, repeat. What I remember is that was code was really elegant but brain hurt so much.

mcbuilder · on July 10, 2020

Eventually you program enough FP, and then imperative starts to hurt your brain. Programming is much like a muscle. As your program in one paradigm, you brain starts to reason that way, then switch paradigms and you're suddenly struck with a bunch of programming atrophy.

DaiPlusPlus · on July 10, 2020

Imperative doesn’t hurt my brain, but because many operations in FP are generally much more succinct and expressive than in an imperative style - writing in imperative instead just makes me groan about all the manual keyboard-typing I’ll have to do (e.g. Linq vs foreach).

I really wish I could do more FP, but the languages and libraries I use for my day-job aren’t as-accommodating (mostly C#) - while C# has some FP features, it’s really held-back by the CLR’s type-system - so until that fundamental plumbing gets done we’ll never see features like type-classes, true immutable types, algebraic-types, and so on. Without those features we’ll have to keep on writing more code than is necessary.

<digress>Heck, it’s bad enough that IDictionary doesn’t implement IReadOnlyDictionary - or that none of the IReadOnly* types make any guarantees about immutability, which means having to review documentation or disassembly in ILSpy. And the famed non-nullable-reference-types in C# 8.0 is actually all just syntactic sugar for yet more attributes rather than true code-contracts, a built-in Option type, or extending the CLR’s type-system to understand nullability. Grumble.</digress>

Multicomp · on July 11, 2020

> while C# has some FP features....we’ll never see features like type-classes, true immutable types, algebraic-types, and so on.

IIRC you just described some features of F#, excepting type classes. Or at least F# gets you closer to the goal.

Maybe the ever elusive F* has that stuff?

atrudeau · on July 10, 2020

Why is Damas dropped when referring to HM type inference? Not being critical, just genuinely curious. I imagine there's a good reason.

dang · on July 10, 2020

I put 2012 as an upper bound on the year above: https://web.archive.org/web/20120324053910/http://www.ian-gr.... Perhaps it is earlier?

3001 · on July 10, 2020

Learnt this in my compiler class.It was the most brutal week of my life.

jldugger · on July 10, 2020

IMO, the most brutal week was realizing that nothing you will ever use on the job will use H-M type inference. Back to smashing rocks together.

Klathmon · on July 10, 2020

Take it from someone who has to occasionally work on an over complicated type system which uses HM inference, it ain't all that fun.

AnimalMuppet · on July 10, 2020

Could you be more specific? What makes it painful?

(I mean, "overly complicated" is bad news no matter what the specifics of what we're talking about, but how does it cause pain when it's specifically in the types?)

Klathmon · on July 10, 2020

For me personally it's just a lot of really heavy and very "meta" code that is really hard for me to reason about. I also touch it very rarely, so there's always a steep learning curve when I'm jumping back into it.

literallycancer · on July 10, 2020

You could use ReasonML in JS projects. It's even compatible with the rest of your codebase, if you don't want to start by rewriting everything.

GregarianChild · on July 10, 2020

Almost all modern statically typed languages use the ideas first coherently described in the Damas-Hindley-Milner approach. The main reason we do not see more full type inference, is that full type inference becomes uncomputable pretty quickly, once you move to more expressive typing systems.

3001 · on July 10, 2020

You can go work for Jane street.

nicoburns · on July 10, 2020

Perhaps more achievable: iOS dev with Swift. Not technically HM I don't think, but it functions more or less the same.

somewhereoutth · on July 10, 2020

If my understanding is correct, type inference producing a set of possible types - instead of a single type - for each term has proven to be undecidable?

This is a shame, because sum types (e.g. Shape := Circle U Square U Triangle) are actually pretty useful, and languages such as Java ended up implementing them via class or interface inheritance. Languages often also required type coercion to deal with things like 0.2 + 5, otherwise + could have had a type signature like: float U int -> float U int -> float U int.

marcosdumay · on July 11, 2020

If my understanding is correct, yes, it's undecidable.

On practice nobody cares. It being undecidable just means you need an extra check at your compiler to verify the types are converging, and a new error for the problematic case.

siraben · on July 11, 2020

One of my favorite resources that develop the HM algorithm from scratch is Ben Lynn's[0], who treats it like a series of interview questions, because really type inference algorithms match up two trees into constraints, which can contain concrete types or type variables, then solves those constraints.

[0] https://crypto.stanford.edu/~blynn/lambda/hm.html

DanielBMarkham · on July 10, 2020

I code both ways simultaneously.

I'm interested in what types I need to solve my business problem. The rest of it, I don't care. The reason I care about the types I need for my business problem is that I don't want my code doing something that it shouldn't: DDD-style coding allows me to code in such a way that it's impossible for the program to exist in a state that's invalid. That's a great time-saver. Other than that, polymorphic coding FTW.

vyuh · on July 11, 2020

The algorithm, the type system, and some of the logical background are explained in this tutorial, along with an implementation in standard ML.

http://steshaw.org/hm/hindley-milner.pdf I found this 30 page PDF document very helpful in understanding the Algorithm.

jb1991 · on July 10, 2020

One of the principal maintainers of the Racket language once gave a talk, I think it was at the main Clojure conference, on a topic about Typed Racket, where he strongly outlined the reasons why HM inference is a bad idea. It was interesting to hear these differences in philosophies.

tommybu · on July 10, 2020

Would you mind providing a link to this talk?

jb1991 · on July 10, 2020

I've been searching for a few minutes and can't find it -- it was several years ago, maybe at least 5 years, and at one of the Clojure conferences (there used to be several big Clojure conferences each year back in the language's golden age, but now there's 0 - 1 per year), and I can't remember which conference.

andrekandre · on July 10, 2020

was it this one?

https://www.youtube.com/watch?v=RvHYr79RxrQ

jb1991 · on July 11, 2020

That looks interesting as well, but it’s not the talk I was thinking of.

yawaramin · on July 11, 2020

This may be my favourite explanation of the algorithm's logic: https://stackoverflow.com/a/42034379/20371

fizixer · on July 10, 2020

I'm starting to understand why HMTI, and OCaml/Haskell et. al., struggle from such obscurity despite its fanboys swearing by it. The problem is that it's a crap technology in both easy mode and hard mode.

- easy mode: beginners just want to learn how to program as quickly as possible, something like python is fantastic for that. And the side-benefit, you can do amazing stuff with it, like bypass whole of symbolic AI of 80s and 90s (including HMTI, OCaml/Haskell) and go straight to deep learning, BERT, GPT what not.

- hard mode: mathematicians would love a helpful computing system that'll make them more productive mathematicians. When they attempt to explore stuff like Coq, and HoTT etc, the first thing that's thrown at their face is that they have to give up law of excluded middle, non-constructive logic and what not. I'm sorry but the tail does not wag the dog. If your theorem assistants and checkers require the mathematicians to turn their world upside down, most won't give a rat's ass about what you have to say about propositions-as-types and programs-as-proofs.

There is a medium mode, or shall we say mediocre mode, whereby you fall in love with type-theory and type-based languages, you want to live in your own bubble and not be bothered by what's happening in the outside world. And that's were the users of these systems live.

nullc · on July 10, 2020

This would be an interesting statement of an opinion if the language were toned down a little bit from flame-broiled mode to medium well.

You took all this time to write out your thoughts, why not put a little extra effort into massaging them in a way that will maximize their impact rather than just turning people off with flame-bait?

ImprobableTruth · on July 11, 2020

>that's thrown at their face is that they have to give up law of excluded middle, non-constructive logic and what not

This is just outright false. You can introduce it as an Axiom. In Coq this is literally just

  Axiom classic : forall P:Prop, P \/ ~ P.

I wouldn't make such patronizing statements without being very sure that I know what I was talking about.

Also, mathematicians aren't the be-all and end-all. Theorem proving is also very relevant for CS, so it's not the end of the world if mathematicians keep using pen and paper for all eternity.

kmill · on July 11, 2020

Plus, it's a theorem in Lean: https://github.com/leanprover/lean/blob/master/library/init/...

(Lean has a somewhat different type theory from Coq, where propositions in Prop have the property that all proofs of a property are defined to be equal. It turns out this along with functional extensionality (that two functions are equal iff their evaluations are all equal) is enough to get LEM, but only in Prop. Outside Prop there's a stronger commitment to constructibility. Still, every definition and lemma has some program backing it.)

ImprobableTruth · on July 11, 2020

I wouldn't really say that it has a different type theory, just that it has proof irrelevance and functional extensionality build into the kernel as axioms. But yeah, it's a good way to show that you definitely don't have to give up LEM.

dependenttypes · on July 10, 2020

I do not get it. HM lets you do things like f x = fold g x without writing the types, how is that not easy mode compared to C or Java for example?

WJW · on July 11, 2020

It's easy mode when writing, not when reading.

dependenttypes · on July 11, 2020

I disagree, you can have your IDE display the types if you so wish.

Even then, would you say that python for example is hard mode?

WJW · on July 12, 2020

In python, using single letter functions is not encouraged though. In general, I found that there is a tipping point above which more advanced the abstractions actually decrease the readability of the code because the ratio of implied to explicit code gets unbalanced. The verbosity of C/Golang/Java is actually a benefit here, since they remain readable even to new team members, hungover team members, new parents on 2 hours of sleep and generally anyone without a good model of all the implicit meaning behind the abstration in their head at this moment. This may or may not include "future me" scenarios where you come back to some overly clever code in 5 years and have to relearn 3 separate DSLs that were hip at the time.

The IDE experience for Haskell also... leaves something to be desired atm. Hie and GHCIDE are cool projects but I regularly see even experienced Haskellers struggle installing them (and/or keeping them working after updates).

nimih · on July 10, 2020

On the subject of the tail wagging the dog, mathematicians have been fighting about non-constructive proofs and the law of excluded middle far longer than they've been writing proof assistants, and deriding a direction of active and fruitful research just because you're not personally interested in it seems rather myopic.

ImprobableTruth · on July 11, 2020

>mathematicians have been fighting about non-constructive proofs and the law of excluded middle far longer than they've been writing proof assistants

Yes, and Brouwer lost that fight. Classical logic is an implicit assumption for professional mathematicians. LEM is simply taught as the truth in university courses. My SO who is a math phd student hadn't even heard of intuitionistic logic before.

If you want working mathematicians to use theorem provers, they need LEM. But this isn't actually all that much of an issue, since you can just introduce LEM as an axiom. It breaks computationality, but it's not like proof assistants using classical logic would have this either.

a1369209993 · on July 11, 2020

> LEM is simply taught as the truth in university courses.

So is[0] the notion that quantum systems magically undergo non-local, non-linear, non-unitary, CPT-violating, Liouville's-Theorem-violating, non-deterministic evolutions whenever a macroscopic system that happens to implement a sapient intellegence looks at something[1]. So that's not a very good yardstick for "this is not obviously false".

0: sample size of one univerity (circa 2010) for "simply taught as the truth", but I think it's recent enough and widespread enough to make a good example. See [2] if you want to investigate further.

1: aka the collapse postulate or Copenhagen interpretation

2: https://en.wikipedia.org/wiki/Copenhagen_interpretation#Acce...

withtypes · on July 11, 2020

I would argue that trying to describe our physical world and reality has only one "axiom" which is: your description has to be consistent with our observations of the world. Mathematics doesn't have this limitation so I don't think your example is valid. If you don't like LEM, fine, maybe there is something interesting in systems without LEM, but most math is done with LEM. I don't think is a matter of "right" and "wrong".

a1369209993 · on July 11, 2020

> that they have to give up law of excluded middle

They have to give up the law of the excluded middle because the law of the excluded middle is false. Counterexample: "This statement is false.". You can pretend it's true by trying to outlaw self-reference, but Godel proved that that doesn't actually work if your logic is useful enough to support arithmetic.

kmill · on July 11, 2020

I'm not sure what you're talking about, unless you're saying truth is provability. (Classical logic would disagree.)

The Lean proof assistant, which has a pretty strong commitment to constructability, has the law of the excluded middle just fine: https://github.com/leanprover/lean/blob/master/library/init/...

The caveat is that it only exists in Prop, and it's a consequence of proof irrelevance and functional extensionality. Types outside Prop don't have it for the simple reason that there's no sensible way to construct an element of Either A (A -> Void) in general (this is quasi-Haskell notation, where Void is a type with no terms). I'm not sure there's any need to bring Godel into this....

withtypes · on July 11, 2020

I don't understand what "LEM is false" means. What are your starting assumptions that lead to the "LEM is false" conclusion? You can use/assume LEM in your system or not, most mathematicans assume LEM (without even thinking about it).