Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Just curious, how would one handle something like this in Ocaml:

  x = function() { 
        if (rand() > 0.5) {
          return "abc"
        } else {
          return 2.01
        }
      }
So far, union types and boxing/unboxing are possible answers. Would type inference work in this case? (well, it would for boxing/unboxing but union types?)


In Standard ML, it would be a type error to do that specifically:

    - if true then "abc" else 2.01;
    stdIn:1.2-1.30 Error: types of if branches do not agree [tycon mismatch]
      then branch: string
      else branch: real
      in expression:
        if true then "abc" else 2.01
You would need to define a new data type to wrap them:

    - datatype string_or_real = r of real | s of string;
    - if true then s "abc" else r 2.01;
    val it = s "abc" : string_or_real


Type error. your lambda doesn't have a valid static type signature. You would wrap that around a union type:

wtf = A of string | B of float

And then you'd return A("abc") or B(2.01) and your function would be () -> wtf


Why's it a type error rather than inferring a sum type?


You can get it to infer the sum type by giving tags to your values with backticks, like `A 1.23 and `B "foo". This will be inferred as "`A of int | `B of string" without you having to declare any type. Without those tags, the system assumes you have made a mistake in returning values of separate types. If it didn't do that, you would not get type errors in many/most (all?) of the situations where you want the type system to help by indicating the inconsistency to you.


Giving tags seems like declaring a type though?


It's not though. If we think about Result as an example, `1`, `Success(1)` and `Failure(1)` are different values, not just different types.

I think the point you're making is that Standard ML has certain restrictions in its type system that are necessary to make type inference possible: there are no union types, no subtypes (more recent work has found a way to make subtyping compatible with full H-M), polymorphism has to be introduced explicitly with `let`, and there are no higher-kinded types. And in some cases you'd argue those restrictions might incur more code overhead compared to a language that loosens those restrictions at the cost of less reliable type inference (e.g. Scala).

That's sort-of-but-not-really true: in my experience union types and subtypes are always bad ideas and there are better alternatives for any use case where you would use them. You never actually want "String | Int", you want a type with meaningful semantics (like `Result<String, Int>`).


If you use the "polymorphic variants" feature with backticks it automatically infers the union. However, most of the time the regular non-polymorphic variants are easier to work with.

A good discussion of this can be found in https://dev.realworldocaml.org/variants.html#scrollNav-4


because "ssdf" is a string but B("asd") is user-defined sum type.

So in the original example, you can't unify "ssdf" and the int 33, but in the re-written version B("sfd") and A(33) have the same type and can be unified.


You might like the language Crystal, it works this way.

https://crystal-lang.org/


But Crystal recently had to walk back from global type inference as they found it was not realistic in practice, didn't they?


Why is it not realistic? Is it too slow or too unreadable?


You could dig up the mailing list messages at the time, but yeah it was slow and I think it produced error messages that were extremely hard to use - it would decide the type of functions was something nonsensical and then report that as an error at all the call sites instead.


Makes sense, bad errors is also why I don't use type inference in my code.


Don't take my word for it but I don't think you can implicitly unify[0] two base type in HM.

https://www.cs.cornell.edu/courses/cs3110/2011sp/Lectures/le...


You cannot unify a string and an integer. 'Base type' is just an abstraction over the enumerable primitive types (bool, unit, char, etc). It's generally understood that these are simple cases to unify, so they aren't even mentioned most of the time.


If you are interested for a language that does something like that then check out https://archive.is/Op8Mf or the other words by Stephen Dolan


In haskell you would do it like

    import System.Random
    x = do y <- randomIO :: IO Float -- note: this is before there are instances of randomIO for IO Int and other types
           if y > 0.5 then
             return (Left "abc")
           else
             return (Right 2.01)


I think another interesting question is, how would you handle the return value of such function in your code?

You would have to either:

- test for the type of that value in order to handle it properly,

- rely of the implicit cast rules of JS, which wouldn't be very useful here I suppose

As it has been said by others, in Ocaml (as well as in many other strongly typed languages) you can use a sum type to solve that problem.

Edit: In some languages, there is also the concept of "intersection types" which, if I understand correctly, let one also handle that sort of situation. The corresponding Wikipedia entry [1] gives a list of languages supporting that concept, and provides examples.

[1]: https://en.wikipedia.org/wiki/Intersection_type


I've found this type of function useful for duck typing, and I've generally found duck typing to be a really useful tool to add polymorphism to code.


The point I was trying to make is that if you have to rely on runtime type information, you might as well use sum types in a more strongly typed language, where the scope of the code gets limited to the type itself.

With "duck typing", you need to have an extra test for types which your code is not supposed to work with, whereas with sum types, the type checker will help you verify that it cannot happen.

That being said, there are situations which a strong type system will not be able to model.

Anyway, at the end of the day, what matters is that you (and your coworkers) feel comfortable with your code. I know by experience that it's not always the case.


I find unions without a tag hard to handle.

To expand on another commenters example, suppose you have a sum type that models successful computation with a return value or a failure with an error message.

If you have tags / constructors, that's all easy. But if you just use naked unions, you can not write code that deals with error messages (or strings in general) in the success case.


I can’t speak for Ocaml, but in Elm, an if expression as well as a function must return the same type. Therefore the code you posted would be invalid.


To expand slightly, the Elm compiler will help you out here and show you what you need to fix. Given this function body

  randomReturner n =
      if n < 0.5 then
          123
      else
          "abc"

The compiler will give you the following info:

  The 2nd branch of this `if` does not match all the previous branches:

  22|     if n < 0.5 then
  23|         123
  24|     else
  25|         "abc"
            ^^^^^
  The 2nd branch is a string of type:

      String

  But all the previous branches result in:

      number

  Hint: All branches in an `if` must produce the same type of values. This way, no
  matter which branch we take, the result is always a consistent shape. Read
  <https://elm-lang.org/0.19.1/custom-types> to learn how to “mix” types.

  Hint: Try using String.toInt to convert it to an integer?
You can try it out on Ellie here: https://ellie-app.com/9nXxfKSStG4a1


In TypeScript, x would be of type () => "abc" | number.


Wouldn't it be string|number unless using the `as const` modifier?


It used to be. But more recent versions of TypeScript (I believe since 3.5) are much stricter with strings in implicit unions. It's really quite helpful.

Here's an example where the feature can save you:

```

let x = () => {

    if (Math.random() > 0.5) {

        return "abc"

    } else {

        return 4

    }
}

// ERROR: This condition will always return 'false' since the types '"abc" | 4' and '"sdf"' have no overlap.

if (x() === "sdf") {

   do_work()
}

```


In F# (which is very similar to OCaml), you’d have to explicitly cast the return value to “obj” to get this to compile.


Just to clarify to people not familiar with the two languages and their differences, this is somewhere where the two languages diverge greatly. OCaml does not have an "obj" type that we can cast to (and in general casts are quite rare).


You would need to declare that the function returned a sum type (string | int).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: