Skip to content
This repository has been archived by the owner on Dec 6, 2023. It is now read-only.

optionalField Decoder #23

Open
ncthbrt opened this issue Feb 14, 2018 · 30 comments
Open

optionalField Decoder #23

ncthbrt opened this issue Feb 14, 2018 · 30 comments

Comments

@ncthbrt
Copy link

ncthbrt commented Feb 14, 2018

When parsing user input it's important to be able to give good error messages. For example in a http PATCH request, the date_of_birth field might only accept a UTC conformant string when the field is present. If a user passes in a number, we need to return with a 400 BAD REQUEST.

When using the optional decoder in combination with the field decoder, if the field exists but the inner decoder fails, the final result will be None.

In the previous example { dateOfBirth: optional(field("date_of_birth", utcString)) } when fed with the payload { "date_of_birth": 1234 } would simply be { dateOfBirth: None }. The actual semantics that needed to be captured in this example was:

  • { } becomes { dateOfBirth: None }
  • { "date_of_birth": "Wed, 14 Feb 2018 10:22:19 GMT" } becomes { dateOfBirth: Some("Wed, 14 Feb 2018 10:22:19 GMT")
  • { "date_of_birth": 123457 } becomes Json.Decoder.DecoderError("Expected a UTC string, got a number")

An optionalField decoder (or a decoder with a similar name) would solve this problem by returning None if the field is undefined, else would bubble up the error from inner decoder.

@glennsl
Copy link
Owner

glennsl commented Feb 14, 2018

Hmm, that's an interesting case. I can definitely see how this can be generally useful, but I think optionalField might be too specific.

I'd like to make errors more information-rich, but I'm not entirely sure how to best do so. Something like exception DecodeError(string, [>]) perhaps, so you could distinguish between DecodeError(_, `MissingField) and DecoderError(_, `IncompatibleType). but in your case you might instead want to distinguish between errors emitted by the given function from those emitted by nested functions. That's more difficult to generalize in a usable way.

In the meantime, could you do something like this instead?

exception Fatal(exn);

let fatal = decoder =>
  json => try decoder(json) {
    | DecodeError(_) as e => Fatal(e)
  };

...

{ dateOfBirth: optional(field("date_of_birth", fatal(utcString))) }

@ncthbrt
Copy link
Author

ncthbrt commented Feb 14, 2018

That's a good approach. We took a more manual work around, but this seems much cleaner.

@RasmusKlett
Copy link

I think the definition of optional is very problematic. There are three distinct use cases where I would want to produce an option:

  1. Return None if the field doesn't exist
  2. Return None if the value of the field is null
  3. Return None if the inner parsing fails for some reason

Personally I would expect a decoder named optional to be version 1, 2, or possibly a combination.
Version 3 is a very heavy-handed tool - you need to be extremely certain of which errors the inner decoder may throw, or you end up unexpectedly swallowing errors. I would expect such a decoder to have a name that stands out like failSilentlyOptional or decodeOrNone.
The current situation will definitely lead users of the library to swallow decoding errors by accident.

Apart from the above, I'm really enjoying using this library ;)

@RasmusKlett
Copy link

@glennsl Would you be open to a pull request switching optional to be strict, and adding a decodeOrNone implementing the current behaviour? This would be a slightly breaking change, I guess.

@ncthbrt
Copy link
Author

ncthbrt commented Mar 22, 2018

@RasmusKlett Why not think of a new name? And mark the current one as deprecated?

@glennsl
Copy link
Owner

glennsl commented Mar 22, 2018

I do agree that the current situation is far from optimal, and that it will cause errors, but I think this needs to be thought through on a deeper level and be planned for a future 2.0 release. What do you mean by "switching optional to be strict", though?

Currently optional behaves like Elms Json.Decode.maybe, which I think is a reasonable choice in the short term since it's at least familiar to some.

@RasmusKlett
Copy link

RasmusKlett commented Mar 23, 2018

@ncthbrt You're right, that's probably a better strategy.

@glennsl In my opinion the default optional decoder should match the JSON concept of a value which might be present. In some cases this is a field that might be null, in other cases it is a field which might be undefined. In no cases is it a field which might throw an error when decoding is attempted.

I guess my suggestion is to have three separate decoders for these cases, so the decoders can be more strict in what they accept, warning the developer earlier when something is not as expected.

@glennsl
Copy link
Owner

glennsl commented Mar 23, 2018

@RasmusKlett Sounds good in theory, but how would you implement it? optional doesn't operate on just fields, and field is just a generic decoder, it doesn't carry any information about it having anything to do with fields.

@RasmusKlett
Copy link

RasmusKlett commented Mar 23, 2018

Warning: untested code ahead, I hope you get the point of it. Also I might be mixing Reason and OCaml syntax, sorry.

The most often used I think would be the null-as-None decoder:

let nullToNone = (decoder, json) =>
  json == Js.null ? None : decoder(json);

I guess to handle an undefined field, you would need a separate decoder:
optionalField : sting -> 'a decoder -> 'a option decoder
Which returns None if the field is undefined, and returns Some of the given decoder otherwise.

Lastly, we could have:
exceptionToNone : 'a decoder -> 'a option decoder
which swallows any errors during decoding, and turns them into None.

This can be composed into

  1. Handling value which may be null: field("x" nullToNone(int))
  2. Field which may be undefined, but always has a value when present: optionalField("x" int)
  3. Field which may be undefined, and value may be null: optionalField("x", nullToNone(int))
  4. Field which may be malformed, and we want to ignore decoding error: exceptionToNone(field("x", int))

In my opinion this doesn't add too much syntactic noise, and it allows a much more granular specification of what we want, reducing surprising behaviour in the future by throwing early.

@ncthbrt
Copy link
Author

ncthbrt commented Mar 23, 2018

@RasmusKlett Why not:

  json == Js.null ? None : Some(decoder(json));

Seems like automatically wrapping in some would make it more composable

@glennsl
Copy link
Owner

glennsl commented Mar 23, 2018

nullToNone already exists, it's called nullable (edit: actually nullable returns Js.null('a), so you'd have to then convert it to an option, but basically the same thing). There's also nullAs which can be used with either or oneOf.

exceptionToNone is of course optional. I don't think exceptionToNone is a better name though, for several reasons: It should only catch specific exceptions, not all of them, as this name suggests, and it only describes the error path (as does nullToNone). Something like try would perhaps be good, but that's of course a reserved keyword.

optionalField is what @ncthbrt proposed, and is definitely an option, but as I said earlier I'd really like a more general solution if possible.

@RasmusKlett
Copy link

RasmusKlett commented Mar 23, 2018

nullToNone already exists, it's called nullable (edit: actually nullable returns Js.null('a), so you'd have to then convert it to an option, but basically the same thing). There's also nullAs which can be used with either or oneOf.

Well yes, but I think returning Option instead of Js.null is crucial. After all, this library is for getting rid of the Js types, and getting nice OCaml types instead. Whether it is called nullable or nullToNone doesn't matter to me.

exceptionToNone is of course optional. I don't think exceptionToNone is a better name though, for several reasons: It should only catch specific exceptions, not all of them, as this name suggests, and it only describes the error path (as does nullToNone). Something like try would perhaps be good, but that's of course a reserved keyword.

Good point that it only catches DecodeError, that should also be expressed. But my main gripe is exactly that the name is optional, leading to overuse because it seems like the sensible thing to use for producing Option types. In my opinion the name should be verbose, to indicate its niche use. Maybe tryDecode or decodeOrNone?

optionalField is what @ncthbrt proposed, and is definitely an option, but as I said earlier I'd really like a more general solution if possible.

I'm unsure what direction you want to generalize it in? I think if you want to generalize checking for undefinedness, you will have to change the definition of field to make it more composable. I guess you could have field : string -> Js.Json.t, making my earlier examples something like:

  1. Handling value which may be null: field("x") |> nulllToNone(int)
  2. Field which may be undefined, but always has a value when present: field("x") |> undefinedToNone(int)
  3. Field which may be undefined, and value may be null field("x") |> nullToNone(undefinedToNone(int))

I'm not sure if this is an improvement? In this way undefinedToNone is a normal decoder, and may be used for other undefined values. But field no longer produces a decoder.
On a side note, I think my code in example 3 in both cases actually returns a nested option, which should be flattened.

@glennsl
Copy link
Owner

glennsl commented Mar 25, 2018

Well yes, but I think returning Option instead of Js.null is crucial.

null does not necessarily have the same meaning as None. If it isn't distinguished, None could mean either null, undefined or field missing.

After all, this library is for getting rid of the Js types, and getting nice OCaml types instead.

No...? It's for decoding json into a usefully typed data structure. Whether that structure should include js types or not is a choice left to the consumer.

In my opinion the name should be verbose, to indicate its niche use. Maybe tryDecode or decodeOrNone?

I'm not a fan of redundant verbosity. That it's a decode operation is obvious from the context. decodeOrNone is also not accurate, as it wraps the success case too.

I'm unsure what direction you want to generalize it in?

As it is now, every decoder follows the same simple "shape". They either decode successfully or they fail with an exception. By "general solution" I mean one which preferably conforms to this shape, or alternatively changes the overall shape into something that better captures the properties we need, but still remains uniform. optionalField does not, as it can fail in several different ways.

I like field : string -> Js.Json.t -> Js.Json.t. It does kind of fit the existing shape (it just "decodes" to a Js.Json.t instead of something more specific), is more flexible, and there isn't any issues I can see immediately at least. It does break the current API, of course, but this along with more detailed exceptions could make for a good 2.0 API. I'd like to play with this a bit when I get some time.

@RasmusKlett
Copy link

@glennsl Well, I don't agree with a lot of your response. I guess I have different ideas for how a library like this should look. I'm glad you like the field idea; I'm gonna stop trying to impose my ideas on your library now ;)

@snird
Copy link
Contributor

snird commented Apr 14, 2018

Hi, wanted to note here I just bumped my head against this issue too.
I generally didn't expect optional to swallow all underlying exceptions.
I too think that optional should be None only for undefined and null values, and for everything else throwing the exception.
But anyway, even if it is decided to keep it as it is, I think a documentation update to reflect more clearly what optional is doing is needed.

@glennsl
Copy link
Owner

glennsl commented Apr 15, 2018

If I could perform miracles I would, but unfortunately I cannot just will something like this into existence. If optional were to check for undefined or null before passing the json value onto its inner decoder, then if used with field would return None if the object containing the requested field is null or undefined, not if the field is. I doubt that's what you want or expect.

The field idea described above, which seemed promising, has some severe issues. It requires much more advanced function composition techniques and primitives for anything mroe than the simplest uses, and since it doesn't nest decoders we lose the ability to add context and "breadcrumbs" to error messages. That's a pretty high price to pay.

So optionalField might still be the best alternative...

The documentation for optional seems pretty clear to me (but then I wrote it, of course, so I might be a teensy bit biased):

This decoder will never raise a [DecodeError]. Its purpose is to catch and
transform [DecodeError]'s of a given decoder into [None]s by mapping its
[result] into an [option]. This prevents a decoder error from terminating
a composite decoder, and is useful to decode optional JSON object fields.

I'd happily consider suggestions for better formulations of course, but the underlying problem seems to be poor understanding of the primitives and basics of the library, or perhaps even language semantics, which is a non-trivial educational task.

@glennsl
Copy link
Owner

glennsl commented Apr 19, 2018

I've explored a bit more and found a solution that's promising, but probably can be teeaked a bit still, and might still have some unforeseen shortcomings.

Interface:

type field_decoder = {
  optional : 'a. string -> 'a decoder -> 'a option;
  required : 'a. string -> 'a decoder -> 'a
}
val obj : (field: field_decoder -> 'b) -> 'b decoder

Usage:

Json.Decode.(
  obj (fun ~field -> {
    start     = field.required "start" point;
    end_      = field.required "end" point;
    thickness = field.optional "thickness" int
  })
)

Pros:

  • Distinguishes between missing field and incorrect field type
  • Distinguishes between missing field and the given json fragment not being an object at all (no current or previously proposed solution has supported this)
  • Removes the awkward and repetitive json |> part, which also means object decoders can be constructed using partial application, like any other decoder

Cons(-ish):

  • Definitely more complicated and harder to understand, but still less room for error I think
  • field.optional and field.required aren't proper decoders since they're already bound to a json fragment, which means they can't be used with either and other decoder "operators". There's not a lot of use cases for doing so though, I think, and if you absolutely need to it your favorite standard library's Option module should have what you need. This also means they can't be used with the error-prone optional though, and perhaps it can even be removed in favor of nullable.

@ncthbrt
Copy link
Author

ncthbrt commented Apr 19, 2018

@glennsl This looks like a promising approach. Looking at our decoder usage, it isn't very far removed from how your provisional design looks, but with a bit of extra verbosity introduced from the repeated field decoders and the application of the json object.

@glennsl
Copy link
Owner

glennsl commented Apr 19, 2018

Here's a few more considerations:

  • at won't fit in as well as a shortcut for nested fields anymore, since there's more explicitness and ceremony around the latter now
  • This might enable us to move back to a result-based API. field.* could still throw exceptions and allow direct assignment to record fields, but these exceptions could be caught by obj and turned into an Error result, thereby containing it to only where it's really necessary.

@cknitt
Copy link

cknitt commented Apr 25, 2018

I just ran into this issue, too, as I was using Json.Decode.optional to decode stuff encoded using Json.Encode.nullable.

The encoder

val nullable : 'a encoder -> 'a option -> Js.Json.t

turns an option into a JSON value, mapping None to null.

So I would expect the nullable decoder to perform the inverse operation, mapping null back to None.

But that's not the case, as the signature of the decoder is

val nullable : 'a decoder -> 'a Js.null decoder

@ncthbrt
Copy link
Author

ncthbrt commented Apr 25, 2018

As a point of interest, here is what we're using to solve the problem in our api endpoints:

let optionField = (fieldName, innerDecoder, json) =>
      Json.Decode.(
        (
          try (Some(field(fieldName, identity, json))) {
          | Json.Decode.DecodeError(_) => None
          }
        )
        |. Belt.Option.flatMap(nullable(innerDecoder) >> Js.Null.toOption)
      );

@glennsl
Copy link
Owner

glennsl commented Apr 26, 2018

@cknitt Indeed, that does seem pretty inconsistent. Json.Encode.nullable probably should accept an 'a Js.null instead. Or perhaps it would be better to remove these "enhancers" entirely, to make the module as a whole simpler. It'll be breaking either way though, so I'll look into it for 2.0.

@glennsl
Copy link
Owner

glennsl commented May 30, 2018

I've now made a "next" branch which has the proposed obj decoder with some improvements over the earlier proposal: It now has at getters too and passes the getters in using a record rather than labeled arguments.

In OCaml:

      obj (fun {field} -> {
        static    = field.required "static" string;
        dynamics  = field.required "dynamics" (dict int)
      })

and Reason:

      obj (({field}) => {
        static:   field.required("static", string),
        dynamics: field.required("dynamics", dict(int)),
      })

My main concern now is that since the getters aren't proper decoders, you can't compose them using combinators such as etiher, but have to resort to composing options instead. A real example from redex-scripts:

  license       : json |> optional(either(
                            at(["license", "type"], string),
                            field("license", string))),

now becomes

    license       : at.optional(["license", "type"], string)
                    |> Option.or_(field.optional("type", string)),

which is confusingly different (and significantly less readable too). Other than that I'm still unsure about the ergonomics, and would love some feedback from real world usage and beginners trying to grok it.

@bsr203
Copy link

bsr203 commented Sep 27, 2018

After reading the thread, it seems I can read null to an optional field (as None). Can anyone in this thread please give a pointer how to do it.

  let message = json =>
    Json.Decode.{
      id: json |> field("id", string),
      text: Some(json |> field("text", string))
       ..
};

when I parse, it works if text is not null, but get error Expected string but got null for field text otherwise. I tried like

      text: Some(json |> field("text", nullable(string))),

but, that gave type error

This has type:
    Js.null(string)
  But somewhere wanted:
    string

@ncthbrt what is identity in your optionField function. Is there a decoder like that?

thanks.

@glennsl
Copy link
Owner

glennsl commented Sep 28, 2018

@bsr203 What you likely want is this:

text: json |> field("text", optional(string))

optional(string) will return an option(string) with Some(...) if the string decoder succeeds and None if it fails, which it will if the value is null, but also if it's anything else other than a string. However, because optional only surrounds string, not field, message as a whole will fail if the field is missing.

If you really want it to be None only if it's null but fail otherwise, you can do this:

  text: json |> field("text", nullable(string)) |> Js.Null.toOption,

This thread is about accomplishing the opposite, to have it return None if the field is missing and fail if it exists but is not a string (and optionally null).

identity is just a function x => x, that returns exactly what it's given without doing anything to it. And >>, which is also non-standard in Reason, is function composition, i.e. f >> g == x => g(f(x)).

@praveenperera
Copy link

praveenperera commented Oct 1, 2019

and Reason:

  obj (({field}) => {
    static:   field.required("static", string),
    dynamics: field.required("dynamics", dict(int)),
  })

I like this solution, I was having the same issue today.

I have a field that's optional, but if that field is present, I don't want decode errors to swallowed up.

@praveenperera
Copy link

praveenperera commented Oct 1, 2019

This is my solution

let optionalField = (fieldName, decoder, json) => {
  let field = Util.Json.toDict(json)->Js.Dict.get(fieldName);

  switch (field) {
  | None => None
  | Some(field) => Some(decoder(field))
  };
};

let optionalWithDefault = (fieldName, decoder, json, default) => {
  fieldName
  ->optionalField(decoder, json)
  ->Belt.Option.getWithDefault(default);
};
module Json = {
  external toDict: Js.Json.t => Js.Dict.t('a) = "%identity";
}

So in this case with object of

{
  "first_name": "Praveen",
  "last_name": "Perera"
}

Where the field last_name may or may not be present my decoder would look like this:

let decode = (json) => 
  Json.Decode.{
    firstName: field("first_name", string, json),
    lastName: optionalField("last_name", string, json)
  }

In this case if last_name was present but the value was null the decoder would fail, which is what I want:

{
  "first_name": "Praveen",
  "last_name": null
}

If I didn't want it to fail my decoder would look like this:

let decode = (json) => 
  Json.Decode.{
     firstName: field("first_name", string, json),
     lastName: optionalField("last_name", optional(string), json)
  }

@glennsl
Copy link
Owner

glennsl commented Oct 2, 2019

I like this solution

There are some downsides to it. Since these decoders aren't ordinary decoders - they're essentially partially applied with the json argument - they don't compose well with other decoders. This is usually not a big deal, but there are some cases where this is a bit of a sore thumb.

I've tried the next branch out a bit in a few of my own projects, but it really needs some more people to test and give feedback on it.

external toDict: Js.Json.t => Js.Dict.t('a) = "%identity";

This is very unsafe. If the json is not an object, this might cause a crash or propagate garbage data. Also, since you don't have any type annotations on the functions that use this, you can pass any 1-ary function off as a decoder. You might want to consider using Json.Decode.dict instead, or at least replace the 'a with Js.Json.t and check that it actually is what you assume it is. This is the check used in bs-json:

    Js.typeof json = "object" && 
    not (Js.Array.isArray json) && 
    not ((Obj.magic json : 'a Js.null) == Js.null)

Otherwise your solution looks good!

@praveenperera
Copy link

praveenperera commented Oct 2, 2019

HI @glennsl thanks a lot for the feedback, I've changed my code to:

module Json = {
  module Private = {
    external toDict: Js.Json.t => Js.Dict.t(Js.Json.t) = "%identity";

    let isJsonObject = json =>
      Js.typeof(json) == "object"
      && !Js.Array.isArray(json)
      && !((Obj.magic(json): Js.null('a)) === Js.null);
  };

  let toDict = (json: Js.Json.t): Js.Dict.t(Js.Json.t) => {
    Private.isJsonObject(json)
      ? Private.toDict(json) : Js.Dict.empty();
  };
};

Let me know if you see any problems with this.

I didn't see how I could use the dict decoder in this case.

There are some downsides to it. Since these decoders aren't ordinary decoders - they're essentially partially applied with the json argument - they don't compose well with other decoders. This is usually not a big deal, but there are some cases where this is a bit of a sore thumb.

I think I prefer this solution because its just adding a new decoder and not introducing a breaking change. Thoughts? Should I do a PR?

@glennsl
Copy link
Owner

glennsl commented Oct 3, 2019

toDict is basically just dict (fun x -> x), that is. dict passed a decoder that doesn't decode, it just returns the value as Js.Json.t and dict will therefore return a Js.Dict.t(Js.Json.t).

I think I prefer this solution because its just adding a new decoder and not introducing a breaking change. Thoughts? Should I do a PR?

The obj decoder isn't a breaking change either. field and at are still there, but I think it's the wrong abstraction to use to decode entire objects, and it seems like optionalField is really only useful in the context of decoding objects as a wiole.

My line of thinking here has evolved to the better approach being to move this logic outside the object decoder. I think that might make ti simpler, if a bit more boilerplatey.

For example, instead of

let coordinate = json +.
  Json.Decode.{
    x: field("x", int, json)
    y: field("y", int, json)
    z: optionalField("z", int, json)
  }

maybe we should do

let coordinate2 = json +.
  Json.Decode.{
    x: field("x", int, json),
    y: field("y", int, json),
    z: None
};

let coordinate3 = json +.
  Json.Decode.{
    x: field("x", int, json)
    y: field("y", int, json)
    z: Some(field("z", int, json))
  };

let coordinate =
  Json.Decode(
    either(coordinate3, coordinate2)
  );

The latter has significantly more code, but also offers significantly more flexibility an is much easier to maintain over time I think, when more optional fields are added for different shapes. Maybe it should be modeled as a variant instead, to avoid invalid representations. That's much simpler to do with this.

But then maybe other things are much harder. Again, I think we need more experience to figure out these idioms, and would love to get some more feedback on the next branch. I'm not doing anything with BuckleScript at all these days, so I'm not gaining any experience with it on my own.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants