-
Notifications
You must be signed in to change notification settings - Fork 9
Question on your approach to strict mode #200
Comments
Hey @ryanhiebert - Thanks so much for the thoughtful post. It's clear you have read through my documentation and you have hit the nail on the head, as it were, to the fundamental conflict in this library to-date. A Brief History of TimeTypical began as a part of my first foray into typed Python. At the time, I was a new convert to not only type hints, but Python3. I was struggling with the concept of strongly typed code, but I wanted the guarantees it provided. Too much of my functions and methods at the time were devoted to boilerplate validation and coercion of inputs. Thus came the Speaking frankly, these two features were definitely good learning experiences, but I regret them. You can see from the design of the library that my view on how to use type-hints changed from a means to save developers from lazily-typed upstream code to a means to describe protocols for serialization, deserialization, and runtime validation of types described by the Python type-system. If you look at code written in the v1 era vs the v2 era, you can start to see this evolution. This has to do with my own experience with SerDes libraries in statically-typed libraries like Java, Go, etc. I also gained a critical understanding of how to write strongly-typed Python and realized that the "magic" of auto-coercion was largely unnecessary if I was just very careful and explicit about the types I was passing around. Today, my production code is still a heavy user of typical, but basically only at network boundaries. Within my applications, my type-hints and mypy do the rest of the work and give me much greater peace of mind. Moving ForwardI've been hard at work on v3 for the last 6 months. In v3, which you can take a look at here: https://github.com/seandstewart/typical/tree/v3-routine-factories, you can see I now view this library as a SerDes library first and foremost. I've given very little thought to those two areas of the typical API, but I'm a fan of your thinking and like the idea of essentially "quarantining" them behind a magic sub-package. They do have their use, especially in larger code-bases where a developer may have less control over how well-typed external callers may be, so I don't want to get rid of them entirely. Some notable changes in v3:
Things I've considered:
Thusfar I have done neither. There is even a core Things You've Made Me ConsiderI want to close this comment out by saying - your submission has opened my mind to a middle way. Typical can ship two isolated packages. The first can maintain the cutesy WRT "strict" mode - yes... it's honestly quite nasty to wrestle with, I'm honestly not even sure what typical looks like without it at this point! The constraints engine has its own limitations which make it less-than-desirable when it comes to using for SerDes. Perhaps the solution is to simply do away with the juggling. When you invoke A major caveat: Currently, the constraints engine allows for validating mappings against user types (e.g., dataclasses). So What do you think about all this? |
Wow, thank you for taking the time to respond so thoroughly to my inquiry! Thank you for telling me a bit more about the history, and about how your thinking has changed since then. One thing I'll point out is that less-than-ideal APIs made while learning are inevitable and shouldn't be regretted. Instead, it is better to think about what the legacy and future of the API is, and how we can make those movements most effectively. I think that JSON Schema generation is a really neat feature. I also, at least currently, define JSON schemas directly, but I see great independent value in being able to validate that the schemas match or are compatible. This is a challenging problem in its own right, dealing with what kinds of interfaces are breaking changes and which are not. But I can also see it being tangential to the focus of a small package. This is really the question: what is the scope of the package? How much is too much to expect to all be one well in the same package? I agree that the serialization and deserialization aspect is the central aspect of Typical. I think this is necessary, because (a) it's a hard, large problem at the center of everything Typical does, and (b) as Python typing and other language features grow, I think that the explicit principles of Typical encourage you to actually discourage or even remove now-redundant interfaces, and that serialization and deserialization are the ones least likely to be soon added to the language. I see, and I think you do as well, Serialization, validation, and coercion as different things. In the spirit of explicit being better than implicit, I think that it's wise to separate them as much as possible. You have some neat interfaces for the validation. It's neat how you're working them into types, and I wonder how much of that Python will do for itself in the long run. It sure seems like it's doing more and more. The serialization piece is what I'm most focused on, followed by validation. Like you, I'm using this for network boundaries primarily. I think there is room for multiple approaches to all these problems, but its the first-class support of Python primitives that really drives me. I want to be able to start with native python features, sprinkle in some hints about how they should work in different contexts, and have the redundant parts of serialization be reduced and simplified to reduce human error. I think that validation is best done in the destination type. In fact I'd probably define validation this way. Deserialization and coercion deal with putting things into the right types, while validation enforces further constraints. Your validation is interesting in that it often uses subclasses to implement them. In that sense, it rather blurs the line between serialization and validation. And I think that's actually a good thing. I expect that over time more and more validation will be able to be analyzed by type checkers. I suspect that your approach to validation is relatively less likely to stand the test of time than a strict serialization library, largely because new ways of writing these constraints are likely to be added to the type checker. For deserialization, the rule that I want to enforce is that the type coming in matches the type that I expect the serializer to produce. Anything outside of that would fall under coercion, and I think can be left as a different concern. A good many features of typical, due to its principles, are likely to be redundant and therefore counterproductive over time. It is wise to think about how even good features should come to be discouraged when language-preferred alternatives are available. Ok, depending how in sync we are with those thoughts, here's what I might suggest:
Let's see if I can distill it down to a shorter call to action. If you agree with me that nailing a serialization and deserialization protocol and extensibility approach is critical, is it better to (a) do that under the It's scary, but I think if it were me, I'd document that there's going to be hard pivot in this focus of this package for the purpose of nailing down this core API.By using the JSON Schema feels like it would, ultimately, fit really well as a separate extension package. Hugely useful, but tangential to the core mission. Much of the validation feels like it would fit in a similar category as well. But that I'm less sure of. Validation and JSON schema feel like they're likely to be more tightly coupled together, and less generic, just because the space of solutions already becomes very large, and being generic at that level is probably too much work. OK, time for you to gather your thoughts before I keep going. I keep getting the feeling that I'm being too handwavy about how this can work, and that I'm missing important details like how the serialization format (e.g. JSON) is a critical piece of knowledge to how the serialization and deserialization work, and that even that might ultimately not be something we can unify into a single protocol effectively. |
Thank you so much for the work that you've done with this library. I haven't gotten to actually use it yet (so I may be missing important details, and would appreciate knowing that), and I'm extremely impressed. It is taking the right approach to design in ways that have really caused me trouble with other typing and serialization libraries, or that have caused me to just not be happy enough to want to use them.
The principles you lay out as guiding principles are spot on. By leaning into the standard tooling to provide extensibility over those tools rather than working a different parallel path, or even in opposition to them, we're better able to gradually improve our code. Bit by bit, making it better, as we learn and grow, without having to risk an entire rewrite to get the benefits.
When I think about working with Python instead of against it, which is a key feature that I'm very impressed with your work in typical for data classes and other built-in types, I find the approach of the default non-strict mode to be a very notable departure from that principle. Python doesn't do implicit type coercion, and that has been an intentional and deliberate design choice of the language for as long as I've known it, and I think probably since its inception.
The non-strict mode default turns that design principle absolutely on its head. You've provided strict mode, and that helps us work around the concern, but the philosophy is still reversed from the Pythonic first principles. If I care to preserve this principle, I'm left with a variety of (easy to use) options, where I have to constantly re-affirm that I agree with this Pythonic first principle.
The global strict mode to solve this is a non-starter for non-trivial applications, because it breaks assumptions other libraries might very reasonably be making about the state of that. I want everyone to use typical, so I want that concern to be a common thing to encounter. IMO, it is a misfeature to even offer that API, it's far too powerful of a foot gun.
All that said, easy and loose type coercion is extremely valuable. You had an eye toward this with your initial definition of
@typic.al
. That is a great tool, and I don't wish for you to take it out of the toolbox. However, I think it should be a different tool than the typing and serialization layer's defaults.Personally, I find this core philosophical difference between typical as it currently exists and the way I think of the principles that have guided your design of typical to be so significant that its worth having an entirely separate API if needed that defaults to strict mode. There are a few approaches I can see to doing this, depending on what you want typical to be.
Gut reactions to which of these is best might be further informed by these additional considerations:
typical
, I can see myself considering releasing another distribution package to PyPI, that perhaps works with typical under the hood, but exposes the strict mode by default.typical
, matching the distribution package name. This could leave the cutetypic.al
shortcut names defaulting to the non-strict mode, which may be preferable to many people, and allowing others to choose to use the strict APIs that might have more no-frills business-mode names likeattrs
ended up adding.magic=True
, maybe? Magic is cool, as long as you've asked for it.friendly
?autoconvert
?If you got through all that, I hope that it came through in the intended spirit of gratitude and deference. I greatly appreciate that you've released this to the world, and that I get to see it. Still, this is your project, and it is and should be what you say it is, and I respect that.
What would you like to be the future of strict mode in typical? Do you agree with me that it's critical enough to warrant one of these significant options to allow for really changing the default mode? Or is that just not what you want typical to be? Have I perhaps missed something important?
The text was updated successfully, but these errors were encountered: