-
-
Notifications
You must be signed in to change notification settings - Fork 218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CEP] The use of type hints #31160
Comments
From a cursory glance at the rationale behind type-hints, I come to the opposite conclusion. While it seems that 'static analysis' can be extended to include IDE code-completion and refactoring tools, it has always meant, primarily, offline analysis intended to prevent errors. If we are not planning on retrofitting commcare with type hints, we lose out on the primary value that type hints were intended to provide.
Can you link to a conversation suggesting that type hints should be used selectively? My reading of the PEPs is that this is intended for developers who want static analysis throughout their codebase. It's optional because they don't want to disrupt users who enjoy Python as is. Gradual Typing is referenced in PEP 483, writing: "Large software systems are often developed in multiple languages partly because dynamically typed languages are better for some tasks and statically typed languages are better for others", and goes on to list benefits for dynamic typing, and benefits for static typing. I read these benefits as suggesting that this decision is meant to be decided per system, not per function. For example: "Static type checking catches bugs earlier, thereby removing the greater cost of fixing bugs later in the development cycle or the even greater cost of a bug that occurs in the field." If this was a priority, it would be a priority for all of a system's code, not just one function. How I've seen this actually materialize, is writing a server in C++ or Java, where we care about static analysis, and then writing testing hooks in Ruby, because that code wanted faster iteration and more expressiveness. I feel like this is a similar scenario to Javascript frameworks -- many allow you to port a small segment of your site over, but that is done mainly to aid adoption, with the long-term goal of having all code migrated over to a single framework. Mixing and matching, long-term, is a recipe for confusion and inconsistency.
There are exceptions to every rule, but I'm not even sure this is a valid exception, so much as just not wanting to spend the time to create a proper subclass for a test fake (as would be needed in a statically typed language). Either way, this is an exceptional situation, and I'd definitely be concerned seeing types ignored in production code. It seems more beneficial to me to treat type hints as static type declarations, and have to write more complicated test fakes, than the other way around.
I strongly believe in Uncle Bob's quote: "A comment is an apology for not choosing a more clear name, or a more reasonable set of parameters, or for the failure to use explanatory variables and explanatory functions. Apologies for making the code unmaintainable, apologies for not using well-known algorithms, apologies for writing 'clever' code, apologies for not having a good version control system, apologies for not having finished the job of writing the code, or for leaving vulnerabilities or flaws in the code." Code is for the 'how', comments are for the 'why'. As far as I can tell, type hints can never be the 'why', only the 'how'. While I can see the appeal of being able to click a variable and go to the definition, the fact that the variable is vague in the first place is what I'd prefer to focus on.
Do you mean that this CEP does not cover how to write type hints? By the wording, I thought the PEPs might have provided guidance on when type hints were, or were not, appropriate, but was unable to do so. |
Personally, I don't want type hints in our codebase. If we were starting new greenfield development, I would be on-board with declaring that project a 'type hints' project. But as it stands, with a codebase as large and old as commcare is, I don't feel the positives outweigh the negatives. Negatives:
|
Personally I like using type hints for the benefits I get in terms of documentation and IDE support. In terms of static type checking, I haven't actually found that to be very useful. I have used it on one project and it resulted in me spending a bunch of time just figuring out make the type checker happy or adding exclusions to the rules. Having said that I have once caught a bug because of it (refactor that changed a type but not all places that used it). In terms of type hints vs variable names, I don't think you can really compare them unless you plan on including type information in the var names: (types taken from HQ code) # naming gives you decent info here but doesn't help with IDE features.
# Names can get quite long.
form_datum: SessionDatum
stack_datum: StackDatum
form_datum_meta: FormDatumMeta
workflow_datum_meta: WorkflowDatumMeta
# variable naming doesn't tell you much here unless you give a very long explicit name.
# And still you don't get any of the type hint benefits
datums: List[Union[SessionDatum, StackDatum]]
session_or_stack_datums In terms of readability I have not found to it be an issue except in rare circumstances which usually means a function is taking in too many args or you should create a type alias.
That's true, there is no guarantee, but I it is more likely that type hints will be maintained than comments since they do give tangible value to developers and are 'part of the code' in a way that comments aren't. Overall I'm pro using them (and we already are in parts of HQ). |
TL,DR: I don't want type hints in HQ. Functions in Python are typing fences. All type checkers that I am aware of do not infer the types of function arguments. If you want autocomplete on arguments inside a function you must declare their types. I am concerned that this incentivizes full type declarations for every function except for the most trivial. There are several reasons why this feels problematic, the most salient being that HQ would likely have a very long if not indefinite period of having type hints scattered inconsistently throughout the code. As long as that is true it will probably be difficult to attain the most important goal listed in PEP 484: static analysis. As stated previously, I want us to implement a static type checker as part of our build process before we start using type annotations in HQ code. It should be obvious to any contributor when they have broken the code, regardless of what editor they use to make the change. Personally, I don't think the amount work required to do that, including long term maintenance needed to keep a type checker happy, is worth the benefit we will get from it. We have other priorities such as keeping dependencies up to date that seem more important and yet have been hard to resource. I am concerned that type annotations will make developers feel falsely confident about the types of arguments passed to a function. Without static analysis errors may be hard to detect. Consider this code def itemize(items: dict) -> Iterable[str]:
for key, value in items.items():
yield f"{key}: {value}" Is it safe to pass in a value that is not a Is it safe to refactor this function in a way that would call On one hand, declaring On another hand, someone refactoring that function would feel more confident about using other methods of Finally, I think type hints are ugly. They add boilerplate that clutters the code and sometimes makes it take more lines. Typing verbosity (or lack thereof) is the main reason why I prefer Python to languages like Java and C#. Type hints make Python feel more like Java. Edit: minor wording improvements and added thoughts about why I don't think it's worth adding a static type checker. |
I think there are two main things happening here: (In reverse order) the second thing is that our perceptions of the value of type hints are different. I can address this one. The first thing is the aesthetic. I can't do much here. I can only observe that aesthetics are strongly influenced by what we are accustomed to. It seems as if the people who have used type hints the most are the ones who find the highest value of type hints to be at the level of the function. And the people who have used them less perceive their value to be at the level of the codebase. We seem to be reading the same few sentences, and understanding them differently. Quoting PEP 484 again, "Rationale and Goals":
I think the value from offline type checkers is dwarfed by the value of "code completion and refactoring". These are everyday tasks that we do all the time. Shaving off a small amount of effort here compounds into big wins over time and multiple developers.
Offline type checking may prevent some errors, but it does not avoid the need for unit tests, and unit tests can and should be responsible for catching those errors in Python. Offline type checking can catch type errors at a lower cost than unit tests, but I think good unit tests diminish that value, because they should exist whether type checking is happening or not. But I don't think that diminished value is relevant to us. Automated type checking might be a non-starter, because it looks like mypy will take a lot of effort to configure, and pytype will take a long time to run. I'd love someone with more experience with either of those tools to prove me wrong, but I worry this is a distraction from the greater benefit to us: In the case of commcare-hq, the longer-term benefits of type hints to refactoring, and working with other developers' code, outweigh offline type checking.
It sounds like you're suggesting that someone else's opinion of what the HQ codebase should look like is more authoritative than our own. pydocstyle (via flake8) thinks that all public modules, classes, methods, and functions should have docstrings. We don't think so, and I think we have every right to disagree. In the same way, it's up to us to decide whether we use type hints universally.
In general, that is a truism I agree with. But when it applies to actual examples of type hints, I don't find the following confusing: from __future__ import annotations
from typing import Any, Protocol
def inc(x):
return x + 1
def add_one(x: int) -> int:
return x + 1
def increment(x: Addable) -> Any:
return x + 1
class Addable(Protocol):
def __add__(self, other):
... There was a time when I found two-variable list comprehensions confusing. Maybe it just takes time for the pattern recognition to kick in. Over time, I have found that type hints transition from being ugly and confusing, through being unremarkable, to finally being informative -- particularly informative about the author's intention. It is clear to me from the example above that the intention of the author of And if we are willing to accept that, just like some functions don't need docstrings, some functions don't need type hints, then the example above is also not inconsistent.
That is a good point. Similarly, I would be concerned seeing production code accessing the private attributes of a class. But I still believe it would be counter the spirit or philosophy of Python for the language to prevent developers from doing it. I happily agree that we wouldn't want to see types ignored in our production code. But, in my mind, dynamic typing is core to Python's nature, and type hints are not intended to change that.
I strongly agree.
I'm not certain they fit neatly into either. Maybe they are more accurately described as the "what". So, again referring to my example above, if I were to read, incremented = increment(not_a_number) and my first response might be "wtf?", when I notice the type hint I might realise "oh, that's what the author had in mind all along!"
This sentence jumps out at me. Since September 2015,
I once worked in an old Perl codebase, where some devs had used tabs, some used two-space indents, some three-space indents, and others four-space indents. They also used multiple different ways of passing function parameters ... in the same, 65,000 line file: An undeniable horror show. But I don't think that's really analogous to type hints. Some functions having type hints and others not having type hints is more like some functions having docstrings and others not having docstrings. That feels very different to me than mixing tabs and spaces.
Yeah. That would be sad. It would be wonderful if others were to track the same journey I took, from "urgh" to "meh" to "oh!" But we already work with code written by "devs who love classes" and "devs who don't love classes". It's not perfect. But it's better than having no freedom, where every minute detail of our work is subject to tyrannical dictate. I am a big fan of the idea that "There should be one-- and preferably only one --obvious way to do it." But the correct application of "one way to do it" might be realised better at a higher level than the minutiae of "every public function must have a docstring", or "everything should be a class", or "every function must have type hints". I would say that sometimes "one way to do it" refers to a principle that should be applied differently in different circumstances.
No IDE, or human, can tell what a function will be passed in the future. But some IDEs are better than others at figuring out the type of a variable. We should explore to see what tools work best for us.
I am struggling to understand why it is bad to have some but not all functions with type hints, but it's OK to have some but not all functions with docstrings, or the idea that not everything should be a class. I don't see this as an inconsistency. It's about recognising that not everything is a nail, and we should use the language features that best suit each circumstance.
An incorrect type declaration in Java breaks the code. In Python it's like an out-of-date comment (although, as we've discussed, it's not a comment). And we already have a process to deal with that.
Hard agree. I don't think that is where the value of type hints is for our codebase.
Of course, dropping the type hints would not answer any of those questions. What the type hints give us is an indication that the author considered The answer to "Should you update the type annotation on In the context of commcare-hq, that might also allow the original author to weigh in, and confirm my change, or explain why they chose differently. If the next developer, keeping with your example, wants to refactor this function in a way that would call And, finally, back to the "ugly" objection, and particularly, "Type hints make Python feel more like Java." I hear you, especially when it comes to long lists of parameters. Some mitigating thoughts:
I find that after using type hints in my own projects for a while, they don't grate me any more. The test fakes might have helped. |
Since it seems a strong argument is being made that type hints, at least in our code base, should be used selectively (presumably to increase readability/documentation?). I came across this block while doing other work: def metrics_counter(name: str, value: float = 1, tags: Dict[str, str] = None, documentation: str = ''):
provider = _get_metrics_provider()
provider.counter(name, value, tags=tags, documentation=documentation) and I felt it was illustrative on how type hints are often unnecessary and hurt readability, and how more readability can be gained with better names than typing. What would be lost if this was instead just: def metrics_counter(name, value=1.0, tags={}, documentation='')
I think the main value gained here was showing that I struggle to see how the above type hints would aid in development -- I don't think autocomplete support for string, integers, and dictionaries does much. As to the specifics of why it hurts readability:
void metricsCounter(String name, float value=1.0f, HashMap<String, String> tags=null, String documentation='') relatively readable -- but I made the choice to use Java, and I get all the benefits static typing has to offer. When I'm writing Python, I don't write Python to be Java. Python places a much higher emphasis on human readability than Java does. That's the reason Python has I'm sure there are other examples I could use that would show a more tangible benefit to typing, but the point I'm trying to make is that there are generally better alternatives. Even if we can agree the typing was unnecessary here, it's still in our code, which tells me that, were we to advocate for type hints going forward, we'd see more unnecessary declarations and less readability. I need more than autocompletion to sell me on that. |
I like this example. And I agree with a lot of the conclusions you draw from it. It could be that this is a useful data point of a function where we don't want to use type hints, because they reduce readability, and they will not help with refactoring in the future. But before we get there ... I think your point about A Dimagi developer reading this code is likely (but not certainly) familiar with Datadog. So "tags" might be more familiar to them than "property_map". (I tried to think of a name that means "maps filter names to filter value strings" but "filter_name_to_value_str" seemed very clumsy.) I agree with you that either a docstring, or a I see the value of a docstring for a developer who is unfamiliar with this function. But these type hints were added by someone who was refactoring an existing
Let's go with that. I personally like shorter lines (because I split my screen vertically and have tests in the right pane) so I might style this a bit differently: def metrics_counter(
name: str,
value: float = 1.0,
tags: Optional[dict[str, str]] = None,
documentation: str = '',
):
"""
Wraps the delegated metrics provider, and adds ``value`` to the
counter named ``name``.
``tags`` maps tag names to values, which the metrics provider can
use to allow metrics to be filtered by tag values. Use this to allow
filtering by domain, or error, for example.
Use ``documentation`` for additional documentation.
"""
provider = _get_metrics_provider()
provider.counter(name, value, tags=tags, documentation=documentation) I read the parameters slightly differently: " Alternatively, I like your function signature, but we can't use def metrics_counter(name, value, property_map=None, documentation=''):
"""
Wraps the delegated metrics provider, and adds ``value`` to the
counter named ``name``.
``property_map`` maps tag names to values, which the metrics
provider can use to allow metrics to be filtered by tag values. Use
this to allow filtering by domain, or error, for example.
Use ``documentation`` for additional documentation.
"""
if property_map is None:
property_map = {}
provider = _get_metrics_provider()
provider.counter(name, value, tags=property_map, documentation=documentation) While it's definitely more readable, I see a problem here that prevents any IDE from providing the author what they really needed: The name "property_map" does suggest that the function expects a If this refactor was your task, what would you do? Maybe use the type hint, complete the refactor, and then delete the type hint afterwards? I've done that before (a few days ago, in fact). Maybe that approach is the best of both worlds; practical and readable. I'm not sure about it though. It feels bad to me. The next time someone needs to refactor my code, it makes it their problem, all over again. Readability matters. But how many hours is my slightly less verbose function signature going to cost someone else later? Maybe it would be fair if the next developer who needed to refactor my function could call me up, and ask me to put the type hints back in for them, just temporarily, so that they don't have to figure them out themselves. And in our hypothetical world where the original author adds type hints each time someone wants to refactor, and then deletes them again afterwards, how many times would that happen before the author decided they should rather leave the type hints in? ... Zooming out a bit, I really appreciate this discussion: Collectively thinking about the trade-offs, and where we, as the developers who have to live with our codebase, feel the balance lies. This is what I was hoping to get out of this CEP. (As an added bonus, we also get to mention type hint styling along the way.) |
Besides some perspective from a "new developer to the codebase", I don't think there is much for me to add that hasn't already been said. @mjriley's comment resonates particularly strongly with me, especially:
I am particularly worried about the inevitable reality of adding type hints to HQ: inconsistency. @millerdev described this well:
As an experienced developer coming on board and working in the codebase, I can say that dealing with code inconsistencies has been one of the biggest contributors to the steep learning curve that is "understanding how to effectively contribute to commcare-hq". I already spend a sigificant amount of time trying to understand "the '[best] HQ' way to do it", and I'm not enthusiastic about adding more constraints to that already non-trivial task. From the perspective of a developer new to the codebase, it is frustrating to grep around the code looking for the answer to "how this is done in other parts of HQ?" and finding two or more different answers. HQ is a large, production codebase that is constantly under heavy development by several separate teams. I'd prefer to see existing inconsistent code get refactored to use fewer alternative conventions rather than introducing more inconsistencies. Adding type hints will put further burden on growing HQ dev teams, and I don't think the pros justify adding (more of) them. |
Thank you for the input @joxl. I have been worrying that I'm not giving enough weight to the "it will make things (even) messier" side of the conversation. So I started working on a "style guide" to specify how I think type hints should be written. This contradicts my earlier statement, "This CEP does not specify how we should use type hints." But after some consideration, if we do decide to use type hints, it will improve readability and pattern recognition if they look the same. I hope a style guide can also improve their value to developers wanting to refactor or understand code. e.g. If you are going to the effort of stating argument types, then you should state the return type too.
Could you clarify what you mean by "inconsistency"? Do you mean that one function definition looks different from another function definition, and all function definitions should look the same? Or do you mean that there exists a principle of when a function definition should have type hints and when it should not, and you are worried that we would apply that principle inconsistently. There is a principle of when code should have a comment: @mjriley said, "comments are for the 'why'" and I agree with that. We should not delete all comments because they make some code look different. Rather, we should consistently apply the principle: We should add them where they add value, and remove them where they don't. The same is true for docstrings. Is the same true for type hints? When do type hints add value? What is that value? What is the principle? Some of us have been working in codebases that include type hints, and have developed a feel for what they're good for, and when. It seems like these are the people who have expressed support for them to me. Am I right in thinking that those of us who are worried that their value is so low that we should rather never use them at all are the same people who have not worked much with them? "They're ugly and they smell like Java" is not completely wrong! But are we being like dogs barking at a new postman, and in a few months we could be wagging our tails every time his ugly, Java-smelling face comes to bring us things, because we've learned to appreciate those things? Maybe we should fire up PyCharm, or VS Code with I am spending my evening writing way-too-long responses (sorry!), because I am worried that we might rule out a feature that could be really beneficial, only because we can't figure out how to use it properly. |
I don't want to suggest that your concerns about inconsistency are invalid. The real problem here could be that Dimagi devs don't apply principles consistently. That's outside the scope of this CEP, but something that should get attention. Improving this won't just affect type hints; it will benefit our velocity, our job satisfaction, and all decisions we make about our codebase. A recent question on Slack is a great example: Older Django apps import I see two possibilities:
But for this CEP, I'm just hoping we can figure out when to use type hints. |
I'm not speaking for @joxl, but with the current proposal I am concerned that we will end up with even more inconsistency than we have now. An example discussed above highlights one of my consistency concerns: def metrics_counter(name: str, value: float = 1, tags: Dict[str, str] = None, documentation: str = ''): This is real code written by one of our most experienced developers (and reviewed by others), and the type hints are inconsistent. I mean no disrespect to the developer who wrote that code. It's very easy to make mistakes like that. I expect we would get a fair amount of this type of inconsistency without an additional tool to do automated static analysis as part of each test run. The other area where I'm worried about (even more widespread) inconsistency (than we have now) is in the application of type hints to new and existing code, were we to agree to adopt them. What would the process of adoption look like? Certainly no one is going to apply type hints to every function in HQ in a single PR, or even as part of a single resourced migration effort. I imagine the most practical approach would be to roll it out in voluntary iterative steps, as developers have time and feel compelled to do so. This sounds like a very long, possibly indefinite, period of inconsistency. This is not appealing to me, especially since I am not a fan of type hints, and therefore would not frequently feel compelled to add them to code I'm working on. For consistency, I would have preferred that we never allowed type hints in HQ at all, but I also did not want to force anyone to curb their excitement around learning a new language feature by saying "You can't do that here." That is why I agreed to allow developers to use their judgement to apply them consistently within sub-components they maintain and own over an extended period of time. I thought we had agreed that they would not be sprinkled randomly throughout HQ code, and I hope we will stick to that agreement until we can reach some other consensus.
I don't think so, but I only speak for myself. I have played around with them in a side project. Probably not as extensively as some other developers on the team. I have also read and worked with code written with TypeScript, which has many similarities.
I'm not going to suggest what you do with your spare time, but I have other priorities that feel more important than spending my free time playing with something I am not excited about. A couple days ago I was listening to an interview with Peter Wang, co-founder of Anaconda, and he said something important:
He wasn't talking specifically about type hints in Python, but he was talking about how Python is easy and beautiful because it does not force you to declare types. The utility of type hints is low without automated static analysis (a kind of compiler). A static analysis tool could possibly catch errors that otherwise would only be caught with slow integration tests. Until we have that, I don't see much value for HQ. |
Yeah. That was a little presumptuous of me.
I don't think that commcare-hq is the best codebase for learning new language features. In the past I have first tried things out in a toy codebase or a side project. I hope other devs do that too.
I agree. I don't think type hints are for everywhere. (The principle I use is If other modules call this code, then type hints can help whoever writes those other modules (including myself). I don't use type hints in scripts, and I don't find them useful in top-level modules like views or commands.)
OK. What do you all think of the following?
|
I see three options:
#1 is obviously my preferred option. #2 I'm on board with, provided we agree that the value proposition is high enough to undertake that work. #3 I firmly believe is the worst of all possible worlds, and yet it seems to be the path that is being pushed for. @joxl had an excellent comment pointing out the issues surrounding inconsistency. It seems somewhat consistent to restrict them to modules, but that seems like we're opting in to a bus factor issue in the future, when inevitable turnover happens, and the person picked to maintain the type hints code isn't particularly comfortable using them. I also am not sure if, within the type-hinted module, we are still recommending judgment on what is or is not type hinted. Researching type hints, I watched the excellent Type-Checked Python in the real world and Putting Type Hints to Work, as well as reading Diminishing returns of static typing. Every presentation or writeup I came across stressed the main value type hints bring is static analysis, i.e. running Previously, you mentioned:
Yes, I do believe in relying on the collective wisdom of larger companies for best practices. I'm not saying we don't have some specific oddities to our company, but in general, best practices are best practices for a reason. The talks I posted above are from Instagram and Coffee Meets Bagel, and there are many other large, established companies that apply type hints consistently. What about CommCare is so unique that we need to do things our own way here? However, let's return to option #2. Recently, I had a bit of egg on my face as I added parameters to an existing method, not realizing that it was the base method in an inheritance hierarchy. I ended up having to make a series of staging deploys as I fixed one issue, only for the very same issue to happen with the next child class. This was an area of code that was untested, but to have prevented this issue from happening would have required a fairly specific and unlikely test to be written. Because the failure was in calling an expensive method that ordinarily would have been mocked out, we would have either needed complicated integration tests for all of these child features, or have written tests that verified the specific mocked call parameters. This is a situation that tests are fairly lousy at. On the other hand, had type hints been used consistently throughout our codebase, this is an issue static analysis would have solved trivially. I can see value there. To @millerdev 's point, the issue of migrating our code base into one that uses type hints appears intimidating, and based on other migrations, would likely take years. That's certainly not something worth doing. However, MonkeyType exists, and that seems like something that could realistically cover most of our code base. My understanding is it uses debugger hooks so that as you run your program as normal, it records the types of parameters in a separate Python Interface (.pyi) file. A separate utility can later be run to apply those interface definitions to your actual code. That said, I still find them ugly and would prefer to just not use type hints. Foregoing type hints will not leave us in the dust. Typescript has been around for 10 years now, compared to Type Hints' 7. A 2021 language popularity rankings has Typescript as the 8th most popular language, ahead of even Ruby. Yet, the most popular language is still overwhelmingly plain Javascript. I believe Django has been talking about type hints for the last 5 years, but still hasn't put them into the codebase.
Type hints aren't something like a new framework. Statically typed languages, offline static analysis tools to work with those languages, and IDE support for those languages, have existed for a very long time. What else does a C# developer, for example, need to know to have a fair opinion on type hints? I definitely appreciate the amount of effort everyone has put in to contributing to this (and I learned something about default mutable parameters!), but I do feel like we're liable to continue arguing in circles. I think we should just put this to a vote among anyone on the team who is interested, asking which of the 3 options listed above is preferable, and go from there. |
+1 for a vote. Is ranked choice too complicated to implement for this? |
A vote? Already? I've really been enjoying this steady back-and-forth. I'd be up for a ranked choice vote.
Yes! And yes!
Do not use type hints at all: Django Use type hints (almost) everywhere: Quantlane, Dropbox (former employer of Guido van Rossum) Leave type hint usage up to the developer: Instagram:
The collective wisdom has chosen each of the three options.
When I say, "It sounds like you're suggesting that someone else's opinion of what the HQ codebase should look like is more authoritative than our own," I do not mean that we should go rogue (or even go Frank Sinatra). I just don't want us to parrot someone else's approach unthinkingly. We should do what we're doing right now: Evaluate the choices made by other companies; read their blogs, watch their YouTube videos. And then have a conversation among ourselves. We do our best to hire smart people, with valid opinions, including you. Listening is part of Dimagi's culture. We shouldn't make decisions like this without properly listening to each other. Only then can we make the wisest choice we can, for ourselves. You, @millerdev and @joxl have all expressed strong concerns about consistency. You say that Instagram applies type hints consistently, and Instagram says that their type coverage has grown organically as their developers have chosen to use type annotations. It is easy to imagine that you and I agree that "consistent" operates on a higher level than simply "everything looks the same". I want a consistent code base. I think we can achieve that by following these steps (not just for type hints, but for all aspects of our codebase, including currently inconsistent things):
I am strongly opposed to mandating that all code must have type hints:
The reason I opened this CEP was to establish a common starting position and begin a process that could result in people, each, happily, reaching a shared conclusion. But in practice that process can take years, if it ever happens at all. And comparing where we are now with our starting position, it looks like nobody has really shifted at all. I have learned a bunch of things about type hints that I didn't know a week ago, but I still think they are worth using, and I still don't want other people to be forced to use them. So, if we've had enough of all this reading (and frankly I can't spend another evening writing responses), how should we go about voting? A bunch of "1: foo, 2: bar, 3: baz" comments in a thread in #gtd-architecture-team? |
I think it's interesting we interpreted the same line differently, again. I've reached out to Carl to hopefully clarify, but I think it's unlikely that they didn't have the goal of moving to 100% type annotations in the future. One reason would be that, with hundreds of developers, they most likely don't have specific code silos where one developer can say 'This is my code, I do/do not want type annotations on it'. For another, Carl's writeup on MonkeyType here ends with:
And then there's Instagram's PyCon keynote about moving to Python 3 here where they cite type hints as one of the prime motivations to move to Python 3. The issue with devolved authority is that we aren't a bunch of a federated states -- we are a single codebase. While you may be the most likely person to work with motech code, for example, I have to work with it when I'm doing some cross-cutting work (like the XSS issues), and SaaS developers end up having to dig into modules outside their expertise as part of the Genie rotation. What makes me unhappy isn't having to use type hints when I would prefer not to -- there are plenty of practices that I follow professionally that I don't elect to use personally. What makes me unhappy (and frankly, less productive) is seeing code written one way in place A, and another way in place B. It's harder to understand, leads to abuse as an inexperienced developer copy/pastes your code from the place 'it belongs' into another module that doesn't follow those standards, and leads to confusion when code reviewing. Apparently Guido never expected people to write Python programs containing more than a few thousand lines of code. It seems Instagram and Dropbox, having codebases well beyond that, turned to Type Hints as a way to ease the maintenance burden Python gets when it scales to that size. If we feel we're there (or have plans to be there), then let's use Type Hints. If not, let's not. And if we do use type hints, please, just KISS -- new code gets type hints, do your best to update old code you touch with hints, and encourage people to use MonkeyType. That's a sustainable model that follows how we treat testing. It doesn't put an undue burden on developers, is easy to follow, and, at least for myself, would not make me unhappy. |
Hi all, Have been reading through the conversation without jumping in but someone said the word 'vote,' which is my codeword on questions on standardization, so I wanted to clarify a few points. One super important thing I wanted to make sure to communicate: Everyone should consider the default decision-making mechanism of the codebase as a shared common good as consensus, not democracy. Ideally the result of robust conversations about ambiguous / no-right answer conversations about style is a clear consensus among affected stakeholders, but in the event that there is not a consensus reached there is not currently a clear path for pushing through a standards change. As has been mentioned a few times in this thread, there's no credible outcome of this situation which creates a "finger snapped" perfect world where we choose a new approach and all of the sudden the existing code follows it. As a result, any changes we make to using different development standards to do things "the right way" results in a new way that the code is fragmented, and a long term commitment to shouldering that new fragmentation while also agreeing to a path for incrementally improving the consistency of the code until we are at a clean point. Right now the consensus standards are essentially "New code should pass linting, and changed code should be updated to pass linting," "python should roughly be pythonic," and "write code that looks like the code around it. "If we want to create a mechanism for decision-making in non-consensus situations to force different code standards on contributors that's something we can pursue, but I think it'll involve a few steps in between figuring out this particular issue. I don't have strong opinions about Type Hints in Python, but I need to be bought in to any changes to how we are making decisions about standards in the codebase. If we are going to start adopting new standards by force (including voting) rather than consensus, I need to have someone (or a body) who can be accountable for setting the objectives for prioritizing what standards work is most important and establishing a viable approach to choosing and adopting new standards which works towards that objective. Otherwise, I think it's fine to continue the larger debate and discussion and continue trying to reach a consensus (or reach an agreement that a consensus isn't possible). I don't want to chill the conversation and effort to improve practice, but I wanted to make sure to communicate clearly that 50% of some quorum of contributors don't have the authority to make substantive changes to the project's approach over the well qualified objections of other stakeholders. |
I've had a couple chats with folks on this topic and wanted to clarify three points from what I said above. 1 - I realize now that it's important to clarify the word 'consensus,' and especially the notion that a voted outcome is not a consensus. A consensus among the team is a much stronger state where the team is either aligned with a decision, including a weighting of the impact and the vehemence of agreement, disagreement, or neutrality. The bigger the change, the more an enthusiastic agreement is important. 2 - I was perceiving the core debate here to be "adopt a project-wide standard for Type Hints," not "can some people use type hints and others not," which is something else that has been mentioned. We don't have strong consistency standards, but some people starting to adopt semantic annotation-level type hints into the code isn't something that passes the baseline "code should look like the code around it" check across the codebase. I'll admit this can be a fuzzy line, but I think Type Hints don't fall into the category of something that some people can adopt in-part only. If people want to "try out" Type Hints in a python codebase, we could pick one of the smaller ecosystem projects (DET, CCSync, etc) and test it somewhere that folks can agree is ok to apply comprehensively as a project standard. I'll keep making time to chat with folks about this, and about how we can best structure the team towards different ways of making these kinds of decisions if they are a priority. |
Dropping in to say I haven't had time to follow up since my comment. I'm not ignoring this, just haven't had the luxury of time read and process the updates. I'll try to review this next week. |
I noticed some interesting code in a recent PR which seems relevant because it adds type hints in the doc strings of functions. Here's an example: def index_exists(self, index):
"""Check if ``index`` refers to a valid index identifier (index name or
alias).
:param name: ``str`` index name or alias
:returns: ``bool``
""" This seems like a worse world to be in than if type hints had been used since it has all the same issues but none of the benefits. It is not verifiable with static checking, does not provide any IDE support and is just as likely to become stale as type hints. I also think it is more verbose and less readable. Judging by suggestions in this thread, either better naming is in order (thought that still doesn't help with indicating a return type): def index_exists(self, index_name_or_alias):
"""Check if ``index_name_or_alias`` refers to a valid index identifier (index name or
alias).
:returns: ``bool``
""" Or type hints could be used: def index_exists(self, index: str) -> bool:
"""Check if ``index`` refers to a valid index identifier (index name or
alias).
""" Or perhaps a combination is the best of both: def index_exists(self, index_name_or_alias: str) -> bool:
"""Check if ``index`` refers to a valid index identifier (index name or
alias).
""" I'm curious what others think. |
In the above example, I think both type hints and a doc string are superfluous. def index_exists(self, index_name) If the choice is between type hints in docstrings or formal type hints, I agree that type hints are the better approach. But I think the example here highlights how little value they tend to add, such that the real decision is: do we want to complicate our codebase with sporadic annotations, or just apply a consistent approach? |
I motion to close this CEP, not because I personally disagree with it, but because there is still strong disagreement and there is no apparent trend toward reaching a consensus after a lengthy discussion. This implies that we will return to the status quo. Unfortunately, this discussion has surfaced that we may not have consensus on what the status quo is. In the original proposal to use type annotations a consensus was never established about how and when they (as a particular point of style) may or may not be used in HQ code. My understanding (which is not necessarily correct or authoritative) was that since new sub-components are starting more-or-less “fresh” in their code style, type hints were an acceptable style choice that code owners could choose to adopt for those sub-components. The assumption based on our “code should look like the code around it” standard would be that type hints should be applied consistently within a sub-component if they are used at all. This status quo implies that use of type hints would be inconsistent across the larger HQ code base. Several individuals in this discussion have articulated that for Type Hints in particular this is not an ideal outcome, regardless of their opinion on adopting them. One broader consensus we may agree on is that the value of type hints is most likely to be realized across large code bases which implement them in full, so if we can’t reach a consensus about adopting that approach I want to propose that we also reset the status quo towards not further increasing inconsistency in the code base more broadly, and thus not adopting type hints as new style in new modules without a very specific reason (like mandatory usage in Data Classes). Reading back over that original discussion, I also advocated to have an automated type checker in place before we started using type hints anywhere. At least one attempt was made to implement something like that as part of our CI tool-chain, but it was ultimately unsuccessful and use of type hints in HQ proceeded without it. This post was written with collaboration and input from @ctsims. |
Type checking
After a little wrangling with configuring the GitHub workflow, we now have a working automated type checker: #31225 It runs alongside our tests. Modules to be checked are added to mypy_typed_modules.txt as
Style guide
Our definitions of "inconsistent" are, well, inconsistent, but we do agree that inconsistency is undesirable. Where type hints are used, I would like them to follow PEP 484 with consistent, easily readable formatting. I have put a style guide in a Google Doc because this CEP has become a historical record of our thought process, and didn't feel like the right place for a reference for styling the type hints that already exist in our codebase. Only one value?
A value. Not the value. It seems obvious to me that if one were to plot the value of type hints on a graph, it would not flatline at 0 until the moment one reaches 100% coverage. The blog posts and videos linked above present pretty solid arguments that the value of type hints is proportional to their coverage, and that automated static analysis is an important value, but it is certainly not the only important value.
I find this definition of "inconsistency" superficial. Yet again, why type hints, and not docstrings? I'm being serious. Can someone please explain to me why a function definition that looks different because it has a docstring is not inconsistent, but a function definition that looks different because it has type hints is inconsistent?
A reason which excludes any of the following?
All ... or nothing?If Alice wants automated static type checking of code that she owns, and that helps her to write safer code that is cheaper to maintain, is it fair for Bob to deny her, because he likes function definitions to look different? I mean, maybe it is fair, if both Alice and Bob agree that it is?! If it is more important that all function definitions look the same across the codebase than any of the reasons above, should we not rather strip all |
Thank you for explaining this. I'm going to start off sounding like I disagree, but by the end I'm going to agree, mostly. 📜 Docstrings
Yes. I am trying to understand why "code should look like the code around it" does not apply to docstrings in the same way that we are thinking about type hints.
We do that, universally. But I am happy that we don't do this bit:
Instead, we consistently apply the principle you talked about regarding comments: Use docstrings when they add value and don't use them when they don't. In fact, going against that principle would be inconsistent. 📐 Type hints
This is the crux of everything! I'm pretty confident that this is why this topic has generated so much discussion! And David Beasley's tweet is such a great example, because it's a perfect illustration of what people hate about type hints. It looks so bad, it's hilarious! But it's also a strawman, because that's not what anyone wants, and it's not what I'm advocating. Here is the same function, styled in line with our (comments-and-contributions-welcome) Type Hints Style Guide: If you see those colors, in that layout, often enough, your eyes can skip right over them. They become so recognizable, that they are like fence posts along the side of the highway.
It's not that it's no longer true. What changed was the realization (of Guido and others) that being untyped is suited to some situations, and being typed is suited to others. (Like docstrings!) 🎶 Where we agreeWhat exactly are the situations that Guido thinks are suited to type hints? In Guido's words,
It seems that "write code quickly and be more productive" is great for the first few years of a product. That moment when Twitter switched from Ruby to Scala? Maybe that's the same moment in the life of Dropbox when Guido thought they needed something like type hints, or when Instagram thought they should use them. (To be fair, Twitter's motivation was different, but I think there is a reflection point in the life of any product where the nature of the codebase changes.) So maybe we're on the same page: As with docstrings, we are applying a principle. But with type hints, the principle applies at the scale of a codebase. 💡 Pattern recognitionDeciding to halt, or roll back, type hints isn't going to help with pattern recognition. Someone asked me today, "Will our codebase have type hints in five years' time?" And I guess my follow-up question would be, "How good will our Python skills be in five years time, as a consequence of whatever we decide?" Will our eyes still be stumbling over the red text that appears after those colons? 🙈 Make it stopNever, ever, ever want to see a type hint? That might also be a possibility. We can go and upvote this feature request for an option in PyCharm to hide type hints. I've upvoted it already. If your eyes will always get stuck on that red text, this feature could unlock the best of both worlds: Gradual typing ... and never even noticing it. |
The Curtiss-Wright Corporation ("Wright" as in "the Wright Brothers") shut down its Aeroplane Division in 1948, after failing to make the transition from propellers to jets. Yesterday I fixed a bug, caused by a typing issue that would have been caught by type checking. (The result of a method that returns a datetime or a string was passed to a constructor that expects a datetime not a string.) Where we are at with typing:
I had thought that we could mull this over for a month or two, or many; why push to reach a consensus? But encountering trivially avoidable issues is frustrating. The way forward:
|
I've been largely absent from this discussion because I don't have particularly strong opinions, but I'll put my voice behind the in-between perspective here. I really liked Simon's framing above (or at least my interpretation of it). That is, type hints are sometimes useful as documentation, and are better than including lines like Two recent specifics that come to mind for me:
Using type hints for documentation is beneficial with or without type checking, and it shouldn't reach the level of clutter as with universal type hints. |
I'll be honest -- I tapped out of this discussion when a prominent python author and speaker's difficulties with type hints were characterized as a strawman. I think its very reasonable that those of us who haven't been in the industry for 35 years, haven't written multiple books on Python, haven't given conference talks on Python, aren't teaching courses on Python, and don't regularly tweet about Python are liable to at least produce the same type of jumbled mess of type hints he produced above, if not worse. If we really want to run a pilot program, it should be outside of the HQ codebase, but I don't think a pilot program is a solution to essentially a religious argument -- someone who likes code without semi-colons isn't going to suddenly see a program with semi-colons and change his mind. In my mind, type hints were introduced to prevent errors that could be caught at compile time. If it were obvious that our codebase was struggling with these types of errors, I don't think there would be much resistance to moving to type hints. That's the type of evidence I would want collected. By contrast, replacing one form of type hints (params) with another doesn't really move the needle. The only benefit is that people can now use their IDE to navigate to the dependency? At the cost of introducing more inconsistency? Isn't it likely that someone sees type hints used in this limited area, determines that our project supports type hints generally, and then gets code through PR that uses general type hints? That sounds like the situation that triggered this CEP in the first place. Good python code without type hints has been written for decades now, and it's still being written without type hints. Doing a quick survey of some of our most widely used dependencies, for example, and looking at the most recent code available on GitHub: Does not use type hints:
Uses verbose docstrings, sometimes limited to just public APIs:
Moving towards type-hints:
Full type-hints:
|
I am a big fan of Dave Beazley. I bought (and even read!) two of his three books. I loved and followed his series on generators and coroutines. He is undoubtedly much smarter and more experienced than me. I think it's fair to say that his difficulties with type hints are not for a lack of intellect, but still real and valid. It's not clear to me whether the code in his tweet is his own. But it is clear that he chose not to lint it or to show syntax highlighting. I don't think it's controversial, or disrespectful, or even insensitive, to say that he did not go out of his way to "steelman" type hints and represent them at their most readable. I think he is just trying to make a point. And if he is, that's fine! We don't all have to like every feature of Python. I'm perfectly OK with the idea that Dave Beazley doesn't like type hints. Type hints are not for him. He's not working in a large codebase. By my count 1, curio has ~5K lines. SLY has ~15K lines. (Django has 651K. SQLAlchemy has 486K.) HQ has ~3.5M lines of Python. In his own words, regarding his latest book, Python Distilled:
If we think that 3,488,462 lines of Python is a large codebase, then proponents of type hints like Guido, and detractors like Dave, both seem to think that our codebase fits the use case of typing. I would like to have an unemotional conversation about whether typing would benefit our codebase and improve the long-term productivity of Dimagi. I am not up for a religious debate about a language feature.
Of course. That's why we write style guides, and linters, and code formatters. More importantly though, none of us were born with Python skills and experience. We accumulated them through our own perseverance, and the patience and kindness of others. I feel an affinity for the Python community. Working in Python makes me happy. "It fits my brain". It's also a culture. PEP 572 and its aftermath showed that, for some, "fits my brain" more accurately means that Python 2 is embedded in our sense of self, and Python 3 features like asyncio, typing, pattern matching, and the walrus operator, at a minimum are deeply distressing, if not violate our ability to be the best version of that self. In the context of "Impact, Team, Profit" I think type hints move Impact and Profit in the right direction, but if they necessarily have to sacrifice Team in order to do so, then maybe they are not for the HQ codebase. If it has taken me too long for that to sink in, I am sorry for being insensitive. I have benefited from the patience and kindness of others. I've been told that I am a patient person. I have not meant to be unkind. Footnotes
|
Just to re-iterate that my stance on this is that I would like to be able to use type hints in CommCare and I think the broader team would benefit from having them. It is also my opinion that using them with a type checker can improve the quality of the software. I have no ideological stance on them and while the formatting can be irritating at times I don't see that as a blocker. I see them as another tool in the belt that provides a developer with more information about the code they are using and has the potential to provide another layer of automated quality checking. I do agree that they are not necessary everywhere in the code base. I would like to propose doing a trial on a subset of code. The 1.. Set up type checking for the app We can treat it as a hypothetical exercise (meaning that for the purpose of the trial we assume that we will have some use of type hints in CommCare). At the end we make a final decision on whether to proceed with the changes or to revert back. |
I like this idea, because the I took a look at it last night. I completed typing the normal modules. I just need to finish the tests. I'll open a PR once they are done. The exercise has already been beneficial. Type hints exposed a bug here -- |
Hey @kaapstorm / @snopoke - Wanted to jump back in with another gut check on where we are on things.
@kaapstorm - Just to level set on expectations: I haven't seen buy-in from the people who aren't in favor of type hints that they specifically are bought into the process of adopting a full module under some hypothetical approach and reviewing the results. If you are updating an entire module outside of the main branch for your own learning and to inform how you intend to re-approach you are welcome to do so. However, I want to reinforce that the baseline right now is that there is no consensus for change, and that people aren't obliged to participate deeply in new proposals and continue to object for that to remain true. If the result of the work you're doing is a big PR and a new proposal for adoption and people don't have the time or energy to continue debating or objecting, I don't want the lack of active objection to be mistaken for a new consensus. |
Thank you for making that explicit, @ctsims. Those are my expectations, and I should have made that clear. I don't want anyone to think that I'm trying to bulldoze this through. I think there are a few things that I ought to have been more explicit about. I took a different approach with this CEP than other CEPs take: Generally CEPs present a proposal whose details have already been worked out, and it's in its final state. I used this as a channel for discussing typing. In hindsight, before I take this approach again, I will definitely try to find better ways to discuss and explore an idea. Along the way I've learned a lot. And my position has shifted. Coming into this, I had used type hints in a FastAPI project (where they are used to validate requests and responses) and when refactoring or working in other people's code. I find it hugely valuable to be able to Ctrl-click and immediately see a class definition, what methods and properties are available to me, and how best to solve whatever problem I'm tackling. And coming from that point of view I have been perplexed that not everyone else wants this; at times confused, and frustrated. I had never used a type checker like mypy to check the typing I was setting. I didn't have a clear idea of what a type checker can know, and what it can't know. Implementing automated type checking has taught me a lot, and shifted my perspective. I think things are a little more nuanced that I originally thought. It's easy to add type hints. It's not as easy to add typing in a way that can be validated. And if type hints are going to have the value that we (or I) am hoping for, we need to be able to validate them. Adding typing to someone else's code is definitely not trivial. Adding typing to unit tests can be very time consuming, without much payoff. We would need to watch out for a few things that we do all the time. For example: class Foo:
def __init__(self) -> None:
self.bar = None
def set_bar(self, value: int) -> None:
self.bar = value
if self.bar:
do_something() This is not OK. A type checker thinks that bar is def __init__(self) -> None:
self.bar: Optional[int] = None Something else I noticed is how we extend lists: class Duck:
def quack() -> None:
print('Quack')
class Quacker:
def quack() -> None:
print('QUACK!')
class DuckProto(Protocol):
def quack() -> None:
...
def everywhere_a_quack_quack() -> list[DuckProto]:
ducks = [Duck()]
quackers = [Quacker()]
return ducks + quackers This is wrong in several ways. To make it work you need ... def everywhere_a_quack_quack() -> list[DuckProto]:
ducks: list[DuckProto] = [Duck()]
quackers: list[DuckProto] = [Quacker()]
return ducks + quackers (And PyCharm will complain. PyCharm is not yet good with Protocols. (There is an issue logged for this, but I can't find it at the moment.)) Tests are worse. We often use complicated data structures for passing values, and asserting expectations. Typing these tests can take a while to unpack and understand the data structures they use. And for what payoff? We never call test code from outside the tests. They are never executed in production. The bottom line is that doing typing correctly is not simple for Pythonic code. I will open a PR (and mark it as "Do not merge"!) so that those who are interested can follow my working as I added typing to the metrics app. Here are some of the things I learned by doing that:
Based on these lessons, I think my ideal scenario would be to identify a small number of apps that get used widely (like metrics?), or where data types are significant and getting them wrong is easy (maybe motech?), or where there is above average churn and changes would be faster if data type were explicit and definitions were one click away. We guestimate the effort to type it. And if we decide it's worth it, we type the whole app (excl. tests). I hear the position that this is too much effort, and we should just remove existing type hints. I still find that position confusing. Maybe people who are smarter than me find it easy to modify code or add features in areas they are unfamiliar with. But for dummies like me, I would advocate typing apps where the people working in them would get the most value ... and leaving (or adding tests and/or removing existing type hints) in apps where the people working in them aren't going to find them valuable, or the effort is higher than the benefit. I appreciate that our discussion has ruled out this ideal. At this stage I'm willing to compromise to either an "all the code must have type hints", or a "none of the code should have type hints", but I think that both those stances either miss out on substantial benefits for some developers, or is too clumsy for what seems to me to be a more nuanced language feature. (That PR: #31552 ) Edit: Thanks for pointing out the missing |
I found this issue on https://ovio.org/projects and would love to contribute! i'm Arby |
@Arbythecoder I'm delighted you found us. Maybe the two best places to start are
The community forum is a good place to ask questions, and get directions on what you might want to do and how to do it. |
Hi Norman,
Thank you for reaching out and providing the resources. I appreciate the
guidance and will definitely check out the document on contributing and the
developer community forum. I'm looking forward to getting involved and
contributing to the community.
Best regards,
Arby
…On Tue, Aug 27, 2024 at 10:23 AM Norman Hooper ***@***.***> wrote:
@Arbythecoder <https://github.com/Arbythecoder> I'm delighted you found
us. Maybe the two best places to start are
- Our document on contributing
<https://github.com/dimagi/commcare-hq/blob/master/CONTRIBUTING.rst>
- Our developer community forum
<https://forum.dimagi.com/c/developers/5>
The community forum is a good place to ask questions, and get directions
on what you might want to do and how to do it.
—
Reply to this email directly, view it on GitHub
<#31160 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/A2WG7NTSBXI7IDGTGFZE6F3ZTRAPXAVCNFSM6AAAAABNFFU2T2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMJSGAYDIMRZGU>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Abstract
The use of type hints in the CommCare HQ codebase has been discussed in the past. It was decided that we would not adopt them until they can be verified automatically, as part of a build process.
At the time, we evaluated mypy and pytype as tools for doing that verification, and realised that both of them would require an unreasonable amount of effort to configure for use with the commcare-hq repo. Progress stopped there.
Since then, I've been using type hints in my own projects, and not only have I found them to be valuable, but I have also come to believe that verifying them misses a subtle but important aspect of their design, and I no longer think that automatic verification is desirable. More on that under "Specification".
Motivation
Why do we care? Type hints have been controversial. Surely it's easier simply to avoid the controversy and not use them.
We care because the goals of type hints offer tangible value to us. From PEP 484:
I am trying to move this initiative forward because I find type hints valuable in my personal projects, and I would like Dimagi to be able to benefit from them too.
Being able to Ctrl + Left-Click to jump straight to the definition of a variable's class can save a lot of time, trying to search for and trace back through function and method calls to find where a variable is first set, in order to figure out its class, and go to that class's definition. Sometimes debugging can make this easier, but not always. Good variable names can also help, but not as often or as much as one might hope.
Type hints are also really valuable when refactoring, to ensure that changes continue to pass parameters in the types that functions and methods expect. If the original author sets those type hints, it saves other developers (or the same developer much later) a lot more time trying to figure what those variable types ought to be. When it comes to the longer-term maintainability of code, the time trade-off that we make for unit tests is applicable to type hints too.
(Type hints do not fulfill the role of unit tests.)
Because they are hints, not declarations, they are a form of documentation. As documentation, they are not necessary or desirable everywhere, and like comments and docstrings, they bring a maintenance burden of their own -- they need to be updated when their variables are refactored. It is the responsibility of the developer to navigate the balance of benefit and burden when determining where to use comments and docstrings, and also type hints.
Specification
Python is designed, and is intended to be used, as a dynamically typed language. Type hints are not meant to encroach on that principle. The authors of PEP 484, Guido van Rossum, Jukka Lehtosalo, and Łukasz Langa, emphasize:
It seems clear to me that type hints are not meant to be used the way that type declarations are in statically typed languages. In fact, the clue is in the name: they are not declarations or mandates or obligations; they are only hints.
This is the subtle but important aspect of their design that I had initially missed: Developers should be able to set a type-hinted variable with a value of another type.
The obvious use case is when using test fakes in unit tests.
But the practice is based on original Python design philosophies:
Python has always treated its developers like adults: "Private" attributes and methods are not private; Python trusts that the developer knows what they are doing, and allows them to access their values. It merely discourages it by making it awkward.
Duck-typing is an established Python idiom. It is encouraged, and reinforced by recent language features, like protocols, and structural pattern matching.
These are the reasons why I think that treating type hints as static type declarations is undesirable; maybe even a mistake that we would come to regret.
This CEP does not specify how we should use type hints. (That is covered by existing PEPs.) This CEP specifies the principles to guide when we should use type hints. We have mentioned all of them already. To summarize:
Type hints are documentation. They add value, and come with a cost. It is the responsibility of the developer to navigate the balance of value and cost.
Automatic verification of type hints is undesirable and impractical. It is the responsibility of the developer to check that type hints, comments, and docstrings that pertain to their changes are correct. We already do it for comments and docstrings. We can also do it for type hints. (Developers who use PyCharm and VS Code even get a little help.)
Type hints are not type declarations. Python grants the developer the freedom to assign values of different types, and expects the developer to know when it is and is not best to do so.
Dimagi has a strong track record of hiring responsible developers, and follows a development process that ensures that poor code choices are addressed before they are released.
I believe that observing these principles will result in practices that increase the value to our codebase, improve its maintainability, and improve our skills as developers, without compromising the quality of the code.
Impact on users
N/A
Impact on hosting
N/A
Backwards compatibility
N/A
Release Timeline
N/A
Open questions and issues
Earlier documents:
The text was updated successfully, but these errors were encountered: