-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure proper log line/filename in warning/errors #3038
base: main
Are you sure you want to change the base?
Conversation
Looking at this more, I'm seeing that the tests fail due to xdist: there's now race conditions on the global root logger... I can get around this by testing
This of course changes the interface...which I'm hesitant to just throw out there. Although, as far as I can tell, nothing in this codebase uses them outside of Any thoughts? |
That's the point :). import logging
logger = logging.getLogger(__name__) and things will work. I would agree that some things could be cleaned away.
|
@copperchin: Any chance you could take another look at your PR here and decide how you'd like to carry it forward? |
Yes, I'll take another look at this. I don't actually remember why this stalled but, reviewing the notes here, it looks like this might have been blocked by a deeper issue. If that's the case I'll open a separate issue for that and link it here. |
So I think this stalled because I became mired in a lot of unrelated test failures. At first, I was getting a very large number of test failures due to timezones. Turns out this is an issue in the main branch even without any additional changes. Installing The other things I'm working through is a number of remaining test failures in what feels like unrelated code. These failures arise because rather than testing the direct result of an operation, they count the number of warnings logged. Now, in theory, the changes I'm putting forward shouldn't change the actual logging behavior, so I'm looking into why these other tests no longer pass, but I am a bit concerned about leaving tests to rely on specific logging behavior to pass/fail. Anyway. I'll keep kicking at this for now, but wanted to leave an update. |
@copperchin If you need any assistance or guidance, I'd be happy to lend a hand. |
@avaris Thanks! I think I had a breakthrough that resolved most of the issues when I last looked at it. I haven't had much time at my machine for the last week and a bit but I'm hoping to be able to spend a bit of time tonight on it; I don't think an updated PR is far off. |
Just working on adding some tests, doing some cleanup, and adding docstrings to my changes. I have a couple of questions regarding style preferences for pelican code:
|
e21bdc7
to
fce597d
Compare
@copperchin sorry, I totally missed your questions
I have a love/hate relationship with type hints. If you start your project with type hints in mind, they are usually great and helpful. But adding type hints to existing and dated projects was most of the time frustrating (unless you are willing to refactor it to the point of almost rewriting).
I don't think we have a docstring style. As long as it is accurate I am fine with mostly anything :). |
@avaris Yeah I see what you mean; I initially started adding type hints for new content but it felt so out of place that it seemed more distracting than helpful. I also balked at how long it made lines >_< haha. I see the latest surge of merges (a great thing!) has left this PR with some conflicts -- I'll try to get those resolved this weekend. |
@copperchin is it possible to incorporate #3108 into this? Once this is merged, that PR effectively needs to be rewritten :). But if it will delay this one further, don't worry about it. I can deal with all the conflicts, refactoring after this is handled. |
Funny you should bring that up; I came across #2893 yesterday and commented there -- missed the PR #3108 though. My suggestion was to keep the 2-tuple form and just use regex objects where regex is desired -- this would preserve backwards compatibility. I'm happy to implement that change but I don't want to include it in the scope of this change; I'll work on it after this gets merged. |
Overview ======== Motivated by issue getpelican#3037 (Ensure proper log line/filename in warning/errors). ``pelican.log`` is no longer required to be early in the import order, as the package no longer relies on a custom logging class. Logging setup is now uncoupled so that format and filters can be applied independently, which makes both setup and testing more flexible. Behavior of commandline calls have not changed, except for some modification to help texts. ``LOG_FATAL`` and ``LOG_ONCE_LEVEL`` are new configuration keys. Summary of changes ================== :pelican/__init__.py: * Parsing commandline: * Unify type-converstion from string to logging levels. * Tweak some help text in ``parse_arguments``. * Preserve ``log`` namespace throughout (for clarity). * FIX: ``pelican.listen`` verbosity: avoid assumption that the logger's level will be NOTSET. :pelican/log.py: * FIX (getpelican#3037): warnings & errors now display the correct line numbers. * Split logging filters with focus on a specific aspect of behavior. This allows for more a) independence in testing and b) powerful logger configuration, with fewer side effects. * Isolate configuration *reading* from configuration *application* (see ``pelican/settings.py``). :pelican/settings.py: * Remove ``pelican.log`` dependency. * Isolate configuration *reading* from configuration *application*; to remove dependency on a global state when modifying log filters. * Add validation for LOG_LIMIT values. :pelican/tests/__init__.py: * Remove dependency on ``pelican.log`` module as test runners setup their own console handler anyway. * Only add ``NullHandler`` in absence of a configured logger. :pelican/tests/support.py: * Remove ``LoggedTestCase`` class; tests requiring logging analysis use a contextmanager (``LogCountHandler.examine``) instead. :pelican/tests/test_log.py: * Rewrite to reflect changes in pelican's logging module. * Increase code coverage. :pelican/tests/test_contents.py: :pelican/tests/test_pelican.py: :pelican/tests/test_utils.py: * Import ``unittest`` directly, rather than from other modules. * Switch to using contextmanager (``LogCountHandler.examine``) for test-specific log inspection. * Remove dependency on a custom TestCase (used in logging analysis). * Remove code involved with temporarily disabling log filters -- the filters in question are now only enabled in tests directly testing them (eg. in ``test_log.py``) :pelican/tests/test_settings.py: * Add test for LOG_FILTER values from configs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copperchin This is from an initial pass. I need to play with it to see how it behaves, depending on that more might come :).
try: | ||
return logging.getLevelNamesMapping()[level_name] | ||
except AttributeError: # Python < 3.11 | ||
return logging._nameToLevel.copy()[level_name] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although unintended, I'd rather rely on a semi-official method (getLevelName
) than an internal interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with getLevelName
is that it's intended to take an int
and return a str
, but I need the opposite conversion.
While its technically possible to muscle this behavior out of this logging.getLevelName
- the name is misleading and counter to the intended effect
- the documentation basically says this behavior was accidental, removed, and then added back because it was breaking old code. In short -- using it is relying on a hack.
The preferred way of doing the conversion (logging.getLevelNamesMapping
) is used instead -- but as this isn't available in Python < 3.11 I'm using the alternative as shown above, when necessary.
The other reason I don't use getLevelName
is that it can potentially return a string, and I need to guarantee an int
.
I'd rather keep this as-is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it is accidental (hence why I said "unintended") and maaybe considered a hack... But it is documented and is far more likely to remain as is in a minor patch than an undocumented internal dict. That's why I suggested :). I don't feel that strongly about it, thus I can live with it as is. Why do you .copy()
it though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm just imitating what the standard lib does:
# logging.py
def getLevelNamesMapping():
return _nameToLevel.copy()
I'm guessing its there to future proof against an accidental pass by reference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah OK. Yeah, it's probably copying so that any modification on the returned dict
doesn't affect the original version. It doesn't seem necessary here.
""" | ||
try: | ||
logger.addFilter(flt) | ||
yield flt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't actually use it anywhere but seems like yielding logger
would be the more useful case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logger doesn't need to be yielded: the caller has already provided it, and it isn't even always referenced within the context.
The filter, however, is intended to be temporary; yielding the filter allows the caller to generate the filter on entering the context and have it garbage collected on exiting the context. It essentially allows this:
from pelican import log
import logging
with log.temp_filter(logging.getLogger("test_module", MyCustomFilter()) as tmp_filter:
test_module.do_stuff()
# outer scope needs not keep reference to filter; it can be garbage collected
This is an interface; while not used called within this module, it is actually used in several places in testing, and can be useful in scenarios where a filter is needed only for a subsection of code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What you yield has nothing to do with references. You don't need a reference to either of them outside with
block regardless of what is yielded. Context manager is already doing that during with
. You can yield None
or bare yield
if you wanted. yield
only defines the value that will be assigned to the name after as
.
What I'm saying is, you're far more likely to do something with the logger inside the with
block than a filter or a handler. i.e. your example could become:
with log.temp_filter(logging.getLogger("test_module"), MyCustomFilter()) as logger:
logger.warning(...) #
test_module.do_stuff()
""" | ||
try: | ||
logger.addHandler(hnd) | ||
yield hnd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, yield logger
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above ;)
The logger is already provided, it's more useful to yield the temporary handler to circumvent the need to keep reference to it outside the context.
|
||
""" | ||
super().__init__() | ||
self.limit_threshold = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implies it can be None
but setting it None
will make it error in the comparisons below. e.g.:
if counter > self.limit_threshold:
I assume None
disables it, but then the filter
method needs to include that logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! This needs to be changed, and I should add the None
case to my tests.
def ignore(self, level, message): | ||
self._ignored.add((level, message)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this might have been an artifact of one of the refactors I did in this branch earlier...I think I decided to leave that in as an interface.
Looking at this again now, I'd like to keep it -- I can use it to condense the implementation of OnceFilter.filter()
which is essentially duplicating the same work as part of its process.
self.assertEqual(log.severity_from_name('trace'), 5) | ||
|
||
with self.assertRaises(KeyError): | ||
self.assertEqual(log.severity_from_name("wrong"), logging.NOTSET) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assertEqual
is misleading. It looks like this is expected to return NOTSET
, but it's expected to raise KeyError
. assertEqual
will not run anyway. This is enough:
with self.assertRaises(KeyError):
log.severity_from_name("wrong")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
100% agree, thanks for spotting that!
|
||
def test_levels(self): | ||
logger = self.logger | ||
with log.temp_handler(self.logger, log.AbortHandler(logging.ERROR)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could maybe use another test for logging.WARNING
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I can add another test to confirm it works for not just logging.ERROR
.
def test_alt_logger(self): | ||
self.assertEqual(log.set_filters(self.logger), self.logger) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is tested here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's confirming that when a logger is specified in set_filters
the returned logger matches the specified one. If the logger is not specified, the PKG_LOGGER is expected to return. I can add a comment here to clarify, and better still, a check that the PKG_LOGGER is returned when no logger is provided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, comments would be good. While writing the comment it took me a minute to (maybe) figure out the intent and that was a guess at best :).
count_msgs(len(LOG_LVLS * 2), msg) | ||
|
||
|
||
class classTestOnceFilter(_LoggingTestCase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in name
with LogCountHandler.examine(logger) as count_msgs: | ||
for level in LOG_LVLS: | ||
logger.log(level, msg) | ||
count_msgs(5, msg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
5
-> len(LOG_LVLS)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gah! I missed one! Thanks! ':)
Thanks for the in-depth review @avaris ! This is a pretty big diff, so I really appreciate the effort. I'm hoping to be able to get revisions up here in the next fews days at most, to keep the momentum :) |
I also have a sneaking suspicion that filters will not be active for anything outside |
The topline issue can be solved with #3257, which changes only 2 lines. (I didn't realize this was underway, and in your defense, my fix relies on a Python 3.8 feature, which may not have been an acceptable solution a year ago when you started.) I realize this has grown in scope since...
Hopefully this was fixed by #3246. I suspect it might be a Windows-only issue, which would lower the number of people reporting it. |
Ooooh, this is nice. Can this also be applied toward the unittest/pytest unit test script files in One example is to have a test case that actually verifies not only the STDOUT but the logger as well in ensuring that the line number and column offset of a syntax error really does appear. Right now, I've marked it as |
pytest registers custom handlers to the root logger to be able to capture log records emitted in your code, so you can test whether your program logging behaviour is correct. So |
Fix #3037
Pull Request Checklist
Resolves:
Warning and Error logs messages show wrong source
#3037