Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attribute Errors raised in Parser Silently Ignored #9433

Open
james-mchugh opened this issue Jun 10, 2024 Discussed in #9426 · 10 comments · May be fixed by #9455
Open

Attribute Errors raised in Parser Silently Ignored #9433

james-mchugh opened this issue Jun 10, 2024 Discussed in #9426 · 10 comments · May be fixed by #9455

Comments

@james-mchugh
Copy link

Discussed in #9426

Originally posted by james-mchugh June 3, 2024
Hello.

First, thank you for the hard work on DRF. It has been great workin with it.

There seems to be an issue where an AttributeError raised in a parser is being swallowed by the Request class's __getattr__ method. I've noticed a similar issue reported previously (#4896), but it was closed as the reporter associated with an error from a separate package.

I encountered this because I wrote the server and tested using Python 3.11, and another developer attempted to test using 3.10. My custom parser used a function that was not available in Python 3.10 (hashlib.file_digest), which caused an AttributeError when hitting certain endpoints. Instead of seeing a 500 response and a traceback in the server logs, we were getting a 200 response with an empty dictionary in the response body. We were scratching our heads for a while, as there was no indication anything was going wrong.

Environment

OS: Darwin 23.4.0 Darwin Kernel Version 23.4.0 x86_64
Python: 3.10 + 3.11
DRF: 3.15.1
Django: 5.0.6

Expected Behavior

500 Internal server error response and traceback in server logs.

Actual Behavior

200 response and no errors in server logs.

$ curl --json '{"foo": "bar"}' http://localhost:8000/foos/
{}
[04/Jun/2024 02:59:15] "POST /foos/ HTTP/1.1" 200 2

Code to Reproduce

from rest_framework import viewsets, response, parsers, routers


class BrokenParser(parsers.JSONParser):

    def parse(self, stream, media_type=None, parser_context=None):
        raise AttributeError


class TestViewSet(viewsets.ViewSet):

    parser_classes = (BrokenParser,)

    def create(self, request, **kwargs):
        return response.Response(request.data)

router = routers.DefaultRouter()
router.register("foos", TestViewSet, "foo")

urlpatterns = router.urls

Investigation

This appears to be happening because accessing the data property lazily parses data. If the parsing raises an AttributeError, this is raised up to https://github.com/encode/django-rest-framework/blob/master/rest_framework/request.py#L359 where self._full_data is set to an empty dictionary before the error is re-raised.

This error then raises up and causes the attribute access to fallback to the Request.__getattr__ method via Python's data model. Here, the data attribute is attempted to be retrieved from the WSGIRequest stored in Request._request, and once again, this raises an AttributeError which is caught by https://github.com/encode/django-rest-framework/blob/master/rest_framework/request.py#L423. From here, the original Request.__getattribute__ handling runs again. Now when the data getter runs, self._full_data is already set to an empty dictionary which is then returned.

In the end, an empty response is returned and the error is silently ignored.

Noob question, but why does Request.__getattr__ call self.__getattribute__ when an AttributeError occurs? Based on the Python Data model linked above, __getattr__ should only run if __getattribute__ fails in the first place, so by the time __getattr__ runs, __getattribute__ already tried and failed.

@sevdog
Copy link
Contributor

sevdog commented Jun 28, 2024

Noob question, but why does Request.__getattr__ call self.__getattribute__ when an AttributeError occurs? Based on the Python Data model linked above, __getattr__ should only run if __getattribute__ fails in the first place, so by the time __getattr__ runs, __getattribute__ already tried and failed

This is the behaviour of __getattr__.

The issue is with this method overridden it intercepts and hides every AttributeError which may be raised in any other method/attribute.

That wrapper was previously implemented using __getattribute__ in d13c807 and then re-ported to __getattr__ in #5617 (see also #5576).

It seems that also this implementation has its own problems since it is going to mask every AttributeError even if not raised by the Request instance.

@sevdog
Copy link
Contributor

sevdog commented Jun 28, 2024

either getattribute() raises an AttributeError because name is not an instance attribute or an attribute in the class tree for self; or get() of a name property raises AttributeError

Maybe it is also a problem with python itself, because it is invoking this method when the AttributeError.obj is not the Request object. 🤔

@james-mchugh
Copy link
Author

james-mchugh commented Jun 28, 2024

Noob question, but why does Request.__getattr__ call self.__getattribute__ when an AttributeError occurs? Based on the Python Data model linked above, __getattr__ should only run if __getattribute__ fails in the first place, so by the time __getattr__ runs, __getattribute__ already tried and failed

This is the behaviour of __getattr__.

The issue is with this method overridden it intercepts and hides every AttributeError which may be raised in any other method/attribute.

That wrapper was previously implemented using __getattribute__ in d13c807 and then re-ported to __getattr__ in #5617 (see also #5576).

It seems that also this implementation has its own problems since it is going to mask every AttributeError even if not raised by the Request instance.

I'm still not sure that this explains why __getattr__ calls __getattribute__ though. That makes the attribute access flow as follows:

  1. Access attribute (drf.Request().foo)
  2. drf.Request.__getattribute__ runs to retrieve value for attribute.
  3. If it fails with an AttributeError, drf.Request.__getattr__ runs to get the attribute from the underlying django.Request object (per Python's attribute access handling).
  4. If this fails with an AttributeError, drf.Request.__getattribute__ is called again, even though it already failed when first trying to access the attribute.

(edited to clarify since there are technically two Request classes in play)

@sevdog
Copy link
Contributor

sevdog commented Jun 28, 2024

Found the bug https://bugs.python.org/issue45985

The reason is that .data is a property, thus as of now it is catching the AttributeError and thus calling __getattr__.

@james-mchugh
Copy link
Author

Found the bug https://bugs.python.org/issue45985

The reason is that .data is a property, thus as of now it is catching the AttributeError and thus calling __getattr__.

I do not believe that bug is related. The issue is that a Python property is being used to lazily parse the data, but that parsing can be pretty advanced and AttributeErrors raised from the parsing are not properly handled.

@sevdog
Copy link
Contributor

sevdog commented Jun 28, 2024

Just try it!

class TestBrokenPerser(SimpleTestCase):
    def setUp(self):
        self.factory = APIRequestFactory()

    def test_post_accessed_in_post_method(self):
        django_request = self.factory.post('/', {'foo': 'bar'}, content_type='application/json')
        request = Request(django_request, parsers=[BrokenParser()])
        with self.assertRaises(AttributeError, msg='no in parse'):
            request._parse()

        with self.assertRaises(AttributeError, msg='no in data'):
            request.data

FAILED tests/test_parsers.py::TestBrokenPerser::test_post_accessed_in_post_method - AssertionError: AttributeError not raised : no in data

If you follow the call-stack:

  • Request.data (property)
  • Request._load_data_and_files (does not catch)
  • Request._parse (does not catch)
  • BrokenParser.parse (raises AttributeError)

The AttributeError is raised and reaches a property and then __getattr__ is called, just as explained in that bug.

@james-mchugh
Copy link
Author

james-mchugh commented Jun 28, 2024

To be fair, it is somewhat related, but it's not clear to me if the Python team even sees that as a bug (opened over 2 years ago with no feedback). It seems to me to be more of just how Python handles attribute accesses. Properties are attributes too, so it makes sense that the __getattr__ will be called the same way other attributes are. I would expect that a decision to change this isn't simply a bug, but a design decision that would have to be introduced through a PEP.

The core issue here also isn't a confusing error message, it's that DRF's handling of the error completely suppresses the errors (no error messages at all). I would gladly accept a confusing error message due to the weird interplay between propertys and __getattr__ over that.

@sevdog
Copy link
Contributor

sevdog commented Jun 28, 2024

The bug with the parser may be resolved just by wrapping any exception raised by parser in a ParseError:

--- a/rest_framework/request.py
+++ b/rest_framework/request.py
@@ -356,7 +356,7 @@ class Request:
 
         try:
             parsed = parser.parse(stream, media_type, self.parser_context)
-        except Exception:
+        except Exception as exc:
             # If we get an exception during parsing, fill in empty data and
             # re-raise.  Ensures we don't simply repeat the error when
             # attempting to render the browsable renderer response, or when
@@ -364,7 +364,9 @@ class Request:
             self._data = QueryDict('', encoding=self._request._encoding)
             self._files = MultiValueDict()
             self._full_data = self._data
-            raise
+            if not isinstance(exc, ParseError):
+                raise ParseError(str(exc))
+            raise exc
 
         # Parser classes may return the raw data, or a
         # DataAndFiles object.  Unpack the result as required.

This will not address the bug of Python itself (I also put here the reference to the github issue python/cpython#90143 for better linking).

The problem may also be fixed by using __getattribute__ but that implementation was abandoned in #5617.

@james-mchugh
Copy link
Author

james-mchugh commented Jun 28, 2024

The bug with the parser may be resolved just by wrapping any exception raised by parser in a ParseError:

I mostly agree here. I think the bug can be fixed easily by adding error handling to the property in the data property to re-raise AttributeError as another type of error (the issue you link suggests RuntimeErrors).

@property
def data(self):
        if not _hasattr(self, '_full_data'):
            try:
                self._load_data_and_files()
            except AttributeError as exc:
                raise RuntimeError(exc) from exc
        return self._full_data

It could be handled in _parse as well.

This will not address the bug of Python itself (I also put here the reference to the github issue python/cpython#90143 for better linking).

I don't necessarily agree that this a bug with Python, as there is no confirmation from the Python team that it is a bug. It hasn't been triaged on for over 2 years. Just because someone opens an issue with a project doesn't mean that there is a confirmed bug.

The problem may also be fixed by using getattribute but that implementation was abandoned in #5617.

I think you may be misunderstanding me here. I'm not suggesting that the implementation moves back to using __getattribute__ instead of __getattr__. I'm pointing out that there appears to be a bug with how __getattr__ is implemented because it calls __getattribute__ at https://github.com/encode/django-rest-framework/blob/master/rest_framework/request.py#L423. On the surface, this doesn't make sense given how Python's attribute access works, but maybe there is some unclear inner-workings of DRF that requires this type of handling. At best, it's intentional but very unclear why it's happening. At worst, it's a bug introduced by a misunderstanding with how attribute access works.

If __getattr__ did not call __getattribute__, an error would have been raised when the parser failed with an AttributeError, albeit it wouldn't have been immediately clear what the real problem is.

@james-mchugh
Copy link
Author

I've been digging into this some more as I have some time over the weekend.

First, I'm convinced that python/cpython#90143 is not a bug, as it is documented right here in the Python Docs that an AttributeError raised in a property should be handled by __getattr__.

Called when the default attribute access fails with an AttributeError (either getattribute() raises an AttributeError because name is not an instance attribute or an attribute in the class tree for self; or get() of a name property raises AttributeError).

Second, I experimented with swapping the re-calling of __getattribute__ at https://github.com/encode/django-rest-framework/blob/master/rest_framework/request.py#L423 out with explicitly re-raising an AttributeError, and that passed all of the unit tests. This would make it so AttributeErrors raised in the Parser are longer suppressed, but the error message would not be very clear.

Looking closer at the code in request.py, I noticed the wrap_attributeerror context manager which is used to provide the exact behavior needed (wrap AttributeErrors raised by properties). It's used in several properties, but not all of them. I added a safe_property decorator that utilizes this context manager and updated all the properties in Request to use the safe_property decorator instead. This should ensure that all properties in the Request class have more informative error handling in the event they raise an AttributeError. I've opened the PR #9455 with these updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants