Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve "basics" docs #152

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
157 changes: 93 additions & 64 deletions docs/quickstart/basics.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,46 +8,64 @@ Let's take a look at a very basic API using Flask-Rebar:

.. code-block:: python

# todos.py

from flask import Flask
from flask_rebar import Rebar
from flask_rebar import ResponseSchema
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I've never used ResponseSchema because I've always been the one in full control of my data, and have automated tests that make sure the data I keep in my database stays consistent (validation-wise) with the data my API promises to return. In the interest of keeping this example as basic as possible, while still being as useful as possible, I thought it's better to save Rebar's special RequestSchema/ResponseSchema for a later section, in return for demonstrating the other stuff I added here.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to agree on that one but I think we should move toward #161

from marshmallow import fields
from marshmallow import Schema, fields


rebar = Rebar()
registry = rebar.create_handler_registry()


class TodoSchema(ResponseSchema):
id = fields.Integer()
class TodoSchema(Schema):
# The task we have to do.
description = fields.String(required=True)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally like having different schema for create and get, but this pattern is also nice so I am not sure what we should recommend to beginners.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check out the latest revision (which actually includes a little more explanation in the docs) and let me know what you think.

When the data for the same type of object is for some reason very different in request bodies compared to response bodies, it makes sense to use different schemas. But in the simpler case (which is maybe more common?) where the only difference is that some fields should only be read-only by users, since they're managed entirely by the database or application, then I think using a single Schema with those fields marked as dump_only is a perfect fit. Common examples in my experience are autogenerated ids and automatically maintained updated_at and created_at timestamps, which can all be maintained by a SQL database automatically.

FWIW, this is the pattern I use in the class I've been teaching to beginners and it's been working really well. (For the past year I've been teaching a monthly class at Two Sigma on how to build REST APIs with Flask-Rebar, and by the time they leave the class, students have built a full working CRUD app to manage (you guessed it) their Todo list. 🤣)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely much better with the new comments, I am not sure about the form (explanation before/after code vs inline comments) but that can be improved later.


# The unique id for this Todo. Determined by whatever database we're using.
id = fields.Integer(
# We'll explain the dump_only=True shortly.
dump_only=True
)


@registry.handles(
rule='/todos/<id>',
rule='/todos/<int:id>',
method='GET',
response_body_schema=TodoSchema() # for versions <= 1.7.0, use marshal_schema
# Marshal bodies of 200 (default) responses using the TodoSchema.
response_body_schema=TodoSchema()
)
def get_todo(id):
...
return {'id': id}
todo = _get_todo_or_404(id) # Some helper function that queries our database.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# If we got here, it means we were able to find a Todo with the given id.
# As with Flask, a 200 response code is assumed when we don't say otherwise.
# Return a dictionary with the data for this Todo, which Rebar will marshal
# into the response body using our TodoSchema, as we specified above.
return {'id': todo.id, 'description': todo.description}

app = Flask(__name__)
rebar.init_app(app)

if __name__ == '__main__':
app.run()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(app.run() has been deprecated/discouraged for several years, and it's just as easy to show best practice here instead, so I fixed that too while I was in here.)


Now run your Flask app `as usual <https://flask.palletsprojects.com/en/1.1.x/quickstart/>`__,
e.g.:

.. code-block:: bash

export FLASK_APP=todos.py
flask run

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we should split this into two sections, "Creating a Flask App with Rebar" and "Registering a Handler"? As is this portion on running the app feels out of place and I think makes the lines that follow a bit confusing.


We first create a ``Rebar`` instance. This is a Flask extension and takes care of attaching all the Flask-Rebar goodies to the Flask application.

We then create a handler registry that we will use to declare handlers for the service.

``ResponseSchema`` is an extension of ``marshmallow.Schema`` that throws an error if additional fields not specified in the schema are included in the request parameters. It's usage is optional - a normal Marshmallow schema will also work.

``rule`` is the same as Flask's `rule <http://flask.pocoo.org/docs/latest/api/#url-route-registrations>`_, and is the URL rule for the handler as a string.

``method`` is the HTTP method that the handler will accept. To register multiple methods for a single handler function, decorate the function multiple times.

``response_body_schema`` is a Marshmallow schema that will be used marshal the return value of the function. `marshmallow.Schema.dump <http://marshmallow.readthedocs.io/en/latest/api_reference.html#marshmallow.Schema.dump>`_ will be called on the return value. ``response_body_schema`` can also be a dictionary mapping status codes to Marshmallow schemas - see :ref:`Marshaling`. *NOTE: In Flask-Rebar 1.0-1.7.0, this was referred to as ``marshal_schema``. It is being renamed and both names will function until version 2.0*
``response_body_schema`` is a Marshmallow schema that will be used to marshal the return value of the function for 200 responses, which (as with Flask) is the default response code when we don't say otherwise. `marshmallow.Schema.dump <http://marshmallow.readthedocs.io/en/latest/api_reference.html#marshmallow.Schema.dump>`_ will be called on the return value to marshal it. ``response_body_schema`` can also be a dictionary mapping different status codes to Marshmallow schemas - see :ref:`Marshaling`. *NOTE: In Flask-Rebar 1.0-1.7.0, this was referred to as ``marshal_schema``. It is being renamed and both names will function until version 2.0*

The handler function should accept any arguments specified in ``rule``, just like a Flask view function.

Expand All @@ -60,44 +78,60 @@ Request Body Validation

.. code-block:: python

from flask_rebar import RequestSchema
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(FWIW, I haven't yet hit a case where I've wanted Rebar's RequestSchema; in all my use cases so far, I've preferred "being liberal in what I accept and conservative in what I send", at least as far as unrecognized fields are concerned.)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put it everywhere since I prefer a break fast strategy with a REST API. For example, if for some reason you have a breaking change in the API, you don't want to have clients sending requests thinking that all went well and it applied some changes but in reality it didn't (and no alert since its a 200).

Copy link
Contributor Author

@twosigmajab twosigmajab Jan 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(To be clear, I'm by no means saying it's always wrong to fail hard for unrecognized fields, just that it's definitely not always right. Maybe more citation needed, so just did a quick search and immediately turned up something interesting: https://opensource.zalando.com/restful-api-guidelines/#109 – this topic is deep!)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like this website, thanks for sharing. I think we should try to encourage some of it's principles while allowing divergences because they are quite strict and hard to meet for a lot of people.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So looking with fresh eyes today... That API Guidelines link I posted last night is saying that the Robustness Principle applies on the client side, but on the server side, it doesn't!:

Unknown input fields in payload or URL should not be ignored; servers should provide error feedback to clients via an HTTP 400 response code.

This is not what I thought was best practice, so I'll definitely be doing more research on this! If this API Guidelines doc is correct, then it's quite a validation of flask_rebar's RequestSchema, and I'll correct my understanding, my code, and my teaching accordingly!

In the meantime, the current revision of this PR currently is still removing the use of RequestSchema for the example request_body_schemas. That means going from conforming to these guidelines to not conforming to them. I really don't want to be responsible for, in the process of trying to improve the docs' demonstration of best practices, actually making them worse in this particular way.

At the same time, I'm very keen to see #161 happen, because I do think that having two different schemas (request and response) for the same type of object is not always necessary, and when it isn't, as in this Todo example, it adds unnecessary complexity to our docs. Shall we wait for #161 to be resolved so we can incorporate the outcome into these changes before we merge them?

Thank you @Sytten for prompting this!

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can wait, I will try to prioritize this weekend of monday. Though I am not sure if we want to only do that in V2 or backport it to V1.



class CreateTodoSchema(RequestSchema):
description = fields.String(required=True)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Now reusing TodoSchema for both requests and responses)



@registry.handles(
rule='/todos',
rule='/todos/',
method='POST',
request_body_schema=CreateTodoSchema(),
request_body_schema=TodoSchema(),
response_body_schema={201: TodoSchema()}
)
def create_todo():
body = rebar.validated_body
description = body['description']
. . .
# If we got here, it means we have valid Todo data in the request body.
# Thanks to specifying the request_body_schema above, Rebar takes care
# of sending nice 400 responses (with human- and machine-friendly bodies)
# in response to invalid request data for us.
description = rebar.validated_body['description']
new_todo = _insert_todo(description) # Insert a Todo in our db and return it.
# We'll want to return a 201 Created response with a Location header, so calculate
# the url of the new Todo. We use flask.url_for rather than hard-coding this so
# that if we change the get_todo endpoint's url rule in the future, the url here
# will stay up-to-date.
new_todo_url = flask.url_for(
f'{registry.prefix}.{get_todo.__name__}',
id=new_todo.id
)
response_data = {"id": new_todo.id, "description": new_todo.description}
return (response_data, 201, {"Location": new_todo_url})


``RequestSchema`` is an extension of ``marshmallow.Schema`` that throws an internal server error if an object is missing a required field. It's usage is optional - a normal Marshmallow schema will also work.

This request schema is passed to ``request_body_schema``, and the handler will now call `marshmallow.Schema.load <http://marshmallow.readthedocs.io/en/latest/api_reference.html#marshmallow.Schema.load>`_ on the request body decoded as JSON. A 400 error with a descriptive error will be returned if validation fails.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RequestSchema is still important for marshmallow 2 if I am not mistaken, otherwise additional fields are not rejected. What I think we should do in V2 is a single Schema class that includes the RequestSchema and ResponseSchema, but I will open an issue for that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


The validated parameters are available as a dictionary via the ``rebar.validated_body`` proxy.

Remember when we passed ``dump_only=True`` when defining ``TodoSchema``'s ``id`` field above?
This lets us ignore the ``id`` field when unmarshaling (loading) data,
and only look at it when marshaling (dumping) data.
This allows this schema to be used not just to marshal response bodies,
but also to unmarshal request bodies, where the request either won't know the id
ahead of time, as when creating a new Todo, or otherwise where the id is specified
in the URL path rather than in the body, as when updating a Todo (see below).


Query String Validation
=======================

.. code-block:: python

class GetTodosSchema(RequestSchema):
exclude_completed = fields.String(missing=False)
class ExcludeCompletedSchema(Schema):
exclude_completed = fields.String(
# When this param is not provided, use False as its default value.
missing=False
)


@registry.handles(
rule='/todos',
rule='/todos/',
method='GET',
query_string_schema=GetTodosSchema(),
query_string_schema=ExcludeCompletedSchema(),
)
def get_todos():
args = rebar.validated_args
Expand All @@ -117,27 +151,31 @@ Header Parameters

.. code-block:: python

from marshmallow import Schema


class HeadersSchema(Schema):
class UserIdSchema(Schema):
user_id = fields.String(required=True, load_from='X-MyApp-UserId')


@registry.handles(
rule='/todos/<id>',
rule='/todos/<int:id>',
method='PUT',
headers_schema=HeadersSchema(),
request_body_schema=TodoSchema(),
response_body_schema={204: None},
# Assume we can trust a special header to contain the authenticated user (e.g.
# it can only have been set by a gateway that rejects unauthenticated requests).
headers_schema=UserIdSchema(),
)
def update_todo(id):
headers = rebar.validated_headers
user_id = headers['user_id']
. . .
user_id = rebar.validated_headers['user_id']
# Make sure this user is authorized to update this Todo.
_authorized_or_403(user_id, ...)
_update_todo(id, rebar.validated_body['description']) # Update our database.
# Return a 204 No Content response to indicate operation completed successfully
# and we have no additional data to return.
return None, 204


.. note:: In version 3 of Marshmallow, The `load_from` parameter of fields changes to `data_key`

In this case we use a regular Marshmallow schema, since there will almost certainly be other HTTP headers in the request that we don't want to validate against.
.. note:: This example assumes Marshmallow v2. In version 3 of Marshmallow, The `load_from` parameter of fields changes to `data_key`.

This schema is passed to ``headers_schema``, and the handler will now call `marshmallow.Schema.load <http://marshmallow.readthedocs.io/en/latest/api_reference.html#marshmallow.Schema.load>`_ on the header values retrieved from Flask's ``request.headers``. A 400 error with a descriptive error will be returned if validation fails.

Expand All @@ -155,40 +193,31 @@ This default can be overriden in any particular handler by setting ``headers_sch
Marshaling
==========

The ``response_body_schema`` (previously ``marshal_schema``) argument of ``HandlerRegistry.handles`` can be one of three types: a ``marshmallow.Schema``, a dictionary mapping integers to ``marshmallow.Schema``, or ``None``.

In the case of a ``marshmallow.Schema``, that schema is used to ``dump`` the return value of the handler function.

In the case of a dictionary mapping integers to ``marshmallow.Schemas``, the integers are interpreted as status codes, and the handler function must return a tuple of ``(response_body, status_code)``:
The ``response_body_schema`` argument of ``HandlerRegistry.handles`` can be one of three types: a ``marshmallow.Schema``, a dictionary mapping integers to ``marshmallow.Schema``, or ``None``.

.. code-block:: python
In the case of a ``marshmallow.Schema``, that schema is used to ``dump`` the return value of the handler function for 200 responses.

@registry.handles(
rule='/todos',
method='POST',
response_body_schema={
201: TodoSchema()
}
)
def create_todo():
...
return {'id': id}, 201
In the case of a dictionary mapping integers to ``marshmallow.Schemas``, the integers are interpreted as status codes, and the handler function must return a tuple like ``(response_body, status_code)``
(or like ``(response_body, status_code, headers)`` to also include custom headers),
as in the example ``create_todo`` handler function above.
The Schema in the dictionary corresponding to the returned status code will be used to marshal the response.
So ``response_body_schema=Foo()`` is just shorthand for ``response_body_schema={200: Foo()}``.

The schema to use for marshaling will be retrieved based on the status code the handler function returns. This isn't the prettiest part of Flask-Rebar, but it's done this way to help with the automatic Swagger generation.
A status code in the dictionary may also map to ``None``, in which case Rebar will pass whatever value was returned in the body for this status code straight through to Flask. Note, the value must be a return value that Flask supports, e.g. a string, dictionary (as of Flask 1.1.0), or a ``Flask.Response`` object.

In the case of ``None`` (which is also the default), no marshaling takes place, and the return value is passed directly through to Flask. This means the if ``response_body_schema`` is ``None``, the return value must be a return value that Flask supports, e.g. a string or a ``Flask.Response`` object.
Finally, ``response_body_schema`` may be ``None``, which is the default, and works just like ``{200: None}``.

.. code-block:: python


@registry.handles(
rule='/todos',
rule='/foo',
method='GET',
response_body_schema=None
)
def get_todos():
def foo():
...
return 'Hello World!'
return 'This string gets returned directly without marshaling'

This is a handy escape hatch when handlers don't fit the Swagger/REST mold very well, but it the swagger generation won't know how to describe this handler's response and should be avoided.

Expand All @@ -203,7 +232,7 @@ Flask-Rebar includes a set of error classes that can be raised to produce HTTP e
from flask_rebar import errors

@registry.handles(
rule='/todos/<id>',
rule='/todos/<int:id>',
method='GET',
)
def get_todo(id):
Expand Down