Project Scope #1

shawnbrown · 2019-08-24T20:43:36Z

Now that get_reader is its own project, it would be useful to explicitly define its scope and goals. We can always redefine these terms in the future but a working definition can help guide development and prevent scope creep.

The initial motivation for get_reader was to provide a common interface for reading Unicode CSV data across different versions of Python. Reading Unicode CSV data is very different in Python 3 than it was in Python 2.

Here's what I'm thinking for this working definition:

Essential Properties

The get_reader project should:

Provide a common interface for reading tabular data across different versions of Python.

Provide simplified interfaces to multiple data sources that might otherwise have unfamiliar APIs (like a simplified version of the IO tools sub-package in pandas except without the overhead of a dependency as large as pandas).

Be easily vendorable by simply copying it into the other project's directory (no hard third-party dependencies and no modifications to get_reader's source code).

Provide broad support for many different versions of Python.

Read data using memory-efficient iteration (unless explicitly directed to do otherwise)--to support reading data from sources that are larger than available memory.

Non-essential Properties

Provide tools for working with reader and reader-like objects (e.g., ReaderLike for type checking).

Adding an Interface

Before adding an interface (e.g., from_sql(), from_excel(), etc.) it is useful to ask the following questions:

PROs:

Does the interface unify differences across multiple version of Python? Bonus points if it unifies differences between Python 2 and 3.

Can the interface reduce the number of objects a user would otherwise need to manage explicitly (automatically closing files or database cursors)?

Does using the interface take less lines of boilerplate code than it would require to read the data directly? How many lines of boilerplate code does it save? Can it do this reliably without introducing ambiguity or unpredictability?

Does the interface simplify reading data from sources that might otherwise have an unfamiliar API (e.g., DBF, Excel)?

CONs:

Does the interface obfuscate a standard or otherwise well-known API?

Would the feature introduce an API or behavior that is inconsistent with existing interfaces?

Would including the feature compromise the get_reader project's status as a light-weight, easy-to-include dependency?

The text was updated successfully, but these errors were encountered:

shawnbrown pinned this issue Aug 24, 2019

shawnbrown changed the title ~~Define Project Scope~~ Project Scope Sep 2, 2019

shawnbrown closed this as completed Jul 24, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Project Scope #1

Project Scope #1

shawnbrown commented Aug 24, 2019 •

edited

Loading

Essential Properties

Non-essential Properties

Adding an Interface

Project Scope #1

Project Scope #1

Comments

shawnbrown commented Aug 24, 2019 • edited Loading

Essential Properties

Non-essential Properties

Adding an Interface

shawnbrown commented Aug 24, 2019 •

edited

Loading