Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fugue API #396

Merged
merged 31 commits into from
Dec 30, 2022
Merged
Show file tree
Hide file tree
Changes from 27 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
f6199d7
Add fugue interfaceless util functions
goodwanghan Dec 18, 2022
b1243f5
update tests
goodwanghan Dec 18, 2022
f7be978
fix test coverage
goodwanghan Dec 18, 2022
610f413
fix numpy brreaking change
goodwanghan Dec 18, 2022
3665ef6
update backends for utils functions
goodwanghan Dec 19, 2022
4c9055a
fix
goodwanghan Dec 19, 2022
2889830
update qpd
goodwanghan Dec 19, 2022
98b04e1
refactor code
goodwanghan Dec 20, 2022
3de983e
Add test suite for express functions
goodwanghan Dec 21, 2022
2a52b79
add engine level utils
goodwanghan Dec 22, 2022
fdd4998
refactor code
goodwanghan Dec 23, 2022
df86955
add engine operations
goodwanghan Dec 23, 2022
58a6b9a
update type annotations and docs
goodwanghan Dec 23, 2022
c283a24
lint
goodwanghan Dec 23, 2022
cb89e0f
top api docs
goodwanghan Dec 23, 2022
b3e4b50
Refactor ibis, add fugue sql api
goodwanghan Dec 24, 2022
e9ebb23
make duckdb columns encoded
goodwanghan Dec 24, 2022
3d23b6e
merge
goodwanghan Dec 24, 2022
9465601
improve test coverage
goodwanghan Dec 24, 2022
ce24c9f
fix tests
goodwanghan Dec 26, 2022
3412839
refactor SQLEngine
goodwanghan Dec 28, 2022
49d3724
fix ray tests and coverage
goodwanghan Dec 28, 2022
d79d914
fix tests, add fugue.default.partitions
goodwanghan Dec 29, 2022
f0f0468
update docs
goodwanghan Dec 29, 2022
6db7048
fix tests
goodwanghan Dec 29, 2022
51bd5eb
fix test coverage
goodwanghan Dec 29, 2022
3621015
update docs
goodwanghan Dec 29, 2022
6623c1f
add all sql api functions
goodwanghan Dec 30, 2022
ba64362
lint
goodwanghan Dec 30, 2022
0cdf0f0
add join functions
goodwanghan Dec 30, 2022
dfdc264
Make PartitionSpec more flexible
goodwanghan Dec 30, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,7 @@ pythonenv*

# mkdocs documentation
/site
.virtual_documents

# mypy
.mypy_cache
Expand Down
18 changes: 15 additions & 3 deletions RELEASE.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,20 @@
# Release Notes

## 0.7.4

- [340](https://github.com/fugue-project/fugue/issues/340) Migrate to plugin mode (DataFrames & Extensions)
## 0.8.0

- [384](https://github.com/fugue-project/fugue/issues/384) Expanding Fugue API
- [396](https://github.com/fugue-project/fugue/issues/396) Ray/Dask engines guess optimal default partitions
- [403](https://github.com/fugue-project/fugue/issues/403) Deprecate register_raw_df_type
- [392](https://github.com/fugue-project/fugue/issues/392) Aggregations on Spark dataframes fail intermittently
- [398](https://github.com/fugue-project/fugue/issues/398) Rework API Docs and Favicon
- [393](https://github.com/fugue-project/fugue/issues/393) ExecutionEngine as_context
- [385](https://github.com/fugue-project/fugue/issues/385) Remove DataFrame metadata
- [381](https://github.com/fugue-project/fugue/issues/381) Change SparkExecutionEngine to use pandas udf by default
- [380](https://github.com/fugue-project/fugue/issues/380) Refactor ExecutionEngine (Separate out MapEngine)
- [378](https://github.com/fugue-project/fugue/issues/378) Refactor DataFrame show
- [377](https://github.com/fugue-project/fugue/issues/377) Create bag
- [372](https://github.com/fugue-project/fugue/issues/372) Infer execution engine from input
- [340](https://github.com/fugue-project/fugue/issues/340) Migrate to plugin mode
- [369](https://github.com/fugue-project/fugue/issues/369) Remove execution from FugueWorkflow context manager, remove engine from FugueWorkflow
- [373](https://github.com/fugue-project/fugue/issues/373) Fixed Spark engine rename slowness when there are a lot of columns

Expand Down
8 changes: 8 additions & 0 deletions docs/api/fugue.dataframe.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,14 @@ fugue.dataframe
.. |FugueDataTypes| replace:: :doc:`Fugue Data Types <tutorial:tutorials/appendix/generate_types>`


fugue.dataframe.api
-------------------

.. automodule:: fugue.dataframe.api
:members:
:undoc-members:
:show-inheritance:

fugue.dataframe.array\_dataframe
--------------------------------

Expand Down
45 changes: 45 additions & 0 deletions docs/api/fugue.dataset.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
fugue.dataset
==============

.. |SchemaLikeObject| replace:: :ref:`Schema like object <tutorial:tutorials/advanced/x-like:schema>`
.. |ParamsLikeObject| replace:: :ref:`Parameters like object <tutorial:tutorials/advanced/x-like:parameters>`
.. |DataFrameLikeObject| replace:: :ref:`DataFrame like object <tutorial:tutorials/advanced/x-like:dataframe>`
.. |DataFramesLikeObject| replace:: :ref:`DataFrames like object <tutorial:tutorials/advanced/x-like:dataframes>`
.. |PartitionLikeObject| replace:: :ref:`Partition like object <tutorial:tutorials/advanced/x-like:partition>`
.. |RPCHandlerLikeObject| replace:: :ref:`RPChandler like object <tutorial:tutorials/advanced/x-like:rpc>`

.. |ExecutionEngine| replace:: :class:`~fugue.execution.execution_engine.ExecutionEngine`
.. |NativeExecutionEngine| replace:: :class:`~fugue.execution.native_execution_engine.NativeExecutionEngine`
.. |FugueWorkflow| replace:: :class:`~fugue.workflow.workflow.FugueWorkflow`

.. |ReadJoin| replace:: Read Join tutorials on :ref:`workflow <tutorial:tutorials/advanced/dag:join>` and :ref:`engine <tutorial:tutorials/advanced/execution_engine:join>` for details
.. |FugueConfig| replace:: :doc:`the Fugue Configuration Tutorial <tutorial:tutorials/advanced/useful_config>`
.. |PartitionTutorial| replace:: :doc:`the Partition Tutorial <tutorial:tutorials/advanced/partition>`
.. |FugueSQLTutorial| replace:: :doc:`the Fugue SQL Tutorial <tutorial:tutorials/fugue_sql/index>`
.. |DataFrameTutorial| replace:: :ref:`the DataFrame Tutorial <tutorial:tutorials/advanced/schema_dataframes:dataframe>`
.. |ExecutionEngineTutorial| replace:: :doc:`the ExecutionEngine Tutorial <tutorial:tutorials/advanced/execution_engine>`
.. |ZipComap| replace:: :ref:`Zip & Comap <tutorial:tutorials/advanced/execution_engine:zip & comap>`
.. |LoadSave| replace:: :ref:`Load & Save <tutorial:tutorials/advanced/execution_engine:load & save>`
.. |AutoPersist| replace:: :ref:`Auto Persist <tutorial:tutorials/advanced/useful_config:auto persist>`
.. |TransformerTutorial| replace:: :doc:`the Transformer Tutorial <tutorial:tutorials/extensions/transformer>`
.. |CoTransformer| replace:: :ref:`CoTransformer <tutorial:tutorials/advanced/dag:cotransformer>`
.. |CoTransformerTutorial| replace:: :doc:`the CoTransformer Tutorial <tutorial:tutorials/extensions/cotransformer>`
.. |FugueDataTypes| replace:: :doc:`Fugue Data Types <tutorial:tutorials/appendix/generate_types>`


fugue.dataset.api
-----------------

.. automodule:: fugue.dataset.api
:members:
:undoc-members:
:show-inheritance:

fugue.dataset.dataset
---------------------

.. automodule:: fugue.dataset.dataset
:members:
:undoc-members:
:show-inheritance:

8 changes: 8 additions & 0 deletions docs/api/fugue.execution.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,14 @@ fugue.execution
.. |FugueDataTypes| replace:: :doc:`Fugue Data Types <tutorial:tutorials/appendix/generate_types>`


fugue.execution.api
-------------------

.. automodule:: fugue.execution.api
:members:
:undoc-members:
:show-inheritance:

fugue.execution.execution\_engine
---------------------------------

Expand Down
19 changes: 10 additions & 9 deletions docs/api/fugue.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ fugue
fugue.collections
fugue.column
fugue.dataframe
fugue.dataset
fugue.execution
fugue.extensions
fugue.rpc
Expand Down Expand Up @@ -40,18 +41,18 @@ fugue
.. |FugueDataTypes| replace:: :doc:`Fugue Data Types <tutorial:tutorials/appendix/generate_types>`


fugue.constants
---------------
fugue.api
---------

.. automodule:: fugue.constants
.. automodule:: fugue.api
:members:
:undoc-members:
:show-inheritance:

fugue.dataset
-------------
fugue.constants
---------------

.. automodule:: fugue.dataset
.. automodule:: fugue.constants
:members:
:undoc-members:
:show-inheritance:
Expand All @@ -64,10 +65,10 @@ fugue.exceptions
:undoc-members:
:show-inheritance:

fugue.interfaceless
-------------------
fugue.plugins
-------------

.. automodule:: fugue.interfaceless
.. automodule:: fugue.plugins
:members:
:undoc-members:
:show-inheritance:
Expand Down
8 changes: 8 additions & 0 deletions docs/api/fugue.sql.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,14 @@ fugue.sql
.. |FugueDataTypes| replace:: :doc:`Fugue Data Types <tutorial:tutorials/appendix/generate_types>`


fugue.sql.api
-------------

.. automodule:: fugue.sql.api
:members:
:undoc-members:
:show-inheritance:

fugue.sql.workflow
------------------

Expand Down
8 changes: 8 additions & 0 deletions docs/api/fugue.workflow.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,14 @@ fugue.workflow
.. |FugueDataTypes| replace:: :doc:`Fugue Data Types <tutorial:tutorials/appendix/generate_types>`


fugue.workflow.api
------------------

.. automodule:: fugue.workflow.api
:members:
:undoc-members:
:show-inheritance:

fugue.workflow.input
--------------------

Expand Down
2 changes: 2 additions & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -33,5 +33,7 @@ For contributing, start with the `contributing guide <https://github.com/fugue-p
:maxdepth: 3
:hidden:

tutorials
top_api
api

125 changes: 125 additions & 0 deletions docs/top_api.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
Top Level User API Reference
============================

.. |SchemaLikeObject| replace:: :ref:`Schema like object <tutorial:tutorials/advanced/x-like:schema>`
.. |ParamsLikeObject| replace:: :ref:`Parameters like object <tutorial:tutorials/advanced/x-like:parameters>`
.. |DataFrameLikeObject| replace:: :ref:`DataFrame like object <tutorial:tutorials/advanced/x-like:dataframe>`
.. |DataFramesLikeObject| replace:: :ref:`DataFrames like object <tutorial:tutorials/advanced/x-like:dataframes>`
.. |PartitionLikeObject| replace:: :ref:`Partition like object <tutorial:tutorials/advanced/x-like:partition>`
.. |RPCHandlerLikeObject| replace:: :ref:`RPChandler like object <tutorial:tutorials/advanced/x-like:rpc>`

.. |ExecutionEngine| replace:: :class:`~fugue.execution.execution_engine.ExecutionEngine`
.. |NativeExecutionEngine| replace:: :class:`~fugue.execution.native_execution_engine.NativeExecutionEngine`
.. |FugueWorkflow| replace:: :class:`~fugue.workflow.workflow.FugueWorkflow`

.. |ReadJoin| replace:: Read Join tutorials on :ref:`workflow <tutorial:tutorials/advanced/dag:join>` and :ref:`engine <tutorial:tutorials/advanced/execution_engine:join>` for details
.. |FugueConfig| replace:: :doc:`the Fugue Configuration Tutorial <tutorial:tutorials/advanced/useful_config>`
.. |PartitionTutorial| replace:: :doc:`the Partition Tutorial <tutorial:tutorials/advanced/partition>`
.. |FugueSQLTutorial| replace:: :doc:`the Fugue SQL Tutorial <tutorial:tutorials/fugue_sql/index>`
.. |DataFrameTutorial| replace:: :ref:`the DataFrame Tutorial <tutorial:tutorials/advanced/schema_dataframes:dataframe>`
.. |ExecutionEngineTutorial| replace:: :doc:`the ExecutionEngine Tutorial <tutorial:tutorials/advanced/execution_engine>`
.. |ZipComap| replace:: :ref:`Zip & Comap <tutorial:tutorials/advanced/execution_engine:zip & comap>`
.. |LoadSave| replace:: :ref:`Load & Save <tutorial:tutorials/advanced/execution_engine:load & save>`
.. |AutoPersist| replace:: :ref:`Auto Persist <tutorial:tutorials/advanced/useful_config:auto persist>`
.. |TransformerTutorial| replace:: :doc:`the Transformer Tutorial <tutorial:tutorials/extensions/transformer>`
.. |CoTransformer| replace:: :ref:`CoTransformer <tutorial:tutorials/advanced/dag:cotransformer>`
.. |CoTransformerTutorial| replace:: :doc:`the CoTransformer Tutorial <tutorial:tutorials/extensions/cotransformer>`
.. |FugueDataTypes| replace:: :doc:`Fugue Data Types <tutorial:tutorials/appendix/generate_types>`

IO
~~

.. autofunction:: fugue.api.as_fugue_dataset

.. autofunction:: fugue.api.as_fugue_df
.. autofunction:: fugue.api.load
.. autofunction:: fugue.api.save



Information
~~~~~~~~~~~

.. autofunction:: fugue.api.count
.. autofunction:: fugue.api.is_bounded
.. autofunction:: fugue.api.is_empty
.. autofunction:: fugue.api.is_local
.. autofunction:: fugue.api.show

.. autofunction:: fugue.api.get_column_names
.. autofunction:: fugue.api.get_num_partitions
.. autofunction:: fugue.api.get_schema
.. autofunction:: fugue.api.is_df
.. autofunction:: fugue.api.peek_array
.. autofunction:: fugue.api.peek_dict


Transformation
~~~~~~~~~~~~~~

.. autofunction:: fugue.api.alter_columns
.. autofunction:: fugue.api.drop_columns
.. autofunction:: fugue.api.head
.. autofunction:: fugue.api.normalize_column_names
.. autofunction:: fugue.api.rename
.. autofunction:: fugue.api.select_columns

.. autofunction:: fugue.api.distinct
.. autofunction:: fugue.api.dropna
.. autofunction:: fugue.api.fillna
.. autofunction:: fugue.api.sample
.. autofunction:: fugue.api.take

.. autofunction:: fugue.api.join
.. autofunction:: fugue.api.union
.. autofunction:: fugue.api.intersect
.. autofunction:: fugue.api.subtract

.. autofunction:: fugue.api.transform
.. autofunction:: fugue.api.out_transform

SQL
~~~

.. autofunction:: fugue.api.fugue_sql
.. autofunction:: fugue.api.fugue_sql_flow
.. autofunction:: fugue.api.raw_sql

Conversion
~~~~~~~~~~

.. autofunction:: fugue.api.as_local
.. autofunction:: fugue.api.as_local_bounded
.. autofunction:: fugue.api.as_array
.. autofunction:: fugue.api.as_array_iterable
.. autofunction:: fugue.api.as_arrow
.. autofunction:: fugue.api.as_dict_iterable
.. autofunction:: fugue.api.as_pandas
.. autofunction:: fugue.api.get_native_as_df

ExecutionEngine
~~~~~~~~~~~~~~~

.. autofunction:: fugue.api.engine_context
.. autofunction:: fugue.api.set_global_engine
.. autofunction:: fugue.api.clear_global_engine
.. autofunction:: fugue.api.get_current_engine
.. autofunction:: get_current_parallelism


Big Data Operations
~~~~~~~~~~~~~~~~~~~
.. autofunction:: fugue.api.broadcast
.. autofunction:: fugue.api.persist
.. autofunction:: fugue.api.repartition


Development
~~~~~~~~~~~

.. autofunction:: fugue.api.run_engine_function





Loading