Skip to content

Commit

Permalink
Add fugue API (#396)
Browse files Browse the repository at this point in the history
* Add fugue interfaceless util functions

* update tests

* fix test coverage

* fix numpy brreaking change

* update backends for utils functions

* fix

* update qpd

* refactor code

* Add test suite for express functions

* add engine level utils

* refactor code

* add engine operations

* update type annotations and docs

* lint

* top api docs

* Refactor ibis, add fugue sql api

* make duckdb columns encoded

* improve test coverage

* fix tests

* refactor SQLEngine

* fix ray tests and coverage

* fix tests, add fugue.default.partitions

* update docs

* fix tests

* fix test coverage

* update docs

* add all sql api functions

* lint

* add join functions

* Make PartitionSpec more flexible
  • Loading branch information
goodwanghan authored Dec 30, 2022
1 parent b8b7ace commit 348d081
Show file tree
Hide file tree
Showing 108 changed files with 5,300 additions and 1,890 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,7 @@ pythonenv*

# mkdocs documentation
/site
.virtual_documents

# mypy
.mypy_cache
Expand Down
2 changes: 1 addition & 1 deletion .pylintrc
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
[MESSAGES CONTROL]
disable = C0103,C0114,C0115,C0116,C0122,C0200,C0201,C0302,C0411,C0415,E0401,E0712,E1130,E5110,R0201,R0205,R0801,R0902,R0903,R0904,R0911,R0912,R0913,R0914,R0915,R1705,R1710,R1718,R1720,R1724,W0102,W0107,W0108,W0201,W0212,W0221,W0223,W0237,W0511,W0613,W0631,W0640,W0703,W0707,W1116
disable = C0103,C0114,C0115,C0116,C0122,C0200,C0201,C0302,C0411,C0415,E0401,E0712,E1130,E5110,R0201,R0205,R0801,R0902,R0903,R0904,R0911,R0912,R0913,R0914,R0915,R1705,R1710,R1718,R1720,R1724,W0102,W0107,W0108,W0201,W0212,W0221,W0223,W0237,W0511,W0613,W0622,W0631,W0640,W0703,W0707,W1116
# TODO: R0205: inherits from object, can be safely removed
18 changes: 15 additions & 3 deletions RELEASE.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,20 @@
# Release Notes

## 0.7.4

- [340](https://github.com/fugue-project/fugue/issues/340) Migrate to plugin mode (DataFrames & Extensions)
## 0.8.0

- [384](https://github.com/fugue-project/fugue/issues/384) Expanding Fugue API
- [396](https://github.com/fugue-project/fugue/issues/396) Ray/Dask engines guess optimal default partitions
- [403](https://github.com/fugue-project/fugue/issues/403) Deprecate register_raw_df_type
- [392](https://github.com/fugue-project/fugue/issues/392) Aggregations on Spark dataframes fail intermittently
- [398](https://github.com/fugue-project/fugue/issues/398) Rework API Docs and Favicon
- [393](https://github.com/fugue-project/fugue/issues/393) ExecutionEngine as_context
- [385](https://github.com/fugue-project/fugue/issues/385) Remove DataFrame metadata
- [381](https://github.com/fugue-project/fugue/issues/381) Change SparkExecutionEngine to use pandas udf by default
- [380](https://github.com/fugue-project/fugue/issues/380) Refactor ExecutionEngine (Separate out MapEngine)
- [378](https://github.com/fugue-project/fugue/issues/378) Refactor DataFrame show
- [377](https://github.com/fugue-project/fugue/issues/377) Create bag
- [372](https://github.com/fugue-project/fugue/issues/372) Infer execution engine from input
- [340](https://github.com/fugue-project/fugue/issues/340) Migrate to plugin mode
- [369](https://github.com/fugue-project/fugue/issues/369) Remove execution from FugueWorkflow context manager, remove engine from FugueWorkflow
- [373](https://github.com/fugue-project/fugue/issues/373) Fixed Spark engine rename slowness when there are a lot of columns

Expand Down
8 changes: 8 additions & 0 deletions docs/api/fugue.dataframe.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,14 @@ fugue.dataframe
.. |FugueDataTypes| replace:: :doc:`Fugue Data Types <tutorial:tutorials/appendix/generate_types>`


fugue.dataframe.api
-------------------

.. automodule:: fugue.dataframe.api
:members:
:undoc-members:
:show-inheritance:

fugue.dataframe.array\_dataframe
--------------------------------

Expand Down
45 changes: 45 additions & 0 deletions docs/api/fugue.dataset.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
fugue.dataset
==============

.. |SchemaLikeObject| replace:: :ref:`Schema like object <tutorial:tutorials/advanced/x-like:schema>`
.. |ParamsLikeObject| replace:: :ref:`Parameters like object <tutorial:tutorials/advanced/x-like:parameters>`
.. |DataFrameLikeObject| replace:: :ref:`DataFrame like object <tutorial:tutorials/advanced/x-like:dataframe>`
.. |DataFramesLikeObject| replace:: :ref:`DataFrames like object <tutorial:tutorials/advanced/x-like:dataframes>`
.. |PartitionLikeObject| replace:: :ref:`Partition like object <tutorial:tutorials/advanced/x-like:partition>`
.. |RPCHandlerLikeObject| replace:: :ref:`RPChandler like object <tutorial:tutorials/advanced/x-like:rpc>`

.. |ExecutionEngine| replace:: :class:`~fugue.execution.execution_engine.ExecutionEngine`
.. |NativeExecutionEngine| replace:: :class:`~fugue.execution.native_execution_engine.NativeExecutionEngine`
.. |FugueWorkflow| replace:: :class:`~fugue.workflow.workflow.FugueWorkflow`

.. |ReadJoin| replace:: Read Join tutorials on :ref:`workflow <tutorial:tutorials/advanced/dag:join>` and :ref:`engine <tutorial:tutorials/advanced/execution_engine:join>` for details
.. |FugueConfig| replace:: :doc:`the Fugue Configuration Tutorial <tutorial:tutorials/advanced/useful_config>`
.. |PartitionTutorial| replace:: :doc:`the Partition Tutorial <tutorial:tutorials/advanced/partition>`
.. |FugueSQLTutorial| replace:: :doc:`the Fugue SQL Tutorial <tutorial:tutorials/fugue_sql/index>`
.. |DataFrameTutorial| replace:: :ref:`the DataFrame Tutorial <tutorial:tutorials/advanced/schema_dataframes:dataframe>`
.. |ExecutionEngineTutorial| replace:: :doc:`the ExecutionEngine Tutorial <tutorial:tutorials/advanced/execution_engine>`
.. |ZipComap| replace:: :ref:`Zip & Comap <tutorial:tutorials/advanced/execution_engine:zip & comap>`
.. |LoadSave| replace:: :ref:`Load & Save <tutorial:tutorials/advanced/execution_engine:load & save>`
.. |AutoPersist| replace:: :ref:`Auto Persist <tutorial:tutorials/advanced/useful_config:auto persist>`
.. |TransformerTutorial| replace:: :doc:`the Transformer Tutorial <tutorial:tutorials/extensions/transformer>`
.. |CoTransformer| replace:: :ref:`CoTransformer <tutorial:tutorials/advanced/dag:cotransformer>`
.. |CoTransformerTutorial| replace:: :doc:`the CoTransformer Tutorial <tutorial:tutorials/extensions/cotransformer>`
.. |FugueDataTypes| replace:: :doc:`Fugue Data Types <tutorial:tutorials/appendix/generate_types>`


fugue.dataset.api
-----------------

.. automodule:: fugue.dataset.api
:members:
:undoc-members:
:show-inheritance:

fugue.dataset.dataset
---------------------

.. automodule:: fugue.dataset.dataset
:members:
:undoc-members:
:show-inheritance:

8 changes: 8 additions & 0 deletions docs/api/fugue.execution.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,14 @@ fugue.execution
.. |FugueDataTypes| replace:: :doc:`Fugue Data Types <tutorial:tutorials/appendix/generate_types>`


fugue.execution.api
-------------------

.. automodule:: fugue.execution.api
:members:
:undoc-members:
:show-inheritance:

fugue.execution.execution\_engine
---------------------------------

Expand Down
19 changes: 10 additions & 9 deletions docs/api/fugue.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ fugue
fugue.collections
fugue.column
fugue.dataframe
fugue.dataset
fugue.execution
fugue.extensions
fugue.rpc
Expand Down Expand Up @@ -40,18 +41,18 @@ fugue
.. |FugueDataTypes| replace:: :doc:`Fugue Data Types <tutorial:tutorials/appendix/generate_types>`


fugue.constants
---------------
fugue.api
---------

.. automodule:: fugue.constants
.. automodule:: fugue.api
:members:
:undoc-members:
:show-inheritance:

fugue.dataset
-------------
fugue.constants
---------------

.. automodule:: fugue.dataset
.. automodule:: fugue.constants
:members:
:undoc-members:
:show-inheritance:
Expand All @@ -64,10 +65,10 @@ fugue.exceptions
:undoc-members:
:show-inheritance:

fugue.interfaceless
-------------------
fugue.plugins
-------------

.. automodule:: fugue.interfaceless
.. automodule:: fugue.plugins
:members:
:undoc-members:
:show-inheritance:
Expand Down
8 changes: 8 additions & 0 deletions docs/api/fugue.sql.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,14 @@ fugue.sql
.. |FugueDataTypes| replace:: :doc:`Fugue Data Types <tutorial:tutorials/appendix/generate_types>`


fugue.sql.api
-------------

.. automodule:: fugue.sql.api
:members:
:undoc-members:
:show-inheritance:

fugue.sql.workflow
------------------

Expand Down
8 changes: 8 additions & 0 deletions docs/api/fugue.workflow.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,14 @@ fugue.workflow
.. |FugueDataTypes| replace:: :doc:`Fugue Data Types <tutorial:tutorials/appendix/generate_types>`


fugue.workflow.api
------------------

.. automodule:: fugue.workflow.api
:members:
:undoc-members:
:show-inheritance:

fugue.workflow.input
--------------------

Expand Down
2 changes: 2 additions & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -33,5 +33,7 @@ For contributing, start with the `contributing guide <https://github.com/fugue-p
:maxdepth: 3
:hidden:

tutorials
top_api
api

138 changes: 138 additions & 0 deletions docs/top_api.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
Top Level User API Reference
============================

.. |SchemaLikeObject| replace:: :ref:`Schema like object <tutorial:tutorials/advanced/x-like:schema>`
.. |ParamsLikeObject| replace:: :ref:`Parameters like object <tutorial:tutorials/advanced/x-like:parameters>`
.. |DataFrameLikeObject| replace:: :ref:`DataFrame like object <tutorial:tutorials/advanced/x-like:dataframe>`
.. |DataFramesLikeObject| replace:: :ref:`DataFrames like object <tutorial:tutorials/advanced/x-like:dataframes>`
.. |PartitionLikeObject| replace:: :ref:`Partition like object <tutorial:tutorials/advanced/x-like:partition>`
.. |RPCHandlerLikeObject| replace:: :ref:`RPChandler like object <tutorial:tutorials/advanced/x-like:rpc>`

.. |ExecutionEngine| replace:: :class:`~fugue.execution.execution_engine.ExecutionEngine`
.. |NativeExecutionEngine| replace:: :class:`~fugue.execution.native_execution_engine.NativeExecutionEngine`
.. |FugueWorkflow| replace:: :class:`~fugue.workflow.workflow.FugueWorkflow`

.. |ReadJoin| replace:: Read Join tutorials on :ref:`workflow <tutorial:tutorials/advanced/dag:join>` and :ref:`engine <tutorial:tutorials/advanced/execution_engine:join>` for details
.. |FugueConfig| replace:: :doc:`the Fugue Configuration Tutorial <tutorial:tutorials/advanced/useful_config>`
.. |PartitionTutorial| replace:: :doc:`the Partition Tutorial <tutorial:tutorials/advanced/partition>`
.. |FugueSQLTutorial| replace:: :doc:`the Fugue SQL Tutorial <tutorial:tutorials/fugue_sql/index>`
.. |DataFrameTutorial| replace:: :ref:`the DataFrame Tutorial <tutorial:tutorials/advanced/schema_dataframes:dataframe>`
.. |ExecutionEngineTutorial| replace:: :doc:`the ExecutionEngine Tutorial <tutorial:tutorials/advanced/execution_engine>`
.. |ZipComap| replace:: :ref:`Zip & Comap <tutorial:tutorials/advanced/execution_engine:zip & comap>`
.. |LoadSave| replace:: :ref:`Load & Save <tutorial:tutorials/advanced/execution_engine:load & save>`
.. |AutoPersist| replace:: :ref:`Auto Persist <tutorial:tutorials/advanced/useful_config:auto persist>`
.. |TransformerTutorial| replace:: :doc:`the Transformer Tutorial <tutorial:tutorials/extensions/transformer>`
.. |CoTransformer| replace:: :ref:`CoTransformer <tutorial:tutorials/advanced/dag:cotransformer>`
.. |CoTransformerTutorial| replace:: :doc:`the CoTransformer Tutorial <tutorial:tutorials/extensions/cotransformer>`
.. |FugueDataTypes| replace:: :doc:`Fugue Data Types <tutorial:tutorials/appendix/generate_types>`

IO
~~

.. autofunction:: fugue.api.as_fugue_dataset

.. autofunction:: fugue.api.as_fugue_df
.. autofunction:: fugue.api.load
.. autofunction:: fugue.api.save



Information
~~~~~~~~~~~

.. autofunction:: fugue.api.count
.. autofunction:: fugue.api.is_bounded
.. autofunction:: fugue.api.is_empty
.. autofunction:: fugue.api.is_local
.. autofunction:: fugue.api.show

.. autofunction:: fugue.api.get_column_names
.. autofunction:: fugue.api.get_num_partitions
.. autofunction:: fugue.api.get_schema
.. autofunction:: fugue.api.is_df
.. autofunction:: fugue.api.peek_array
.. autofunction:: fugue.api.peek_dict


Transformation
~~~~~~~~~~~~~~

.. autofunction:: fugue.api.transform
.. autofunction:: fugue.api.out_transform

.. autofunction:: fugue.api.alter_columns
.. autofunction:: fugue.api.drop_columns
.. autofunction:: fugue.api.head
.. autofunction:: fugue.api.normalize_column_names
.. autofunction:: fugue.api.rename
.. autofunction:: fugue.api.select_columns

.. autofunction:: fugue.api.distinct
.. autofunction:: fugue.api.dropna
.. autofunction:: fugue.api.fillna
.. autofunction:: fugue.api.sample
.. autofunction:: fugue.api.take

SQL
~~~

.. autofunction:: fugue.api.fugue_sql
.. autofunction:: fugue.api.fugue_sql_flow
.. autofunction:: fugue.api.raw_sql

.. autofunction:: fugue.api.join
.. autofunction:: fugue.api.semi_join
.. autofunction:: fugue.api.anti_join
.. autofunction:: fugue.api.inner_join
.. autofunction:: fugue.api.left_outer_join
.. autofunction:: fugue.api.right_outer_join
.. autofunction:: fugue.api.full_outer_join
.. autofunction:: fugue.api.cross_join

.. autofunction:: fugue.api.union
.. autofunction:: fugue.api.intersect
.. autofunction:: fugue.api.subtract

.. autofunction:: fugue.api.assign
.. autofunction:: fugue.api.select
.. autofunction:: fugue.api.filter
.. autofunction:: fugue.api.aggregate

Conversion
~~~~~~~~~~

.. autofunction:: fugue.api.as_local
.. autofunction:: fugue.api.as_local_bounded
.. autofunction:: fugue.api.as_array
.. autofunction:: fugue.api.as_array_iterable
.. autofunction:: fugue.api.as_arrow
.. autofunction:: fugue.api.as_dict_iterable
.. autofunction:: fugue.api.as_pandas
.. autofunction:: fugue.api.get_native_as_df

ExecutionEngine
~~~~~~~~~~~~~~~

.. autofunction:: fugue.api.engine_context
.. autofunction:: fugue.api.set_global_engine
.. autofunction:: fugue.api.clear_global_engine
.. autofunction:: fugue.api.get_current_engine
.. autofunction:: get_current_parallelism


Big Data Operations
~~~~~~~~~~~~~~~~~~~
.. autofunction:: fugue.api.broadcast
.. autofunction:: fugue.api.persist
.. autofunction:: fugue.api.repartition


Development
~~~~~~~~~~~

.. autofunction:: fugue.api.run_engine_function





Loading

0 comments on commit 348d081

Please sign in to comment.