Skip to content

Commit

Permalink
feat: skip ocr (#36)
Browse files Browse the repository at this point in the history
  • Loading branch information
ciur committed Feb 22, 2024
1 parent c93c0b3 commit 498d665
Show file tree
Hide file tree
Showing 7 changed files with 48 additions and 6 deletions.
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,12 @@ can be found in [changelog.d folder](https://github.com/papermerge/papermerge-cl

<!-- towncrier release notes start -->

## 0.8.0 - 2024-02-22

## Added

- `--skip-ocr` flag. Works only with Papermerge REST API >= 3.1

## 0.7.1 - 2024-02-20

### Fixed
Expand Down
18 changes: 18 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,14 @@ successful import - add `--delete` flag:
PLEASE BE CAREFUL WITH `--delete` FLAG AS IT WILL IRREVERSIBLE DELETE THE LOCAL
COPY OF THE UPLOADED DOCUMENT!

Choose to skip OCR of imported documents with `--skip-ocr` flag:

$ papermerge-cli import --skip-ocr /path/to/folder/

Skip OCR flag can be used with folders (will apply to all docs in the folder)
or with individual documents.
`--skip-ocr` flag will work only with Papermerge REST API >= v3.1

### search

Search for node (document or folder) by text or by tags:
Expand Down Expand Up @@ -169,3 +177,13 @@ or in case of uuid is a folder:
You can also specify the format/type of the downloaded archive (e.g. in case node is either a folder):

$ papermerge-cli download --uuid <folder-uuid> -f /path/to/file-system/folder.targz -t targz


## Version Compatiblity


| CLI Version | REST API version | Remarks|
|-------------|------------------|--------|
| 0.7.0 | 3.0.x ||
| 0.7.1 | 3.0.x ||
| 0.8.0 | 3.1.x | Skip OCR feature introduced|
5 changes: 4 additions & 1 deletion papermerge_cli/lib/importer.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,8 @@ def upload_file_or_folder(
token: str,
file_or_folder: Path,
parent_id=None,
delete: bool = False
delete: bool = False,
skip_ocr: bool = False,
) -> None:
user: User = get_me(host=host, token=token)

Expand All @@ -27,6 +28,7 @@ def upload_file_or_folder(
host=host,
token=token,
file_path=file_or_folder,
skip_ocr=skip_ocr,
parent_id=parent_id
)
if delete:
Expand All @@ -40,6 +42,7 @@ def upload_file_or_folder(
host=host,
token=token,
file_path=Path(entry.path),
skip_ocr=skip_ocr,
parent_id=parent_id
)

Expand Down
14 changes: 12 additions & 2 deletions papermerge_cli/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -92,11 +92,19 @@
help='Delete local(s) file after successful upload.'
)
]
SkipOCR = Annotated[
bool,
typer.Option(
is_flag=True,
help='Skip OCR i.e. do not trigger OCR operation on upload.'
' Works only with REST API >= 3.1'
)
]
TargetNodeID = Annotated[
uuid.UUID,
typer.Option(
help="UUID of the target/destination folder. "
"Default value is user's Inbox folder's UUID."
is_flag=True,
help="Trigger OCR"
)
]
OrderBy = Annotated[
Expand Down Expand Up @@ -138,6 +146,7 @@ def import_command(
ctx: typer.Context,
file_or_folder: FileOrFolderPath,
delete: DeleteAfterImport = False,
skip_ocr: SkipOCR = False,
target_id: TargetNodeID | None = None
):
"""Import recursively folders and documents from local filesystem
Expand All @@ -150,6 +159,7 @@ def import_command(
host=ctx.obj['HOST'],
token=ctx.obj['TOKEN'],
file_or_folder=Path(file_or_folder),
skip_ocr=skip_ocr,
parent_id=target_id,
delete=delete
)
Expand Down
6 changes: 4 additions & 2 deletions papermerge_cli/rest/documents.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,14 +9,16 @@ def upload(
host: str,
token: str,
file_path: Path,
parent_id: UUID
parent_id: UUID,
skip_ocr: bool = False,
) -> Document:
api_client = ApiClient[Document](token=token, host=host)

doc_to_create = CreateDocument(
title=file_path.name,
file_name=file_path.name,
parent_id=parent_id
parent_id=parent_id,
ocr=not skip_ocr
)

response_doc: Document = api_client.post(
Expand Down
3 changes: 3 additions & 0 deletions papermerge_cli/schema/documents.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,9 @@ class CreateDocument(BaseModel):
parent_id: UUID | None
lang: str | None = None
file_name: str | None = None
# if true then OCR the document
# if false then skip OCR part
ocr: bool = True


class Page(BaseModel):
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[tool.poetry]
name = "papermerge-cli"
version = "0.7.1"
version = "0.8.0"
description = "Command line utility for your Papermerge DMS instance"
authors = ["Eugen Ciur <[email protected]>"]
license = "Apache 2.0"
Expand Down

0 comments on commit 498d665

Please sign in to comment.