-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #2 from haiwen/add_deploy_by_docker_and_opt_struct…
…ures opt_introductions
Showing
4 changed files
with
109 additions
and
56 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,21 @@ | ||
# Introduction | ||
|
||
ZincSearch is a full-text search server implemented in Go language, providing an API compatible with ElasticSearch DSL. It uses Bluge as the indexing engine. Bluge is a fork version of Bleve (developed by CouchBase), a widely used Go language full-text indexing library, which refactors the code to make it more modern and flexible. | ||
**SeaSearch** is a lightweight search engine will replace ElasticSearch as the default search engine, built on open source search engine ([ZincSearch](https://zincsearch-docs.zinc.dev/)) implemented in Go language. | ||
|
||
Why use **SeaSearch**: | ||
|
||
- **Problems of ElasticSearch**: | ||
- Not designed for large number of indexes (like one index per library) | ||
- Need to search entire storage when searching inside a library | ||
- Need to filter out results that the user has permissions to acces | ||
- Can become slow when you have ~billion of files to search | ||
- Heavyweight Java program | ||
- Upgrade often requires rebuilding index | ||
|
||
- **How about SeaSearch**: | ||
- Lightweight and can support one index per library | ||
- API compatible with ElasticSearch | ||
- Architecture Highlights | ||
- Cloud-native: can use S3 as storage (for single-node or cluster) | ||
- Shared storage: in a cluster, nodes use S3 as shared storage and store index metadata in etcd | ||
- Failover: node switching is easy thanks to shared storage architecture. ES replicates data between the nodes so consistency is more complex. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,73 +1,80 @@ | ||
# SeaSearch Configuration | ||
|
||
## Original Configurations | ||
## Single-Node Configurations | ||
|
||
The original configurations of environment variables can be referenced:[https://zincsearch-docs.zinc.dev/environment-variables/](https://zincsearch-docs.zinc.dev/environment-variables/) | ||
### Basic Configurations | ||
|
||
The following configuration instructions are for our extended configuration items. All configurations are set in the form of environment variables. | ||
```shell | ||
# log mode of gin framework,default release | ||
ZINC_WAL_ENABLE=true | ||
|
||
## Extended Configurations in SeaSearch | ||
# type of storage's engine, i.e., s3 | ||
ZINC_STORAGE_TYPE= | ||
|
||
### Single-Node Configurations | ||
# the number of shards, since seaseach has one index per database, in order to improve loading efficiency, the default value is changed to 1 | ||
ZINC_SHARD_NUM=1 | ||
``` | ||
|
||
### S3 Storage Configurations | ||
|
||
To enable s3 storage configurations, the term `ZINC_STORAGE_TYPE` has to be set as `ZINC_STORAGE_TYPE=s3`. | ||
|
||
```shell | ||
# the maximum local cache file size | ||
ZINC_MAX_OBJ_CACHE_SIZE= | ||
|
||
# S3 relative informations | ||
ZINC_S3_ACCESS_ID=<your s3 access id> | ||
ZINC_S3_USE_V4_SIGNATURE=<your s3 signature> | ||
ZINC_S3_ACCESS_SECRET=<your s3 access secret> | ||
ZINC_S3_ENDPOINT=<your s3 endpoint> | ||
ZINC_S3_USE_HTTPS=<your s3 tls enabled> | ||
ZINC_S3_PATH_STYLE_REQUEST=<your s3 style request path> | ||
ZINC_S3_AWS_REGION=<your s3 AWS region> | ||
``` | ||
GIN_MODE, log mode of gin framework,default release | ||
ZINC_WAL_ENABLE, whether to enable WAL,defaule enabled | ||
ZINC_STORAGE_TYPE | ||
ZINC_MAX_OBJ_CACHE_SIZE, when s3 and oss are enabled, the maximum local cache file size | ||
ZINC_SHARD_LOAD_OBJS_GOROUTINE_NUM, index loading parallelism, when S3 and oss are enabled, can improve the index loading speed | ||
ZINC_SHARD_NUM zincsearch the original default value is 3. Since seaseach has one index per database, in order to improve loading efficiency, the default value is changed to 1 | ||
S3 related, only valid when ZINC_STORAGE_TYPE=s3 | ||
ZINC_S3_ACCESS_ID | ||
ZINC_S3_USE_V4_SIGNATURE | ||
ZINC_S3_ACCESS_SECRET | ||
ZINC_S3_ENDPOINT | ||
ZINC_S3_USE_HTTPS | ||
ZINC_S3_PATH_STYLE_REQUEST | ||
ZINC_S3_AWS_REGION | ||
OSS related, only valid when ZINC_STORAGE_TYPE=oss | ||
ZINC_OSS_ACCESS_ID | ||
ZINC_OSS_ACCESS_SECRET | ||
ZINC_OSS_BUCKET | ||
ZINC_OSS_ENDPOINT | ||
cluster related | ||
ZINC_SERVER_MODE, default none for standalone deployment, optional to cluster, must be cluster for cluster deployment | ||
ZINC_CLUSTER_ID, cluster id,need to be globally unique | ||
ZINC_ETCD_ENDPOINTS, etcd address | ||
ZINC_ETCD_ENDPOINTS, etcd key prefix, default /zinc | ||
ZINC_ETCD_USERNAME, etcd username | ||
ZINC_ETCD_PASSWORD, etcd password | ||
log related | ||
ZINC_LOG_OUTPUT, whether to output logs to files, default yes | ||
ZINC_LOG_DIR, log directory, recommended configuration, default is the log subdirectory under the current directory | ||
ZINC_LOG_LEVEL, log level,default debug | ||
|
||
## Cluster Configurations | ||
|
||
### Basic Configurations | ||
|
||
```shell | ||
# default none for standalone deployment, optional to cluster, must be cluster for cluster deployment | ||
ZINC_SERVER_MODE= | ||
|
||
# cluster id,need to be globally unique | ||
ZINC_CLUSTER_ID=<your cluster id> | ||
|
||
ZINC_ETCD_ENDPOINTS=<your etcd address> | ||
ZINC_ETCD_USERNAME=<your etcd username> | ||
ZINC_ETCD_PASSWORD=<your etcd password> | ||
``` | ||
|
||
### Proxy Configurations | ||
If the current node is a proxy node, the term `ZINC_SERVER_MODE` has to be set as **proxy** and the `ZINC_ETCD_ENDPOINTS` has to be pointed (i.e., =127.0.0.1:2379). | ||
|
||
``` | ||
ZINC_CLUSTER_PROXY_LOG_DIR=./log | ||
```shell | ||
ZINC_CLUSTER_PROXY_LOG_DIR=/opt/seasearch/data/log | ||
ZINC_CLUSTER_PROXY_HOST=0.0.0.0 | ||
ZINC_CLUSTER_PROXY_PORT=4082 | ||
ZINC_SERVER_MODE=proxy # must be proxy | ||
ZINC_ETCD_ENDPOINTS=127.0.0.1:2379 | ||
ZINC_ETCD_PREFIX=/zinc | ||
ZINC_ETCD_PREFIX=<yout etcd perfix, default /zinc> | ||
ZINC_MAX_DOCUMENT_SIZE=1m # Bulk and multisearch limit on the maximum single document,default 1m | ||
ZINC_CLUSTER_MANAGER_ADDR=127.0.0.1:4081 # manager address | ||
``` | ||
|
||
### Cluster-manger Configurations | ||
|
||
``` | ||
ZINC_CLUSTER_MANAGER_LOG_DIR=./log | ||
```shell | ||
ZINC_CLUSTER_MANAGER_LOG_DIR=/opt/seasearch/data/log | ||
ZINC_CLUSTER_MANAGER_HOST=0.0.0.0 | ||
ZINC_CLUSTER_MANAGER_PORT=4081 | ||
ZINC_CLUSTER_MANAGER_ETCD_ENDPOINTS=127.0.0.1:2379 | ||
ZINC_CLUSTER_MANAGER_ETCD_PREFIX=/zinc | ||
ZINC_CLUSTER_MANAGER_ETCD_ENDPOINTS=<your etcd endpoints> | ||
ZINC_CLUSTER_MANAGER_ETCD_PREFIX=<yout etcd perfix, default /zinc> | ||
``` | ||
|
||
## Logs Configurations | ||
|
||
```shell | ||
ZINC_LOG_OUTPUT=true #whether to output logs to files, default yes | ||
ZINC_LOG_DIR=/opt/seasearch/data/log #log directory | ||
ZINC_LOG_LEVEL=debug #log level,default debug | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters