Skip to content

Commit

Permalink
refactoring
Browse files Browse the repository at this point in the history
  • Loading branch information
ynqa committed Apr 25, 2020
1 parent c036d20 commit dc8f2d3
Show file tree
Hide file tree
Showing 104 changed files with 3,365 additions and 5,101 deletions.
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
.idea/
vendor/
example/*.txt
*.txt

text8
text8.zip
Expand Down
41 changes: 14 additions & 27 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ Wego outputs a .txt file that is described word vector is subject to the followi
<word> <value1> <value2> ...
```

## Example
## API

It's also able to train word vectors using wego APIs. Examples are as follows.

Expand All @@ -78,41 +78,28 @@ package main
import (
"os"

"github.com/ynqa/wego/pkg/builder"
"github.com/ynqa/wego/pkg/model/modelutil/save"
"github.com/ynqa/wego/pkg/model/word2vec"
)

func main() {
b := builder.NewWord2vecBuilder()

b.Dimension(10).
Window(5).
Model(word2vec.CBOW).
Optimizer(word2vec.NEGATIVE_SAMPLING).
NegativeSampleSize(5).
Verbose()

m, err := b.Build()
model, err := word2vec.New(
word2vec.WithWindow(5),
word2vec.WithModel(word2vec.Cbow),
word2vec.WithOptimizer(word2vec.NegativeSampling),
word2vec.WithNegativeSampleSize(5),
word2vec.Verbose(),
)
if err != nil {
// Failed to build word2vec.
// failed to create word2vec.
}

input, _ := os.Open("text8")

// Start to Train.
if err = m.Train(input); err != nil {
// Failed to train by word2vec.
if err = model.Train(input); err != nil {
// failed to train.
}

output, err := os.Create("example.txt")
if err != nil {
// Failed to create output file.
}

defer func() {
output.Close()
}()

m.Save(output)
// write word vector.
model.Save(os.Stdin, save.AggregatedVector)
}
```
83 changes: 83 additions & 0 deletions pkg/model/README.md → cmd/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,3 +102,86 @@ Flags:
--verbose verbose mode
-w, --window int context window size (default 5)
```

# Search

Similarity search between word vectors.

## Usage

```
Search similar words
Usage:
wego search [flags]
Examples:
wego search -i example/word_vectors.txt microsoft
Flags:
-h, --help help for search
-i, --inputFile string input file path for trained word vector (default "example/input.txt")
-r, --rank int how many the most similar words will be displayed (default 10)
```

## Example

```
$ go run wego.go search -i example/word_vectors_sg.txt microsoft
RANK | WORD | SIMILARITY
+------+------------+------------+
1 | apple | 0.994008
2 | operating | 0.992855
3 | versions | 0.992800
4 | ibm | 0.992232
5 | os | 0.989174
6 | computers | 0.988998
7 | machines | 0.988804
8 | dvd | 0.988732
9 | cd | 0.988259
10 | compatible | 0.988200
```

# REPL for search

Similarity search between word vectors with REPL mode.

## Usage

```
Search similar words with REPL mode
Usage:
wego repl [flags]
Examples:
wego repl -i example/word_vectors.txt
>> apple + banana
...
Flags:
-h, --help help for repl
-i, --inputFile string input file path for trained word vector (default "example/word_vectors.txt")
-r, --rank int how many the most similar words will be displayed (default 10)
```

## Example

Now, it is able to use `+`, `-` for arithmetic operations.

```
$ go run wego.go repl -i example/word_vectors_sg.txt
>> a + b
RANK | WORD | SIMILARITY
+------+---------+------------+
1 | phi | 0.907975
2 | q | 0.904593
3 | mathbf | 0.903066
4 | cdot | 0.902205
5 | b | 0.901952
6 | becomes | 0.900346
7 | int | 0.898680
8 | z | 0.897895
9 | named | 0.896480
10 | v | 0.895456
```
98 changes: 0 additions & 98 deletions cmd/glove.go

This file was deleted.

46 changes: 0 additions & 46 deletions cmd/glove_test.go

This file was deleted.

Loading

0 comments on commit dc8f2d3

Please sign in to comment.