Morse

Morse is the morphological analysis model described in:

Akyürek, Ekin, Erenay Dayanık, and Deniz Yuret. "Morphological Analysis Using a Sequence Decoder." Transactions of the Association for Computational Linguistics 7 (2019): 567-579. (TACL, arXiv).

Dependencies

Julia 1.x
Network connection

Installation

   git clone https://github.com/ai-ku/Morse.jl
   cd Morse.jl

Note: Setup and Data is optional because running an experiment from the scripts directory automatically sets up the environment and installs required data when needed. However, if you're working in a cluster node that has no internet connection, you may need to perform these steps manually. To get the pkg> prompt in Julia for package commands please use the ']' key. Backspace gets back to the original julia> prompt.

Setup (Optional)

   julia> # Press the `]` key to get the `pkg>` prompt
   (v1.1) pkg> activate .
   (v1.1) Morse> instantiate # only in the first time

Data (Optional)

   julia> using Morse
   julia> download(TRDataSet)
   julia> download(UDDataSet)

Experiments

To verify the results presented in the paper, you may run the scripts to train models and ablations. During training logs will be created at logs/ folder.

Detailed information about experiments can be found in scripts/

Note: An Nvidia GPU is required to train the models in a reasonable amount of time.

Tagging

Available Pre-Trained Models

trained(MorseModel, TRDataSet);
trained(MorseModel, UDDataSet, lang="ru"); # Russian
trained(MorseModel, UDDataSet, lang="da"); # Danish
trained(MorseModel, UDDataSet, lang="fi"); # Finnish
trained(MorseModel, UDDataSet, lang="pt"); # Portuguese
trained(MorseModel, UDDataSet, lang="es"); # Español
trained(MorseModel, UDDataSet, lang="hu"); # Hungarian
trained(MorseModel, UDDataSet, lang="bg"); # Bulgarian
trained(MorseModel, UDDataSet, lang="sv"); # Swedish

How To Use

Note: Please use lowercased and tokenized inputs.

   julia> using Knet, KnetLayers, Morse
   julia> model, vocabulary, parser = trained(MorseModel, TRDataSet);
   julia> predictions = model("annem sana yardım edemez .", v=vocabulary, p=parser)
   annem anne+Noun+A3sg+P1sg+Nom
   sana sen+Pron+Pers+A2sg+Pnon+Dat
   yardım yardım+Noun+A3sg+Pnon+Nom
   edemez et+Verb^DB+Verb+Able+Neg+Aor+A3sg
   . .+Punct

Customized Training

Note: Nvidia GPU is required to train on a reasonable time.

   julia> using Knet, KnetLayers, Morse
   julia> config = Morse.intro(split("--logFile nothing --lemma --dataSet TRDataSet")) # you can modify the program arguments
   julia> dataFiles = ["train.txt", "test.txt"] # make sure you have theese files exists in the given path
   julia> data, vocab, parser = prepareData(dataFiles,TRDataSet) # or UDDataSet
   julia> data = miniBatch(data,vocab) # sentence minibatching is required for processing a sentence correctly
   julia> model = MorseModel(config,vocab)
   julia> setoptim!(model, SGD(;lr=1.6,gclip=60.0))
   julia> trainmodel!(model,data,config,vocab,parser) # can take hours or more depends to your data
   julia> predictions = model("Annem sana yardım edemez .", v=vocab, p=parser)

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
checkpoints		checkpoints
data		data
docs		docs
generations		generations
logs		logs
scripts		scripts
src		src
.gitignore		.gitignore
.gitlab-ci.yml		.gitlab-ci.yml
.travis.yml		.travis.yml
LICENSE		LICENSE
Manifest.toml		Manifest.toml
Project.toml		Project.toml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Morse

Dependencies

Installation

Setup (Optional)

Data (Optional)

Experiments

Tagging

How To Use

Customized Training

About

Releases

Packages

Contributors 2

Languages

License

ai-ku/Morse.jl

Folders and files

Latest commit

History

Repository files navigation

Morse

Dependencies

Installation

Setup (Optional)

Data (Optional)

Experiments

Tagging

How To Use

Customized Training

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages