Using custom model for input #2

chiehminwei · 2018-12-09T22:03:39Z

I have trained a supertagger using BERT. It takes in a sentence and outputs a softmax for each word in the sentence.
I want to use your A* parsing method to parse the sentence. How can I use my trained supertagger and combine it with your A* parser? Thank you so much.

masashi-y · 2018-12-10T07:38:31Z

Hi. I agree that making the use of external supertaggers easier is an important feature that should be implemented. As you may know, my parser requires probabilities of dependency structure in addition to those of supertags. Recently, I updated Python version of depccg (found at https://github.com/masashi-y/depccg. Sorry for the complicatedness...), so that it can accept input file in jsonl format where each line corresponds to a json dictionary containing infos about a sentence, that should look like as follow:

{
    "words": "this is an example sentence .",
    "head_tags": [0.0, ...],    # flattened matrix (list) which contains # words * # supertags elements
    "head_tags_shape: [# words, # supertags],
    "heads": [0.0, ...],    # flattened matrix with # words * (# words + 1) elements   
    "heads_shape": [# words, # words + 1],
    "categories" ["S", "N", ...]  # a list supertags
}

Optionally, you can omit head_shape and heads entries, in which case the parser uses the default dependency parser implemented in it to assign probabilities of the dependency structure.

Please try the following scripts and check the attached json file to see if it works.

git clone https://github.com/masashi-y/depccg
cd depccg
git checkout refactor
python setup.py build_ext --inplace  # or you can install by  "python setup.py install"
sh bin/depccg_en download     # install default tagger
cat test.json | sh bin/depccg_en --input-format json

Please be careful to use log probabilities (e.g. log_softmax). The scale mismatch may cause unexpected bahavior of the parser.
I'm happy to help if you see any problem.
Hope that this meets your need.
test.json.gz

chiehminwei · 2018-12-10T13:38:53Z

I tested the json file and it worked! Thank you so much!

I just re-read your paper and some of your code (ja_lstm_parser_bi.py) and seem to have a better understanding now. You jointly trained for P_tag and P_dep terms using a biLSTM. I still have some questions though.

How does the default dependency parser assign dependency probabilities if I don't include "heads" and "heads_shape" fields in the json file? I'm guessing it loads the pre-trained biLSTM and do a forward prop to obtain the P_dep terms? In that case, if I wanted to parse Chinese, I guess I would need to train my own model to get the P_dep terms? Alternatively, if I just assigned equal probabilities to each P_dep (say, set "heads" to [# words, #words+1] with all -1's) as a quick hack, would this be equivalent to using Mike Lewis's EasySRL parser? Also, why does "heads" have shape [# words, # words + 1]?

I'm trying to parse the Chinese CCGBank, so I would highly appreciate your tips on any challenges you faced (Japanese is strictly head-final, no tri-training available, had to convert to bunsetsu dependency for evaluation...could you still use EASYCCG for Japanese evaluation?) or any reference that might be helpful (for example, how can I use the script for converting from CCGBank to dependency? Any code in this repo/python repo that I should look at more closely?).

Sorry I have so many questions. I'm so glad there's someone working on Japanese CCG though. Thank you so much again for your help!

masashi-y · 2018-12-11T06:28:30Z

I'm happy too, that you are working on Chinese CCG parsing!

How does the default dependency parser assign dependency probabilities if I don't include "heads" and "heads_shape" fields in the json file? I'm guessing it loads the pre-trained biLSTM and do a forward prop to obtain the P_dep terms?

Exactly yes. It loads pre-trained biLSTM trained using tri-training.
The performance is the best performing one in my CCG paper.

In that case, if I wanted to parse Chinese, I guess I would need to train my own model to get the P_dep terms?

Ideally yes.

Alternatively, if I just assigned equal probabilities to each P_dep (say, set "heads" to [# words, #words+1] with all -1's) as a quick hack, would this be equivalent to using Mike Lewis's EasySRL parser?

If you set P_dep in that way, it becomes very close to EasySRL. But it lacks a heuristic method that is introduced in Mike Lewis's paper (e.g. what he calls attach-low heuristics). So the performance will be poorer in the case of English parsing.
(One of the claims in my paper is that for languages such as Japanese, which has relatively freer word ordering, this kind of heuristics does not help. And we found that it is important to model higher non-terminal level syntax (dependencies) in addition to supertags. I don't know if this is the case for Chinese CCGbank parsing..)

Also, why does "heads" have shape [# words, # words + 1]?

heads[i, 0] is a log probability that (i+1)'th word is connected to dummy root node in a dependency tree.
heads[i, j] (j>0) is one that (i+1)'th word is a dependency child of j'th word.

I'm trying to parse the Chinese CCGBank, so I would highly appreciate your tips on any challenges you faced (Japanese is strictly head-final, no tri-training available, had to convert to bunsetsu dependency for evaluation...could you still use EASYCCG for Japanese evaluation?) or any reference that might be helpful (for example, how can I use the script for converting from CCGBank to dependency? Any code in this repo/python repo that I should look at more closely?).

Regarding tri-training, I did not do it for Japanese language simply because I did not have enough time for that. You download some huge data (e.g. wikipedia) and assign supertags and dependency trees using some taggers and dep-parsers, which is just an engineering work.
Converting a CCG tree to a dependency one is very simple if you know the algorithm how Stanford converter converts a constituency tree to dependency one. Googling e.g. converting constituency tree to dependency tree will be helpful 😃 In my code, https://github.com/masashi-y/depccg/blob/master/src/py/ja_lstm_parser_bi.py contains a function that do this: TraininingDataCreator._get_dependencies. We also use the similar algorithm in the conversion to the bunsetsu dependencies.

chiehminwei · 2018-12-12T19:40:10Z

Thank you so much for your thorough explanations. Really appreciate your help!
One more question, if I provided the probabilities in json, would the parsing results vary depending on whether it is using the Japanese model or English model? Or is it the case that once I have the probabilities, then parsing is deterministic regardless of the input language?
Thank you so much again!

masashi-y · 2018-12-19T07:14:45Z

Hi. Sorry to reply late. Unfortunately there are some language-specific configurations. Most importantly, you must define a set of combinatory rules and a set of unary rules for a specific language. You can find the sets of combinatory rules implemented for Japanese and English languages at https://github.com/masashi-y/depccg.ml/blob/master/lib/grammar.ml, and you can find the sets of unary rules in the zipped model files (unary_rules.txt file). Additionaly, in the zip archive you can see other things such as seen_rules.txt and cat_dict.txt, which is used to reduce the search space in runnning A* algorithm. You should configure so that the parser does not use them, or it would be nice if you create one for Chinese language for the efficiency sake.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using custom model for input #2

Using custom model for input #2

chiehminwei commented Dec 9, 2018

masashi-y commented Dec 10, 2018 •

edited

Loading

chiehminwei commented Dec 10, 2018

masashi-y commented Dec 11, 2018

chiehminwei commented Dec 12, 2018

masashi-y commented Dec 19, 2018

Using custom model for input #2

Using custom model for input #2

Comments

chiehminwei commented Dec 9, 2018

masashi-y commented Dec 10, 2018 • edited Loading

chiehminwei commented Dec 10, 2018

masashi-y commented Dec 11, 2018

chiehminwei commented Dec 12, 2018

masashi-y commented Dec 19, 2018

masashi-y commented Dec 10, 2018 •

edited

Loading