Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

模型转换pytorch问题 #162

Open
dalinvip opened this issue May 14, 2021 · 5 comments
Open

模型转换pytorch问题 #162

dalinvip opened this issue May 14, 2021 · 5 comments

Comments

@dalinvip
Copy link

模型现在还是不支持转成pytorch版本嘛,用这份代码在自己的数据领域微调了一版,用脚本转成torch的还是报错

AssertionError: ('Pointer shape torch.Size([128]) and array shape (312,) mismatched', torch.Size([128]), (312,))

@Gamelife311
Copy link

请问您解决了吗

@msclock
Copy link

msclock commented Jun 20, 2023

@DoverDW 可以转的参考 GLUE 或者 albert_pytorch 仓库

@Gamelife311
Copy link

@msclock 请问按照albert_pytorch
/convert_albert_tf_checkpoint_to_pytorch.py 文件来就可以吗

@msclock
Copy link

msclock commented Jun 21, 2023

@DoverDW
albert_zh生成的好像有两个版本,一个是https://github.com/brightmart/albert_zh/blob/master/modeling.py的, 一个是https://github.com/brightmart/albert_zh/blob/master/modeling_google.py版本.
modeling_google版本好像可以直接用huggingface transformers albert转换,我这边项目做一个分类的子任务拿到是这个https://github.com/brightmart/albert_zh/blob/master/modeling.py 生成保存的ckpt模型, 里面对应/workspaces/ai-serving-solution/CLUE/baselines/models_pytorch/classifier_pytorch/convert_albert_original_tf_checkpoint_to_pytorch.py 转换脚本, 还要改一下,增加分类子任务的输出权重绑定到模型属性方式/workspaces/ai-serving-solution/CLUE/baselines/models_pytorch/classifier_pytorch/transformers/modeling_albert.py的函数load_tf_weights_in_albert

    for name, array in zip(names, arrays):
        name = name.split("/")
        # adam_v and adam_m are variables used in AdamWeightDecayOptimizer to calculated m and v
        # which are not required for using pretrained model
        if any(n in ["adam_v", "adam_m", "global_step"] for n in name):
            logger.info("Skipping {}".format("/".join(name)))
            continue

        # Classifier 这里把albert_zh中的输出添加前缀,方便后面代码绑定到对应的模型属性权重上
        if len(name) == 1 and ("output_bias" in name or "output_weights" in name):
            name = ["classifier"] + name

        pointer = model

然后再进行加载

from transformers.modeling_albert import AlbertForSequenceClassification
from transformers.tokenization_bert import BertTokenizer
from transformers.configuration_bert import BertConfig
import torch

news_categories = [
    "other",
    "drawing_name",
    "draing_number",
]
idx2cate = {i: item for i, item in enumerate(news_categories)}

config = BertConfig.from_pretrained(
    "/workspaces/ai-serving-solution/deploy/ai_recognition/analysis/albert/multiclass_output/signature1.1/",
    num_labels=len(news_categories),
)
tokenizer = BertTokenizer.from_pretrained(
    "/workspaces/ai-serving-solution/deploy/ai_recognition/analysis/albert/multiclass_output/signature1.1/",
    padding=True,
)
model = AlbertForSequenceClassification.from_pretrained(
    "/workspaces/ai-serving-solution/deploy/ai_recognition/analysis/albert/multiclass_output/signature1.1/",
    from_tf=True,
    config=config,
)
pytorch_dump_path = "/workspaces/ai-serving-solution/deploy/ai_recognition/analysis/albert/multiclass_output/signature1.1/pytorch_model.bin"
print("Save PyTorch model to {}".format(pytorch_dump_path))
torch.save(model.state_dict(), pytorch_dump_path)
token_codes = tokenizer.encode("主体结构中板梁配筋图", max_length=24)
input_ids = torch.tensor(token_codes).unsqueeze(0)  # Batch size 1
outputs = model(input_ids)
# get output probabilities by doing softmax
probs = outputs[0].softmax(1)
# executing argmax function to get the candidate label index
label_index = probs.argmax(dim=1)[0].tolist()
# get the label name
label = idx2cate[label_index]
# get the label probability
proba = probs.tolist()[0][label_index]
print({"label": label, "proba": proba})

最后,建议直接用huggingface transformers albert 上现有的预训练模型直接拿来用, 上面的步骤太冗余了

@Gamelife311
Copy link

@DoverDW albert_zh生成的好像有两个版本,一个是https://github.com/brightmart/albert_zh/blob/master/modeling.py的, 一个是https://github.com/brightmart/albert_zh/blob/master/modeling_google.py版本. modeling_google版本好像可以直接用huggingface transformers albert转换,我这边项目做一个分类的子任务拿到是这个https://github.com/brightmart/albert_zh/blob/master/modeling.py 生成保存的ckpt模型, 里面对应/workspaces/ai-serving-solution/CLUE/baselines/models_pytorch/classifier_pytorch/convert_albert_original_tf_checkpoint_to_pytorch.py 转换脚本, 还要改一下,增加分类子任务的输出权重绑定到模型属性方式/workspaces/ai-serving-solution/CLUE/baselines/models_pytorch/classifier_pytorch/transformers/modeling_albert.py的函数load_tf_weights_in_albert

    for name, array in zip(names, arrays):
        name = name.split("/")
        # adam_v and adam_m are variables used in AdamWeightDecayOptimizer to calculated m and v
        # which are not required for using pretrained model
        if any(n in ["adam_v", "adam_m", "global_step"] for n in name):
            logger.info("Skipping {}".format("/".join(name)))
            continue

        # Classifier 这里把albert_zh中的输出添加前缀,方便后面代码绑定到对应的模型属性权重上
        if len(name) == 1 and ("output_bias" in name or "output_weights" in name):
            name = ["classifier"] + name

        pointer = model

然后再进行加载

from transformers.modeling_albert import AlbertForSequenceClassification
from transformers.tokenization_bert import BertTokenizer
from transformers.configuration_bert import BertConfig
import torch

news_categories = [
    "other",
    "drawing_name",
    "draing_number",
]
idx2cate = {i: item for i, item in enumerate(news_categories)}

config = BertConfig.from_pretrained(
    "/workspaces/ai-serving-solution/deploy/ai_recognition/analysis/albert/multiclass_output/signature1.1/",
    num_labels=len(news_categories),
)
tokenizer = BertTokenizer.from_pretrained(
    "/workspaces/ai-serving-solution/deploy/ai_recognition/analysis/albert/multiclass_output/signature1.1/",
    padding=True,
)
model = AlbertForSequenceClassification.from_pretrained(
    "/workspaces/ai-serving-solution/deploy/ai_recognition/analysis/albert/multiclass_output/signature1.1/",
    from_tf=True,
    config=config,
)
pytorch_dump_path = "/workspaces/ai-serving-solution/deploy/ai_recognition/analysis/albert/multiclass_output/signature1.1/pytorch_model.bin"
print("Save PyTorch model to {}".format(pytorch_dump_path))
torch.save(model.state_dict(), pytorch_dump_path)
token_codes = tokenizer.encode("主体结构中板梁配筋图", max_length=24)
input_ids = torch.tensor(token_codes).unsqueeze(0)  # Batch size 1
outputs = model(input_ids)
# get output probabilities by doing softmax
probs = outputs[0].softmax(1)
# executing argmax function to get the candidate label index
label_index = probs.argmax(dim=1)[0].tolist()
# get the label name
label = idx2cate[label_index]
# get the label probability
proba = probs.tolist()[0][label_index]
print({"label": label, "proba": proba})

最后,建议直接用huggingface transformers albert 上现有的预训练模型直接拿来用, 上面的步骤太冗余了

谢谢你!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants