模型转换pytorch问题 #162

dalinvip · 2021-05-14T12:25:40Z

模型现在还是不支持转成pytorch版本嘛，用这份代码在自己的数据领域微调了一版，用脚本转成torch的还是报错

AssertionError: ('Pointer shape torch.Size([128]) and array shape (312,) mismatched', torch.Size([128]), (312,))

Gamelife311 · 2023-06-01T02:47:14Z

请问您解决了吗

msclock · 2023-06-20T04:04:33Z

@DoverDW 可以转的参考 GLUE 或者 albert_pytorch 仓库

Gamelife311 · 2023-06-20T08:10:14Z

@msclock 请问按照albert_pytorch
/convert_albert_tf_checkpoint_to_pytorch.py 文件来就可以吗

msclock · 2023-06-21T04:29:36Z

@DoverDW
albert_zh生成的好像有两个版本,一个是https://github.com/brightmart/albert_zh/blob/master/modeling.py的, 一个是https://github.com/brightmart/albert_zh/blob/master/modeling_google.py版本.
modeling_google版本好像可以直接用huggingface transformers albert转换,我这边项目做一个分类的子任务拿到是这个https://github.com/brightmart/albert_zh/blob/master/modeling.py 生成保存的ckpt模型, 里面对应/workspaces/ai-serving-solution/CLUE/baselines/models_pytorch/classifier_pytorch/convert_albert_original_tf_checkpoint_to_pytorch.py 转换脚本, 还要改一下,增加分类子任务的输出权重绑定到模型属性方式/workspaces/ai-serving-solution/CLUE/baselines/models_pytorch/classifier_pytorch/transformers/modeling_albert.py的函数load_tf_weights_in_albert

    for name, array in zip(names, arrays):
        name = name.split("/")
        # adam_v and adam_m are variables used in AdamWeightDecayOptimizer to calculated m and v
        # which are not required for using pretrained model
        if any(n in ["adam_v", "adam_m", "global_step"] for n in name):
            logger.info("Skipping {}".format("/".join(name)))
            continue

        # Classifier 这里把albert_zh中的输出添加前缀,方便后面代码绑定到对应的模型属性权重上
        if len(name) == 1 and ("output_bias" in name or "output_weights" in name):
            name = ["classifier"] + name

        pointer = model

然后再进行加载

from transformers.modeling_albert import AlbertForSequenceClassification
from transformers.tokenization_bert import BertTokenizer
from transformers.configuration_bert import BertConfig
import torch

news_categories = [
    "other",
    "drawing_name",
    "draing_number",
]
idx2cate = {i: item for i, item in enumerate(news_categories)}

config = BertConfig.from_pretrained(
    "/workspaces/ai-serving-solution/deploy/ai_recognition/analysis/albert/multiclass_output/signature1.1/",
    num_labels=len(news_categories),
)
tokenizer = BertTokenizer.from_pretrained(
    "/workspaces/ai-serving-solution/deploy/ai_recognition/analysis/albert/multiclass_output/signature1.1/",
    padding=True,
)
model = AlbertForSequenceClassification.from_pretrained(
    "/workspaces/ai-serving-solution/deploy/ai_recognition/analysis/albert/multiclass_output/signature1.1/",
    from_tf=True,
    config=config,
)
pytorch_dump_path = "/workspaces/ai-serving-solution/deploy/ai_recognition/analysis/albert/multiclass_output/signature1.1/pytorch_model.bin"
print("Save PyTorch model to {}".format(pytorch_dump_path))
torch.save(model.state_dict(), pytorch_dump_path)
token_codes = tokenizer.encode("主体结构中板梁配筋图", max_length=24)
input_ids = torch.tensor(token_codes).unsqueeze(0)  # Batch size 1
outputs = model(input_ids)
# get output probabilities by doing softmax
probs = outputs[0].softmax(1)
# executing argmax function to get the candidate label index
label_index = probs.argmax(dim=1)[0].tolist()
# get the label name
label = idx2cate[label_index]
# get the label probability
proba = probs.tolist()[0][label_index]
print({"label": label, "proba": proba})

最后,建议直接用huggingface transformers albert 上现有的预训练模型直接拿来用, 上面的步骤太冗余了

Gamelife311 · 2023-07-10T09:08:12Z

@DoverDW albert_zh生成的好像有两个版本,一个是https://github.com/brightmart/albert_zh/blob/master/modeling.py的, 一个是https://github.com/brightmart/albert_zh/blob/master/modeling_google.py版本. modeling_google版本好像可以直接用huggingface transformers albert转换,我这边项目做一个分类的子任务拿到是这个https://github.com/brightmart/albert_zh/blob/master/modeling.py 生成保存的ckpt模型, 里面对应/workspaces/ai-serving-solution/CLUE/baselines/models_pytorch/classifier_pytorch/convert_albert_original_tf_checkpoint_to_pytorch.py 转换脚本, 还要改一下,增加分类子任务的输出权重绑定到模型属性方式/workspaces/ai-serving-solution/CLUE/baselines/models_pytorch/classifier_pytorch/transformers/modeling_albert.py的函数load_tf_weights_in_albert

    for name, array in zip(names, arrays):
        name = name.split("/")
        # adam_v and adam_m are variables used in AdamWeightDecayOptimizer to calculated m and v
        # which are not required for using pretrained model
        if any(n in ["adam_v", "adam_m", "global_step"] for n in name):
            logger.info("Skipping {}".format("/".join(name)))
            continue

        # Classifier 这里把albert_zh中的输出添加前缀,方便后面代码绑定到对应的模型属性权重上
        if len(name) == 1 and ("output_bias" in name or "output_weights" in name):
            name = ["classifier"] + name

        pointer = model

然后再进行加载

from transformers.modeling_albert import AlbertForSequenceClassification
from transformers.tokenization_bert import BertTokenizer
from transformers.configuration_bert import BertConfig
import torch

news_categories = [
    "other",
    "drawing_name",
    "draing_number",
]
idx2cate = {i: item for i, item in enumerate(news_categories)}

config = BertConfig.from_pretrained(
    "/workspaces/ai-serving-solution/deploy/ai_recognition/analysis/albert/multiclass_output/signature1.1/",
    num_labels=len(news_categories),
)
tokenizer = BertTokenizer.from_pretrained(
    "/workspaces/ai-serving-solution/deploy/ai_recognition/analysis/albert/multiclass_output/signature1.1/",
    padding=True,
)
model = AlbertForSequenceClassification.from_pretrained(
    "/workspaces/ai-serving-solution/deploy/ai_recognition/analysis/albert/multiclass_output/signature1.1/",
    from_tf=True,
    config=config,
)
pytorch_dump_path = "/workspaces/ai-serving-solution/deploy/ai_recognition/analysis/albert/multiclass_output/signature1.1/pytorch_model.bin"
print("Save PyTorch model to {}".format(pytorch_dump_path))
torch.save(model.state_dict(), pytorch_dump_path)
token_codes = tokenizer.encode("主体结构中板梁配筋图", max_length=24)
input_ids = torch.tensor(token_codes).unsqueeze(0)  # Batch size 1
outputs = model(input_ids)
# get output probabilities by doing softmax
probs = outputs[0].softmax(1)
# executing argmax function to get the candidate label index
label_index = probs.argmax(dim=1)[0].tolist()
# get the label name
label = idx2cate[label_index]
# get the label probability
proba = probs.tolist()[0][label_index]
print({"label": label, "proba": proba})

最后,建议直接用huggingface transformers albert 上现有的预训练模型直接拿来用, 上面的步骤太冗余了

谢谢你！！

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

模型转换pytorch问题 #162

模型转换pytorch问题 #162

dalinvip commented May 14, 2021

Gamelife311 commented Jun 1, 2023

msclock commented Jun 20, 2023 •

edited

Loading

Gamelife311 commented Jun 20, 2023

msclock commented Jun 21, 2023 •

edited

Loading

Gamelife311 commented Jul 10, 2023

模型转换pytorch问题 #162

模型转换pytorch问题 #162

Comments

dalinvip commented May 14, 2021

Gamelife311 commented Jun 1, 2023

msclock commented Jun 20, 2023 • edited Loading

Gamelife311 commented Jun 20, 2023

msclock commented Jun 21, 2023 • edited Loading

Gamelife311 commented Jul 10, 2023

msclock commented Jun 20, 2023 •

edited

Loading

msclock commented Jun 21, 2023 •

edited

Loading