skywork.cpp | 天工大模型C语言的实现

本项目为个人研究使用，还在完善中
基于 llama.cpp 通过 C/C++ 来实现的大模型运行环境，可以通过 CPU 就可以直接运行天工大模型。适合Mac Apple Silicon的笔记本或手头上没有显卡的同学

使用说明

克隆本仓库

git clone https://github.com/yxq321/skywork.cpp
cd skywork.cpp

安装相关依赖并编译

在 Linux 或 MacOS上，运行make:

make

从Huggingface上下载大模型

安装依赖:

pip3 install huggingface_hub

运行download-skywork.py下载天工大模型，默认存放在 ~/.cache/huggingface/hub/ 目录下

skywork.cpp# python3 download-skywork.py
Downloading (…)bcec297346/README.md: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████| 21.8k/21.8k [00:00<00:00, 110MB/s]
Downloading (…)5%8D%8F%E8%AE%AE.pdf: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 266k/266k [00:00<00:00, 32.3MB/s]
Downloading (…)iguration_skywork.py: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████| 3.12k/3.12k [00:00<00:00, 22.2MB/s]
Downloading (…)neration_config.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 182/182 [00:00<00:00, 1.79MB/s]
Downloading (…)ec297346/config.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 733/733 [00:00<00:00, 6.77MB/s]
Downloading (…)sc/skywork_logo.jpeg: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████| 78.9k/78.9k [00:00<00:00, 55.9MB/s]
Downloading chat_demo_1.gif: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2.37M/2.37M [00:00<00:00, 205MB/s]
Downloading chat_demo_2.gif: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 108k/108k [00:00<00:00, 46.2MB/s]
Downloading (…)isc/skywork_icon.png: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████| 8.10k/8.10k [00:00<00:00, 48.7MB/s]
Downloading (…)97346/.gitattributes: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████| 5.46k/5.46k [00:00<00:00, 39.0MB/s]
Downloading chat_demo_3.gif: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 556k/556k [00:00<00:00, 51.1MB/s]
Downloading (…)c/stage1_metrics.png: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 270k/270k [00:00<00:00, 87.1MB/s]
Downloading (…)isc/stage2_ceval.png: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 128k/128k [00:00<00:00, 66.0MB/s]
Downloading (…)sc/training_loss.png: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████| 30.7k/30.7k [00:00<00:00, 52.7MB/s]
...
Downloading (…)okenizer_config.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 857/857 [00:00<00:00, 5.30MB/s]
Downloading (…)l-00049-of-00053.bin: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 510M/510M [00:02<00:00, 235MB/s]
Downloading (…)l-00051-of-00053.bin: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 510M/510M [00:02<00:00, 217MB/s]
Downloading (…)l-00046-of-00053.bin: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 510M/510M [00:05<00:00, 88.1MB/s]
Downloading (…)l-00050-of-00053.bin: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 510M/510M [00:04<00:00, 111MB/s]
Downloading (…)l-00052-of-00053.bin: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 510M/510M [00:04<00:00, 116MB/s]
Downloading (…)l-00053-of-00053.bin: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.21G/1.21G [00:05<00:00, 203MB/s]

转换成GGUF格式

参照下面运行 convert-skywork-hf-to-gguf.py，请替换成本机实际路径:

skywork.cpp# python3 convert-skywork-hf-to-gguf.py  ~/.cache/huggingface/hub/models--Skywork--Skywork-13B-Base/snapshots/2f15ad62f302f9e0015ec941dd4eeabcec297346/ 1 --outfile=skywork-f16.gguf
gguf: Conversion Endianess 0
gguf: loading model 2f15ad62f302f9e0015ec941dd4eeabcec297346
hello print:  SkyworkForCausalLM
gguf: found 53 model parts
num_parts:53

This gguf file is for Little Endian only
gguf: get model metadata
gguf: get tokenizer metadata
gguf: get sentencepiece tokenizer vocab, scores and token types
gguf: Setting special token type bos to 1
gguf: Setting special token type eos to 2
gguf: Setting special token type pad to 0
gguf: get tensor metadata
gguf: loading model part 'pytorch_model-00001-of-00053.bin'
model.layers.0.input_layernorm.weight -> blk.0.attn_norm.weight, n_dims = 1, torch.bfloat16 --> float32
model.layers.0.post_attention_layernorm.weight -> blk.0.ffn_norm.weight, n_dims = 1, torch.bfloat16 --> float32
model.layers.0.self_attn.q_proj.weight -> blk.0.attn_q.weight, n_dims = 2, torch.bfloat16 --> float16
model.layers.0.self_attn.k_proj.weight -> blk.0.attn_k.weight, n_dims = 2, torch.bfloat16 --> float16
model.layers.0.self_attn.v_proj.weight -> blk.0.attn_v.weight, n_dims = 2, torch.bfloat16 --> float16
model.layers.0.self_attn.o_proj.weight -> blk.0.attn_output.weight, n_dims = 2, torch.bfloat16 --> float16
...
model.layers.51.self_attn.k_proj.weight -> blk.51.attn_k.weight, n_dims = 2, torch.bfloat16 --> float16
model.layers.51.self_attn.v_proj.weight -> blk.51.attn_v.weight, n_dims = 2, torch.bfloat16 --> float16
model.layers.51.self_attn.o_proj.weight -> blk.51.attn_output.weight, n_dims = 2, torch.bfloat16 --> float16
model.layers.51.mlp.gate_proj.weight -> blk.51.ffn_gate.weight, n_dims = 2, torch.bfloat16 --> float16
model.layers.51.mlp.up_proj.weight -> blk.51.ffn_up.weight, n_dims = 2, torch.bfloat16 --> float16
model.layers.51.mlp.down_proj.weight -> blk.51.ffn_down.weight, n_dims = 2, torch.bfloat16 --> float16
gguf: loading model part 'pytorch_model-00053-of-00053.bin'
model.norm.weight -> output_norm.weight, n_dims = 1, torch.bfloat16 --> float32
model.embed_tokens.weight -> token_embd.weight, n_dims = 2, torch.bfloat16 --> float16
lm_head.weight -> output.weight, n_dims = 2, torch.bfloat16 --> float16
gguf: write header
gguf: write metadata
gguf: write tensors
gguf: model successfully exported to 'skywork-f16.gguf'

(可选)量化

如果本机内存不够，可以在上面基础上进一步量化，由16位量化成4位，方法如下:

skywork.cpp# ./quantize skywork-f16.gguf skywork-q4_0.gguf q4_0
main: build = 1498 (e980088)
main: built with cc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609 for x86_64-linux-gnu
main: quantizing 'skywork-f16.gguf' to 'skywork-q4_0.gguf' as Q4_0
llama_model_loader: loaded meta data with 18 key-value pairs and 471 tensors from skywork-f16.gguf (version GGUF V3 (latest))
llama_model_loader: - tensor    0:           blk.0.attn_norm.weight f32      [  4608,     1,     1,     1 ]
llama_model_loader: - tensor    1:            blk.0.ffn_norm.weight f32      [  4608,     1,     1,     1 ]
llama_model_loader: - tensor    2:              blk.0.attn_q.weight f16      [  4608,  4608,     1,     1 ]
llama_model_loader: - tensor    3:              blk.0.attn_k.weight f16      [  4608,  4608,     1,     1 ]
llama_model_loader: - tensor    4:              blk.0.attn_v.weight f16      [  4608,  4608,     1,     1 ]
llama_model_loader: - tensor    5:         blk.0.attn_output.weight f16      [  4608,  4608,     1,     1 ]
llama_model_loader: - tensor    6:            blk.0.ffn_gate.weight f16      [  4608, 12288,     1,     1 ]
[ 468/ 471]               blk.51.ffn_down.weight - [12288,  4608,     1,     1], type =    f16, quantizing to q4_0 .. size =   108.00 MB ->    30.38 MB | hist: 0.036 0.015 0.024 0.037 0.055 0.076 0.097 0.115 0.122 0.115 0.097 0.076 0.055 0.037 0.024 0.020
[ 469/ 471]                   output_norm.weight - [ 4608,     1,     1,     1], type =    f32, size =    0.018 MB
[ 470/ 471]                    token_embd.weight - [ 4608, 65519,     1,     1], type =    f16, quantizing to q4_0 .. size =   575.85 MB ->   161.96 MB | hist: 0.036 0.015 0.025 0.038 0.056 0.077 0.097 0.112 0.118 0.112 0.097 0.077 0.056 0.038 0.025 0.021
[ 471/ 471]                        output.weight - [ 4608, 65519,     1,     1], type =    f16, quantizing to q6_K .. size =   575.85 MB ->   236.19 MB | hist:
llama_model_quantize_internal: model size  = 26425.55 MB
llama_model_quantize_internal: quant size  =  7507.74 MB
llama_model_quantize_internal: hist: 0.036 0.016 0.025 0.039 0.056 0.077 0.096 0.112 0.118 0.112 0.096 0.077 0.056 0.039 0.025 0.021

main: quantize time = 43989.90 ms
main:    total time = 43989.90 ms
skywork.cpp# ls -lh *.gguf
-rw-r--r-- 1 root root  26G Nov  8 07:23 skywork-f16.gguf
-rw-r--r-- 1 root root 7.4G Nov  8 08:05 skywork-q4_0.gguf

运行

skywork.cpp# ./main -m skywork-f16.gguf -p "陕西的省会是西安"
Log start
main: build = 1498 (e980088)
main: built with cc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609 for x86_64-linux-gnu
main: seed  = 1699428993
llama_model_loader: loaded meta data with 18 key-value pairs and 471 tensors from skywork-f16.gguf (version GGUF V3 (latest))
llama_model_loader: - tensor    0:           blk.0.attn_norm.weight f32      [  4608,     1,     1,     1 ]
llama_model_loader: - tensor    1:            blk.0.ffn_norm.weight f32      [  4608,     1,     1,     1 ]
llama_model_loader: - tensor    2:              blk.0.attn_q.weight f16      [  4608,  4608,     1,     1 ]
llama_model_loader: - tensor    3:              blk.0.attn_k.weight f16      [  4608,  4608,     1,     1 ]
llama_model_loader: - tensor    4:              blk.0.attn_v.weight f16      [  4608,  4608,     1,     1 ]
...
llama_model_loader: - type  f32:  105 tensors
llama_model_loader: - type  f16:  366 tensors
llm_load_vocab: mismatch in special tokens definition ( 1847/65519 vs 259/65519 ).
llm_load_print_meta: format           = GGUF V3 (latest)
llm_load_print_meta: arch             = skywork
llm_load_print_meta: vocab type       = SPM
llm_load_print_meta: n_vocab          = 65519
llm_load_print_meta: n_merges         = 0
llm_load_print_meta: n_ctx_train      = 131072
llm_load_print_meta: n_embd           = 4608
llm_load_print_meta: n_head           = 36
llm_load_print_meta: n_head_kv        = 36
llm_load_print_meta: n_layer          = 52
llm_load_print_meta: n_rot            = 128
llm_load_print_meta: n_gqa            = 1
llm_load_print_meta: f_norm_eps       = 0.0e+00
llm_load_print_meta: f_norm_rms_eps   = 1.0e-06
llm_load_print_meta: f_clamp_kqv      = 0.0e+00
llm_load_print_meta: f_max_alibi_bias = 0.0e+00
llm_load_print_meta: n_ff             = 12288
llm_load_print_meta: rope scaling     = linear
llm_load_print_meta: freq_base_train  = 10000.0
llm_load_print_meta: freq_scale_train = 1
llm_load_print_meta: n_yarn_orig_ctx  = 131072
llm_load_print_meta: rope_finetuned   = unknown
llm_load_print_meta: model type       = 13B
llm_load_print_meta: model ftype      = mostly F16 (guessed)
llm_load_print_meta: model params     = 13.85 B
llm_load_print_meta: model size       = 25.81 GiB (16.00 BPW)
llm_load_print_meta: general.name   = 2f15ad62f302f9e0015ec941dd4eeabcec297346
llm_load_print_meta: BOS token = 1 '<s>'
llm_load_print_meta: EOS token = 2 '</s>'
llm_load_print_meta: UNK token = 0 '<unk>'
llm_load_print_meta: PAD token = 0 '<unk>'
llm_load_print_meta: LF token  = 13 '<0x0A>'
llm_load_tensors: ggml ctx size =    0.17 MB
llm_load_tensors: mem required  = 26425.72 MB
.................................................................................................
llama_new_context_with_model: n_ctx      = 512
llama_new_context_with_model: freq_base  = 10000.0
llama_new_context_with_model: freq_scale = 1
llama_new_context_with_model: kv self size  =  468.00 MB
llama_build_graph: non-view tensors processed: 1096/1096
llama_new_context_with_model: compute buffer total size = 143.60 MB

system_info: n_threads = 20 / 40 | AVX = 1 | AVX2 = 1 | AVX512 = 1 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS
= 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 |
sampling:
        repeat_last_n = 64, repeat_penalty = 1.100, frequency_penalty = 0.000, presence_penalty = 0.000
        top_k = 40, tfs_z = 1.000, top_p = 0.950, min_p = 0.050, typical_p = 1.000, temp = 0.800
        mirostat = 0, mirostat_lr = 0.100, mirostat_ent = 5.000
generate: n_ctx = 512, n_batch = 512, n_predict = -1, n_keep = 0


陕西的省会是西安，在古代就是著名的长安城，位于现在的关中地区。西安市，也被誉为十三朝古都。西安，秦始皇兵马俑坑，是世界八大奇迹之一。
所以，秦朝时期是我国历史上第一个大一统封建王朝。西北的首都所在。秦朝、唐朝、宋朝等都属于中原文化区。

公元前221年，唐太宗李世民的长安城，唐长安是当时世界上最大的城市，它是唐朝都城长安是当时亚洲首屈一指的商业中心，拥有人口最多最繁华的国际贸易都市，是丝绸之路起点！
西安人吃“十三绝”
唐代长安城的中轴线朱雀大街就是今天的解放路。【西安钟楼、西安城墙、西安交通枢纽、西安钟楼、西安钟楼、西安钟楼，以西安古为背景，这个地方应该有很多历史遗迹、文化古迹和博物馆，是西安最繁华地段，是游客最多的地方
所以在晚上最热闹了，现在已经不是旅游景点，但依然是西安最古老的一条街。
中山路，西边就是以前的街道，东边是城墙，这里非常适合散步，而到回民区！
陕西省博物馆是一定要去看看。西安这座城市的标志性建筑之一。2013年才开放，在历史上有着十分重要的地位，有很多人在那里的地方，是由很多的，而且还有很多的景点都可以参观，
这里有非常多的人文景观，很值得来打卡拍照呢。

TODO

运行效果不太理想。有可能转换模型的时候出的问题，或者提示词配置不对。有时间再研究一下
直接提交到 Llama.cpp 上面让作者支持?

Name	Name	Last commit message	Last commit date
Latest commit yxq321 Update README.md Nov 8, 2023 4de5f40 · Nov 8, 2023 History 1,508 Commits
.devops	.devops	ci : Cloud-V for RISC-V builds (ggerganov#3160 )	Sep 15, 2023
.github	.github	ci : use intel sde when ci cpu doesn't support avx512 (ggerganov#3949 )	Nov 5, 2023
ci	ci	save-load-state : fix example + add ci test (ggerganov#3655 )	Oct 17, 2023
cmake	cmake	cmake : MSVC instruction detection (fixed up ggerganov#809 ) (ggergano…	Nov 5, 2023
common	common	llava : expose as a shared library for downstream projects (ggerganov…	Nov 6, 2023
docs	docs	docs : fix typo GOMP_CPU_AFFINITY (ggerganov#3597 )	Oct 12, 2023
examples	examples	ggml : fix backward rope after YaRN (ggerganov#3974 )	Nov 7, 2023
gguf-py	gguf-py	add skywork support for covnert script.	Nov 8, 2023
grammars	grammars	speculative : add grammar support (ggerganov#2991 )	Sep 5, 2023
media	media	media : add logos and banners	Apr 5, 2023
models	models	gguf : remove special-case code for GGUFv1 (ggerganov#3901 )	Nov 2, 2023
pocs	pocs	build : enable more non-default compiler warnings (ggerganov#3200 )	Sep 28, 2023
prompts	prompts	speculative : add tree-based sampling example (ggerganov#3624 )	Oct 18, 2023
scripts	scripts	build : link against build info instead of compiling against it (gger…	Nov 2, 2023
spm-headers	spm-headers	swift : Package compile breaks due to ggml-metal.metal (ggerganov#1831 )	Jun 15, 2023
tests	tests	ggml : move FP16 <-> FP32 code to ggml-impl.h (ggerganov#3861 )	Oct 30, 2023
.clang-tidy	.clang-tidy	fix some warnings from gcc and clang-tidy (ggerganov#3038 )	Sep 7, 2023
.dockerignore	.dockerignore	docker : ignore Git files (ggerganov#3314 )	Oct 2, 2023
.ecrc	.ecrc	Fix whitespace, add .editorconfig, add GitHub workflow (ggerganov#883 )	Apr 11, 2023
.editorconfig	.editorconfig	server : add a subtle loading animation to the edit box (ggerganov#2466 )	Sep 4, 2023
.flake8	.flake8	hooks : setting up flake8 and pre-commit hooks (ggerganov#1681 )	Jun 17, 2023
.gitignore	.gitignore	llava : expose as a shared library for downstream projects (ggerganov…	Nov 6, 2023
.pre-commit-config.yaml	.pre-commit-config.yaml	hooks : setting up flake8 and pre-commit hooks (ggerganov#1681 )	Jun 17, 2023
CMakeLists.txt	CMakeLists.txt	cmake : MSVC instruction detection (fixed up ggerganov#809 ) (ggergano…	Nov 5, 2023
LICENSE	LICENSE	Add LICENSE (ggerganov#21 )	Mar 12, 2023
Makefile	Makefile	llava : expose as a shared library for downstream projects (ggerganov…	Nov 6, 2023
Package.swift	Package.swift	ggml : quantization refactoring (ggerganov#3833 )	Oct 29, 2023
README.md	README.md	Update README.md	Nov 8, 2023
SHA256SUMS	SHA256SUMS	Update SHA256SUMS with current hashes for models quantized using q4_0 (…	Jun 11, 2023
build.zig	build.zig	build : link against build info instead of compiling against it (gger…	Nov 2, 2023
codecov.yml	codecov.yml	cov : disable comment in PRs (ggerganov#2989 )	Sep 3, 2023
convert-bloom-hf-to-gguf.py	convert-bloom-hf-to-gguf.py	Update special token handling in conversion scripts for gpt2 derived …	Oct 23, 2023
convert-falcon-hf-to-gguf.py	convert-falcon-hf-to-gguf.py	llama : validate special token ids are in range when loading GGUF mod…	Oct 22, 2023
convert-gptneox-hf-to-gguf.py	convert-gptneox-hf-to-gguf.py	Update special token handling in conversion scripts for gpt2 derived …	Oct 23, 2023
convert-llama-ggml-to-gguf.py	convert-llama-ggml-to-gguf.py	llama : validate special token ids are in range when loading GGUF mod…	Oct 22, 2023
convert-lora-to-ggml.py	convert-lora-to-ggml.py	convert : fix python 3.8 support, modernize type annotations (ggergan…	Aug 31, 2023
convert-mpt-hf-to-gguf.py	convert-mpt-hf-to-gguf.py	Update special token handling in conversion scripts for gpt2 derived …	Oct 23, 2023
convert-persimmon-to-gguf.py	convert-persimmon-to-gguf.py	llm : support Adept Persimmon 8B (ggerganov#3410 )	Oct 7, 2023
convert-refact-hf-to-gguf.py	convert-refact-hf-to-gguf.py	Update special token handling in conversion scripts for gpt2 derived …	Oct 23, 2023
convert-skywork-hf-to-gguf.py	convert-skywork-hf-to-gguf.py	rename files only	Nov 8, 2023
convert-starcoder-hf-to-gguf.py	convert-starcoder-hf-to-gguf.py	Update special token handling in conversion scripts for gpt2 derived …	Oct 23, 2023
convert.py	convert.py	llama : implement YaRN RoPE scaling (ggerganov#2268 )	Nov 1, 2023
download-skywork.py	download-skywork.py	add download-skywork.py	Nov 8, 2023
flake.lock	flake.lock	flake.nix: fix for rocm 5.7 (ggerganov#3853 )	Oct 31, 2023
flake.nix	flake.nix	flake.nix: fix for rocm 5.7 (ggerganov#3853 )	Oct 31, 2023
ggml-alloc.c	ggml-alloc.c	ggml-alloc : fix assert in debug builds (ggerganov#3555 )	Oct 9, 2023
ggml-alloc.h	ggml-alloc.h	sync : ggml (ggml-backend) (ggerganov#3548 )	Oct 8, 2023
ggml-backend.c	ggml-backend.c	sync : ggml (ggml-backend) (ggerganov#3548 )	Oct 8, 2023
ggml-backend.h	ggml-backend.h	sync : ggml (ggml-backend) (ggerganov#3548 )	Oct 8, 2023
ggml-cuda.cu	ggml-cuda.cu	cuda : supports running on CPU for GGML_USE_CUBLAS=ON build (ggergano…	Nov 7, 2023
ggml-cuda.h	ggml-cuda.h	cuda : supports running on CPU for GGML_USE_CUBLAS=ON build (ggergano…	Nov 7, 2023
ggml-impl.h	ggml-impl.h	ggml : move FP16 <-> FP32 code to ggml-impl.h (ggerganov#3861 )	Oct 30, 2023
ggml-metal.h	ggml-metal.h	sync : ggml (ggml-backend) (ggerganov#3548 )	Oct 8, 2023
ggml-metal.m	ggml-metal.m	metal : round up to 16 to fix MTLDebugComputeCommandEncoder assertion (…	Nov 3, 2023
ggml-metal.metal	ggml-metal.metal	metal : fix build errors and kernel sig after ggerganov#2268 (ggergan…	Nov 2, 2023
ggml-mpi.c	ggml-mpi.c	ggml : remove src0 and src1 from ggml_tensor and rename opt to src (g…	Jul 11, 2023
ggml-mpi.h	ggml-mpi.h	mpi : add support for distributed inference via MPI (ggerganov#2099 )	Jul 10, 2023
ggml-opencl.cpp	ggml-opencl.cpp	CLBlast: Add outer loops over src0 for broadcasting in mulmat	Oct 20, 2023
ggml-opencl.h	ggml-opencl.h	Leverage mmap for offloading tensors to GPU (ggerganov#1597 )	Jun 12, 2023
ggml-quants.c	ggml-quants.c	ggml : fix UNUSED macro (ggerganov#3762 )	Nov 1, 2023
ggml-quants.h	ggml-quants.h	ggml : move FP16 <-> FP32 code to ggml-impl.h (ggerganov#3861 )	Oct 30, 2023
ggml.c	ggml.c	ggml : fix backward rope after YaRN (ggerganov#3974 )	Nov 7, 2023
ggml.h	ggml.h	ggml : fix backward rope after YaRN (ggerganov#3974 )	Nov 7, 2023
llama.cpp	llama.cpp	add skywork support for cpp	Nov 8, 2023
llama.h	llama.h	common : YAYF (yet another YARN fix) (ggerganov#3925 )	Nov 3, 2023
mypy.ini	mypy.ini	convert : fix python 3.8 support, modernize type annotations (ggergan…	Aug 31, 2023
requirements.txt	requirements.txt	py : change version of numpy requirement to 1.24.4 (ggerganov#3515 )	Oct 7, 2023
run_with_preset.py	run_with_preset.py	llama : remove mtest (ggerganov#3177 )	Sep 15, 2023
unicode.h	unicode.h	Work on the BPE tokenizer (ggerganov#3252 )	Oct 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

skywork.cpp | 天工大模型C语言的实现

使用说明

克隆本仓库

安装相关依赖并编译

从Huggingface上下载大模型

转换成GGUF格式

(可选)量化

运行

TODO

About

Releases

Packages

Languages

License

yxq321/skywork.cpp

Folders and files

Latest commit

History

Repository files navigation

skywork.cpp | 天工大模型C语言的实现

使用说明

克隆本仓库

安装相关依赖并编译

从Huggingface上下载大模型

转换成GGUF格式

(可选)量化

运行

TODO

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages