deepseek-coder模型量化

2024-03-17 07:04
文章标签 模型 量化 coder deepseek

本文主要是介绍deepseek-coder模型量化,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

1 简介

DeepSeek-Coder在多种编程语言和各种基准测试中取得了开源代码模型中最先进的性能。

为尝试在开发板进行部署,首先利用llama.cpp对其进行量化。

2 llama.cpp安装

git clone之后进入文件夹make即可,再将依赖补全pip install -r requirements.txt

3 量化

按照GitHub上DeepSeek和llama.cpp官方的信息,后者对deepseek模型的量化目前的支持(进度)还不是很完善。
下面记录一下目前量化出现的问题。

3.1 DeepSeek官方tutorial

依照官方md

git clone https://github.com/DOGEwbx/llama.cpp.git
cd llama.cpp
git checkout regex_gpt2_preprocess

出现error: pathspec 'regex_gpt2_preprocess' did not match any file(s) known to git


# set up the environment according to README
make
python3 -m pip install -r requirements.txt
# generate GGUF model
python convert-hf-to-gguf.py <MODEL_PATH> --outfile <GGUF_PATH> --model-name deepseekcoder

出现convert-hf-to-gguf.py: error: unrecognized arguments: --model-name deepseekcoder

去掉--model-name参数,出现NotImplementedError: Architecture 'LlamaForCausalLM' not supported!,解释。


3.2 convert.py转换

参考这个comment和这个comment,使用convert.py进行转换。
看起来这个修改已经被合并了,浅浅试一下。

python convert.py <MODEL_PATH> --outfile <GGUF_PATH>

出现错误: Exception: Vocab size mismatch (model has 32256, but ../DeepSeek-Coder/models/deepseek-coder-1.3b-instruct has 32022). Add the --pad-vocab option and try again.

详细的log如下

Loading model file ../DeepSeek-Coder/models/deepseek-coder-1.3b-instruct/model.safetensors
params = Params(n_vocab=32256, n_embd=2048, n_layer=24, n_ctx=16384, n_ff=5504, n_head=16, n_head_kv=16, n_experts=None, n_experts_used=None, f_norm_eps=1e-06, rope_scaling_type=<RopeScalingType.LINEAR: 'linear'>, f_rope_freq_base=100000, f_rope_scale=4.0, n_orig_ctx=None, rope_finetuned=None, ftype=None, path_model=PosixPath('../DeepSeek-Coder/models/deepseek-coder-1.3b-instruct'))
Found vocab files: {'spm': None, 'bpe': None, 'hfft': PosixPath('../DeepSeek-Coder/models/deepseek-coder-1.3b-instruct/tokenizer.json')}
Loading vocab file PosixPath('../DeepSeek-Coder/models/deepseek-coder-1.3b-instruct/tokenizer.json'), type 'hfft'
fname_tokenizer: ../DeepSeek-Coder/models/deepseek-coder-1.3b-instruct
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Vocab info: <HfVocab with 32000 base tokens and 22 added tokens>
Special vocab info: <SpecialVocab with 0 merges, special tokens {'bos': 32013, 'eos': 32021, 'pad': 32014}, add special tokens {'bos': True, 'eos': False}>
Permuting layer 0
Permuting layer 1
Permuting layer 2
...省略部分
Permuting layer 22
Permuting layer 23
lm_head.weight                                   -> output.weight                            | BF16   | [32256, 2048]
model.embed_tokens.weight                        -> token_embd.weight                        | BF16   | [32256, 2048]
model.layers.0.input_layernorm.weight            -> blk.0.attn_norm.weight                   | BF16   | [2048]
model.layers.0.mlp.down_proj.weight              -> blk.0.ffn_down.weight                    | BF16   | [2048, 5504]
model.layers.0.mlp.gate_proj.weight              -> blk.0.ffn_gate.weight                    | BF16   | [5504, 2048]
...
model.layers.18.self_attn.v_proj.weight          -> blk.18.attn_v.weight                     | BF16   | [2048, 2048]
model.layers.19.input_layernorm.weight           -> blk.19.attn_norm.weight                  | BF16   | [2048]
...
model.layers.9.input_layernorm.weight            -> blk.9.attn_norm.weight                   | BF16   | [2048]
model.layers.9.mlp.down_proj.weight              -> blk.9.ffn_down.weight                    | BF16   | [2048, 5504]
model.layers.9.mlp.gate_proj.weight              -> blk.9.ffn_gate.weight                    | BF16   | [5504, 2048]
model.layers.9.mlp.up_proj.weight                -> blk.9.ffn_up.weight                      | BF16   | [5504, 2048]
model.layers.9.post_attention_layernorm.weight   -> blk.9.ffn_norm.weight                    | BF16   | [2048]
model.layers.9.self_attn.k_proj.weight           -> blk.9.attn_k.weight                      | BF16   | [2048, 2048]
model.layers.9.self_attn.o_proj.weight           -> blk.9.attn_output.weight                 | BF16   | [2048, 2048]
model.layers.9.self_attn.q_proj.weight           -> blk.9.attn_q.weight                      | BF16   | [2048, 2048]
model.layers.9.self_attn.v_proj.weight           -> blk.9.attn_v.weight                      | BF16   | [2048, 2048]
model.norm.weight                                -> output_norm.weight                       | BF16   | [2048]
Writing ../DeepSeek-Coder/models/1.3b.gguf, format 1
Traceback (most recent call last):File "/home/stlinpeiyang/lpy22/LLM/llama.cpp/convert.py", line 1479, in <module>main()File "/home/stlinpeiyang/lpy22/LLM/llama.cpp/convert.py", line 1473, in mainOutputFile.write_all(outfile, ftype, params, model, vocab, special_vocab,File "/home/stlinpeiyang/lpy22/LLM/llama.cpp/convert.py", line 1117, in write_allcheck_vocab_size(params, vocab, pad_vocab=pad_vocab)File "/home/stlinpeiyang/lpy22/LLM/llama.cpp/convert.py", line 963, in check_vocab_sizeraise Exception(msg)
Exception: Vocab size mismatch (model has 32256, but ../DeepSeek-Coder/models/deepseek-coder-1.3b-instruct has 32022). Add the --pad-vocab option and try again.

3.2.1 添加--pad-vocab

首先,显然提示添加参数,根据提示加上--pad-vocab参数后,成功运行并可以成功量化,但是在测试时,会出现以下错误

terminate called after throwing an instance of 'std::out_of_range'what():  _Map_base::at
Aborted (core dumped)

这种情况有相关的issue comment&这个。

llama.cpp的pull request和issue来看,应该是还没处理好。菜鸡只能嗷嗷待哺了
😥。不知道TheBloke大佬是怎么处理的👍。
(表情网站)


3.2.2 修改vocab_size

其次,根据错误的前半段的model has 32256, but ... has 32022,有类似的issue.
根据comment,对vocal_size进行修改。相应地,打开deepseek-coder-1.3b-instruct中的config.json文件,试将"vocab_size": 32256修改为"vocal_size": 32022。再次运行

python convert.py <MODEL_PATH> --outfile <GGUF_PATH>

输出的log如下

Loading model file ../DeepSeek-Coder/models/deepseek-coder-1.3b-instruct/model.safetensors
params = Params(n_vocab=32022, n_embd=2048, n_layer=24, n_ctx=16384, n_ff=5504, n_head=16, n_head_kv=16, n_experts=None, n_experts_used=None, f_norm_eps=1e-06, rope_scaling_type=<RopeScalingType.LINEAR: 'linear'>, f_rope_freq_base=100000, f_rope_scale=4.0, n_orig_ctx=None, rope_finetuned=None, ftype=None, path_model=PosixPath('../DeepSeek-Coder/models/deepseek-coder-1.3b-instruct'))
Found vocab files: {'spm': None, 'bpe': None, 'hfft': PosixPath('../DeepSeek-Coder/models/deepseek-coder-1.3b-instruct/tokenizer.json')}
Loading vocab file PosixPath('../DeepSeek-Coder/models/deepseek-coder-1.3b-instruct/tokenizer.json'), type 'hfft'
fname_tokenizer: ../DeepSeek-Coder/models/deepseek-coder-1.3b-instruct
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Vocab info: <HfVocab with 32000 base tokens and 22 added tokens>
Special vocab info: <SpecialVocab with 0 merges, special tokens {'bos': 32013, 'eos': 32021, 'pad': 32014}, add special tokens {'bos': True, 'eos': False}>
Permuting layer 0
Permuting layer 1
Permuting layer 2
...省略部分
lm_head.weight                                   -> output.weight                            | BF16   | [32256, 2048]
model.embed_tokens.weight                        -> token_embd.weight                        | BF16   | [32256, 2048]
model.layers.0.input_layernorm.weight            -> blk.0.attn_norm.weight                   | BF16   | [2048]
model.layers.0.mlp.down_proj.weight              -> blk.0.ffn_down.weight                    | BF16   | [2048, 5504]
model.layers.0.mlp.gate_proj.weight              -> blk.0.ffn_gate.weight                    | BF16   | [5504, 2048]
model.layers.0.mlp.up_proj.weight                -> blk.0.ffn_up.weight                      | BF16   | [5504, 2048]
model.layers.0.post_attention_layernorm.weight   -> blk.0.ffn_norm.weight                    | BF16   | [2048]
model.layers.0.self_attn.k_proj.weight           -> blk.0.attn_k.weight                      | BF16   | [2048, 2048]
model.layers.0.self_attn.o_proj.weight           -> blk.0.attn_output.weight                 | BF16   | [2048, 2048]
model.layers.0.self_attn.q_proj.weight           -> blk.0.attn_q.weight                      | BF16   | [2048, 2048]
model.layers.0.self_attn.v_proj.weight           -> blk.0.attn_v.weight     
...省略部分
model.layers.9.self_attn.q_proj.weight           -> blk.9.attn_q.weight                      | BF16   | [2048, 2048]
model.layers.9.self_attn.v_proj.weight           -> blk.9.attn_v.weight                      | BF16   | [2048, 2048]
model.norm.weight                                -> output_norm.weight                       | BF16   | [2048]
Writing ../DeepSeek-Coder/models/1.3b.gguf, format 1
Ignoring added_tokens.json since model matches vocab size without it.
gguf: This GGUF file is for Little Endian only
gguf: Setting special token type bos to 32013
gguf: Setting special token type eos to 32021
gguf: Setting special token type pad to 32014
gguf: Setting add_bos_token to True
gguf: Setting add_eos_token to False
gguf: Setting chat_template to {% if not add_generation_prompt is defined %}
{% set add_generation_prompt = false %}
{% endif %}
{%- set ns = namespace(found=false) -%}
{%- for message in messages -%}{%- if message['role'] == 'system' -%}{%- set ns.found = true -%}{%- endif -%}
{%- endfor -%}
{{bos_token}}{%- if not ns.found -%}
{{'You are an AI programming assistant, utilizing the Deepseek Coder model, developed by Deepseek Company, and you only answer questions related to computer science. For politically sensitive questions, security and privacy issues, and other non-computer science questions, you will refuse to answer\n'}}
{%- endif %}
{%- for message in messages %}{%- if message['role'] == 'system' %}
{{ message['content'] }}{%- else %}{%- if message['role'] == 'user' %}
{{'### Instruction:\n' + message['content'] + '\n'}}{%- else %}
{{'### Response:\n' + message['content'] + '\n<|EOT|>\n'}}{%- endif %}{%- endif %}
{%- endfor %}
{% if add_generation_prompt %}
{{'### Response:'}}
{% endif %}
[  1/219] Writing tensor output.weight                          | size  32256 x   2048  | type F16  | T+   0
[  2/219] Writing tensor token_embd.weight                      | size  32256 x   2048  | type F16  | T+   0
...省略部分
[216/219] Writing tensor blk.9.attn_output.weight               | size   2048 x   2048  | type F16  | T+   2
[217/219] Writing tensor blk.9.attn_q.weight                    | size   2048 x   2048  | type F16  | T+   2
[218/219] Writing tensor blk.9.attn_v.weight                    | size   2048 x   2048  | type F16  | T+   2
[219/219] Writing tensor output_norm.weight                     | size   2048           | type F32  | T+   2
Wrote ../DeepSeek-Coder/models/1.3b.gguf

成功生成gguf文件。下一步进行量化

./quantize ${out_model.gguf} ${out_model-q5_0.gguf} q5_0

输出log如下

main: build = 1 (231ae28)
main: built with cc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0 for x86_64-linux-gnu
main: quantizing '../DeepSeek-Coder/models/1.3b.gguf' to '../DeepSeek-Coder/models/1.3b-q5_0.gguf' as Q5_0
llama_model_loader: loaded meta data with 24 key-value pairs and 219 tensors from ../DeepSeek-Coder/models/1.3b.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = llama
llama_model_loader: - kv   1:                               general.name str              = models
llama_model_loader: - kv   2:                       llama.context_length u32              = 16384
llama_model_loader: - kv   3:                     llama.embedding_length u32              = 2048
llama_model_loader: - kv   4:                          llama.block_count u32              = 24
llama_model_loader: - kv   5:                  llama.feed_forward_length u32              = 5504
llama_model_loader: - kv   6:                 llama.rope.dimension_count u32              = 128
llama_model_loader: - kv   7:                 llama.attention.head_count u32              = 16
llama_model_loader: - kv   8:              llama.attention.head_count_kv u32              = 16
llama_model_loader: - kv   9:     llama.attention.layer_norm_rms_epsilon f32              = 0.000001
llama_model_loader: - kv  10:                       llama.rope.freq_base f32              = 100000.000000
llama_model_loader: - kv  11:                    llama.rope.scaling.type str              = linear
llama_model_loader: - kv  12:                  llama.rope.scaling.factor f32              = 4.000000
llama_model_loader: - kv  13:                          general.file_type u32              = 1
llama_model_loader: - kv  14:                       tokenizer.ggml.model str              = llama
llama_model_loader: - kv  15:                      tokenizer.ggml.tokens arr[str,32022]   = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv  16:                      tokenizer.ggml.scores arr[f32,32022]   = [-1000.000000, -1000.000000, -1000.00...
llama_model_loader: - kv  17:                  tokenizer.ggml.token_type arr[i32,32022]   = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv  18:                tokenizer.ggml.bos_token_id u32              = 32013
llama_model_loader: - kv  19:                tokenizer.ggml.eos_token_id u32              = 32021
llama_model_loader: - kv  20:            tokenizer.ggml.padding_token_id u32              = 32014
llama_model_loader: - kv  21:               tokenizer.ggml.add_bos_token bool             = true
llama_model_loader: - kv  22:               tokenizer.ggml.add_eos_token bool             = false
llama_model_loader: - kv  23:                    tokenizer.chat_template str              = {% if not add_generation_prompt is de...
llama_model_loader: - type  f32:   49 tensors
llama_model_loader: - type  f16:  170 tensors
llama_model_quantize_internal: meta size = 767616 bytes
[   1/ 219]                        output.weight - [ 2048, 32256,     1,     1], type =    f16, quantizing to q6_K .. size =   126.00 MiB ->    51.68 MiB
[   2/ 219]                    token_embd.weight - [ 2048, 32256,     1,     1], type =    f16, quantizing to q5_0 .. size =   126.00 MiB ->    43.31 MiB | hist: 0.040 0.018 0.028 0.043 0.061 0.082 0.101 0.114 0.117 0.109 0.092 0.072 0.052 0.035 0.022 0.016
...
[ 218/ 219]                  blk.9.attn_v.weight - [ 2048,  2048,     1,     1], type =    f16, quantizing to q5_0 .. size =     8.00 MiB ->     2.75 MiB | hist: 0.040 0.017 0.028 0.042 0.060 0.081 0.101 0.116 0.121 0.109 0.091 0.071 0.051 0.034 0.022 0.016
[ 219/ 219]                   output_norm.weight - [ 2048,     1,     1,     1], type =    f32, size =    0.008 MB
llama_model_quantize_internal: model size  =  2568.38 MB
llama_model_quantize_internal: quant size  =   891.50 MB
llama_model_quantize_internal: hist: 0.040 0.017 0.028 0.043 0.061 0.082 0.101 0.114 0.118 0.109 0.092 0.071 0.051 0.035 0.022 0.016main: quantize time =  9300.54 ms
main:    total time =  9300.54 ms

进行测试

./main -m ../DeepSeek-Coder/models/1.3b-q5_0.gguf  -n 256 -t 18 --repeat_penalty 1.0 --color -i -r "User:" -f ./prompts/chat-with-bob.txt -ngl 20

加载模型失败.

warning: not compiled with GPU offload support, --n-gpu-layers option will be ignored
warning: see main README.md for information on enabling GPU BLAS support
Log start
main: build = 1 (231ae28)
main: built with cc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0 for x86_64-linux-gnu
main: seed  = 1710571501
llama_model_loader: loaded meta data with 25 key-value pairs and 219 tensors from ../DeepSeek-Coder/models/1.3b-q5_0.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = llama
llama_model_loader: - kv   1:                               general.name str              = models
llama_model_loader: - kv   2:                       llama.context_length u32              = 16384
llama_model_loader: - kv   3:                     llama.embedding_length u32              = 2048
llama_model_loader: - kv   4:                          llama.block_count u32              = 24
llama_model_loader: - kv   5:                  llama.feed_forward_length u32              = 5504
llama_model_loader: - kv   6:                 llama.rope.dimension_count u32              = 128
llama_model_loader: - kv   7:                 llama.attention.head_count u32              = 16
llama_model_loader: - kv   8:              llama.attention.head_count_kv u32              = 16
llama_model_loader: - kv   9:     llama.attention.layer_norm_rms_epsilon f32              = 0.000001
llama_model_loader: - kv  10:                       llama.rope.freq_base f32              = 100000.000000
llama_model_loader: - kv  11:                    llama.rope.scaling.type str              = linear
llama_model_loader: - kv  12:                  llama.rope.scaling.factor f32              = 4.000000
llama_model_loader: - kv  13:                          general.file_type u32              = 8
llama_model_loader: - kv  14:                       tokenizer.ggml.model str              = llama
llama_model_loader: - kv  15:                      tokenizer.ggml.tokens arr[str,32022]   = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv  16:                      tokenizer.ggml.scores arr[f32,32022]   = [-1000.000000, -1000.000000, -1000.00...
llama_model_loader: - kv  17:                  tokenizer.ggml.token_type arr[i32,32022]   = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv  18:                tokenizer.ggml.bos_token_id u32              = 32013
llama_model_loader: - kv  19:                tokenizer.ggml.eos_token_id u32              = 32021
llama_model_loader: - kv  20:            tokenizer.ggml.padding_token_id u32              = 32014
llama_model_loader: - kv  21:               tokenizer.ggml.add_bos_token bool             = true
llama_model_loader: - kv  22:               tokenizer.ggml.add_eos_token bool             = false
llama_model_loader: - kv  23:                    tokenizer.chat_template str              = {% if not add_generation_prompt is de...
llama_model_loader: - kv  24:               general.quantization_version u32              = 2
llama_model_loader: - type  f32:   49 tensors
llama_model_loader: - type q5_0:  169 tensors
llama_model_loader: - type q6_K:    1 tensors
llm_load_vocab: SPM vocabulary, but newline token not found: _Map_base::at! Using special_pad_id instead.llm_load_vocab: mismatch in special tokens definition ( 9/32022 vs 22/32022 ).
llm_load_print_meta: format           = GGUF V3 (latest)
llm_load_print_meta: arch             = llama
llm_load_print_meta: vocab type       = SPM
llm_load_print_meta: n_vocab          = 32022
llm_load_print_meta: n_merges         = 0
llm_load_print_meta: n_ctx_train      = 16384
llm_load_print_meta: n_embd           = 2048
llm_load_print_meta: n_head           = 16
llm_load_print_meta: n_head_kv        = 16
llm_load_print_meta: n_layer          = 24
llm_load_print_meta: n_rot            = 128
llm_load_print_meta: n_embd_head_k    = 128
llm_load_print_meta: n_embd_head_v    = 128
llm_load_print_meta: n_gqa            = 1
llm_load_print_meta: n_embd_k_gqa     = 2048
llm_load_print_meta: n_embd_v_gqa     = 2048
llm_load_print_meta: f_norm_eps       = 0.0e+00
llm_load_print_meta: f_norm_rms_eps   = 1.0e-06
llm_load_print_meta: f_clamp_kqv      = 0.0e+00
llm_load_print_meta: f_max_alibi_bias = 0.0e+00
llm_load_print_meta: n_ff             = 5504
llm_load_print_meta: n_expert         = 0
llm_load_print_meta: n_expert_used    = 0
llm_load_print_meta: pooling type     = 0
llm_load_print_meta: rope type        = 0
llm_load_print_meta: rope scaling     = linear
llm_load_print_meta: freq_base_train  = 100000.0
llm_load_print_meta: freq_scale_train = 0.25
llm_load_print_meta: n_yarn_orig_ctx  = 16384
llm_load_print_meta: rope_finetuned   = unknown
llm_load_print_meta: model type       = ?B
llm_load_print_meta: model ftype      = Q5_0
llm_load_print_meta: model params     = 1.35 B
llm_load_print_meta: model size       = 891.50 MiB (5.55 BPW)
llm_load_print_meta: general.name     = models
llm_load_print_meta: BOS token        = 32013 '<|begin▁of▁sentence|>'
llm_load_print_meta: EOS token        = 32021 '<|EOT|>'
llm_load_print_meta: UNK token        = 0 '!'
llm_load_print_meta: PAD token        = 32014 '<|end▁of▁sentence|>'
llm_load_tensors: ggml ctx size =    0.08 MiB
llama_model_load: error loading model: create_tensor: tensor 'token_embd.weight' has wrong shape; expected  2048, 32022, got  2048, 32256,     1,     1
llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model '../DeepSeek-Coder/models/1.3b-q5_0.gguf'
main: error: unable to load model

看错误llama_model_load: error loading model: create_tensor: tensor 'token_embd.weight' has wrong shape; expected 2048, 32022, got 2048, 32256, 1, 1应该是跟前面修改的vocab-size有关。


这篇关于deepseek-coder模型量化的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/818179

相关文章

Golang的CSP模型简介(最新推荐)

《Golang的CSP模型简介(最新推荐)》Golang采用了CSP(CommunicatingSequentialProcesses,通信顺序进程)并发模型,通过goroutine和channe... 目录前言一、介绍1. 什么是 CSP 模型2. Goroutine3. Channel4. Channe

Python基于火山引擎豆包大模型搭建QQ机器人详细教程(2024年最新)

《Python基于火山引擎豆包大模型搭建QQ机器人详细教程(2024年最新)》:本文主要介绍Python基于火山引擎豆包大模型搭建QQ机器人详细的相关资料,包括开通模型、配置APIKEY鉴权和SD... 目录豆包大模型概述开通模型付费安装 SDK 环境配置 API KEY 鉴权Ark 模型接口Prompt

大模型研发全揭秘:客服工单数据标注的完整攻略

在人工智能(AI)领域,数据标注是模型训练过程中至关重要的一步。无论你是新手还是有经验的从业者,掌握数据标注的技术细节和常见问题的解决方案都能为你的AI项目增添不少价值。在电信运营商的客服系统中,工单数据是客户问题和解决方案的重要记录。通过对这些工单数据进行有效标注,不仅能够帮助提升客服自动化系统的智能化水平,还能优化客户服务流程,提高客户满意度。本文将详细介绍如何在电信运营商客服工单的背景下进行

Andrej Karpathy最新采访:认知核心模型10亿参数就够了,AI会打破教育不公的僵局

夕小瑶科技说 原创  作者 | 海野 AI圈子的红人,AI大神Andrej Karpathy,曾是OpenAI联合创始人之一,特斯拉AI总监。上一次的动态是官宣创办一家名为 Eureka Labs 的人工智能+教育公司 ,宣布将长期致力于AI原生教育。 近日,Andrej Karpathy接受了No Priors(投资博客)的采访,与硅谷知名投资人 Sara Guo 和 Elad G

Retrieval-based-Voice-Conversion-WebUI模型构建指南

一、模型介绍 Retrieval-based-Voice-Conversion-WebUI(简称 RVC)模型是一个基于 VITS(Variational Inference with adversarial learning for end-to-end Text-to-Speech)的简单易用的语音转换框架。 具有以下特点 简单易用:RVC 模型通过简单易用的网页界面,使得用户无需深入了

透彻!驯服大型语言模型(LLMs)的五种方法,及具体方法选择思路

引言 随着时间的发展,大型语言模型不再停留在演示阶段而是逐步面向生产系统的应用,随着人们期望的不断增加,目标也发生了巨大的变化。在短短的几个月的时间里,人们对大模型的认识已经从对其zero-shot能力感到惊讶,转变为考虑改进模型质量、提高模型可用性。 「大语言模型(LLMs)其实就是利用高容量的模型架构(例如Transformer)对海量的、多种多样的数据分布进行建模得到,它包含了大量的先验

图神经网络模型介绍(1)

我们将图神经网络分为基于谱域的模型和基于空域的模型,并按照发展顺序详解每个类别中的重要模型。 1.1基于谱域的图神经网络         谱域上的图卷积在图学习迈向深度学习的发展历程中起到了关键的作用。本节主要介绍三个具有代表性的谱域图神经网络:谱图卷积网络、切比雪夫网络和图卷积网络。 (1)谱图卷积网络 卷积定理:函数卷积的傅里叶变换是函数傅里叶变换的乘积,即F{f*g}

秋招最新大模型算法面试,熬夜都要肝完它

💥大家在面试大模型LLM这个板块的时候,不知道面试完会不会复盘、总结,做笔记的习惯,这份大模型算法岗面试八股笔记也帮助不少人拿到过offer ✨对于面试大模型算法工程师会有一定的帮助,都附有完整答案,熬夜也要看完,祝大家一臂之力 这份《大模型算法工程师面试题》已经上传CSDN,还有完整版的大模型 AI 学习资料,朋友们如果需要可以微信扫描下方CSDN官方认证二维码免费领取【保证100%免费

【生成模型系列(初级)】嵌入(Embedding)方程——自然语言处理的数学灵魂【通俗理解】

【通俗理解】嵌入(Embedding)方程——自然语言处理的数学灵魂 关键词提炼 #嵌入方程 #自然语言处理 #词向量 #机器学习 #神经网络 #向量空间模型 #Siri #Google翻译 #AlexNet 第一节:嵌入方程的类比与核心概念【尽可能通俗】 嵌入方程可以被看作是自然语言处理中的“翻译机”,它将文本中的单词或短语转换成计算机能够理解的数学形式,即向量。 正如翻译机将一种语言

AI Toolkit + H100 GPU,一小时内微调最新热门文生图模型 FLUX

上个月,FLUX 席卷了互联网,这并非没有原因。他们声称优于 DALLE 3、Ideogram 和 Stable Diffusion 3 等模型,而这一点已被证明是有依据的。随着越来越多的流行图像生成工具(如 Stable Diffusion Web UI Forge 和 ComyUI)开始支持这些模型,FLUX 在 Stable Diffusion 领域的扩展将会持续下去。 自 FLU