image.png
Reference:Huggingface文档☘

本文档主要用于记录、复现、回退🤪。

安装

我用conda

1
conda install -c conda-forge huggingface_hub

但是我想安装torch和cli,但是不知道它安装的是不是gpu版本很烦

1
2
3
conda create -n env4hf
conda activate env4hf
conda install -c conda-forge huggingface_hub[cli,torch]

测试

1
python -c "from huggingface_hub import model_info; print(model_info('gpt2'))"

-c 参数是 Python 命令行解释器的一个选项,它允许你执行一段 Python 代码。在 -c 后面,你需要提供一个字符串,这个字符串就是你要执行的 Python 代码.报错

1
2
3
  File "C:\Users\A\miniconda3\envs\env4hf\Lib\site-packages\requests\adapters.py", line 507, in send
raise ConnectTimeout(e, request=request)
requests.exceptions.ConnectTimeout: (MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /api/models/gpt2 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x00000160A995BBC0>, 'Connection to huggingface.co timed out. (connect timeout=None)'))"), '(Request ID: acf1c7b1-92ac-46e1-b79a-287e1394669b)')

正常,网络问题,科学一下就成功了

1
Model Name: gpt2, Tags: ['transformers', 'pytorch', 'tf', 'jax', 'tflite', 'rust', 'onnx', 'safetensors', 'gpt2', 'text-generation', 'exbert', 'en', 'doi:10.57967/hf/0039', 'license:mit', 'endpoints_compatible', 'has_space', 'text-generation-inference', 'region:us'], Task: text-generation

问题:没安transformers

1
ModuleNotFoundError: No module named 'transformers'

安装:🤗 Transformers (huggingface)

1
conda install -c huggingface transformers

在文档末尾提供了缓存和离线运行的方法,安装时报错

1
2
3
4
5
6
7
8
Solving environment: unsuccessful initial attempt using frozen solve. Retrying with flexible solve.
Solving environment: unsuccessful attempt using repodata from current_repodata.json, retrying with next repodata source.
Solving environment: unsuccessful initial attempt using frozen solve. Retrying with flexible solve.
Found conflicts! Looking for incompatible packages.
This can take several minutes. Press CTRL-C to abort.
failed
UnsatisfiableError: The following specifications were found to be incompatible with each other:
Output in format: Requested package -> Available versions

问题,刚刚安了torch可能和base里的torch冲突了,可是pip和conda都删不了,显示没有安这个包,也有可能是python3.12太新了。

重新安装就好了,换一个思路,从含有torch-gpu的env克隆一份env,再安装hugging face和transformers(先跑到/user/miniconda/env删除原来的env4hf环境)

1
2
3
conda create --clone env4dl --name env4hf
conda activate env4hf
conda install -c huggingface transformers

离线运行

1
2
HF_DATASETS_OFFLINE=1 TRANSFORMERS_OFFLINE=1 \
python examples/pytorch/translation/run_translation.py --model_name_or_path t5-small --dataset_name wmt16 --dataset_config ro-en ...

以下是transformers doc,生硬搬运

Getting started with our git and git-lfs interface

If you need to create a repo from the command line (skip if you created a repo from the website)

1
pip install huggingface_hub

You already have it if you installed transformers or datasets

huggingface-cli login
Log in using a token from huggingface.co/settings/tokens
Create a model or dataset repo from the CLI if needed
huggingface-cli repo create repo_name –type {model, dataset, space}
Clone your model or dataset locally

Make sure you have git-lfs installed
(https://git-lfs.github.com)
git lfs install
git clone https://huggingface.co/username/repo_name
Then add, commit and push any file you want, including larges files

save files via .save_pretrained() or move them here

1
2
3
git add .
git commit -m "commit from $USER"
git push

In most cases, if you’re using one of the compatible libraries, your repo will then be accessible from code, through its identifier: username/repo_name

For example for a transformers model, anyone can load it with:

1
2
tokenizer = AutoTokenizer.from_pretrained("username/repo_name")
model = AutoModel.from_pretrained("username/repo_name")