๋ฐ˜์‘ํ˜•

๐Ÿ Python & library/HuggingFace 5

HuggingFace Space ๋งŒ๋“ค๊ธฐ

HuggingFace Space๋Š” ๊ฐ„ํŽธํ•˜๊ณ  ์‰ฝ๊ฒŒ ML ๋ฐ๋ชจ ์•ฑ์„ ๋งŒ๋“ค ์ˆ˜ ์žˆ๋Š” ๊ณต๊ฐ„์ด๋‹ค. Spaces - Hugging Face Spaces Discover amazing ML apps made by the community! huggingface.co Space๋“ค์˜ ์˜ˆ์‹œ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค. 1: text-to-image Rich Text To Image - a Hugging Face Space by songweig huggingface.co 2: chat-with-GPT4 Chat-with-GPT4 - a Hugging Face Space by ysharma huggingface.co ๋ณธ ๊ธ€์—์„œ๋Š” ์ด๋Ÿฌํ•œ HuggingFace Space๋ฅผ ๋งŒ๋“œ๋Š” ๋ฒ•์— ๋Œ€ํ•ด ๊ฐ„๋žตํžˆ ์†Œ๊ฐœํ•˜๊ณ ์ž ํ•ฉ๋‹ˆ๋‹ค. ๋ณด๋‹ค ์ž์„ธํ•œ ์„ค๋ช…์€ ๊ณต์‹..

[HuggingFace] Trainer ์‚ฌ์šฉ๋ฒ•

Official Docs: https://huggingface.co/docs/transformers/v4.19.2/en/main_classes/trainer Trainer When using gradient accumulation, one step is counted as one step with backward pass. Therefore, logging, evaluation, save will be conducted every gradient_accumulation_steps * xxx_step training examples. huggingface.co Trainer class๋Š” ๋ชจ๋ธํ•™์Šต๋ถ€ํ„ฐ ํ‰๊ฐ€๊นŒ์ง€ ํ•œ ๋ฒˆ์— ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ๋Š” API๋ฅผ ์ œ๊ณตํ•œ๋‹ค. ๋‹ค์Œ์˜ ์‚ฌ์šฉ์˜ˆ์‹œ๋ฅผ ๋ณด๋ฉด ์ง๊ด€์ ์œผ๋กœ ์ดํ•ดํ•  ์ˆ˜ ์žˆ๋‹ค. f..

[HuggingFace] Tokenizer์˜ ์—ญํ• ๊ณผ ๊ธฐ๋Šฅ, Token ID, Input ID, Token type ID, Attention Mask

HuggingFace์˜ Tokenizer์„ ์‚ฌ์šฉํ•˜๋ฉด Token (Input) ID, Attention Mask๋ฅผ ํฌํ•จํ•œ BatchEncoding์„ ์ถœ๋ ฅ์œผ๋กœ ๋ฐ›๊ฒŒ ๋œ๋‹ค. ์ด ๊ธ€์—์„œ๋Š” ์ด๋Ÿฌํ•œ HuggingFace์˜ Model input์— ๋Œ€ํ•ด ์ •๋ฆฌํ•ด ๋ณด๊ณ ์ž ํ•œ๋‹ค. Tokenizer class์— ๋Œ€ํ•œ ๊ฒŒ์‹œ๋ฌผ์€ ์—ฌ๊ธฐ์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค. ์ฐธ๊ณ : Official Docs Glossary Fine-tune for downstream tasks huggingface.co Tokenizer HuggingFace์˜ Tokenizer์„ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์šฐ์„  ์ •์˜ํ•œ๋‹ค. ๋ณธ ์˜ˆ์ œ์—์„œ๋Š” BertTokenizer์„ ์‚ฌ์šฉํ•œ๋‹ค. from transformers import BertTokenizer tokenizer = BertToken..

[HuggingFace] Tokenizer class ์•Œ์•„๋ณด๊ธฐ

Official Docs: https://huggingface.co/docs/transformers/v4.19.2/en/main_classes/tokenizer Tokenizer Returns List[int], torch.Tensor, tf.Tensor or np.ndarray The tokenized ids of the text. huggingface.co Github: https://github.com/huggingface/tokenizers Tokenizer์€ ๋ชจ๋ธ์— ๋“ค์–ด๊ฐˆ input์„ ์ค€๋น„ํ•˜๋Š” ๋ฐ์— ํ•„์š”ํ•˜๋‹ค. Hugging Face์—์„œ ์ œ๊ณตํ•˜๋Š” Tokenizer class๋ฅผ ํ†ตํ•ด ์‰ฝ๊ฒŒ ์ด์šฉํ•  ์ˆ˜ ์žˆ๋‹ค. ๊ธฐ๋ณธ์ด ๋˜๋Š” class๋“ค์€ PreTrainedTokenizer์™€ PreTrainedTokeni..

[HuggingFace] Pipeline & AutoClass

PyTorch์—์„œ์˜ ์‚ฌ์šฉ๋ฒ• ์œ„์ฃผ๋กœ ์ •๋ฆฌํ•œ ๊ธ€ Quick tour Get up and running with ๐Ÿค— Transformers! Start using the pipeline() for rapid inference, and quickly load a pretrained model and tokenizer with an AutoClass to solve your text, vision or audio task. All code examples presented in the documentation have a huggingface.co HuggingFace์˜ ๊ฐ€์žฅ ๊ธฐ๋ณธ ๊ธฐ๋Šฅ์ธ pipeline()๊ณผ AutoClass๋ฅผ ์†Œ๊ฐœํ•œ๋‹ค. pipeline()์€ ๋น ๋ฅธ inference๋ฅผ ์œ„ํ•ด ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๊ณ , Au..

1
๋ฐ˜์‘ํ˜•