๐ŸŒŒ Deep Learning/Etc.

ํ•œ๊ตญ์–ด ์˜คํ”ˆ์†Œ์Šค ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ๋ชจ๋ธ ๋ชจ์Œ (image-text)

๋ณต๋งŒ 2024. 1. 13. 20:59

ํ˜น์€ awesome-korean-multimodal ๊ฐ™์€๊ฒƒ

 

์‚ฌ์‹ค ํ•œ๊ตญ์–ด LLM๋„ ๋งŽ์ด ์—†๊ฑฐ๋‹ˆ์™€, ์˜คํ”ˆ์†Œ์Šค๋กœ ๊ณต๊ฐœ๋œ ํ•œ๊ตญ์–ด ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ LLM(MLLM)์€ ์ •๋ง ์–ผ๋งˆ ์•ˆ๋˜๋Š”๋“ฏ ํ•˜๋‹ค.

(์ฐธ๊ณ : ํ•œ๊ตญ์–ด LLM ๋ชจ๋ธ ๋ชจ์Œ - awesome-korean-llm)

 

GitHub - NomaDamas/awesome-korean-llm: Awesome list of Korean Large Language Models.

Awesome list of Korean Large Language Models. Contribute to NomaDamas/awesome-korean-llm development by creating an account on GitHub.

github.com

 

ํ•œ๊ตญ์–ด multimodal llm ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ multimodal embedding ๋ชจ๋ธ์„ ํ•จ๊ป˜ ์ •๋ฆฌํ•ด๋ณด์•˜๋‹ค.

์ •ํ™•ํžˆ ๋งํ•˜๋ฉด ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ์ค‘ image-text (vision-language) ๋ชจ๋ธ๋“ค๋งŒ ์ •๋ฆฌํ–ˆ๋‹ค.

์—ฌ๊ธฐ์— ์—†๋Š” ๋ชจ๋ธ์ด๋‚˜ ์ƒˆ๋กœ์šด ๋ชจ๋ธ์ด ์žˆ์œผ๋ฉด ๋Œ“๊ธ€๋กœ ์•Œ๋ ค์ฃผ์„ธ์š”

 

 

๋ชฉ์ฐจ

  • Multimodal LLM (MLLM)
    • tabtoyou/KoLLaVA
    • etri-vilab/Ko-LLaVA
  • Vision-Language Pretraining (VLP) - Multimodal embedding
    • jaketae/KoCLIP
    • Bingsu/clip-vit-large-patch-ko
    • SeanForHim/KoBEiT3

 

 

Multimodal LLM

tabtoyou/KoLLaVA
 

GitHub - tabtoyou/KoLLaVA: KoLLaVA: Korean Large Language-and-Vision Assistant (feat.LLaVA)

KoLLaVA: Korean Large Language-and-Vision Assistant (feat.LLaVA) - GitHub - tabtoyou/KoLLaVA: KoLLaVA: Korean Large Language-and-Vision Assistant (feat.LLaVA)

github.com

 

 

 

etri-vilab/Ko-LLaVA
 

Ko-LLaVA - a Hugging Face Space by etri-vilab

 

huggingface.co

  • Demo
  • ๋งŒ๋“ ์‚ฌ๋žŒ: ETRI ์‹œ๊ฐ์ง€๋Šฅ์—ฐ๊ตฌ์‹ค
  • backbone: LLama(13b)
  • space๋งŒ ๊ณต๊ฐœ๋˜๊ณ  ๋ชจ๋ธ weight์ด๋‚˜ ๋ฐ์ดํ„ฐ์…‹์€ ๊ณต๊ฐœ๋œ๊ฒŒ ์—†๋‹ค. ๋ช‡๊ฐ€์ง€ ํ…Œ์ŠคํŠธํ•ด๋ดค์„ ๋•Œ ์œ„์˜ KoLLaVA๋ณด๋‹ค ์„ฑ๋Šฅ์ด ๋งŽ์ด ๋–จ์–ด์ง€๋Š” ๊ฒƒ ๊ฐ™๋‹ค.
  • ์˜ˆ์‹œ)

 

 

 

๋ถ€๋ก) LLaVA์™€ ๋น„๊ต

 

LLaVA๋„ ํ•œ๊ตญ์–ด๋ฅผ ์–ด๋Š์ •๋„ ํ•  ์ˆ˜ ์žˆ๋‹ค. (demo)

"in Korean" ํ‚ค์›Œ๋“œ๋Š” ์ž˜ ์•ˆ๋˜๋Š”๊ฒƒ ๊ฐ™๊ณ , ํ•œ๊ตญ์–ด๋กœ ๋ฌผ์—ˆ์„ ๋•Œ ์˜์–ด๋กœ ๋Œ€๋‹ตํ•˜๋Š” ๋‹จ์ ์ด ์žˆ๋‹ค.

์˜์™ธ๋กœ ์˜์–ด๋กœ ์งˆ๋ฌธํ•œ ๋‹ค์Œ ๋‹ต๋ณ€์„ ํ•œ๊ตญ์–ด๋กœ ๋ฐ”๊ฟ”๋‹ฌ๋ผ๊ณ  ํ•˜๋ฉด ๊ดœ์ฐฎ๋‹ค.

 

 

 

VLP

jaketae/KoCLIP
 

GitHub - jaketae/koclip: KoCLIP: Korean port of OpenAI CLIP, in Flax

KoCLIP: Korean port of OpenAI CLIP, in Flax. Contribute to jaketae/koclip development by creating an account on GitHub.

github.com

 

 

Bingsu/clip-vit-large-patch14-ko
 

Bingsu/clip-vit-large-patch14-ko · Hugging Face

clip-vit-large-patch14-ko Korean CLIP model trained by Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation๋กœ ํ•™์Šต๋œ ํ•œ๊ตญ์–ด CLIP ๋ชจ๋ธ์ž…

huggingface.co

  • HuggingFace
  • backbone: CLIP(ViT)
  • dataset: AIHUB์— ์žˆ๋Š” ๋ชจ๋“  ํ•œ๊ตญ์–ด-์˜์–ด ๋ณ‘๋ ฌ ๋ฐ์ดํ„ฐ
  • license: MIT
  • training:

 

 

SeanForHim/KoBEiT3
 

SeanForHim/KoBEiT3 · Hugging Face

 

huggingface.co

 

๋ฐ˜์‘ํ˜•