CSDN Global

stable-diffusion-depth2img@jagilley

Create variations of an image while preserving shape and depth

llava-v1.6-mistral-7b@yorickvp

LLaVA v1.6: Large Language and Vision Assistant (Mistral-7B)

sdxl-hiroshinagai@bhavnicksm

A SDXL Model trained from another SDXL-hiroshinagai model images

sdxl-inpainting@sepal

Stable Diffusion XL specifically trained on Inpainting by huggingface

honeybee@jyoung105

Locality-enhanced Projector for Multimodal LLM

siglip@lucataco

SigLIP proposes to replace the loss function used in CLIP by a simple pairwise sigmoid loss

nanosam@jimothyjohn

Fully distilled and accelerated MobileSAM

deforum-kandinsky-2-2@adirik

Generate videos from text prompts with Kandinsky-2.2

star-trek-flan@fofr

flan-t5-xl trained on the Memory Alpha Star Trek Wiki

chatglm3-6b-32k@nomagick

A 6B parameter open bilingual chat LLM (optimized for 8k+ context) | 开源双语对话语言模型

insanely-fast-whisper-with-video@turian

whisper-large-v3, incredibly fast, with video transcription

lucid-sonic-dreams-xl@pollinations

Lucid Sonic Dreams syncs StyleGAN XL -generated visuals to music

speaker-diarization@lucataco

Segments an audio recording based on who is speaking (on A100)

sdxl-tombraider@jbilcke

A SDXL LoRA inspired by Tomb Raider (1996)

image-captioning-with-visual-attention@nohamoamary

datasets: Flickr8k

image-prompts@fofr

Generate image prompts for Midjourney. Prefix inputs with "Image: "

towerinstruct-7b-v0.1@tomasmcm

Source: Unbabel/TowerInstruct-7B-v0.1 ✦ Quant: TheBloke/TowerInstruct-7B-v0.1-AWQ ✦ This model is trained to handle several translation-related tasks, such as general machine translation, gramatical error correction, and paraphrase generation

sdxl-finland@pwntus

An SDXL finetune of photos I took while riding a train in Finland (Helsinki-Vaasa)

whisperx@victor-upmeet

Accelerated transcription, word-level timestamps and diarization with whisperX large-v3