Create variations of an image while preserving shape and depth
LLaVA v1.6: Large Language and Vision Assistant (Mistral-7B)
A SDXL Model trained from another SDXL-hiroshinagai model images
7B base version of Google’s Gemma model
Stable Diffusion XL specifically trained on Inpainting by huggingface
Locality-enhanced Projector for Multimodal LLM
SigLIP proposes to replace the loss function used in CLIP by a simple pairwise sigmoid loss
Fully distilled and accelerated MobileSAM
Generate videos from text prompts with Kandinsky-2.2
flan-t5-xl trained on the Memory Alpha Star Trek Wiki
A 6B parameter open bilingual chat LLM (optimized for 8k+ context) | 开源双语对话语言模型
whisper-large-v3, incredibly fast, with video transcription
Lucid Sonic Dreams syncs StyleGAN XL -generated visuals to music
Segments an audio recording based on who is speaking (on A100)
A SDXL LoRA inspired by Tomb Raider (1996)
datasets: Flickr8k
Generate image prompts for Midjourney. Prefix inputs with "Image: "
Source: Unbabel/TowerInstruct-7B-v0.1 ✦ Quant: TheBloke/TowerInstruct-7B-v0.1-AWQ ✦ This model is trained to handle several translation-related tasks, such as general machine translation, gramatical error correction, and paraphrase generation
An SDXL finetune of photos I took while riding a train in Finland (Helsinki-Vaasa)
Accelerated transcription, word-level timestamps and diarization with whisperX large-v3