CSDN Global

streaming-t2v@camenduru

StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text

Generate a collection of logos based on your text input. Use longer and more detailed inputs for better results. The first time it takes a few minutes to load the model. Subsequent generations are much faster.

sdxl-family-guy@ghostofpokemon

Model trained on Family Guy animation style

blip-3@zsxkib

Blip 3 / XGen-MM, Answers questions about images ({blip3,xgen-mm}-phi3-mini-base-r-v1)

supir@cjwbw

Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild. This version uses LLaVA-13b for captioning.

deep-image-diffusion-prior@laion-ai

Generate an image using text by visualizing CLIP features.

ocr-receipt@sulthonmb

OCR receipt into JSON

pets@visoar

a cat lora model

llava-13b@yorickvp

Visual instruction tuning towards large language and vision models with GPT-4 level capabilities

deepfspis@lime-j

Deep Flexible Structure-preserving Image Smoothing

hairfastgan@camenduru

HairFastGAN: Realistic and Robust Hair Transfer with a Fast Encoder-Based Approach

controlnet-multi@mikelyndon

A multi-input ControlNet model. Pass in control images and set the weights.

wavyfusion@cjwbw

dreambooth trained on a very diverse dataset ranging from photographs to paintings

prometheus-13b-v1.0@tomasmcm

Source: kaist-ai/prometheus-13b-v1.0 ✦ Quant: TheBloke/prometheus-13B-v1.0-AWQ ✦ An alternative to GPT-4 when evaluating LLMs & Reward models for RLHF