Choose from world-class AI generation models. Video, image, audio, LLM, and 3D — all in one place.
49 models available
| Model | Category | Provider | Cost | Supported Params | |
|---|---|---|---|---|---|
ace-step Lightweight music generation for quick prototyping and background music | audio | music_generation_v3_5 | 2Credits | promptduration | Try |
claude Anthropic Claude with powerful reasoning and long-context understanding | llm | claude-sonnet-4-6 | 2Credits | texttemperaturemax_tokens | Try |
claude-code Claude Code optimized for software development and engineering tasks | llm | claude-sonnet-4-6 | 2Credits | texttemperaturemax_tokens | Try |
deep-research Autonomous research agent capable of executing multi-step investigative research | llm | deepseek-reasoner | 5Credits | texttemperaturemax_tokens | Try |
deepseek Powerful reasoning model with exceptional math and coding capabilities | llm | deepseek-chat | 1Credits | texttemperaturemax_tokens | Try |
diffrhythm Generate full songs from text prompts or reference audio | audio | music_generation_v4_5 | 3Credits | promptduration | Try |
f5-tts Fast high-fidelity text-to-speech supporting multiple languages and voice cloning | audio | qwen3-tts-vd | 2Credits | textvoicespeedlanguage | Try |
flux-api High-quality text-to-image generation with exceptional detail | image | flux-2-flex | 4Credits | promptnegative_promptaspect_ratioseedcountquality | Try |
flux-kontext Flux with contextual understanding for more coherent outputs | image | flux-kontext-pro | 5Credits | promptimage_urlaspect_ratioseedcountquality | Try |
framepack-api Professional video frame packing and enhancement with visual effects | video | framepack-api | 9Credits | promptnegative_promptaspect_ratiodurationseedquality | Try |
gpt-4o OpenAI's most capable multimodal model | llm | gpt-4o | 2Credits | texttemperaturemax_tokens | Try |
gpt-image-1 OpenAI's GPT-powered image generation | image | gpt-image-1 | 6Credits | promptaspect_ratioseedcountquality | Try |
gpt-image-1-5 GPT Image 1 upgraded version with further improved generation quality | image | gpt-image-1.5 | 6Credits | promptaspect_ratioseedcountquality | Try |
gpt-image-2-0 OpenAI's latest GPT Image 2.0 generation with higher resolution support | image | gpt-image-2 | 7Credits | promptaspect_ratioseedcountquality | Try |
hailuo-api Fast and efficient video generation optimized for creators and marketers | video | MiniMax-Hailuo-2.3 | 10Credits | promptnegative_promptaspect_ratiodurationseedquality | Try |
hunyuan-api Tencent's powerful video generation model with excellent Chinese text understanding | video | hunyuan-api | 12Credits | promptnegative_promptaspect_ratiodurationseedquality | Try |
kling-2-5 Balanced performance and quality for professional video creation | video | kling-o3-text-to-video | 11Credits | promptnegative_promptaspect_ratiodurationseedquality | Try |
kling-2-6 Enhanced Kling 2.x with improved detail and motion smoothness | video | kling-v2-6 | 13Credits | promptimage_urlnegative_promptaspect_ratiodurationseedquality | Try |
kling-3 High-quality cinematic video generation with realistic motion and physics | video | kling-v3-text-to-video | 15Credits | promptimage_urlnegative_promptaspect_ratiodurationseedquality | Try |
kling-3-omni Omni-modal Kling 3.0 supporting multiple input types for video generation | video | kling-v3-omni | 18Credits | promptimage_urlnegative_promptaspect_ratiodurationseedquality | Try |
kling-ai-avatar Generate personalized AI avatar videos with consistent character identity | video | kling-o3-reference-to-video | 14Credits | promptimage_urlvideo_urlaspect_ratioduration | Try |
kling-api Kuaishou's flagship video generation API with high-quality output | video | kling-v3-text-to-video | 12Credits | promptimage_urlnegative_promptaspect_ratiodurationseedquality | Try |
kling-o1 Optimized Kling model with faster inference and improved consistency | video | kling-video-o1 | 10Credits | promptimage_urlnegative_promptaspect_ratiodurationseedquality | Try |
luma-api Fast video generation with photorealistic quality and rapid inference | video | ltx-2.3-text-video | 11Credits | promptnegative_promptaspect_ratiodurationseedquality | Try |
mmaudio Multimodal audio generation for automatic sound effects and soundtrack matching | audio | music_generation_v4_5 | 3Credits | promptduration | Try |
moshi Open-source full-duplex voice conversation model supporting real-time voice interaction | audio | moshi | 3Credits | promptduration | Try |
nano-banana Fast and affordable image generation | image | nano-banana-2-lite | 2Credits | promptaspect_ratioseedquality | Try |
nano-banana-2 Second-generation Nano Banana with stronger semantic understanding | image | nano-banana-2-lite | 2Credits | promptaspect_ratioseedquality | Try |
nano-banana-pro Nano Banana Pro with significantly improved quality and detail | image | nano-banana-pro-beta | 3Credits | promptaspect_ratioseedquality | Try |
omniavatar Create customizable 3D avatars with multiple actions and expressions | video | omniavatar | 15Credits | promptimage_urlaspect_ratioduration | Try |
omnihuman-1-5 Upgraded OmniHuman with more accurate lip-sync and natural body movement | video | omnihuman-1.5 | 18Credits | promptimage_urlaspect_ratiodurationseedquality | Try |
omnihuman-api Generate realistic human videos from a single photo with natural expressions | video | omnihuman-1.5 | 16Credits | promptimage_urlaspect_ratiodurationseedquality | Try |
qwen-image Tongyi Qianwen image generation with excellent Chinese understanding and diverse styles | image | qwen-image-edit | 3Credits | promptaspect_ratioseedcountquality | Try |
seedance-1-0-lite Lightweight Seedance with faster inference while maintaining excellent quality | video | doubao-seedance-1-0-pro-fast | 9Credits | promptimage_urlaspect_ratiodurationseedquality | Try |
seedance-1-0-pro Professional image-to-video with fluid motion and high fidelity | video | seedance-1.5-pro | 14Credits | promptimage_urlaspect_ratiodurationseedquality | Try |
seedance-2-0 Next-gen image-to-video with upgraded motion dynamics and detail | video | seedance-2.0-text-to-video | 16Credits | promptimage_urlaspect_ratiodurationseedquality | Try |
seedream-4-0 Seedream 4.0 with balanced quality and cost for most scenarios | image | doubao-seedream-4.0 | 4Credits | promptaspect_ratioseedcountquality | Try |
seedream-5-0 ByteDance Seedream 5.0 with commercial-grade image quality | image | doubao-seedream-5.0-lite | 5Credits | promptaspect_ratioseedcountquality | Try |
skyreels-api Cinematic video generation with advanced camera movement and scene composition | video | skyreels-v4-std | 13Credits | promptnegative_promptaspect_ratiodurationseedquality | Try |
trellis-2 Enhanced 3D generation with improved geometry structure and texture quality | 3d | trellis-2 | 10Credits | promptimage_urlseed | Try |
trellis-3d Generate high-quality 3D assets from images or text descriptions | 3d | trellis-3d | 8Credits | promptimage_urlseed | Try |
udio High-quality music generation with natural vocals for professional production | audio | suno-v5 | 4Credits | promptduration | Try |
veo-3 Google's advanced video generation with stunning visual fidelity | video | veo3 | 20Credits | promptimage_urlaspect_ratiodurationquality | Try |
veo-3-1 Refined Veo with improved temporal consistency and longer videos | video | veo3.1-pro | 22Credits | promptimage_urlend_image_urlaspect_ratiodurationquality | Try |
wan-2-1 Open-source video generation with strong performance and flexibility | video | wan2.6-text-to-video | 8Credits | promptnegative_promptaspect_ratiodurationseedquality | Try |
wan-2-2 Enhanced Wan model with higher motion quality and longer sequence support | video | wan2.6-text-to-video | 9Credits | promptnegative_promptaspect_ratiodurationseedquality | Try |
wan-2-5 Mid-range Wan model optimized for quality-speed balance | video | wan2.6-text-to-video | 10Credits | promptnegative_promptaspect_ratiodurationseedquality | Try |
wan-2-6 Latest generation Wan with state-of-the-art video synthesis capabilities | video | wan2.6-text-to-video | 12Credits | promptnegative_promptaspect_ratiodurationseedquality | Try |
z-image-turbo Ultra-fast Z Image Turbo generation | image | z-image-turbo | 2Credits | promptaspect_ratioseedquality | Try |
music_generation_v3_5
Lightweight music generation for quick prototyping and background music
Supported Params
claude-sonnet-4-6
Anthropic Claude with powerful reasoning and long-context understanding
Supported Params
claude-sonnet-4-6
Claude Code optimized for software development and engineering tasks
Supported Params
deepseek-reasoner
Autonomous research agent capable of executing multi-step investigative research
Supported Params
deepseek-chat
Powerful reasoning model with exceptional math and coding capabilities
Supported Params
music_generation_v4_5
Generate full songs from text prompts or reference audio
Supported Params
qwen3-tts-vd
Fast high-fidelity text-to-speech supporting multiple languages and voice cloning
Supported Params
flux-2-flex
High-quality text-to-image generation with exceptional detail
Supported Params
flux-kontext-pro
Flux with contextual understanding for more coherent outputs
Supported Params
framepack-api
Professional video frame packing and enhancement with visual effects
Supported Params
gpt-4o
OpenAI's most capable multimodal model
Supported Params
gpt-image-1
OpenAI's GPT-powered image generation
Supported Params
gpt-image-1.5
GPT Image 1 upgraded version with further improved generation quality
Supported Params
gpt-image-2
OpenAI's latest GPT Image 2.0 generation with higher resolution support
Supported Params
MiniMax-Hailuo-2.3
Fast and efficient video generation optimized for creators and marketers
Supported Params
hunyuan-api
Tencent's powerful video generation model with excellent Chinese text understanding
Supported Params
kling-o3-text-to-video
Balanced performance and quality for professional video creation
Supported Params
kling-v2-6
Enhanced Kling 2.x with improved detail and motion smoothness
Supported Params
kling-v3-text-to-video
High-quality cinematic video generation with realistic motion and physics
Supported Params
kling-v3-omni
Omni-modal Kling 3.0 supporting multiple input types for video generation
Supported Params
kling-o3-reference-to-video
Generate personalized AI avatar videos with consistent character identity
Supported Params
kling-v3-text-to-video
Kuaishou's flagship video generation API with high-quality output
Supported Params
kling-video-o1
Optimized Kling model with faster inference and improved consistency
Supported Params
ltx-2.3-text-video
Fast video generation with photorealistic quality and rapid inference
Supported Params
music_generation_v4_5
Multimodal audio generation for automatic sound effects and soundtrack matching
Supported Params
moshi
Open-source full-duplex voice conversation model supporting real-time voice interaction
Supported Params
nano-banana-2-lite
Fast and affordable image generation
Supported Params
nano-banana-2-lite
Second-generation Nano Banana with stronger semantic understanding
Supported Params
nano-banana-pro-beta
Nano Banana Pro with significantly improved quality and detail
Supported Params
omniavatar
Create customizable 3D avatars with multiple actions and expressions
Supported Params
omnihuman-1.5
Upgraded OmniHuman with more accurate lip-sync and natural body movement
Supported Params
omnihuman-1.5
Generate realistic human videos from a single photo with natural expressions
Supported Params
qwen-image-edit
Tongyi Qianwen image generation with excellent Chinese understanding and diverse styles
Supported Params
doubao-seedance-1-0-pro-fast
Lightweight Seedance with faster inference while maintaining excellent quality
Supported Params
seedance-1.5-pro
Professional image-to-video with fluid motion and high fidelity
Supported Params
seedance-2.0-text-to-video
Next-gen image-to-video with upgraded motion dynamics and detail
Supported Params
doubao-seedream-4.0
Seedream 4.0 with balanced quality and cost for most scenarios
Supported Params
doubao-seedream-5.0-lite
ByteDance Seedream 5.0 with commercial-grade image quality
Supported Params
skyreels-v4-std
Cinematic video generation with advanced camera movement and scene composition
Supported Params
trellis-2
Enhanced 3D generation with improved geometry structure and texture quality
Supported Params
trellis-3d
Generate high-quality 3D assets from images or text descriptions
Supported Params
suno-v5
High-quality music generation with natural vocals for professional production
Supported Params
veo3
Google's advanced video generation with stunning visual fidelity
Supported Params
veo3.1-pro
Refined Veo with improved temporal consistency and longer videos
Supported Params
wan2.6-text-to-video
Open-source video generation with strong performance and flexibility
Supported Params
wan2.6-text-to-video
Enhanced Wan model with higher motion quality and longer sequence support
Supported Params
wan2.6-text-to-video
Mid-range Wan model optimized for quality-speed balance
Supported Params
wan2.6-text-to-video
Latest generation Wan with state-of-the-art video synthesis capabilities
Supported Params
z-image-turbo
Ultra-fast Z Image Turbo generation
Supported Params