Video, Audio & Voice Generation
Video, Audio & Voice Generation
AI Media Studio offers powerful tools for creating videos, sound effects, and voiceovers to bring your content to life.
Video Generation
How It Works
Create videos from text descriptions or animate existing images using AI.
Generation Options
- Text-to-Video - Describe a scene and AI creates the video
- Image-to-Video - Upload an image and add motion
Quality Tiers
- Standard Quality (Veo 2) - 400 sparks
- Good for testing and drafts
- 720p resolution
- 5-10 second clips
- Premium Quality (Veo 3) - 600 sparks
- Professional output
- 1080p resolution
- Up to 15 seconds
- Better motion quality
Video Prompting Tips
Focus on a Single Action:
- Good: "A bird flying over a calm lake at sunset"
- Avoid: "A bird lands, walks around, then flies away"
Describe Camera Movement:
- "Slow zoom in on a flower"
- "Camera panning across a city skyline"
- "Static shot of waves crashing"
Include Environment Details:
- Lighting: "golden hour", "overcast", "dramatic shadows"
- Mood: "peaceful", "energetic", "mysterious"
- Setting: "forest", "city street", "abstract space"
Best Practices
- Start with 5-second clips to test concepts
- Keep motion simple - complex actions may look unnatural
- Use Premium for final deliverables
- Generate multiple variations for choice
Common Video Use Cases
- Course introductions and outros
- Product demonstrations
- Concept explanations
- Background visuals for presentations
- Social media content
Audio Generation (Sound Effects)
How It Works
Create realistic sound effects from text descriptions using ElevenLabs AI audio.
Spark Cost
- 6 sparks per sound effect
- Duration: 0.5 to 22 seconds
- Format: MP3, 44.1kHz, 128kbps
Audio Prompting Tips
Be Descriptive About the Sound:
- Good: "Thunder rumbling in the distance with light rain"
- Avoid: "Storm sounds"
Include Context:
- Environment: "indoor", "outdoor", "echoing cave"
- Distance: "close-up", "distant", "approaching"
- Intensity: "loud", "soft", "subtle", "intense"
Specify Duration When Needed:
- "Short doorbell ring (2 seconds)"
- "Long ocean waves ambience (20 seconds)"
- Leave blank for auto-detection
Creativity Control
Adjust prompt influence (0-1):
- Low (0.2-0.3) - More realistic, closer to training data
- Medium (0.5) - Balanced creativity and realism
- High (0.7-0.9) - More experimental and unique
Example Prompts
- "Keyboard typing in a quiet office"
- "Car engine starting and driving away"
- "Cheerful notification ping"
- "Footsteps on wooden floor"
- "Coffee machine brewing espresso"
- "Wind rustling through leaves"
- "Applause from a small audience"
Best Practices
- Generate variations if first result isn't perfect
- Use lower creativity for common sounds (door close, bell)
- Use higher creativity for abstract or unique sounds
- Test in your target application before finalizing
Voice Generation (Text-to-Speech)
How It Works
Convert text into natural-sounding voiceovers using professional AI voices.
Spark Cost
- 15 sparks per voiceover
- Up to 500 characters per generation
- Format: MP3, high-quality audio
Voice Options
Preset Voices
Choose from a library of professional voices:
- Male and female options
- Various ages and accents
- Different tones (friendly, authoritative, energetic)
Custom Voice Design
Create your own voice by specifying:
- Gender: Male, female, neutral
- Age: Young, middle-aged, mature
- Accent: American, British, Australian, etc.
- Tone: Friendly, professional, enthusiastic, calm
Text Formatting for Best Results
Use Proper Punctuation
- Periods for natural pauses
- Commas for shorter pauses
- Question marks for rising intonation
- Exclamation points for emphasis
SSML Support (Coming Soon)
- Control speed, pitch, and emphasis
- Add breaks and pronunciations
Writing Script Tips
- Conversational Tone: Write how you'd speak, not formal writing
- Short Sentences: Easier to deliver naturally
- Avoid Jargon: May be mispronounced
- Include Context: Mention tone in prompt ("enthusiastic delivery", "calm narration")
Common Use Cases
- Course narration and explanations
- Video voiceovers
- Podcast intros and outros
- Audiobook samples
- Phone system messages
- Character voices for stories
Example Texts
eLearning Intro:
"Welcome to this course on project management. Over the next few modules, you'll learn essential skills for leading successful projects. Let's get started!"
Product Demo:
"Here's how easy it is to create stunning visuals. Simply type your description, hit generate, and watch as AI brings your ideas to life in seconds."
Call-to-Action:
"Ready to transform your content? Sign up today and get your first 100 sparks free. No credit card required."
Combining Media Types
Complete Video Project
- Generate video clip (premium quality)
- Generate sound effects to match action
- Generate voiceover narration
- Combine in video editor
eLearning Module
- Generate character images
- Generate voiceover for dialogue
- Generate background audio ambience
- Assemble in authoring tool
Social Media Post
- Generate eye-catching image
- Generate short video clip
- Add voice narration
- Post to platforms
Quality and File Management
File Formats
- Video: MP4 (H.264 codec)
- Audio: MP3 (44.1kHz, 128-320kbps)
- Voice: MP3 (high-quality stereo)
Gallery Organization
- All media automatically saved to gallery
- Filter by type (video/audio/voice)
- Search by prompt or date
- Download in original quality
Commercial Rights
You own all content you generate:
- Use in commercial projects
- Include in client deliverables
- Distribute and publish freely
- No attribution required
Troubleshooting
Video Issues
- Unnatural Motion: Simplify prompt, focus on single action
- Poor Quality: Upgrade to Premium (Veo 3)
- Wrong Style: Add style keywords ("photorealistic", "animated")
Audio Issues
- Wrong Sound: Be more specific in description
- Too Short/Long: Specify duration in prompt
- Not Realistic: Lower creativity setting
Voice Issues
- Mispronunciation: Spell phonetically or choose different voice
- Wrong Tone: Specify desired emotion in prompt
- Unnatural Pauses: Check punctuation