Wan 2.2 Animate and Replace work best when you understand the division of labor: the image defines the subject, the source video defines the motion or scene timing, and the prompt refines identity, realism, stability, and integration.
Reference subject + source motion or scene + preservation instructions + integration instructions + artifact control
Unlike pure text-to-video, Animate and Replace already inherit structure from source media. The prompt should refine and stabilize rather than trying to reinvent the whole scene.
Preserve first, enhance second.
Identity + what stays fixed + what transfers + scene match + cleanup constraints
Lock the face, hair, clothing, proportions, and defining visual traits from the reference image.
Let the motion, expressions, timing, or scene flow come from the source clip.
Preserve lighting, perspective, background, framing, and camera behavior where appropriate.
Target flicker, edge problems, warped anatomy, unstable hands, or drifting identity.
Animate mode transfers motion from a driving video onto the subject in the reference image.
Replace mode swaps a performer in a video with the subject from the reference image while preserving the original scene timing and camera behavior.
These modes usually work best when you preserve the original scene structure instead of fighting it.
Animate = transfer motion onto the subject. Replace = swap the subject into the scene. In both cases, preserve first and stylize second.