⚠️ Ethical Use & Disclaimer

This model is a technical tool designed for Digital Identity Research, Professional VFX Workflows, and Cinematic Prototyping.

By downloading or using this LoRA, you acknowledge and agree to the following:

  • Intended Use: Designed for filmmakers, VFX artists, and researchers exploring high-fidelity video identity transformation.
  • Consent & Rights: You must possess explicit legal consent and all necessary rights from any individual whose likeness is being processed.
  • Legal Compliance: You are fully responsible for complying with all local and international laws regarding synthetic media.
  • Liability Waiver: This model is provided β€œas is.” As the creator (Alissonerdx), I assume no responsibility for misuse. Any legal, ethical, or social consequences are solely the responsibility of the end user.

πŸ“Ί Video Examples (V1)

Generated using the Frame 0 Anchoring Technique. All examples follow the guide video motion while preserving the identity provided in the first frame.

Example 1 Example 2
Example 3 Example 4
Example 5

πŸ›  Technical Background (V1)

To achieve this level of identity transfer, I heavily modified the official LTX-2 training scripts.

Key Improvements

  • Novel Conditioning Injection: Custom latent injection methods for reference identity stabilization.
  • Noise Distribution Overhaul: Implemented a custom High-Noise Power Law timestep distribution, forcing the model to prioritize target identity reconstruction over guide-video context.
  • Training Compute: 60+ hours of training on NVIDIA RTX PRO 6000 Blackwell GPUs, iterating through 300GB+ of experimental checkpoints.

πŸ“Š Dataset Specifications

V1 Dataset

  • 300 high-quality head swap video pairs
  • Trained on 512x512 buckets
  • Primarily Landscape format
  • Optimized for close-up framing

Wide shots may reduce identity fidelity.


πŸ’‘ Inference Guide (V1)

πŸ”΄ CRITICAL β€” Frame 0 Requirement

This version was trained to use Frame 0 as the identity anchor.

You MUST prepare the first frame correctly.

Recommended Workflow

  1. Perform a high-quality head swap on Frame 0.
  2. Use that processed frame as conditioning input.
  3. Run the full video generation.

For best results, prepare Frame 0 using my previous BFS Image Models.


Optimization

LoRA Strength

  • 1.0 β†’ Best motion fidelity
  • >1.0 β†’ Stronger identity & hair capture but may distort original motion

Multi-Pass Workflows

You can experiment with multiple passes using different strengths.

Prompting

Detailed prompts currently have no effect.

Trigger remains:

head swap

⚠️ Known Issues (V1 – Alpha)

  • Identity Leakage: Hair from the guide video may reappear.
  • Hard Cuts: Jump cuts can reset identity.
  • Portrait Format: Performance significantly better in landscape.

πŸš€ Version 2 – Major Update

V2 introduces a complete redesign of conditioning strategy and masking logic, significantly improving identity robustness and reducing leakage.


πŸ”Ή Multiple Conditioning Modes (Using First Frame)

V2 supports multiple identity injection approaches:

1️⃣ Direct Photo Conditioning

Use a clean photo of the new face as reference input.

This method works and can produce strong results. However, because the model must internally reconcile lighting, perspective, depth, and occlusion differences, it may need to "fight" to correctly integrate the new identity into the guide video. In some cases, this can reduce stability or identity consistency.

2️⃣ First-Frame Head Swap (Recommended)

Applying a proper head swap on Frame 0 still produces extremely strong and reliable results.

Because the first frame is already structurally correct (pose, lighting, depth, occlusions), the model has significantly less work to do. Instead of forcing alignment from a static photo, it simply propagates and stabilizes the identity through time.

This approach generally:

  • Produces higher identity fidelity
  • Reduces deformation
  • Minimizes integration artifacts
  • Improves overall temporal stability

3️⃣ Automatic Magazine-Style Overlay

The new face is automatically cut and positioned over the guide face using mask alignment. This simulates a "magazine cutout" overlay, but performed automatically based on mask positioning.

4️⃣ Manual Overlay

Advanced users may manually composite the new face over Frame 0 before running inference.

Advanced users may manually composite the new face over Frame 0 before running inference.


πŸ”Ή Facial Motion Behavior (Important Change)

Unlike V1:

V2 does NOT follow the original guide face’s facial micro-movements.

The guide face is fully masked to prevent identity leakage.

This makes masking quality critical.

Mask Requirements

  • The guide face MUST be completely covered.
  • Mask color must be magenta tone.
  • Any visible guide identity may leak into the final output.

πŸ”Ή Mask Types

Users may alternate between:

β–ͺ Square Masks

  • More stable identity
  • Better consistency
  • Often produce stronger overall results
  • May generate slightly oversized heads due to spatial padding

In most scenarios, square masks tend to perform better because they provide additional spatial context for the model to reconstruct structure and hair.

β–ͺ Tight / Adjusted Masks

  • More natural head proportions
  • May deform if guide head shape differs significantly
  • Sensitive to long-hair mismatches

If the original guide has long hair and the new identity does not, deformation risk increases.

If the original guide has long hair and the new identity does not, deformation risk increases.


πŸ”Ή Dataset & Training Improvements (V2)

  • 800+ video pairs
  • Trained at 768 resolution
  • 768 is the recommended inference resolution
  • Improved hair stability
  • Reduced identity leakage compared to V1
  • More robust identity transfer under motion

πŸ”Ή First Pass vs Second Pass

You may:

  • Run single pass at 768 (recommended)
  • Or run a downscaled first pass + second upscale pass

⚠️ Important:

Second pass may alter identity from the first pass and reduce consistency in some cases.


πŸ”Ή Trigger

Trigger remains:

head swap

🎬 Upcoming Demonstration Video

A full workflow breakdown will be shared soon, covering:

  • Mask preparation best practices
  • Conditioning variations comparison
  • First pass vs second pass differences
  • Failure cases and correction strategies


πŸ”΄ Critical Success Factor (V2)

In this new version, the single most important factor is mask quality.

Everything depends on the mask.

  • Absolutely no detail from the original guide face can leak.
  • There must be no visible facial fragments.
  • Avoid small holes, gaps, or transparency artifacts.
  • Ensure full coverage of skin, facial hair, eyebrows, and hairline when necessary.

If any portion of the original identity remains visible, the model may reintroduce it during generation.

Mask precision directly determines:

  • Identity stability
  • Leakage prevention
  • Deformation resistance
  • Overall realism

Take time to refine your mask. A high-quality mask will produce dramatically better results than increasing LoRA strength.


πŸ”§ Advanced Technique: Combine with LTX-2 Inpainting

Advanced users can experiment with combining this LoRA with the native LTX-2 inpainting workflow.

This can help:

  • Refine problematic areas
  • Correct small deformation zones
  • Improve edge blending
  • Recover detail in hair or jaw regions

When properly combined, inpainting can significantly enhance final output quality, especially in challenging frames.


πŸ’™ Support

Maintaining R&D and renting Blackwell GPUs is expensive.

If this project helps you, consider supporting the development of:

  • V3 improvements
  • Advanced conditioning pipelines
  • SAM 3 integration
  • Full reference-photo-only workflows

Support here:

https://buymeacoffee.com/nrdx

Downloads last month
169
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Alissonerdx/BFS-Best-Face-Swap-Video

Base model

Lightricks/LTX-2
Adapter
(42)
this model