# Source notes — LTX 2.3 ComfyUI Character & Background Replacement Guide

- Video: [How to Replace Characters or Backgrounds in Videos with LTX 2.3](https://www.youtube.com/watch?v=5KX_1JkiNCE)
- Channel: [FutuTek](https://www.youtube.com/channel/UCXG5FVJlVLSHREQUE-ha5OA)
- Video ID: `5KX_1JkiNCE`
- Duration: 5:00
- Upload date from metadata: 20260522
- Transcript segments: 52
- Local transcript: `/Users/alanwheat/.hermes/guide-publishing/youtube-5KX_1JkiNCE/transcript.txt`
- Local metadata: `/Users/alanwheat/.hermes/guide-publishing/youtube-5KX_1JkiNCE/metadata.txt`
- Retrieval date: 2026-05-22

## What the video contributed

- Workflow purpose: replace a video character from one reference image, or replace the background/environment while preserving performance and timing.
- Settings described: mode selection, video length, starting frame, FPS, resolution, keep-original-voice true/false.
- Prompt method: user writes only character/environment/action; workflow extracts original dialogue and injects it into final prompt.
- Character replacement flow: Flux 2 Klein Edit generates first output frame; IC LoRA replicates source motion into the new character.
- Background replacement flow: Flux 2 Klein Edit replaces first-frame environment; SAM3 isolates the main character; DWpose creates a clean pose reference; IC LoRA preserves movement inside the new scene.

## Description/resource links preserved

- Workflow & Resources — `Google Drive workflow/resources folder` — Import workflow JSON/resources from the author: https://ko-fi.com/fututek
- Flux 2 Klein Edit — `/ComfyUI/models/diffusion_models` — First-frame image editing for character/background integration: https://huggingface.co/silveroxides/FLUX.2-dev-fp8_scaled/resolve/main/flux-2-klein-9b-fp8mixed.safetensors
- Qwen text encoder for Flux Klein — `/ComfyUI/models/text_encoders` — Text encoder used by Flux Klein subgraph: https://huggingface.co/Comfy-Org/vae-text-encorder-for-flux-klein-9b/resolve/main/split_files/text_encoders/qwen_3_8b_fp8mixed.safetensors
- Flux2 VAE — `/ComfyUI/models/vae` — VAE for Flux Klein: https://huggingface.co/Comfy-Org/flux2-dev/resolve/main/split_files/vae/flux2-vae.safetensors
- MelBand RoFormer audio model — `/ComfyUI/models/diffusion_models` — Audio/dialogue extraction or separation support: https://huggingface.co/Kijai/MelBandRoFormer_comfy/resolve/main/MelBandRoformer_fp32.safetensors?download=true
- LTX 2.3 low-VRAM GGUF — `/ComfyUI/models/diffusion_models` — 12 GB VRAM path / quantized model: https://huggingface.co/vantagewithai/LTX-2.3-GGUF/blob/main/dev/ltx-2-3-22b-dev-Q4_K_M.gguf
- LTX 2.3 FP8 transformer — `/ComfyUI/models/diffusion_models` — 16 GB+ VRAM path: https://huggingface.co/Kijai/LTX2.3_comfy/blob/main/diffusion_models/ltx-2.3-22b-dev_transformer_only_fp8_scaled.safetensors
- Gemma 3 text encoder — `/ComfyUI/models/text_encoders` — LTX 2.3 text encoder: https://huggingface.co/Comfy-Org/ltx-2/resolve/main/split_files/text_encoders/gemma_3_12B_it_fp4_mixed.safetensors
- LTX text projection — `/ComfyUI/models/text_encoders` — LTX 2.3 text projection: https://huggingface.co/Kijai/LTX2.3_comfy/blob/main/text_encoders/ltx-2.3_text_projection_bf16.safetensors
- LTX audio VAE — `/ComfyUI/models/vae` — Audio VAE for LTX 2.3: https://huggingface.co/Kijai/LTX2.3_comfy/blob/main/vae/LTX23_audio_vae_bf16.safetensors
- LTX video VAE — `/ComfyUI/models/vae` — Video VAE for LTX 2.3: https://huggingface.co/Kijai/LTX2.3_comfy/blob/main/vae/LTX23_video_vae_bf16.safetensors
- Tiny VAE preview — `/ComfyUI/models/vae or preview-related folder` — Faster previews; verify workflow expected path: https://github.com/madebyollin/taehv/blob/main/safetensors/taeltx2_3.safetensors
- Spatial upscaler — `/ComfyUI/models/latent_upscale_models` — 2x latent spatial upscaling: https://huggingface.co/Lightricks/LTX-2.3/blob/main/ltx-2.3-spatial-upscaler-x2-1.0.safetensors
- Distilled LoRA — `/ComfyUI/models/loras` — Faster/distilled LTX generation: https://huggingface.co/Kijai/LTX2.3_comfy/blob/main/loras/ltx-2.3-22b-distilled-lora-dynamic_fro09_avg_rank_105_bf16.safetensors
- Camera movement LoRAs — `/ComfyUI/models/loras` — Optional movement/control LoRAs: https://github.com/Lightricks/LTX-2?tab=readme-ov-file

## Transcript excerpt / timing map

```text
0:00 Today, I’m going to introduce an workflow that 
lets you replace any character in an existing  
0:05 video using just a single reference image.
You can swap the original character while  
0:11 fully preserving the performance, emotion, 
timing, and natural motion of the source video. 
0:16 This workflow brings any character to life 
by accurately replicating the performer’s  
0:20 facial expressions, eye movement, body 
language, and overall motion dynamics. 
0:25 From subtle facial expressions to eye contact, 
movement rhythm, and emotional delivery,  
0:29 every detail is preserved to create 
realistic and cinematic results. 
0:33 You also can completely transform the background 
and surrounding environment while keeping the  
0:37 main character perfectly consistent.
Everything around can shift dynamically  
0:41 to create a fresh, dramatic, 
and emotionally rich atmosphere,  
0:44 while the character still feels as if they 
truly belong in every world they step into.
0:50 I won’t cover the installation process for ComfyUI 
and LTX 2.3 again — just check the tutorial videos  
0:57 on my channel and follow the setup instructions 
there. Now let me explain how everything works. 
1:04 Inside the Settings section, you can choose 
between Character Replacement or Background  
1:09 Replacement. You can also configure the video 
length, starting frame, FPS, resolution, and  
1:16 decide whether to keep the original voice audio.
Set it to true if you want to preserve the  
1:22 original voice, or false if you want LTX to 
generate a more natural voice for the character. 
1:28 In the green Prompt section, you only need 
to describe the character, environment,  
1:33 and actions — no need to type the dialogue.
I already built a feature that automatically reads  
1:39 the dialogue from the original video and inserts 
it into the prompt for you, as shown in the black  
1:44 Final Prompt section. It’s surprisingly 
accurate… almost suspiciously accurate. 
1:51 The core idea behind this workflow is using IC 
LoRA to replicate the motion of the original  
1:57 character onto the new character. Because of 
that, the first frame of the input video must  
2:03 match the first frame of the output video almost 
perfectly, except for the replaced character. 
2:10 Normally, doing this manually with separate AI 
tools or editing software would be a complete  
2:15 nightmare. So I automated the whole process 
by integrating a Flux 2 Klein Edit subgraph.  
2:23 As you can see, the character replacement 
blends naturally into the original  
2:27 scene with very impressive accuracy, 
integrates perfectly into the scene,  
2:32 matching the lighting, visual style, 
and atmosphere in a very natural way. 
2:38 For Option 1, after reviewing output setting, 
selecting the character’s natural voice,  
2:44 inputting the initial prompt, verifying 
the audio, I simply press Run.
2:50 The workflow will automatically 
generate the first frame for output. 
2:54 At this point, if I’m not satisfied with the image 
generated by Flux, I can restart the process. But  
3:01 luckily… everything is going smoothly today.
Then the workflow read the video dialogue,  
3:07 inject it into the initial prompt and 
combine it to be the final prompt. 
3:12 After waiting a few minutes with Option 1, 
I get the result shown on screen. Honestly,  
3:18 I’m pretty happy with how this turned out.
Next, let’s move on to Option 2:  
3:29 Background Replacement.
First, I’ll review some basic settings  
3:34 to make sure everything is set correctly.
Since we’re only replacing the background  
3:38 this time, I’ll keep the original 
voice audio from the character. 
3:43 After press Run, the system 
automatically reads the dialogue  
3:46 and inserts it into the prompt step by step.
Then, Flux 2 Klein Edit generates the first  
3:53 output frame by replacing the background image to 
the first input frame. As you can see, the image  
4:00 quality looks fantastic, and the character 
blends naturally into the new environment. 
4:08 Then, inside the subgraph, the SAM3 
Segmentation node isolates only the  
4:14 main character, allowing the DWpose node to 
generate a single clean pose reference. This  
4:21 helps prevent random unwanted characters 
from appearing in the new background. 
4:26 From there, IC LoRA recreates and preserves 
the original character’s movements while  
4:32 smoothly integrating them into the 
newly generated animated environment. 
4:37 The final result is a video featuring the 
original character placed inside an entirely  
4:42 new scene - all generated quickly, naturally, and 
with minimal effort, as you can see on the screen. 
4:55 Hope you have fun creating with this workflow, 
and don’t forget to support the channel!
```
