Revision Ver-2

My Chaotic Journey with HunyuanVideo-Avatar

Nobody Should Ever Attempt This.

To put it bluntly, the HunyuanVideo-Avatar project is fundamentally unfinished "paperware" and, in my view, outright hype.

I naively trusted the marketing, cloned the repo, and tried to patch it up. After 10 hours of struggling with the code, I reached a clear conclusion: I deleted the HunyuanVideo-Avatar repository with a single rm -rf.

image of dev's anger toward paperware

Why Are There No YouTube Reviews or GitHub Activity?

Eight months after its launch, the project remains silent—within the open-source world, that essentially means it’s dead. This silence is common for AI labs in big tech; Tencent is a prime example.

I suspect the promotional material was cherry‑picked.

The demo videos likely showcase a single miraculous output generated after dozens of internal supercomputing cluster runs.

KPI-driven dump: Researchers probably wrote a paper claiming SOTA results, tossed the code onto GitHub for credibility, and never cared about polishing it for public use. The code I cloned was riddled with bugs, hard-coded paths, and required an outdated transformers version (4.5-4.6). Even though the requirements file states “>=4.50.0”, any version other than 4.5-4.6 crashes. Months passed, and the developers never updated dependencies or fixed bugs.
Hellish errors and unrealistic VRAM demands: Spark reports 120 GB unified memory, with peak inference hitting 80 GB, yet it is suddenly terminated by a SIGKILL-9 with no OOM warning. Even with FP16 disabled and Griffin-quantized FP8, running a mere 17-frame (0.7 s) clip triggers the termination. This isn’t a hardware issue; it points to a severe memory leak in the C++ kernels or Python code, and the attention implementation is completely broken.
Memory-leak complaints: Issue reports indicate that even an A100 can run out of memory, and a 30-minute run yields abysmal quality.
Ridiculous PyTorch basics error (--cpu-offload bug): The error Cannot generate a cpu tensor from a generator of type cuda. is a beginner’s mistake. If --cpu-offload is implemented, every tensor and random generator should be moved to CPU, but somewhere the code hard-codes .cuda() and leaves it unchecked. In short, the feature was released without proper testing.
Other attempts to patch the code revealed countless bugs and hacky workarounds, which I will not enumerate here.

My Bottom Line: Unfixable Code

I spent 10 hours trying to salvage the project, hoping Tencent’s engineers had left something usable. I’m writing this post so others don’t waste the same effort. The code is beyond repair.

If anyone actually got it to run, please share evidence of the hardware specs and results—I'll gladly apologize. I doubt that will happen.

My test environment:

DGX-Spark GB10 with 120 GB VRAM, CUDA 13.0

Related posts

My Chaotic Journey with HunyuanVideo-Avatar [ r2 ]

Nobody Should Ever Attempt This.

Why Are There No YouTube Reviews or GitHub Activity?

My Bottom Line: Unfixable Code