Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inference with a list of prompts without re-loading the model each time #122

Open
JosephPai opened this issue Dec 12, 2024 · 5 comments
Open

Comments

@JosephPai
Copy link

Hi authors, I would like to run the model for a list of prompts in multi-gpu mode. To save the time on loading the pre-trained model each time, I modified the sample_video.py file with a for loop to work on a list of prompts.
However, the code works well for the first prompt, but always fails at the second one.
Could you help look into this issue? Thanks.

import os
import time
from pathlib import Path
from loguru import logger
from datetime import datetime
import torch

from hyvideo.utils.file_utils import save_videos_grid
from hyvideo.config import parse_args
from hyvideo.inference import HunyuanVideoSampler


def main():
    args = parse_args()
    print(args)
    models_root_path = Path(args.model_base)
    if not models_root_path.exists():
        raise ValueError(f"`models_root` not exists: {models_root_path}")
    
    # Create save folder to save the samples
    save_path = args.save_path if args.save_path_suffix=="" else f'{args.save_path}_{args.save_path_suffix}'
    if not os.path.exists(args.save_path):
        os.makedirs(save_path, exist_ok=True)

    # Load models
    hunyuan_video_sampler = HunyuanVideoSampler.from_pretrained(models_root_path, args=args)
    
    # Get the updated args
    args = hunyuan_video_sampler.args

    for i in range(5):
        # Start sampling
        # TODO: batch inference check
        outputs = hunyuan_video_sampler.predict(
            prompt=args.prompt + f"_test_{i}",
            height=args.video_size[0],
            width=args.video_size[1],
            video_length=args.video_length,
            seed=args.seed,
            negative_prompt=args.neg_prompt,
            infer_steps=args.infer_steps,
            guidance_scale=args.cfg_scale,
            num_videos_per_prompt=args.num_videos,
            flow_shift=args.flow_shift,
            batch_size=args.batch_size,
            embedded_guidance_scale=args.embedded_cfg_scale
        )
        samples = outputs['samples']

        # Save samples
        if 'LOCAL_RANK' not in os.environ or int(os.environ['LOCAL_RANK']) == 0:
            for i, sample in enumerate(samples):
                sample = samples[i].unsqueeze(0)
                time_flag = datetime.fromtimestamp(time.time()).strftime("%Y-%m-%d-%H:%M:%S")
                save_path = f"{save_path}/{time_flag}_seed{outputs['seeds'][i]}_{outputs['prompts'][i][:100].replace('/','')}.mp4"
                save_videos_grid(sample, save_path, fps=24)
                logger.info(f'Sample save to: {save_path}')

        torch.cuda.empty_cache()
        torch.distributed.barrier()

if __name__ == "__main__":
    main()

Error message:

HunyuanVideo/hyvideo/inference.py", line 63, in new_forward
[rank1]:     raise ValueError(f"Cannot split video sequence into ulysses_degree x ring_degree ({get_sequence_parallel_world_size()}) parts evenly")
[rank1]: ValueError: Cannot split video sequence into ulysses_degree x ring_degree (8) parts evenly
@tavyra
Copy link

tavyra commented Dec 12, 2024

Does it work if you just make prompt a ["Prompt List", "List of Prompts"] ? Inference.py seems to have this built in and your code would be doing the same thing except dumping cuda cache and trying to reload the pipeline every loop.
Args: prompt (str or List[str]): The input text.

@JosephPai
Copy link
Author

@tavyra
According to this issue, it seems that this feature is not supported yet. (Sad...

@feifeibear
Copy link
Contributor

The prompt list as inputs is not supported currently, whether a single GPU or multi-GPU with xDiT.

@guankaisi
Copy link
Contributor

Hello, I have encountered the same problem as @JosephPai. Do you have a solution to fix the problem?

guankaisi added a commit to guankaisi/HunyuanVideo that referenced this issue Dec 16, 2024
In issue Tencent#122 (Tencent#122), every for loop the parallelize_transformer will reset the pipeline causing the problem. If changing the parallelize_transfomer to __init__,it will solve the issue without affecting other functions.
@guankaisi
Copy link
Contributor

I found that this problem is caused by reinitializing the parallelize_transformer function. I have solved this problem in #130.

JacobKong added a commit that referenced this issue Dec 18, 2024
Solve the issue #122 by updating inference.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants