Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MiniGPT4] Add MiniGPT4 to SHARK #1554

Merged
merged 4 commits into from
Jul 25, 2023
Merged

Conversation

Abhishek-Varma
Copy link
Contributor

-- This is the first installment of MiniGPT4 in SHARK.

Signed-off-by: Abhishek Varma [email protected]

Download the vmfbs from SHARK Tank

The script currently has changed the vmfb names so it needs to be modified like the latest fp16 Vision Model's vmfb : minigpt4_vision_model_fp16_cuda.vmfb after downloading accordingly (I'm working on refactoring this all and get rid of the manual downloading - this is WIP patch).

Discalimer: This is a pretty big WIP patch

@powderluv
Copy link
Contributor

Lets call this "MultiModal" in the UI since I assume we will want to switch out LLMs etc in the future.

shark/shark_importer.py Outdated Show resolved Hide resolved
shark/shark_inference.py Outdated Show resolved Hide resolved
shark/shark_runner.py Outdated Show resolved Hide resolved
@Abhishek-Varma Abhishek-Varma force-pushed the minigpt4_v1 branch 6 times, most recently from 4bdd01a to 0d7f62a Compare June 30, 2023 16:50
@Abhishek-Varma Abhishek-Varma force-pushed the minigpt4_v1 branch 2 times, most recently from a7de215 to 14a2d31 Compare July 7, 2023 13:04
@Abhishek-Varma Abhishek-Varma changed the title [WIP][MiniGPT4] Add MiniGPT4 to SHARK [MiniGPT4] Add MiniGPT4 to SHARK Jul 7, 2023
@Abhishek-Varma Abhishek-Varma marked this pull request as ready for review July 7, 2023 15:45
@Abhishek-Varma
Copy link
Contributor Author

Hi @powderluv - we can merge this and handle bugs in subsequent PR.

  1. I've used source build of IREE to compile first/second llama in fp16 for CUDA.
  2. I've verified it works in both CLI and WebUI in Linux CUDA.
  3. Have cloned the llama model, used for initialising the first/second, as a workaround - because after compiling the first llama for fp16 - because we change fx graph, it was crashing for second llama.

In the subsequent PR - I'd try to handle your comment + automate certain aspect of running this script which currently requires manual intervention + try to get rid of other part of this still huge a code base.

Also, I'll have to try for int8 - let me know your thoughts here as well!

@Abhishek-Varma Abhishek-Varma force-pushed the minigpt4_v1 branch 3 times, most recently from f9ce1ca to 1495997 Compare July 12, 2023 12:29
@powderluv powderluv enabled auto-merge (squash) July 12, 2023 16:37
@Abhishek-Varma Abhishek-Varma force-pushed the minigpt4_v1 branch 3 times, most recently from ef2c983 to 5ac977d Compare July 17, 2023 06:45
@powderluv
Copy link
Contributor

Please lint and we can merge it in and iterate.

@Abhishek-Varma
Copy link
Contributor Author

Please lint and we can merge it in and iterate.

Hi @powderluv
I've formatted the said files.
We can merge this in.

@powderluv
Copy link
Contributor

Is this ready to merge ?

-- This is the first installment of MiniGPT4 in SHARK.

Signed-off-by: Abhishek Varma <[email protected]>
-- This commit adds int8 support for MiniGPT4.

Signed-off-by: Abhishek Varma <[email protected]>
@Abhishek-Varma
Copy link
Contributor Author

Is this ready to merge ?

Hi @powderluv .
This can be merged. Because I was working on a separate branch for int8 and shared that yesterday - I've added the extra set of commits on top of the fp16/fp32 commit.

The CI failure looks unrelated.

@powderluv powderluv disabled auto-merge July 25, 2023 16:42
@powderluv powderluv merged commit 47f8a79 into nod-ai:main Jul 25, 2023
2 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants