-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* update the post * update the index
- Loading branch information
Showing
3 changed files
with
43 additions
and
0 deletions.
There are no files selected for viewing
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
--- | ||
title: "PLM-Local" | ||
description: "Local Compilation of AMPLIFY on Metal" | ||
author: "Zachary Charlop-Powers" | ||
date: "2024-12-09" | ||
categories: [rust, ai, proteins] | ||
image: "images/cargo_run.jpg" | ||
--- | ||
|
||
|
||
# PLM-Local | ||
|
||
|
||
This is a quick update post. | ||
|
||
- Release of [ESMC](https://www.evolutionaryscale.ai/blog/esm-cambrian) | ||
- Began to port ESMC model in subcrate [ferritin-esm](https://github.com/zachcp/ferritin/tree/main/ferritin-esm) | ||
- Ported the model in [this PR](https://github.com/zachcp/ferritin/pull/66), and [this PR](https://github.com/zachcp/ferritin/pull/67) | ||
- However, when I went to implement the tokenizer, I ended up encountering some issues. Although ESMC is similar to ESM2 in that it is trained only on sequences, it shares a common codebase with ESM3 which uses structural information | ||
- The major implication is that the tokenizer is not as simple and self-contained, and porting requires either: | ||
- a full minimal reimplementation of ESM functionality or | ||
- a simpler tokenization scheme that deviates from the current codebase | ||
- While brainstorming on ESMC, I remembered that I hadn't implemented METAL for AMPLIFY; so [I did](https://github.com/zachcp/ferritin/pull/69) | ||
- This makes AMPLIFY really pleasantly fast! | ||
- I was therefore inspired to build and serve the binary via a new project, [plm-local](https://github.com/zachcp/plm-local), and can build and deploy to a Conda channel | ||
- h/t to [luizirber](https://bsky.app/profile/luizirber.bsky.social/post/3lcss7hz6v22k) for showing me this nice trick | ||
|
||
```shell | ||
pixi exec -c https://repo.prefix.dev/protein-language-local-public \ | ||
plm-local --protein-string MSVVGIDLGFQSCYVAVARAGGIETIANEYSDRCTPACISFGPKNR | ||
|
||
# that's pretty awesome. | ||
time pixi exec ... <as above> 0.53s user 0.65s system 42% cpu 2.800 total | ||
time pixi exec ... <as above> 0.53s user 0.27s system 172% cpu 0.464 total | ||
``` | ||
|
||
|
||
## What it looks like at the moment. | ||
|
||
Current CLI simply encodes and decodes the protein. But it can do it in half a second with no compilation step | ||
or dependencies! Boom. | ||
|
||
![](images/binary_conda.jpg) |