{"payload":{"feedbackUrl":"https://github.com/orgs/community/discussions/53140","repo":{"id":689773665,"defaultBranch":"main","name":"llamafile","ownerLogin":"Mozilla-Ocho","currentUserCanPush":false,"isFork":false,"isEmpty":false,"createdAt":"2023-09-10T21:12:32.000Z","ownerAvatar":"https://avatars.githubusercontent.com/u/117940224?v=4","public":true,"private":false,"isOrgOwned":true},"refInfo":{"name":"","listCacheKey":"v0:1719859662.0","currentOid":""},"activityList":{"items":[{"before":"1601118bde65ff4533e39fb75295014932a77ddf","after":"d7c8e33da0ccdd9a9361a33b8362860375374d9d","ref":"refs/heads/main","pushedAt":"2024-07-05T13:39:24.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jart","name":"Justine Tunney","path":"/jart","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49262?s=80&v=4"},"commit":{"message":"Add support for JSON parameters to new server\n\nThis uses the Redbean JSON parser adapted for C++.","shortMessageHtmlLink":"Add support for JSON parameters to new server"}},{"before":"21a30bed3dbcf91a158376345699014d282c42c7","after":"1601118bde65ff4533e39fb75295014932a77ddf","ref":"refs/heads/main","pushedAt":"2024-07-04T21:55:39.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jart","name":"Justine Tunney","path":"/jart","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49262?s=80&v=4"},"commit":{"message":"Revert \"Disable warmup\"\n\nThis reverts commit 21a30bed3dbcf91a158376345699014d282c42c7.\n\nSee #485","shortMessageHtmlLink":"Revert \"Disable warmup\""}},{"before":"cd84736433182552901f375edb57c436003c7208","after":"21a30bed3dbcf91a158376345699014d282c42c7","ref":"refs/heads/main","pushedAt":"2024-07-04T18:38:28.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jart","name":"Justine Tunney","path":"/jart","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49262?s=80&v=4"},"commit":{"message":"Disable warmup\n\nThe llama.cpp warmup isn't needed. We have our own mmap() loader, with a\npregress bar, that activates when logging is enabled and stderr is a tty\n\n    llamafile -m model.gguf --cli -n 1 --log-disable\n\nAbove is an example of how to get the fastest cold start.\n\nFixes #485","shortMessageHtmlLink":"Disable warmup"}},{"before":"1346ef4010f2bdbf78cf5425d75036aff0c1e6d6","after":"cd84736433182552901f375edb57c436003c7208","ref":"refs/heads/main","pushedAt":"2024-07-01T18:47:36.000Z","pushType":"push","commitsCount":9,"pusher":{"login":"jart","name":"Justine Tunney","path":"/jart","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49262?s=80&v=4"},"commit":{"message":"Release llamafile v0.8.9","shortMessageHtmlLink":"Release llamafile v0.8.9"}},{"before":"571b4e5ae86f47208378062430c9c8dcb77450a2","after":"1346ef4010f2bdbf78cf5425d75036aff0c1e6d6","ref":"refs/heads/main","pushedAt":"2024-06-30T23:42:12.000Z","pushType":"push","commitsCount":3,"pusher":{"login":"jart","name":"Justine Tunney","path":"/jart","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49262?s=80&v=4"},"commit":{"message":"Create /embedding endpoint in new server\n\n    make -j o//llamafile/server/main\n    o//llamafile/server/main -m /weights/all-MiniLM-L6-v2.F32.gguf\n    curl http://127.0.0.1:8080/embedding?prompt=orange","shortMessageHtmlLink":"Create /embedding endpoint in new server"}},{"before":"b2f587cedf8e9eef6fba09f4510cca5687ee8c8f","after":"571b4e5ae86f47208378062430c9c8dcb77450a2","ref":"refs/heads/main","pushedAt":"2024-06-29T18:39:30.000Z","pushType":"push","commitsCount":5,"pusher":{"login":"jart","name":"Justine Tunney","path":"/jart","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49262?s=80&v=4"},"commit":{"message":"Release llamafile v0.8.8","shortMessageHtmlLink":"Release llamafile v0.8.8"}},{"before":"6d3590c49076de3db62e4f1cfb4789228465edbe","after":"b2f587cedf8e9eef6fba09f4510cca5687ee8c8f","ref":"refs/heads/main","pushedAt":"2024-06-24T14:47:32.000Z","pushType":"push","commitsCount":4,"pusher":{"login":"jart","name":"Justine Tunney","path":"/jart","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49262?s=80&v=4"},"commit":{"message":"Release llamafile v0.8.7","shortMessageHtmlLink":"Release llamafile v0.8.7"}},{"before":"60404a831edd135565f8a152ef180baac41ba4d4","after":"6d3590c49076de3db62e4f1cfb4789228465edbe","ref":"refs/heads/main","pushedAt":"2024-06-24T06:26:51.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jart","name":"Justine Tunney","path":"/jart","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49262?s=80&v=4"},"commit":{"message":"Pacify --temp flag when running in server mode\n\nThis caused some confusion for the granite 34b llamafiles, which specify\nthe temperature flag in the .args file. While it worked fine for the CLI\nmode of operation, if you ran the llamafile without arguments, then it'd\nfail with an error message instead of running the server :'(","shortMessageHtmlLink":"Pacify --temp flag when running in server mode"}},{"before":"a28250b82ea37c8db52e82827f512c051295f7ae","after":"60404a831edd135565f8a152ef180baac41ba4d4","ref":"refs/heads/main","pushedAt":"2024-06-22T16:56:39.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jart","name":"Justine Tunney","path":"/jart","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49262?s=80&v=4"},"commit":{"message":"Always use tinyBLAS with AMD GPUs on Windows\n\nWhen llamafile uses hipBLAS with ROCm SDK 5.7.1 on Windows10 the process\ncrashes shortly after tokens start getting printed. This is possibly the\nworst heisenbug I've ever seen in my career. It seems to to crash in AMD\ncode, in a separate thread, inside hipGraphicsUnregisterResource, when a\nvqmovdqu instruction is being executed. While this happens, cosmo's main\nthread is usually doing something like std::string and std::locale stuff\nwhich appears unrelated. Could possibly be related to C++ exceptions and\nthread-local storage. Using --tinyblas appears to make it go away, but I\ncan't say for certain it has anything to do with hipBLAS, since it might\nsimply not manifest itself, because the binary footprint, stack, or heap\nmemory layout changed. Let's keep our fingers crossed that tinyBLAS will\nsave us from this issue. Note also that no one else has reported the bug\neven though it's been impacting me for months.","shortMessageHtmlLink":"Always use tinyBLAS with AMD GPUs on Windows"}},{"before":"c38feb4f4896216458b77665aca532897476c040","after":"a28250b82ea37c8db52e82827f512c051295f7ae","ref":"refs/heads/main","pushedAt":"2024-06-20T14:44:31.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"jart","name":"Justine Tunney","path":"/jart","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49262?s=80&v=4"},"commit":{"message":"Update GGML_HIP_UMA (#473)\n\nAdd UMA config for higher speed like in (https://github.com/ggerganov/llama.cpp/pull/7414)\r\nbut made 2 changes:\r\n\r\n- Remove UMA build option\r\n- Use it in all case if hipalloc failed with 'not have enough memory'\r\n\r\nAnother change is look for 'hipcc' on linux and not 'amdclang++'","shortMessageHtmlLink":"Update GGML_HIP_UMA (<a class=\"issue-link js-issue-link\" data-error-text=\"Failed to load title\" data-id=\"2352261661\" data-permission-text=\"Title is private\" data-url=\"https://github.com/Mozilla-Ocho/llamafile/issues/473\" data-hovercard-type=\"pull_request\" data-hovercard-url=\"/Mozilla-Ocho/llamafile/pull/473/hovercard\" href=\"https://github.com/Mozilla-Ocho/llamafile/pull/473\">#473</a>)"}},{"before":"842a421c67fdf909d6cd8690b1204894add5a4eb","after":"c38feb4f4896216458b77665aca532897476c040","ref":"refs/heads/main","pushedAt":"2024-06-08T04:51:51.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"jart","name":"Justine Tunney","path":"/jart","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49262?s=80&v=4"},"commit":{"message":"Optimized matrix multiplications for i-quants on __aarch64__ (#464)\n\n* Arm for i-quants\r\n\r\nThis carries over what I had done within llama.cpp.\r\nIn llamafile we have nice performance gains for PP, but\r\nwe get performance regression for TG.\r\nFor now, just adjusted iq2_xxs to also outperform in TG\r\n(~10% beter @ 4 and 8 threads).\r\nWill tackle the other quants next.\r\n\r\n* Arm for i-quants: iq2_xxs\r\n\r\nSo, improving TG speed results in a drop of performance for PP.\r\nBefore I had PP-512 = 56.78 t/s, TG-128 = 12.42 t/s @ 8 threads.\r\nNow we have  PP-512 = 52.77 t/s, TG-128 = 15.97 t/s @ 8 threads.\r\n\r\n* Arm for i-quants: iq3_s\r\n\r\nImproved TG from 4.96 t/s yto 5.43 t/s. Still ~3.5$ slower\r\nthan mainline.\r\nPP-512 became slightly better (47.9 vs 46.8 t/s).\r\nThis is 3.9X mainline (!)\r\n\r\n* Arm for i-quants: iq3_xxs\r\n\r\nPP stays the same - 3.67X mainline.\r\nTG improves slightly to 5.05 t/s from 4.74 t/s @ 4 threads.\r\nThis is still 15% slower than mainline.\r\n\r\n* Arm for i-quants: iq2_s\r\n\r\nWe get 3.32X mainline for PP.\r\nTG is, sadly, 0.92X @ 4 threads\r\n\r\n* Arm for i-quants: iq2_xs\r\n\r\nWe get 2.87X mainline for PP.\r\nTG is, sadly, 0.95X @ 4 threads\r\n\r\n* Arm for i-quants: abandoning special-casing Ny = 1\r\n\r\n* Arm for i-quants: cleanup and disable iqk_mul_mat for Ny = 1\r\n\r\n* Arm for i-quants: holding the compiler's hand\r\n\r\nTurns out we can improve quite a bit by explicitely\r\nasking the compiler to never inline some functions, and\r\nto always inline some other.\r\nWith that, PP performance gains are > 3X for all i-quants,\r\nreacing 4.3X for iq3_s. TG is also always better, except\r\nfor iq3_xxs, where it is 0.99X, so re-enabled iql_mul_mat\r\nfor Ny = 1.\r\n\r\n* Arm for i-quants: iterating\r\n\r\nTurns out changing one method of a quant affects the\r\nperformance of other qunts(s). Is the compiler somehow\r\ntrying to optimize all template instantiations together?\r\nAnyway, with this version I have this:\r\n|                     cpu_info | model_filename |       size |    test |     t/s |\r\n| ---------------------------: | -------------: | ---------: | ------: | ------: |\r\n| Apple M2 Max (+fp16+dotprod) |         iq2xxs |   1.73 GiB |   tg128 |    9.02 |\r\n| Apple M2 Max (+fp16+dotprod) |         iq2xxs |   1.73 GiB |   pp512 |   61.31 |\r\n| Apple M2 Max (+fp16+dotprod) |          iq2xs |   1.89 GiB |   tg128 |   10.58 |\r\n| Apple M2 Max (+fp16+dotprod) |          iq2xs |   1.89 GiB |   pp512 |   56.11 |\r\n| Apple M2 Max (+fp16+dotprod) |           iq2m |   2.20 GiB |   tg128 |    7.07 |\r\n| Apple M2 Max (+fp16+dotprod) |           iq2m |   2.20 GiB |   pp512 |   45.78 |\r\n| Apple M2 Max (+fp16+dotprod) |         iq3xxs |   2.41 GiB |   tg128 |    6.40 |\r\n| Apple M2 Max (+fp16+dotprod) |         iq3xxs |   2.41 GiB |   pp512 |   47.51 |\r\n| Apple M2 Max (+fp16+dotprod) |           iq3m |   2.90 GiB |   tg128 |    5.97 |\r\n| Apple M2 Max (+fp16+dotprod) |           iq3m |   2.90 GiB |   pp512 |   47.98 |\r\n\r\nTG is with 4 threads, PP with 8.\r\n\r\n* Arm for i-quants: iterating\r\n\r\nWith this version we get\r\n|                     cpu_info | model_filename |       size |   test |     t/s |\r\n| ---------------------------: | -------------: | ---------: | -----: | ------: |\r\n| Apple M2 Max (+fp16+dotprod) |         iq2xxs |   1.73 GiB |  tg128 |   10.83 |\r\n| Apple M2 Max (+fp16+dotprod) |         iq2xxs |   1.73 GiB |  pp512 |   60.82 |\r\n| Apple M2 Max (+fp16+dotprod) |          iq2xs |   1.89 GiB |  tg128 |   10.79 |\r\n| Apple M2 Max (+fp16+dotprod) |          iq2xs |   1.89 GiB |  pp512 |   57.10 |\r\n| Apple M2 Max (+fp16+dotprod) |           iq2m |   2.20 GiB |  tg128 |    7.45 |\r\n| Apple M2 Max (+fp16+dotprod) |           iq2m |   2.20 GiB |  pp512 |   46.39 |\r\n| Apple M2 Max (+fp16+dotprod) |         iq3xxs |   2.41 GiB |  tg128 |    6.77 |\r\n| Apple M2 Max (+fp16+dotprod) |         iq3xxs |   2.41 GiB |  pp512 |   48.74 |\r\n| Apple M2 Max (+fp16+dotprod) |           iq3m |   2.90 GiB |  tg128 |    5.97 |\r\n| Apple M2 Max (+fp16+dotprod) |           iq3m |   2.90 GiB |  pp512 |   48.59 |\r\n\r\n* Arm for i-quants: cleanup and comments\r\n\r\n* Remove forgotten experimental change in q3_K implementation","shortMessageHtmlLink":"Optimized matrix multiplications for i-quants on __aarch64__ (<a class=\"issue-link js-issue-link\" data-error-text=\"Failed to load title\" data-id=\"2330821055\" data-permission-text=\"Title is private\" data-url=\"https://github.com/Mozilla-Ocho/llamafile/issues/464\" data-hovercard-type=\"pull_request\" data-hovercard-url=\"/Mozilla-Ocho/llamafile/pull/464/hovercard\" href=\"https://github.com/Mozilla-Ocho/llamafile/pull/464\">#464</a>)"}},{"before":"1c08faddd237a9b323149331b9214ac50ab13dda","after":"842a421c67fdf909d6cd8690b1204894add5a4eb","ref":"refs/heads/main","pushedAt":"2024-06-06T12:20:01.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jart","name":"Justine Tunney","path":"/jart","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49262?s=80&v=4"},"commit":{"message":"Add back missing build rule","shortMessageHtmlLink":"Add back missing build rule"}},{"before":"e0656ea190fa1687712c46641a721b02164e06d0","after":"1c08faddd237a9b323149331b9214ac50ab13dda","ref":"refs/heads/main","pushedAt":"2024-06-05T11:53:22.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jart","name":"Justine Tunney","path":"/jart","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49262?s=80&v=4"},"commit":{"message":"Fix the build","shortMessageHtmlLink":"Fix the build"}},{"before":"8e23c73d1ec953737a6baa3656b0bab7e237bf64","after":"e0656ea190fa1687712c46641a721b02164e06d0","ref":"refs/heads/main","pushedAt":"2024-06-05T11:43:16.000Z","pushType":"push","commitsCount":5,"pusher":{"login":"jart","name":"Justine Tunney","path":"/jart","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49262?s=80&v=4"},"commit":{"message":"Introduce new llamafile server\n\nYou can now build and run `o//llamafile/server/main` which launches an\nHTTP server that currently supports a single endpoint at /tokenize. If\nwrk sends it a request to tokenize a string that has 51 tokens then it\nserves two million requests per second on my workstation, where 99 pct\nlatency is 179 µs. This server is designed to be crash proof, reliable\nand preeempting. Workers are able to be asynchronously canceled so the\nsupervisor thread can respawn them. Cosmo's new memory allocator helps\nthis server be high performance for llama.cpp's STL-heavy use case too","shortMessageHtmlLink":"Introduce new llamafile server"}},{"before":"5447f2d8dba6d50b8a847480c03a5739ce1431a6","after":"8e23c73d1ec953737a6baa3656b0bab7e237bf64","ref":"refs/heads/main","pushedAt":"2024-06-03T22:52:38.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"stlhood","name":"Stephen Hood","path":"/stlhood","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/42821?s=80&v=4"},"commit":{"message":"Add Mozilla logo to README","shortMessageHtmlLink":"Add Mozilla logo to README"}},{"before":"9cd8d70942a049ba3c3bddd12e87e1fb599fbd49","after":"5447f2d8dba6d50b8a847480c03a5739ce1431a6","ref":"refs/heads/main","pushedAt":"2024-06-03T22:50:11.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"stlhood","name":"Stephen Hood","path":"/stlhood","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/42821?s=80&v=4"},"commit":{"message":"add Mozilla logo","shortMessageHtmlLink":"add Mozilla logo"}},{"before":"7d8dd1b33fd54e9e54d4ad8074f8df64e547b75d","after":"9cd8d70942a049ba3c3bddd12e87e1fb599fbd49","ref":"refs/heads/main","pushedAt":"2024-06-01T15:51:45.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"jart","name":"Justine Tunney","path":"/jart","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49262?s=80&v=4"},"commit":{"message":"Update sever README build/testing instructions (#461)","shortMessageHtmlLink":"Update sever README build/testing instructions (<a class=\"issue-link js-issue-link\" data-error-text=\"Failed to load title\" data-id=\"2329170488\" data-permission-text=\"Title is private\" data-url=\"https://github.com/Mozilla-Ocho/llamafile/issues/461\" data-hovercard-type=\"pull_request\" data-hovercard-url=\"/Mozilla-Ocho/llamafile/pull/461/hovercard\" href=\"https://github.com/Mozilla-Ocho/llamafile/pull/461\">#461</a>)"}},{"before":"293a5284c49318bb2cef4ab781331edce3f2243c","after":"7d8dd1b33fd54e9e54d4ad8074f8df64e547b75d","ref":"refs/heads/main","pushedAt":"2024-06-01T10:09:40.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"jart","name":"Justine Tunney","path":"/jart","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49262?s=80&v=4"},"commit":{"message":"Upgrade to Cosmopolitan v3.3.10 (#460)\n\nNeeded to fix https://github.com/Mozilla-Ocho/llamafile/issues/446 on windows","shortMessageHtmlLink":"Upgrade to Cosmopolitan v3.3.10 (<a class=\"issue-link js-issue-link\" data-error-text=\"Failed to load title\" data-id=\"2329009495\" data-permission-text=\"Title is private\" data-url=\"https://github.com/Mozilla-Ocho/llamafile/issues/460\" data-hovercard-type=\"pull_request\" data-hovercard-url=\"/Mozilla-Ocho/llamafile/pull/460/hovercard\" href=\"https://github.com/Mozilla-Ocho/llamafile/pull/460\">#460</a>)"}},{"before":"73088c3bb0e3143fec0d356feb97a0cacd2c0d70","after":"293a5284c49318bb2cef4ab781331edce3f2243c","ref":"refs/heads/main","pushedAt":"2024-05-30T00:10:31.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"jart","name":"Justine Tunney","path":"/jart","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49262?s=80&v=4"},"commit":{"message":"Performance improvements on Arm for legacy and k-quants (#453)","shortMessageHtmlLink":"Performance improvements on Arm for legacy and k-quants (<a class=\"issue-link js-issue-link\" data-error-text=\"Failed to load title\" data-id=\"2319448482\" data-permission-text=\"Title is private\" data-url=\"https://github.com/Mozilla-Ocho/llamafile/issues/453\" data-hovercard-type=\"pull_request\" data-hovercard-url=\"/Mozilla-Ocho/llamafile/pull/453/hovercard\" href=\"https://github.com/Mozilla-Ocho/llamafile/pull/453\">#453</a>)"}},{"before":"31419d0b718f318ab23ab40eeb10a170e0eb2edc","after":"73088c3bb0e3143fec0d356feb97a0cacd2c0d70","ref":"refs/heads/main","pushedAt":"2024-05-29T17:38:11.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"jart","name":"Justine Tunney","path":"/jart","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49262?s=80&v=4"},"commit":{"message":"github: delete question in favor of link to discussion [no ci] (#457)","shortMessageHtmlLink":"github: delete question in favor of link to discussion [no ci] (<a class=\"issue-link js-issue-link\" data-error-text=\"Failed to load title\" data-id=\"2322801809\" data-permission-text=\"Title is private\" data-url=\"https://github.com/Mozilla-Ocho/llamafile/issues/457\" data-hovercard-type=\"pull_request\" data-hovercard-url=\"/Mozilla-Ocho/llamafile/pull/457/hovercard\" href=\"https://github.com/Mozilla-Ocho/llamafile/pull/457\">#457</a>)"}},{"before":"397175e673c4334962f446d9470e3bceefc88fb0","after":"31419d0b718f318ab23ab40eeb10a170e0eb2edc","ref":"refs/heads/main","pushedAt":"2024-05-29T07:24:34.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"jart","name":"Justine Tunney","path":"/jart","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49262?s=80&v=4"},"commit":{"message":"github: add ci (#454)","shortMessageHtmlLink":"github: add ci (<a class=\"issue-link js-issue-link\" data-error-text=\"Failed to load title\" data-id=\"2319477243\" data-permission-text=\"Title is private\" data-url=\"https://github.com/Mozilla-Ocho/llamafile/issues/454\" data-hovercard-type=\"pull_request\" data-hovercard-url=\"/Mozilla-Ocho/llamafile/pull/454/hovercard\" href=\"https://github.com/Mozilla-Ocho/llamafile/pull/454\">#454</a>)"}},{"before":"92be52a3bbde8366becff2cdd550cc6a249f7c43","after":"397175e673c4334962f446d9470e3bceefc88fb0","ref":"refs/heads/main","pushedAt":"2024-05-26T11:55:21.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"jart","name":"Justine Tunney","path":"/jart","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49262?s=80&v=4"},"commit":{"message":"github: add mention of strace and ftrace (#449)","shortMessageHtmlLink":"github: add mention of strace and ftrace (<a class=\"issue-link js-issue-link\" data-error-text=\"Failed to load title\" data-id=\"2317666015\" data-permission-text=\"Title is private\" data-url=\"https://github.com/Mozilla-Ocho/llamafile/issues/449\" data-hovercard-type=\"pull_request\" data-hovercard-url=\"/Mozilla-Ocho/llamafile/pull/449/hovercard\" href=\"https://github.com/Mozilla-Ocho/llamafile/pull/449\">#449</a>)"}},{"before":"ba7193043ba5c51fde6a5e146883dc87aaf07a85","after":"92be52a3bbde8366becff2cdd550cc6a249f7c43","ref":"refs/heads/main","pushedAt":"2024-05-26T11:44:11.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"jart","name":"Justine Tunney","path":"/jart","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49262?s=80&v=4"},"commit":{"message":"actions: add labeler + editorconfig github actions (#443)\n\n* actions: add labler + editorconfig github actions\r\n\r\n* Update labeler.yml","shortMessageHtmlLink":"actions: add labeler + editorconfig github actions (<a class=\"issue-link js-issue-link\" data-error-text=\"Failed to load title\" data-id=\"2317148282\" data-permission-text=\"Title is private\" data-url=\"https://github.com/Mozilla-Ocho/llamafile/issues/443\" data-hovercard-type=\"pull_request\" data-hovercard-url=\"/Mozilla-Ocho/llamafile/pull/443/hovercard\" href=\"https://github.com/Mozilla-Ocho/llamafile/pull/443\">#443</a>)"}},{"before":"076dfb0dae2169abb62f490218b5053f37f61cfc","after":"ba7193043ba5c51fde6a5e146883dc87aaf07a85","ref":"refs/heads/main","pushedAt":"2024-05-26T10:56:49.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"jart","name":"Justine Tunney","path":"/jart","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49262?s=80&v=4"},"commit":{"message":"github: delete assignees and about --> description (#448)","shortMessageHtmlLink":"github: delete assignees and about --&gt; description (<a class=\"issue-link js-issue-link\" data-error-text=\"Failed to load title\" data-id=\"2317637284\" data-permission-text=\"Title is private\" data-url=\"https://github.com/Mozilla-Ocho/llamafile/issues/448\" data-hovercard-type=\"pull_request\" data-hovercard-url=\"/Mozilla-Ocho/llamafile/pull/448/hovercard\" href=\"https://github.com/Mozilla-Ocho/llamafile/pull/448\">#448</a>)"}},{"before":"81cfbcf48ee037912eed78e34cc214dac0d2a6d5","after":"076dfb0dae2169abb62f490218b5053f37f61cfc","ref":"refs/heads/main","pushedAt":"2024-05-26T01:16:46.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"jart","name":"Justine Tunney","path":"/jart","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49262?s=80&v=4"},"commit":{"message":"github: add issue templates (#442)","shortMessageHtmlLink":"github: add issue templates (<a class=\"issue-link js-issue-link\" data-error-text=\"Failed to load title\" data-id=\"2317137086\" data-permission-text=\"Title is private\" data-url=\"https://github.com/Mozilla-Ocho/llamafile/issues/442\" data-hovercard-type=\"pull_request\" data-hovercard-url=\"/Mozilla-Ocho/llamafile/pull/442/hovercard\" href=\"https://github.com/Mozilla-Ocho/llamafile/pull/442\">#442</a>)"}},{"before":"ea2a96e5bf8216d002ff40d3283cce4f2100b181","after":"81cfbcf48ee037912eed78e34cc214dac0d2a6d5","ref":"refs/heads/main","pushedAt":"2024-05-25T14:24:24.000Z","pushType":"push","commitsCount":3,"pusher":{"login":"jart","name":"Justine Tunney","path":"/jart","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49262?s=80&v=4"},"commit":{"message":"Release llamafile v0.8.6","shortMessageHtmlLink":"Release llamafile v0.8.6"}},{"before":"b79ecf465befa8018e3331720372917454097a90","after":"ea2a96e5bf8216d002ff40d3283cce4f2100b181","ref":"refs/heads/main","pushedAt":"2024-05-25T09:17:40.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jart","name":"Justine Tunney","path":"/jart","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49262?s=80&v=4"},"commit":{"message":"Disable GPU in llava-quantize","shortMessageHtmlLink":"Disable GPU in llava-quantize"}},{"before":"e67571914779c847233c2ea1e05c587769298f7f","after":"b79ecf465befa8018e3331720372917454097a90","ref":"refs/heads/main","pushedAt":"2024-05-25T08:45:56.000Z","pushType":"push","commitsCount":4,"pusher":{"login":"jart","name":"Justine Tunney","path":"/jart","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49262?s=80&v=4"},"commit":{"message":"Release llamafile v0.8.5","shortMessageHtmlLink":"Release llamafile v0.8.5"}},{"before":"4451c6d98f31325c9eae3e4be0351883096a831d","after":"e67571914779c847233c2ea1e05c587769298f7f","ref":"refs/heads/main","pushedAt":"2024-05-24T20:06:18.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"jart","name":"Justine Tunney","path":"/jart","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49262?s=80&v=4"},"commit":{"message":"Make some more benchmark tool fixes","shortMessageHtmlLink":"Make some more benchmark tool fixes"}},{"before":"91dd4d371ef383a0c22b7c94aea963863ba8c30d","after":"4451c6d98f31325c9eae3e4be0351883096a831d","ref":"refs/heads/main","pushedAt":"2024-05-24T16:38:32.000Z","pushType":"push","commitsCount":4,"pusher":{"login":"jart","name":"Justine Tunney","path":"/jart","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/49262?s=80&v=4"},"commit":{"message":"Reclaim mapped memory","shortMessageHtmlLink":"Reclaim mapped memory"}}],"hasNextPage":true,"hasPreviousPage":false,"activityType":"all","actor":null,"timePeriod":"all","sort":"DESC","perPage":30,"cursor":"djE6ks8AAAAEd9alzAA","startCursor":null,"endCursor":null}},"title":"Activity · Mozilla-Ocho/llamafile"}