Skip to content

How to chat with Mesolitica chat

HUSEIN ZOLKEPLI edited this page Feb 28, 2024 · 15 revisions

Please make sure login Github account to see the images.

Screenshot 2023-12-30 at 5 24 18 PM

We hosted Chat UI at https://chat.us.mesolitica.com/ to serve,

  1. https://huggingface.co/mesolitica/malaysian-mistral-7b-32k-instructions-v3.5
  2. https://huggingface.co/mesolitica/malaysian-tinyllama-1.1b-16k-instructions-v2
  3. https://huggingface.co/mesolitica/mallam-1.1b-20k-instructions-rag
  4. https://huggingface.co/mesolitica/mallam-5b-20k-instructions-v2

This order sorted by descending order in term of speed TPS (Tokens Per Second), https://huggingface.co/mesolitica/mallam-1.1b-20k-instructions-v2 is super fast.

In order to get the best response, please use Malaysian Mistral 7B v3.5 32k context length.

These models is a quick demonstration that the base model can be easily fine-tuned to achieve some performance, we are going to make it better in the future. It does have minimal moderation mechanisms, so please be patience to us.

We cannot tell how long we will open this Chat UI to the public due to GPU VM is super expensive, if you found out the chats crashed or no longer respond to you, probably the traffic is busy on that time, chillz ok.

Choose specific model

Just press drop down button at Current Model card, and it will list available models,

Screenshot 2023-12-14 at 10 36 38 PM

Personally I like Mistral the most, it can code, it can answer general QA related to the world and the most important is multiturn QA.

Buttons you need to understand

Download prompt button

Screenshot 2023-12-16 at 10 54 18 PM

This will popup a new JSON page,

{
  "note": "This is a preview of the prompt that will be sent to the model when retrying the message. It may differ from what was sent in the past if the parameters have been updated since",
  "prompt": "<s>[INST] KWSP tu apa [/INST]",
  "model": "Malaysian Mistral 7B 32k",
  "parameters": {
    "temperature": 0.3,
    "truncate": 15360,
    "max_new_tokens": 1024,
    "stop": [
      "</s>",
      "[/INST]"
    ],
    "top_p": 0.95,
    "repetition_penalty": 1.15,
    "top_k": 50,
    "return_full_text": false
  }
}

So you can try by yourself in any local.

Regenerate

If you are not satisfied with the output, you can always to regenerate,

Screenshot 2023-12-16 at 10 55 32 PM

Thumbs up or down

If you want to make the models better, please help us on thumbs up or down,

Screenshot 2023-12-16 at 10 56 16 PM

Thumps up or down will be part of continuous human feedback loop learning process.

Benchmark

https://huggingface.co/spaces/mesolitica/malay-llm-leaderboard, currently we only focused on tatabahasa benchmark.

Dataset trained

  1. Multi-turn Malaysian QA, it can answer follow up questions related to Malaysian context.
  2. Multi-turn General QA, it can answer follow up questions related to general or world context.
  3. Multi-turn Coding QA, it can answer follow up questions related to coding.
  4. Multi-turn Math and Science QA, it can answer mathematics and science follow up questions.
  5. Multi-turn RAG, you can give any document context and it can answer follow up questions based on the context.

Multi-turn Malaysian QA

You can ask about EPF.

Screenshot 2023-12-16 at 10 17 04 PM

Multi-turn General QA

You can ask about Palestine and Israel conflict.

Screenshot 2023-12-16 at 10 18 03 PM

Multi-turn Coding QA

How to start Python

Screenshot 2023-12-16 at 10 19 24 PM

Sentiment analysis in Python

Screenshot 2023-12-16 at 10 20 46 PM

Multi-turn RAG

All models should be able to do RAG including multiturn QA, you just provide context and it should able to answer, example,

Screenshot 2023-12-14 at 10 40 03 PM Screenshot 2023-12-14 at 10 40 11 PM

I copied context from https://www.sinarharian.com.my/article/253799/berita/politik/isu-perkauman-kerajaan-didakwa-terdesak-kaitkan-pemimpin-pn and pasted into the chat and asked apa isu dengan konteks diatas, and it replied,

Isu yang dibincangkan dalam artikel ini adalah tentang kenyataan berunsur perkauman yang dikaitkan dengan pemimpin Perikatan Nasional (PN). Artikel ini menyebut bahawa pemimpin PN, termasuk Presiden Pas dan Ketua Pemuda Pas Pulau Pinang, sedang disiasat atas kenyataan tersebut.

After that asked another follow up based on the printscreens.

Currently the UI does not able to upload the documents and we are not intended to add it for now.

This does not reflect our political view.

Malaysian instruction

All these outputs from Malaysian Mistral 7B v3 32k context length

Sajak nasi ayam

You ask to generate sajak buat sajak tentang nasi ayam,

Screenshot 2023-12-14 at 10 53 19 PM

Food at Penang

You ask about food at Penang,

Screenshot 2023-12-16 at 3 49 53 PM

Translate local slur

Screenshot 2023-12-30 at 5 16 45 PM

Translate to Bahasa Kelantan

Screenshot 2023-12-30 at 5 22 56 PM

Translate to Bahasa Utara

Screenshot 2023-12-30 at 5 23 28 PM

Summarize any document including long social media text

Copy and pasted from https://iiumc.com/madu-atau-hempedu-ii-akhir/

Screenshot 2023-12-30 at 5 30 45 PM

Very very less alignment

**All these outputs from Malaysian Mistral 7B v3 32k context length

Make a bomb

Screenshot 2023-12-14 at 11 35 17 PM

How to avoid to get caught

Screenshot 2023-12-14 at 11 42 46 PM

Gulingkan kerajaan

Screenshot 2023-12-30 at 5 19 59 PM

Jadi cybertrooper

Screenshot 2023-12-30 at 5 21 05 PM

Hallucination problem

All these outputs from Malaysian Mistral 7B v3 32k context length

Do not 100% believe the outputs

Screenshot 2023-12-16 at 4 10 19 PM

Always always verified the outputs.