|
--- |
|
base_model: internistai/base-7b-v0.2 |
|
datasets: |
|
- omi-health/medical-dialogue-to-soap-summary |
|
language: |
|
- en |
|
license: apache-2.0 |
|
metrics: |
|
- accuracy |
|
tags: |
|
- medical |
|
- mlx |
|
tag: text-generation |
|
--- |
|
|
|
![image/png](/static-proxy?url=https%3A%2F%2Fcdn-uploads.huggingface.co%2Fproduction%2Fuploads%2F651d96a3e8c4c2ebaafc1e7d%2FuyiryuBhU4y62f4CRxabO.png%3C%2Fspan%3E)%3C!-- HTML_TAG_END --> |
|
|
|
The Model [cogbuji/MrGrammaticaOntology-internistai-SCT-DRIFT-clinical-problem-0.6.5](https://huggingface.co/cogbuji/MrGrammaticaOntology-internistai-SCT-DRIFT-clinical-problem-0.6.5) was converted to MLX format from [internistai/base-7b-v0.2](https://huggingface.co/internistai/base-7b-v0.2) using mlx-lm version **0.16.0**. |
|
|
|
The name of the model is a homage to Fela Kuti's song __Mr Grammarticalogy-Lisationalsim Is The Boss__ released on the B-side of his 1976 LP [Excuse O](https://www.discogs.com/release/3149841-Fela-And-The-Africa-70-Excuse-O). |
|
|
|
It is an experimental model for non-production environments inspired by explorations into how large language models can be trained to be more conversant in medical terminology and concepts and used in various medical informatics scenarios. |
|
|
|
It is a LoRa finetune of [internistai/base-7b-v0.2](https://huggingface.co/internistai/base-7b-v0.2) using [controlled natural language (CNL) phrases] generated from the September 23rd release of [SNOMED CT United States Edition](https://www.snomed.org/snomed-ct/Use-SNOMED-CT). The general idea is described in [Reference Domain Ontologies and Large Medical Language Models](https://www.slideshare.net/slideshow/reference-domain-ontologies-and-large-medical-language-modelspptx/267024290). |
|
|
|
During the training, LoRa was applied to all linear layers using a dataset comprising 318,798 SNOMED-CT DRIFT phrases from the SNOMED-CT [concept hierarchies](https://nhsengland.kahootz.com/gf2.ti/f/762498/152743141.1/PDF/-/SNOMED%20Implementation_User%20Guide_Hierarchies.pdf) relevant to medical problems (findings, morphologic abnormalities, situations with explicit context, and disorders) and 7,400 records from the Synthetic Medical Dialogues and SOAP Summaries [dataset](https://huggingface.co/datasets/omi-health/medical-dialogue-to-soap-summary). The training ran for two days, 13 hours, and 55 minutes using mlx-tuning fork, a framework for parameterized large language model (Q)LoRa fine tuning on Apple Metal. |
|
|
|
Below is a snippet of the configuration used (the format has changed over time): |
|
|
|
```yaml |
|
lora_parameters: |
|
keys: ["self_attn.q_proj","self_attn.v_proj","self_attn.k_proj","self_attn.o_proj"] |
|
rank: 32 |
|
alpha: 32 |
|
dropout: 0.3205 |
|
scale: 10.0 |
|
|
|
epochs: 2 |
|
|
|
learning_schedule: |
|
type: "cosine_w_warmup" |
|
warmup_proportion: .1 |
|
min_lr: 1e-7 |
|
cycle_length: -1 |
|
min_cos_lr: 7e-6 |
|
``` |
|
|
|
The wand db log is below: |
|
> 79,700 iterations at 39,850 iterations per epoch on a dataset of 318,798 records, 8 at a time. |
|
> |
|
> |
|
|
|
## MMLU-SR benchmarks |
|
|
|
Below are before and after [MMLU-SR benchmark](https://github.com/EleutherAI/lm-evaluation-harness/tree/main/lm_eval/tasks/mmlusr) scores for the MMLU medical topics listed below were measured before and afterwards. MMLU-SR is a dataset |
|
used by the LM Evaluation Harness for rigorous benchmarking of true model comprehension. |
|
|
|
### Before (unquantized internistai lm-eval run on Apple Metal) |
|
|
|
>>> hf (pretrained=internistai/base-7b-v0.2,dtype=float), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 64 |
|
| Tasks |Version|Filter|n-shot|Metric| |Value | |Stderr| |
|
|---------------------|------:|------|-----:|------|---|-----:|---|-----:| |
|
|clinical knowledge | 0|none | 0|acc |↑ |0.5019|± |0.0308| |
|
|professional medicine| 0|none | 0|acc |↑ |0.5441|± |0.0303| |
|
|
|
### After (unquantized internistai lm-eval run on Apple Metal) |
|
|
|
hf (pretrained=../raw_models/outbox/MrGrammaticaOntology-internistai-SCT-DRIFT-clinical-problem-0.6.5,dtype=float), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 64 |
|
| Tasks |Version|Filter|n-shot|Metric| |Value | |Stderr| |
|
|---------------------|------:|------|-----:|------|---|-----:|---|-----:| |
|
|clinical knowledge | 0|none | 0|acc |↑ |0.5208|± |0.0307| |
|
|professional medicine| 0|none | 0|acc |↑ |0.5625|± |0.0301| |
|
|
|
|
|
## Use with mlx |
|
|
|
```bash |
|
pip install mlx-lm |
|
``` |
|
|
|
```python |
|
from mlx_lm import load, generate |
|
|
|
model, tokenizer = load("cogbuji/MrGrammaticalOntology-internistai-SCT-core-0.6.5") |
|
response = generate(model, tokenizer, prompt="hello", verbose=True) |
|
``` |
|
|