--- library_name: transformers tags: - phi3 - python - dpo - mypo license: mit datasets: - joshuasundance/mypo-4k-rfc language: - en pipeline_tag: text-generation --- **This is a pipeline version of `joshuasundance/phi3-mini-4k-qlora-python-code-20k-mypo-4k-rfc`** # Model Card for Model ID * **Base Model**: https://huggingface.co/edumunozsala/phi3-mini-4k-qlora-python-code-20k * **Preference Dataset**: https://huggingface.co/datasets/joshuasundance/mypo-4k-rfc * **Training Code**: https://gist.github.com/joshuasundance-swca/a94672960733782865932a645587ccdc * **Training Metrics**: [trainer_state.json](trainer_state.json) This is an experimental model made by using `joshuasundance/mypo-4k-rfc` for DPO training of `edumunozsala/phi3-mini-4k-qlora-python-code-20k`. The goal is to learn about model training and potentially get the base model to reliably produce Python with type hints. I chose `edumunozsala/phi3-mini-4k-qlora-python-code-20k` because I was able to train this model in one hour on my laptop. ## Model Details ### Model Description This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated. - **Developed by:** Joshua Sundance Bailey - **Model type:** phi 3 qlora DPO - **Language(s) (NLP):** English - **License:** MIT - **Finetuned from model [optional]:** `edumunozsala/phi3-mini-4k-qlora-python-code-20k` ### Model Sources [optional] - **Training Code:** https://gist.github.com/joshuasundance-swca/a94672960733782865932a645587ccdc ## Uses For evaluation and testing only. Do not expect great results, and do not use this model for anything important. It has not been evaluated in any way after training. ### Direct Use ```python from transformers import pipeline pipe = pipeline( "text-generation", model="joshuasundance/phi3-mini-4k-qlora-python-code-20k-mypo-4k-rfc-pipe", trust_remote_code=True, ) prompt_template = """### Instruction: Below is an instruction that describes a task. Write a response that appropriately completes the request. ALWAYS use Python type hints for mypy. ### Instruction: {instruction} ### Input: {input} ### Output: """ def invoke(user_instruction: str, user_input: str = "") -> str: prompt_str = prompt_template.format(instruction=user_instruction, input=user_input) prompt = pipe.tokenizer.apply_chat_template( [{"role": "user", "content": prompt_str}], tokenize=False, add_generation_prompt=True, ) outputs = pipe( prompt, max_new_tokens=256, do_sample=True, num_beams=1, temperature=0.3, top_k=50, top_p=0.95, max_time=180, ) # , eos_token_id=eos_token) return outputs[0]["generated_text"][len(prompt) :].strip() user_instruction = ( "Write a Python function that takes 3 ints, x, y, and z, and returns (x*z)//y." ) user_input = "" invoke(user_instruction, user_input) ``` [More Information Needed] ### Downstream Use [optional] [More Information Needed] ### Out-of-Scope Use [More Information Needed] ## Bias, Risks, and Limitations [More Information Needed] ### Recommendations Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. ## How to Get Started with the Model Use the code below to get started with the model. [More Information Needed] ## Training Details ### Training Data * Original qlora: `iamtarun/python_code_instructions_18k_alpaca` * DPO: `joshuasundance/mypo-4k-rfc` ### Training Procedure See training code using `peft`, `transformers`, and `trl` #### Preprocessing [optional] See training code using `peft`, `transformers`, and `trl` #### Training Hyperparameters See training code using `peft`, `transformers`, and `trl` #### Speeds, Sizes, Times [optional] See [trainer_state.json](trainer_state.json) in this repo [More Information Needed] ## Evaluation See [trainer_state.json](trainer_state.json) in this repo ### Testing Data, Factors & Metrics #### Testing Data 20% of DPO dataset (see training code) [More Information Needed] #### Factors [More Information Needed] #### Metrics [More Information Needed] ### Results [More Information Needed] #### Summary ## Model Examination [optional] [More Information Needed] ## Environmental Impact Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). - **Hardware Type:** [More Information Needed] - **Hours used:** [More Information Needed] - **Cloud Provider:** [More Information Needed] - **Compute Region:** [More Information Needed] - **Carbon Emitted:** [More Information Needed] ## Technical Specifications [optional] ### Model Architecture and Objective [More Information Needed] ### Compute Infrastructure [More Information Needed] #### Hardware [More Information Needed] #### Software [More Information Needed] ## Citation [optional] **BibTeX:** [More Information Needed] **APA:** [More Information Needed] ## Glossary [optional] [More Information Needed] ## More Information [optional] [More Information Needed] ## Model Card Authors [optional] Joshua Sundance Bailey ## Model Card Contact Joshua Sundance Bailey