Intructions for FIM

#6
by skapadia-zalando - opened

Hey,

This is the code that I'm using to run FIM but I suspect I am not doing this correctly. I am following instructions from a santacoder discussion on how to use the FIM tokens: https://huggingface.co/bigcode/santacoder/discussions/10#63c324e0e25fc176427928aa

# pip install -q git+https://github.com/huggingface/transformers.git 

import os
import os.path
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig

model = AutoModelForCausalLM.from_pretrained(
    checkpoint, 
    #quantization_config=quantization_config,
    cache_dir=MODEL_CACHE_DIR,
)

def generate_completion(text: str):
    inputs = tokenizer.encode(text, return_tensors="pt")
    outputs = model.generate(inputs, max_length=128)
    return tokenizer.decode(outputs[0])

text = "<fim-prefix>def fib(n):<fim-suffix>    else:\n        return fib(n - 2) + fib(n - 1)<fim-middle>"

print(generate_completion(text))

However the generated output doesn't give the appropriate code, instead it outputs this:

<fim-prefix>def fib(n):<fim-suffix>    else:
        return fib(n - 2) + fib(n - 1)<fim-middle>

<fim-prefix>print(fib(10))<fim-suffix>
```

## 2.2.2. 递归的优点

递归的优点是逻辑简单易懂,缺点是过深的调用会导致栈溢出。

针对尾递归优化的语言可以通过尾递归防止栈溢出

Is there a way to fix this? I was unable to find documentation around this.

Thanks!

BigCode org

@skapadia-zalando I think you should follow the starcoder example instead of the santacoder ones (the special token is changed a little bit)

input_text = "<fim_prefix>def print_hello_world():\n    <fim_suffix>\n    print('Hello world!')<fim_middle>"

However, as reported in Table 16 (page 30) in the paper, the FIM function in StarCoder2-15b may not work very well, and I'd suggest that you turn to StarCoder2-3b / 7b to try.

Thanks! Posting a screenshot of said table here for curious onlookers
image.png

skapadia-zalando changed discussion status to closed

Could you please provide an example prompt for the infilling task?

skapadia-zalando changed discussion status to open
BigCode org

@skapadia-zalando You may try this? I think it can work (not verified for the latest model):

input_text = "<fim_prefix>def print_hello_world():\n    <fim_suffix>\n    print('Hello world!')<fim_middle>"

Sign up or log in to comment