BinT5

BinT5 is a Binary Code Summarization model, the base models are CodeT5 and fine-tuned with Capybara.

We offer 5 variations of the model:

Name Training Data
BinT5-C C Source
BinT5-Decom Decompiled C Binaries
BinT5-Stripped Stripped Decompiled C Binaries
BinT5-Demi Demi-stripped Decompiled C Binaries
BinT5-NoFunName Decompiled C Binaries with the Function Name removed

Citation Information

@inproceedings{alkaswan2023extending,
  title={Extending Source Code Pre-Trained Language Models to Summarise Decompiled Binaries},
  author={Al-Kaswan, Ali and Ahmed, Toufique and Izadi, Maliheh and Sawant, Anand Ashok and Devanbu, Premkumar and van Deursen, Arie},
  booktitle={2023 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)},
  pages={260--271},
  year={2023},
  organization={IEEE}
}
Downloads last month
7
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Dataset used to train AISE-TUDelft/BinT5-NoFunName

Collection including AISE-TUDelft/BinT5-NoFunName