--- license: apache-2.0 datasets: - AISE-TUDelft/Capybara tags: - code --- # BinT5 - **Repository: https://github.com/AISE-TUDelft/Capybara-BinT5** - **Paper: https://huggingface.co/papers/2301.01701** - **Point of Contact: https://huggingface.co/aalkaswan** - **Raw Data: https://zenodo.org/records/7229913** BinT5 is a Binary Code Summarization model, the base models are [CodeT5]() and fine-tuned with [Capybara](). We offer 5 variations of the model: | Name | Training Data | |-----------------------------------------------------|------------------------------------------------------| | [BinT5-C](https://huggingface.co/AISE-TUDelft/BinT5-C) | C Source | | [BinT5-Decom](https://huggingface.co/AISE-TUDelft/BinT5-Decom) | Decompiled C Binaries | | [BinT5-Stripped](https://huggingface.co/AISE-TUDelft/BinT5-Stripped) | Stripped Decompiled C Binaries | | [BinT5-Demi](https://huggingface.co/AISE-TUDelft/BinT5-Demi) | Demi-stripped Decompiled C Binaries | | [BinT5-NoFunName](https://huggingface.co/AISE-TUDelft/BinT5-NoFunName) | Decompiled C Binaries with the Function Name removed | ### Citation Information ``` @inproceedings{alkaswan2023extending, title={Extending Source Code Pre-Trained Language Models to Summarise Decompiled Binaries}, author={Al-Kaswan, Ali and Ahmed, Toufique and Izadi, Maliheh and Sawant, Anand Ashok and Devanbu, Premkumar and van Deursen, Arie}, booktitle={2023 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)}, pages={260--271}, year={2023}, organization={IEEE} } ```