--- license: llama2 base_model: - codellama/CodeLlama-13b-Instruct-hf pipeline_tag: text-generation --- CodeLlama-13b-MORepair is a program repair model fine-tuned from CodeLlama-13b-instruct using a novel multi-objective fine-tuning framework called MOREPAIR. This model is specifically designed to improve automated program repair capabilities by learning both code transformations and repair logic reasoning. [Paper](https://arxiv.org/abs/2404.12636) | [Code](https://github.com/buaabarty/morepair) | [Colab](https://colab.research.google.com/drive/1vlabdN5Oucm-5kVtMHuEw-kvqDOtB5hg) ## Citation If you use this model in your research, please cite: ```bibtex @article{10.1145/3735129, author = {Yang, Boyang and Tian, Haoye and Ren, Jiadong and Zhang, Hongyu and Klein, Jacques and Bissyande, Tegawende and Le Goues, Claire and Jin, Shunfu}, title = {MORepair: Teaching LLMs to Repair Code via Multi-Objective Fine-Tuning}, year = {2025}, publisher = {Association for Computing Machinery}, issn = {1049-331X}, url = {https://doi.org/10.1145/3735129}, doi = {10.1145/3735129}, journal = {ACM Trans. Softw. Eng. Methodol.}, } ``` ## Model Description - **Base Model**: CodeLlama-13b-instruct - **Training Technique**: Multi-objective fine-tuning with MOREPAIR framework - **Supported Languages**: Primarily tested on C++ and Java, but likely generalizes to other languages - **Primary Use**: Automated program repair - **License**: Llama 2 Community License - **Evaluation Benchmarks**: [EvalRepair-Java](https://huggingface.co/datasets/barty/EvalRepair-Java) | [EvalRepair-C++](https://huggingface.co/datasets/barty/EvalRepair-Cpp) | [D4J-Repair](https://huggingface.co/datasets/barty/D4J-Repair) | [SWE-Repair](https://huggingface.co/datasets/barty/SWE-Repair) ## Training Details ### Training Data - **Dataset**: [TutorLLMCode](https://tutorcode.org/docs/) - **Size**: 1,535 pairs of buggy and repaired code - **Nature**: Programming task corrections with LLM-generated repair guidance ### Training Approach The model was trained using MOREPAIR, which employs: - Multi-objective learning with two objectives: 1. Generating repaired code 2. Producing repaired code with explanatory guidance - QLoRA fine-tuning (only 1.84% of parameters modified) - NEFTune for improved generalization - LLM-generated guidance for understanding repair logic ## Usage Here's how to use the model with the Hugging Face Transformers library: ### Installation ```bash pip install transformers torch ``` ### Basic Usage ````python from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline # Load model and tokenizer model_name = "barty/CodeLlama-13B-MORepair" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained( model_name, device_map="auto", load_in_8bit=True, torch_dtype=torch.float16 ) pipe = pipeline("text-generation", model=model, tokenizer=tokenizer) def repair_code(buggy_code, filename="example.java"): # Construct prompt in the format model expects prompt = f"""[INST] This is an incorrect code({filename}): ```java {buggy_code} ``` You are a software engineer. Can you repair the incorrect code? [/INST] ```java """ # Calculate token count for length control prompt_tokens = len(tokenizer.tokenize(prompt)) max_new_tokens = 500 - prompt_tokens # Generate repair output = pipe( prompt, min_length=prompt_tokens + 64, max_length=prompt_tokens + max_new_tokens, temperature=1.0, do_sample=True ) # Extract the generated code full_text = output[0]['generated_text'] fixed_code = full_text.split('[/INST]')[1].strip() return full_text, fixed_code # Example usage buggy_code = """ public static int findMinRotated(int[] arr) { int left = 0; int right = arr.length - 1; while (left < right) { int mid = (left + right) / 2; if (arr[mid] > arr[right]) left = mid; // Bug: should be mid + 1 else right = mid; } return arr[left]; } """ full_response, fixed_code = repair_code(buggy_code) print("Fixed code:") print(fixed_code) ```` ### Important Parameters - `load_in_8bit=True`: Enables 8-bit quantization for efficient inference - `temperature=1.0`: Controls randomness in generation - `do_sample=True`: Enables sampling-based generation - `min_length`: Minimum length of generated text - `max_length`: Maximum length of generated text ## Limitations - Performance varies across different programming languages - May require multiple attempts to generate correct fixes - Should be used with appropriate test cases to validate repairs - May not handle very complex or multi-file program repairs ## Technical Specifications - **Architecture**: Based on CodeLlama-13b-instruct - **Parameters**: Same as base model (13B) - **Fine-tuning Method**: QLoRA + NEFTune - **Context Window**: Same as CodeLlama-13b-instruct - **Input Format**: Code snippets with optional repair guidance