terraform-cloud-codellama-7b / README.md

Upload folder using huggingface_hub

81148b8 verified about 2 months ago

9.69 kB

	---
	library_name: peft
	base_model: codellama/CodeLlama-7b-Instruct-hf
	tags:
	- terraform
	- terraform-configuration
	- infrastructure-as-code
	- iac
	- hashicorp
	- codellama
	- lora
	- qlora
	- peft
	- code-generation
	- devops
	- cloud
	- aws
	- azure
	- gcp
	- multi-cloud
	- automation
	- configuration-management
	- cloud-infrastructure
	license: apache-2.0
	language:
	- en
	pipeline_tag: text-generation
	---

	# terraform-cloud-codellama-7b

	RECOMMENDED MODEL - An advanced LoRA fine-tuned model for comprehensive Terraform infrastructure-as-code generation, supporting multiple cloud providers (AWS, Azure, GCP). This model generates Terraform configurations, HCL code, and multi-cloud infrastructure automation scripts.

	## Model Description

	This is the enhanced model - an advanced version of terraform-codellama-7b that has been additionally trained on AWS, Azure, and GCP public documentation. It provides superior performance for multi-cloud Terraform development with deep understanding of cloud provider-specific resources and best practices.

	### Key Features

	- Multi-Cloud Support: Trained on AWS, Azure, and GCP documentation
	- Enhanced Performance: Superior to the base terraform-codellama-7b model
	- Production Ready: Optimized for real-world multi-cloud infrastructure development
	- Comprehensive Coverage: Handles complex cloud provider-specific configurations
	- Efficient Training: Uses QLoRA (4-bit quantization + LoRA) for memory efficiency

	## Model Details

	- Developed by: Rafi Al Attrach, Patrick Schmitt, Nan Wu, Helena Schneider, Stefania Saju (TUM + IBM Research Project)
	- Model type: LoRA fine-tuned CodeLlama (Enhanced)
	- Language(s): English
	- License: Apache 2.0
	- Finetuned from: [codellama/CodeLlama-7b-Instruct-hf](https://huggingface.co/codellama/CodeLlama-7b-Instruct-hf)
	- Training method: QLoRA (4-bit quantization + LoRA)
	- Base Model: Built on [rafiaa/terraform-codellama-7b](https://huggingface.co/rafiaa/terraform-codellama-7b)

	### Technical Specifications

	- Base Model: CodeLlama-7b-Instruct-hf
	- LoRA Rank: 64
	- LoRA Alpha: 16
	- Target Modules: q_proj, v_proj
	- Training Epochs: 3 (Stage 1) + Additional training (Stage 2)
	- Max Sequence Length: 512
	- Quantization: 4-bit (fp4)

	## Uses

	### Direct Use

	This model is designed for:
	- Multi-cloud Terraform development
	- AWS resource configuration (EC2, S3, RDS, Lambda, etc.)
	- Azure resource management (Virtual Machines, Storage Accounts, App Services, etc.)
	- GCP resource deployment (Compute Engine, Cloud Storage, Cloud SQL, etc.)
	- Complex infrastructure orchestration
	- Cloud provider-specific best practices

	### Example Use Cases

	```python
	# Generate AWS multi-service infrastructure
	prompt = "Create a Terraform configuration for an AWS application with VPC, EC2, RDS, and S3"
	```

	```python
	# Generate Azure App Service with database
	prompt = "Create a Terraform configuration for an Azure App Service with PostgreSQL database"
	```

	```python
	# Generate GCP Kubernetes cluster
	prompt = "Create a Terraform configuration for a GCP GKE cluster with node pools"
	```

	```python
	# Generate multi-cloud setup
	prompt = "Create a Terraform configuration for a hybrid cloud setup using AWS and Azure"
	```

	## How to Get Started

	### Installation

	```bash
	pip install transformers torch peft accelerate bitsandbytes
	```

	### Loading the Model

	#### GPU Usage (Recommended)
	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	from peft import PeftModel
	import torch

	# Load base model with 4-bit quantization (GPU)
	base_model = "codellama/CodeLlama-7b-Instruct-hf"
	model = AutoModelForCausalLM.from_pretrained(
	base_model,
	load_in_4bit=True,
	torch_dtype=torch.float16,
	device_map="auto"
	)

	# Load LoRA adapter
	model = PeftModel.from_pretrained(model, "rafiaa/terraform-cloud-codellama-7b")
	tokenizer = AutoTokenizer.from_pretrained(base_model)

	# Set pad token
	if tokenizer.pad_token is None:
	tokenizer.pad_token = tokenizer.eos_token
	```

	#### CPU Usage (Alternative)
	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	from peft import PeftModel
	import torch

	# Load base model (CPU compatible)
	base_model = "codellama/CodeLlama-7b-Instruct-hf"
	model = AutoModelForCausalLM.from_pretrained(
	base_model,
	torch_dtype=torch.float32,
	device_map="cpu"
	)

	# Load LoRA adapter
	model = PeftModel.from_pretrained(model, "rafiaa/terraform-cloud-codellama-7b")
	tokenizer = AutoTokenizer.from_pretrained(base_model)

	# Set pad token
	if tokenizer.pad_token is None:
	tokenizer.pad_token = tokenizer.eos_token
	```

	### Usage Example

	```python
	def generate_terraform(prompt, max_length=512):
	inputs = tokenizer(prompt, return_tensors="pt")

	with torch.no_grad():
	outputs = model.generate(
	**inputs,
	max_length=max_length,
	temperature=0.7,
	do_sample=True,
	pad_token_id=tokenizer.eos_token_id
	)

	return tokenizer.decode(outputs[0], skip_special_tokens=True)

	# Example: Multi-cloud infrastructure
	prompt = """
	Create a Terraform configuration for a multi-cloud setup:
	- AWS: VPC with public/private subnets, EC2 instances
	- Azure: Storage account and App Service
	- GCP: Cloud SQL database
	"""

	result = generate_terraform(prompt)
	print(result)
	```

	### Advanced Usage

	```python
	# Cloud-specific prompts
	aws_prompt = "Create a Terraform configuration for AWS EKS cluster with managed node groups"
	azure_prompt = "Create a Terraform configuration for Azure Kubernetes Service (AKS)"
	gcp_prompt = "Create a Terraform configuration for GCP Cloud Run service"

	# Generate configurations
	aws_config = generate_terraform(aws_prompt)
	azure_config = generate_terraform(azure_prompt)
	gcp_config = generate_terraform(gcp_prompt)
	```

	## Training Details

	### Training Data

	Stage 1: Public Terraform Registry documentation
	Stage 2: Additional training on:
	- AWS Documentation: EC2, S3, RDS, Lambda, VPC, IAM, etc.
	- Azure Documentation: Virtual Machines, Storage Accounts, App Services, Key Vault, etc.
	- GCP Documentation: Compute Engine, Cloud Storage, Cloud SQL, GKE, etc.

	### Training Procedure

	- Method: QLoRA (4-bit quantization + LoRA)
	- Two-Stage Training:
	1. Terraform Registry documentation
	2. Cloud provider documentation (AWS, Azure, GCP)
	- LoRA Rank: 64
	- LoRA Alpha: 16
	- Target Modules: q_proj, v_proj
	- Training Epochs: 3 (Stage 1) + Additional training (Stage 2)
	- Max Sequence Length: 512
	- Quantization: 4-bit (fp4)

	### Training Hyperparameters

	- Training regime: 4-bit mixed precision
	- LoRA Dropout: 0.0
	- Learning Rate: Optimized for QLoRA training
	- Batch Size: Optimized for memory efficiency

	## Performance Comparison

	\| Model \| Terraform Knowledge \| AWS Support \| Azure Support \| GCP Support \| Multi-Cloud Capability \|
	\|-------\|-------------------\|-------------\|---------------\|-------------\|-------------------\|
	\| terraform-codellama-7b \| Excellent \| Limited \| Limited \| Limited \| Basic \|
	\| terraform-cloud-codellama-7b \| Excellent \| Excellent \| Excellent \| Excellent \| Advanced \|

	## Limitations and Bias

	### Known Limitations

	- Context Length: Limited to 512 tokens due to training configuration
	- Domain Specificity: Optimized for Terraform and cloud infrastructure
	- Base Model Limitations: Inherits limitations from CodeLlama-7b-Instruct-hf
	- Cloud Provider Updates: May not include the latest cloud provider features

	### Recommendations

	- Use for Terraform and cloud infrastructure tasks
	- Validate generated configurations before deployment
	- Consider the 512-token context limit for complex configurations
	- For production use, always review and test generated code
	- Stay updated with cloud provider documentation for latest features

	## Environmental Impact

	- Training Method: QLoRA reduces computational requirements significantly
	- Hardware: Trained using efficient 4-bit quantization
	- Carbon Footprint: Reduced compared to full fine-tuning due to QLoRA efficiency
	- Two-Stage Approach: Efficient incremental training

	## Citation

	If you use this model in your research, please cite:

	```bibtex
	@misc{terraform-cloud-codellama-7b,
	title={terraform-cloud-codellama-7b: A Multi-Cloud LoRA Fine-tuned Model for Terraform Code Generation},
	author={Rafi Al Attrach and Patrick Schmitt and Nan Wu and Helena Schneider and Stefania Saju},
	year={2024},
	url={https://huggingface.co/rafiaa/terraform-cloud-codellama-7b}
	}
	```

	## Related Models

	- Base Model: [codellama/CodeLlama-7b-Instruct-hf](https://huggingface.co/codellama/CodeLlama-7b-Instruct-hf)
	- Stage 1 Model: [rafiaa/terraform-codellama-7b](https://huggingface.co/rafiaa/terraform-codellama-7b)
	- This Model: [rafiaa/terraform-cloud-codellama-7b](https://huggingface.co/rafiaa/terraform-cloud-codellama-7b) (Recommended)

	## Model Card Contact

	- Author: rafiaa
	- Model Repository: [HuggingFace Model](https://huggingface.co/rafiaa/terraform-cloud-codellama-7b)
	- Issues: Please report issues through the HuggingFace model page

	## Acknowledgments

	- Research Project: Early 2024 research project at TUM + IBM
	- Training Data: Public documentation from Terraform Registry, AWS, Azure, and GCP
	- Base Model: Meta's CodeLlama-7b-Instruct-hf
	- Training Method: QLoRA for efficient fine-tuning

	---

	This model represents the culmination of a two-stage fine-tuning approach, combining Terraform expertise with comprehensive cloud provider knowledge for superior infrastructure-as-code generation.