LLM Behavior Lab

Comprehensive testing environment for analyzing Large Language Model behavior across energy consumption and comparative performance.

⚡

Analyze energy consumption and carbon footprint of LLM modifications. Track Wh/token usage across different benchmarks.

• Real-time energy tracking

• Custom benchmark creation

• CO2 emissions calculation

• Modification impact analysis

📊

Simple model testing and comparison. Compare base vs instruction-tuned models with basic metrics.

• Base vs RLHF comparison

• Basic performance metrics

• Simple model testing

• Token counting

For specialized use cases, standalone versions of each environment are available:

python3 app_energy.py Port 8002 - Energy only

python3 app_comparison.py Port 8004 - Comparison only

1. Make sure Ollama is running: ollama serve

2. Pull test models: ollama pull llama3.1:8b

3. Choose your testing environment above and start analyzing!