Skip to content

Can Large Language Models Solve Security Challenges? We test LLMs' ability to interact and break out of shell environments using the OverTheWire wargames environment, showing the models' surprising ability to do action-oriented cyberexploits in shell environments

License

Notifications You must be signed in to change notification settings

ZiyueWang25/llm-security-challenge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

llm-security-challenge

This is our open source solution for Eval Hackathon hosted by Apart Research.

Check out our paper: Can Large Language Models Solve Security Challenges?

Steps to replicate:

  1. conda create -n llm-security python=3.10
  2. pip install -r requirements.txt
  3. If we want to query the API instead of checking the existing result. Please ensure OpenAI API key is set as system environment variable. For Linux, put export OPENAI_API_KEY=xxxx inside ~/.bashrc and source ~/.bashrc to activate it.
  4. To generate existing result: python main.py -m gpt3 -m gpt3.5 -m gpt4
  5. To read already generated result: python main.py -m gpt3 -m gpt3.5 -m gpt4 --use_prev=True --version=20230820-19:12

About

Can Large Language Models Solve Security Challenges? We test LLMs' ability to interact and break out of shell environments using the OverTheWire wargames environment, showing the models' surprising ability to do action-oriented cyberexploits in shell environments

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages