It’s titled “Navigating LLM Threats: Detecting Prompt Injections and Jailbreaks”. The virtual event will air at 10am PST on Tuesday, January 9th. Can’t attend? No worries, just like the rest of Deep’s events, it will be streamed. Just be sure to register!
If you’re not familiar with these concepts, there’s a short course on LearnPrompting.org on prompt injecting, which is a way to maneuver the LLM into doing something other than intended. Jailbreaking uses prompt injections to bypass safety features. Understanding these techniques is the start of learning how to detect and mitigate hijacking.
While it won’t be an issue in the work I’m presently doing, which is training LLMs, it could be when the GPT store opens. As I’ve mentioned, I made a GPT for fun. I did add some guardrails, but there’s not any personal or proprietary information for a would-be hacker to obtain. Still, I may build something else that could be more meaty, so I am interested to see how someone might sneak in and throw wrenches into my creation.
Hope you can catch the live event. I’ll be watching the streamed version and will report back.