Christian Schroeder de Witt
Christian Schroeder de Witt
Home
About Me
Experience
Honors, Awards & Grants
CV
Publications
Teaching
Press
Contact
Consulting
Industry Collaboration
Outreach Request
Other
Light
Dark
Automatic
AI Safety
Illusory Attacks: Detectability Matters in Adversarial Attacks on Sequential Decision-Makers
We introduce illusory attacks, a novel form of adversarial attacks on reinforcement learning agents that is of bounded statistical detectability.
Tim Franzmeyer
,
Stephen Marcus McAleer
,
Joao F. Henriques
,
Jakob Nicolaus Foerster
,
Philip Torr
,
Adel Bibi
,
Christian Schroeder de Witt
PDF
Cite
Rethinking Out-of-Distribution Detection for Reinforcement Learning: Advancing Methods for Evaluation and Detection
We propose DEXTER, a novel out-of-distribution dynamics detection method achieving state-of-the-art in this setting.
Linas Nasvytis
,
Kai Sandbrink
,
Jakob Foerster
,
Tim Franzmeyer
,
Christian Schroeder de Witt
PDF
Cite
Cite
×