Security of Agentic AI Systems

Security of Agentic AI Systems
(CS 7670)

This site is maintained for public access, if you are enrolled in the class see the Canvas webpage for detailed information.

Class Information

Instructor

Cristina Nita-Rotaru

c.nitarotaru@northeastern.edu

Course Description

Agentic AI systems are autonomous entities capable of perceiving, reasoning, learning, and acting toward goals using large language models (LLMs) with minimal human oversight. While these systems offer significant potential advantages, they also introduce systemic risks. Misaligned or poorly defined objectives can drive agents to take unsafe shortcuts, bypass safeguards, or behave deceptively. As AI agents become increasingly embedded in real-world applications, ensuring their security, reliability, and alignment is becoming a critical priority. In this class we will study architectures and applications of agentic AI systems, understand threat models and attacks against them, and study existing proposed defenses.

The objectives of the course are the following:

Provide an overview of current frameworks to develop agentic AI systems, and threat models relevant in this context.
Read recent, state-of-the-art research papers from both security and machine learning conferences focused on attacks against agentic AI systems and proposed defenses, and discuss them in class. Students will actively participate in class discussions, and lead discussions on multiple papers during the semester.
Experiment with agentic AI systems through programming exercises and a semester-long research project. Students can select the topic of the research project.

Grade

The grade will be based on participation in paper discussions in class (PD), presentations of papers in class and discussion lead (PL), one programming assignment (PA) and a research project (RP). Paper reviews are due by 9pm the day before the lecture when the paper is discussed. Submission is through Gradescope. Grade is computed as follows:

Grade = 15%*PD + 15%*PL + 20%*PA + 50%*RP.

Academic Integrity

Academic Honesty and Ethical behavior are required in this course, as it is in all courses at Northeastern University. There is zero tolerance to cheating.

You are encouraged to talk with the professor about any questions you have about what is permitted on any particular assignment.

Resources

How to read research papers: [PDF]
How to write a review [PDF]
How to prepare presentations [HTML]
Computing ecosystem literacy [HTML]
Docker tutorial [HTML]

Schedule

A tentative schedule is posted below for public access. Class platform is Canvas available through mynortheastern. All additional material for the class and all class communication will take place on Canvas. For the most updated information check Canvas.

Week	Topics
Week 1	Introduction. Class overview. Introduction to security, LLMs, and Agentic AI.
Week 2	Attacks against LLMs. Taxonomy of adversarial attacks on predictive and generative AI. Chapters 1 and 2. PDF Great, Now Write an Article About That: The Crescendo Multi-Turn LLM Jailbreak Attack. PDF
Week 3	Governing Agentic Systems. Practices for Governing Agentic AI Systems PDF Harms from Increasingly Agentic Algorithmic Systems. PDF
Week 4	Agent Directory Services SAGA: A Security Architecture for Governing AI Agentic Systems. PDF . The AGNTCY Agent Directory Service: Architecture and Implementation PDF .
Week 5	Attacks against MCP When MCP Servers Attack: Taxonomy, Feasibility, and Mitigation PDF . MINDGUARD: Tracking, Detecting, and Attributing MCP Tool Poisoning Attack via Decision Dependence Graph PDF . Securing the Model Context Protocol (MCP): Risks, Controls, and Governance PDF
Week 6	Attacks against web-agents Mind the Web: The Security of Web Use Agents PDF WASP: Benchmarking Web Agent Security Against Prompt Injection Attacks PDF Context manipulation attacks : Web agents are susceptible to corrupted memory PDF
Week 7	Attacks against multi-agent systems Multi-Agent Systems Execute Arbitrary Malicious Code PDF On the Resilience of LLM-Based Multi-Agent Collaboration with Faulty Agents PDF Demonstrations of Integrity Attacks in Multi-Agent Systems PDF
Week 8	Project proposal presentation.
Week 9	Spring break.
Week 10	Solutions against prompt injection. Defeating Prompt Injections by Design PDF StruQ: Defending Against Prompt Injection with Structured Queries PDF
Week 11	Defenses against attacks in agent systems Breaking and Fixing Defenses Against Control-Flow Hijaking in Multi-Agent Systems PDF ACE: A Security Architecture for LLM-Integrated App Systems PDF Progent: Programmable Privilege Control for LLM Agents PDF
Week 12	Defenses against attacks in web-based agents BrowseSafe: Understanding and Preventing Prompt Injection Within AI Browser Agents User PDF AI Kill Switch for Malicious Web-Based LLM Agent PDF
Week 13	Defenses against attacks in agent systems Systems Security Foundations for Agentic Computing PDF Trusted AI Agents in the Cloud PDF
Week 14	Privacy issues in agentic systems Terrarium: Revisiting the Blackboard for Multi-Agent Safety, Privacy, and Security Studies PDF Privacy in Action: Towards Realistic Privacy Mitigation and Evaluation for LLM-Powered Agents PDF
Week 15	Automated red-teaming RedCodeAgent: Automatic Red-Teaming Agent Against Diverse Code Agents PDF Multi-Agent Penetration Testing AI for the Web PDF
Week 16	Project presentations

Additional Reading List