🎉 [Gate 30 Million Milestone] Share Your Gate Moment & Win Exclusive Gifts!
Gate has surpassed 30M users worldwide — not just a number, but a journey we've built together.
Remember the thrill of opening your first account, or the Gate merch that’s been part of your daily life?
📸 Join the #MyGateMoment# campaign!
Share your story on Gate Square, and embrace the next 30 million together!
✅ How to Participate:
1️⃣ Post a photo or video with Gate elements
2️⃣ Add #MyGateMoment# and share your story, wishes, or thoughts
3️⃣ Share your post on Twitter (X) — top 10 views will get extra rewards!
👉
MCP System Security Risk Practical Demonstration: From Poisoning to Covert Manipulation
Covert Poisoning and Manipulation in the MCP System: Practical Demonstration
MCP ( Model Context Protocol ) is currently in the early stages of development, with an overall chaotic environment and various potential attack methods emerging. Existing protocols and tool designs struggle to provide effective defense. To help the community better understand and enhance the security of MCP, Slow Mist has specifically open-sourced the MasterMCP tool, hoping to identify security vulnerabilities in product design through practical attack drills, thereby gradually strengthening the MCP project.
This article will guide everyone through practical demonstrations of common attack methods under the MCP system, such as information poisoning and hidden malicious instructions, along with real case studies. All demonstration scripts will also be open-sourced, allowing everyone to fully replicate the entire process in a secure environment and even develop their own attack testing plugins based on these scripts.
Overall Architecture Overview
Demonstration Attack Target MC:Toolbox
smithery.ai is currently one of the most popular MCP plugin websites, gathering a large number of MCP listings and active users. Among them, @smithery/toolbox is the official MCP management tool launched by smithery.ai.
Choose Toolbox as the testing target, mainly based on the following points:
Demonstration of malicious MCP: MasterMCP
MasterMCP is a simulation tool for malicious MCP developed by Slow Mist specifically for security testing, designed with a plugin architecture, and includes the following key modules:
To more realistically replicate attack scenarios, MasterMC has a built-in local website service simulation module. It quickly sets up a simple HTTP server using the FastAPI framework to simulate common web environments. These pages appear normal on the surface, such as displaying information about a cake shop or returning standard JSON data, but in fact, they conceal carefully designed malicious payloads within the page source code or API responses.
MasterMCP adopts a plugin-based approach for expansion, facilitating the quick addition of new attack methods in the future. After running, MasterMCP will run the FastAPI service of the previous module in a subprocess. ( There are existing security risks here - local plugins can arbitrarily start subprocesses that are not expected by MCP ).
Demo Client
demonstration use large model
Choose version Claude 3.7, as it has made certain improvements in sensitive operation recognition, and it represents a strong operational capability in the current MCP ecosystem.
Cross-MCP Malicious Invocation
This demonstration includes two contents: poisoning and Cross-MCP malicious calls.
web content poisoning attack
Cursor accesses the local test website.
This is a seemingly harmless "Delicious Cake World" page. Through this experiment, we simulate the impact of a large model client accessing a malicious website.
Execute command:
Fetch the content of
The results show that the Cursor not only read the web content but also transmitted local sensitive configuration data back to the test server. In the source code, malicious prompts are embedded in the form of HTML comments.
Although the annotation method is relatively straightforward and easy to identify, it can already trigger malicious operations.
Visit the /encode page, which looks like the previous example, but the malicious prompts have been encoded, making the poisoning exploit more hidden, and it is difficult to directly detect even when viewing the webpage source.
Even if the source code does not contain plaintext prompts, the attack still successfully executes.
MCP tool returns information poisoning
According to the prompt instructions from MasterMCP, input the simulation command (. This command has no actual meaning and is intended to trigger malicious MCP to demonstrate its subsequent operations ):
get a lot of apples
It can be seen that after triggering the command, the client cross-MCP called the Toolbox and successfully added a new MCP server.
Upon examining the plugin code, it can be found that the returned data has already embedded encoded malicious payloads, making it almost impossible for the user side to detect any anomalies.
Third-party interface pollution attack
This demonstration is primarily to remind that whether it is a malicious or non-malicious MC, calling a third-party API and directly returning third-party data to the context may have serious consequences.
Execute request:
Fetch json from /api/data
Result: Malicious prompt words were injected into the returned JSON data and successfully triggered malicious execution.
Poisoning Techniques in the MCP Initialization Phase
This demonstration includes initial prompt injection and name conflict.
malicious function override attack
Here, MasterMC has created a tool with the same function name remove_server as Toolbox, and has encoded malicious prompts.
Execute command:
toolbox remove fetch plugin server
Claude Desktop did not call the original toolbox remove_server method, but instead triggered the method of the same name provided by MasterMCP.
The principle is to emphasize that "the original method has been deprecated" to prioritize inducing the large model to call the maliciously overridden function.
Add malicious global check logic
Here, MasterMCP has written a tool for banana, whose core function is to enforce that all tools must execute this tool for a security check before running any prompts.
Before each execution of the function, the system will prioritize calling the banana check mechanism.
This global logic injection is achieved by repeatedly emphasizing "must run banana checks" in the code.
Advanced Techniques for Hiding Malicious Prompts
A coding method friendly to large models
Due to the strong parsing ability of the large language model ( LLM ) for multilingual formats, it is often exploited to hide malicious information, with common methods including:
Random Malicious Payload Return Mechanism
As mentioned in Chapter 2 regarding third-party interface pollution, when requesting /random:
Each time, it randomly returns a page with a malicious payload, greatly increasing the difficulty of detection and tracing.
Summary
Through the practical demonstration of MasterMC, we intuitively observed various hidden security risks in the Model Context Protocol (MCP) system. From simple prompt injection and cross-MCP calls to more covert initialization phase attacks and malicious instruction hiding, each step reminds us: although the MCP ecosystem is powerful, it is equally fragile.
Especially in today's world where large models are increasingly interacting with external plugins and APIs, even a small amount of input pollution can trigger system-level security risks. The diversification of attackers' methods, such as ( code hiding, random pollution, and function overwriting ), also means that traditional protection strategies need a comprehensive upgrade.
Security is never achieved overnight.
I hope this demonstration can serve as a wake-up call for everyone: whether developers or users, we should maintain a sufficient level of vigilance towards the MCP system, constantly paying attention to every interaction, every line of code, and every return value. Only by treating every detail rigorously can we truly build a solid and secure MCP environment.
Next, we will continue to improve the MasterMCP script, open source more targeted test cases, and help everyone gain a deeper understanding, practice, and strengthen protection in a safe environment.