Sampling is a mechanism that allows an MCP server to access a language model like Claude through the connected MCP client, instead of calling the model directly. Without sampling, a server that needs text generation would require its own API key, authentication logic, cost management, and full Claude integration code. With sampling, the server creates a prompt and asks the client to make the Claude call on its behalf. The client, which already has a connection to Claude, handles the request and returns the generated text. This shifts both complexity and cost from the server to the client.