Spring AI Recipes
AI isn’t just for Python developers anymore. With Spring AI, Java and Spring developers can build powerful AI agents and intelligent applications directly within the platform they already trust for enterprise software.
As I've worked with Spring AI for the past few years, I've created a sizable number of “recipes” in my notebook. Rather than keep them to myself, I've decided to share them for anyone who may learn from them. (All recipes are hosted at Medium.com.)
If you like what you read here, then you should check out my books.
Recently added
All recipes
Using ElevenLabs Voices
Your AI application can listen and speak - but it doesn't have to sound like every other AI assistant. In this recipe, you'll replace OpenAI's built-in voices with ElevenLabs, unlocking a vast library of high-quality voices that can give your application its own unique personality.
Adding Voice to AI
Talking to an AI application feels surprisingly natural — until it answers with a wall of text. In this recipe, you’ll close that gap by adding text-to-speech support, enabling your Spring AI application to speak its responses aloud.
Talking to AI
Why type your prompts when you can just say them out loud? In this recipe, we'll add voice input to our chat loop by recording audio from the microphone and transcribing it into text with Spring AI.
Tool-Based RAG
Traditional RAG retrieves context for every question, whether it's needed or not. What if the LLM could decide for itself when retrieval is necessary and only fetch additional information when it would actually help answer the question?
Essential RAG
Even the most capable LLM is limited by what it knew when it was trained. RAG breaks through that limitation by supplying relevant information at prompt time, allowing your applications to answer questions about documents the model has never seen before.
Semantic Caching with Any Vector Store
Semantic caching can use any vector store. If you're already using a vector store such as Qdrant, you can use it to speed up semantically similar requests and reduce token usage without adding another database to your stack.
Semantic Caching (with Redis)
Traditional caches only work when users ask the exact same question twice. Semantic caching goes a step further, recognizing when two differently worded questions mean the same thing and serving a cached answer without ever calling the LLM again.
Controlling MCP Tool Visibility
A well-stocked toolbox is useful. That is, until you have to dig through dozens of tools to find the one you need. The same is true for MCP servers, where exposing fewer tools can often make an AI application faster, cheaper, and more focused.
MCP Tool Progress
Not every tool can return a result immediately. When an MCP tool needs extra time to complete, progress notifications allow the server to keep the client informed, providing visibility into long-running operations and a better overall user experience.
MCP Elicitation
MCP tools don't always have everything they need to do their job. With MCP Elicitation, a tool can pause, gather additional context from the client, and then continue with a more complete understanding of the task at hand.
MCP Sampling
MCP tools usually serve LLMs. But with sampling, the tools can turn around and ask the LLM for help. In this recipe, you’ll see how MCP Sampling enables a server-side tool to collaborate with the client’s LLM to produce richer, more intelligent results.
MCP Logging
When an MCP-powered agent invokes tools across multiple systems, it can sometimes feel like the real action is happening in a black box. Fortunately, MCP includes built-in cross-system logging support, allowing servers to stream log events directly to clients so you can finally see what's happening behind the curtain.
MCP Prompt Completions
A prompt field that merely accepts text is functional - but a prompt field that guides the user with intelligent suggestions creates a dramatically better experience. MCP prompt completions make prompts feel less like raw API inputs and more like polished, interactive UI components.
Defining MCP Prompts
Most users shouldn't have to become prompt engineers just to get great results from AI. MCP prompts let you encapsulate sophisticated prompt engineering into reusable, standardized capabilities that deliver consistent, high-quality LLM interactions across every client that uses them.
Explaining Tool Selection
Understanding why an LLM chose a specific tool can be just as important as the tool call itself. Today's Spring AI Recipe shows how to use AugmentedToolCallbackProvider to capture and inspect tool-selection reasoning from the model.
Better LLM Request/Response Logging with ToolCallAdvisor
Sometimes the most interesting part of an LLM interaction isn't the initial prompt or final response - it's everything that happens in between. With Spring AI's ToolCallAdvisor, you can finally see the full tool-calling conversation unfold in your logs, no matter which model abstraction you're using.
Securing an MCP Server with an API Key
Tools give agents power. Security determines who gets to use that power. In this recipe, learn how to add API Key security to your MCP servers.
Adding a Loop to a Graph-Based Workflow
What if your workflow could take a second pass at its own answer before showing it to the user? By adding a simple loop, you can turn a one-shot response into an iterative process that refines itself until it’s actually useful.
Adding Human-in-the-Loop to a Graph-based Agentic Workflow
Sometimes the smartest thing an AI can do is admit that it’s unsure. Instead of guessing, a graph-based agentic workflow can pause, ask for human guidance, and then continue with confidence. See how to add human-in-the-loop to a graph-based workflow.
Building a Graph-Based Agentic Workflow
Completely autonomous agents are like unplanned road trips--flexible and full of adventure, but unpredictable and potentially surprising. Graph-based workflows provide the roadmap, while still allowing intelligent decisions along the way. Build a graph-based agentic workflow with Spring AI Alibaba Graph.
Creating an MCP Client
Build an MCP client and give tools from an MCP server to your agent.
Streamable HTTP MCP Server
So you’ve built an MCP server with STDIO? That’s great! Now take it from local integration to network-accessible service, by creating a Streamable HTTP MCP server.
Creating an STDIO MCP Server
An agent without tools is just thinking really hard.Give it something to do. Create an STDIO MCP server to provide capabilities to an agent.
Enabling Long-Term Memory
Your agent may be smart. But how is its memory? Learn how to add long-term memory to your agents so that it learns and remembers significant information.
Invoking A2A Sub-Agents with TaskTool
A2A isn’t just about exposing agents--it’s about using them. Use TaskTool to invoke A2A sub-agents.
Enabling Agent-to-Agent Communication with A2A
You've built an agent. But if no other agent can find or use it...what’s the point? See how to enable agent discovery and communication using Agent-to-Agent (A2A).
Reusing Agent Behavior with SkillsJars
Tools give capabilities. Skills give direction. SkillsJars make it reusable. Find out how to add plug-and-play agent behavior with Spring AI and SkillsJars.
Guiding Agentic Behavior with Skills
Tools give your agent capabilities. Skills give it direction. Learn how to guide agent behavior with Skills in Spring AI.
Agentic Planning with TodoWriteTool
What if your LLM didn’t just answer questions—but made a plan and executed it? See how to enable agentic planning with TodoWriteTool.
Asking Questions to the User
We often ask questions to LLMs and get back answers. What if the LLM wants to ask us something? See how to use AskUserQuestionTool to make the human a part of the loop.
Logging LLM Requests and Responses
Keep your ChatClient configuration code neat, tidy, and gain the opportunity to enable/disable it with configuration properties.
Composing ChatClient Behavior
Keep your ChatClient configuration code neat, tidy, and gain the opportunity to enable/disable it with configuration properties.
Building a Text-Based Chat Loop Around ChatClient
Build a simple conversational loop with Spring AI’s ChatClient that continuously accepts user input, sends it to an LLM, and returns the response in real time.


