Bringing agentic Retrieval Augmented Generation to Amazon Q Business

Amazon Q Business is a generative AI-powered enterprise assistant that helps organizations unlock value from their data. By connecting to enterprise data sources, employees can use Amazon Q Business to quickly find answers, generate content, and automate tasks—from accessing HR policies to streamlining IT support workflows, all while respecting existing permissions and providing clear citations. At the heart of systems like Amazon Q Business lies Retrieval Augmented Generation (RAG), which enables AI models to ground their responses in an organization’s enterprise data.

The evolution of RAG

Traditional RAG implementations typically follow a straightforward approach: retrieve relevant documents or passages based on a user query, then generate a response using these documents or passages as context for the large language model (LLM) to respond. While this methodology works well for basic, factual queries, enterprise environments present uniquely complex challenges that expose the limitations of this single-shot retrieval approach.

Consider an employee asking about the differences between two benefits packages or requesting a comparison of project outcomes across multiple quarters. These queries require synthesizing information from various sources, understanding company-specific context, and often need multiple retrieval steps to gather comprehensive information around each aspect of the query.

Traditional RAG systems struggle with such complexity, often providing incomplete answers or failing to adapt their retrieval strategy when initial results are insufficient. When processing these more involved queries, users are left waiting without visibility into the system’s progress, leading to an opaque experience.

Bringing agency to Amazon Q Business

Bringing agency to Amazon Q Business is a new paradigm to handle sophisticated enterprise queries through intelligent, agent-based retrieval strategies. By introducing AI agents that dynamically plan and execute sophisticated retrieval strategies with a suite of data navigation tools, Agentic RAG represents a significant evolution in how AI assistants interact with enterprise data, delivering more accurate and comprehensive responses while maintaining the speed users expect.

With Agentic RAG in Amazon Q Business you have several new capabilities, including query decomposition and transparent events, agentic retrieval tool use, improved conversational capabilities, and agentic response optimization. Let’s dive deeper into what each of these mean.

Query decomposition and transparent response events

Traditional RAG systems often face significant challenges when processing complex enterprise queries, particularly those involving multiple steps, composite elements, or comparative analysis. With this release of Agentic RAG in Amazon Q Business, we aim to solve this problem through sophisticated query decomposition techniques, where AI agents intelligently break down complex questions into discrete, manageable components.

When an employee asks Please compare the vacation policies of Washington and California?, the question is decomposed into two queries on Washington and California policies. The first decomposed query being washington state vacation policies and the second query being california state vacation policies.

Because Agentic RAG presumes a series of parallel steps to explore the data source and collect thorough information for more accurate query resolution, we are now providing real-time visibility into its processing steps that will be displayed on the screen as data is being retrieved to generate the response. After the response is generated, the steps will be collapsed with the response streamed. In the following image, we see how the decomposed queries are displayed and the relevant data retrieved for response generation.

This allows users to see meaningful updates to the system’s operations, including query decomposition patterns, document retrieval paths, and response generation workflows. This granular visibility into the system’s decision-making process enhances user confidence and provides valuable insights into the sophisticated mechanisms driving accurate response generation.

This agentic solution facilitates comprehensive data collection and enables more accurate, nuanced responses. The result is enhanced responses that maintain both granular precision and holistic understanding of complex, multi-faceted business questions, while relying on the LLM to synthesize the information retrieved. As shown in the following image, the information fetched individually for California and Washington vacation policies were synthesized by the LLM and presented in a rich markdown format.

Agentic tool use

The designed RAG agents can intelligently deploy various data exploration tools and retrieval methods in optimal strategies by thinking about the retrieval plan while maintaining context over multiple turns of the conversations. These retrieval tools include tools built within Amazon Q Business such as tabular search, allowing intelligent retrieval of data through either code generation or tabular linearization across small and large tables embedded in documents (such as DOCX, PPTX, PDF, and so on) or stored in CSV or XLSX files. Another retrieval tool includes long context retrieval, which determines when the full context of a document is required for retrieval. An example of long context retrieval: if a user asks a query such as Summarize the 10K of Company X, the agent could identify the query’s intent as a summarization query that requires document-level context and, as a result, deploy the long context retrieval tool that fetches the complete document—the 10K of Company X—as part of the context for the LLM to generate a response (as shown in the following figure). This intelligent tool selection and deployment represents a significant advancement over traditional RAG systems, which often rely on fragmented passage retrieval that can compromise the coherence and completeness of complex document analysis for question answering.

Improved conversational capabilities

Agentic RAG introduces multi-turn query capabilities that elevate the conversational capabilities of Amazon Q Business into dynamic, context-aware dialogues. The agent maintains conversational context across interactions by storing short-term memory, enabling natural follow-up questions without requiring users to restate previous context. Additionally, when the agent encounters multiple possible answers based on your enterprise data, it asks clarifying questions to disambiguate the query to better understand what you’re looking for to provide more accurate responses. For instance, Q refers to any of the many implementations of Amazon Q. The system handles semantic ambiguity gracefully by recognizing multiple potential interpretations of what Q could be and asks for clarifications in its responses to verify accuracy and relevance. This sophisticated approach to dialogue management makes complex tasks like policy interpretation or technical troubleshooting more efficient, because the system can progressively refine its understanding through targeted clarification and follow-up exchanges.

In the following image, the user asks tell me about Q with the system providing a high-level overview of the various implementations and asking a follow-up question to disambiguate the user’s search intent.

Upon successful disambiguation, the system persists both the conversation state and previously retrieved contextual data in-memory, enabling the generation of precisely targeted responses that align with the user’s clarified intent thus being more accurate, relevant, and complete.

Agentic response optimization

Agentic RAG introduces dynamic response optimization where AI agents actively evaluate and refine their responses. Unlike traditional systems that provide answers even when the context is insufficient, these agents continuously assess response quality and iteratively plan out new actions to improve information completeness. They can recognize when initial retrievals miss crucial information and autonomously initiate additional searches or alternative retrieval strategies. This means when discussing complex topics like compliance policies, the system captures all relevant updates, exceptions, and interdependencies while maintaining context across multiple turns of the conversation. The following diagram shows how Agentic RAG handles the conversation history across multiple turns of the conversation. The agent plans and reasons across the retrieval tool use and response generation process. Based on the initial retrieval, while taking into account the conversation state and history, the agent re-plans the process as needed to generate the most complete and accurate response for the user’s query.

Using the Agentic RAG feature

Getting started with Agentic RAG’s advanced capabilities in Amazon Q Business is straightforward and can immediately improve how your organization interacts with your enterprise data. To begin, in the Amazon Q Business web interface, you can switch on the Advanced Search toggle to enable Agentic RAG, as shown in the following image.

After advanced search is enabled, users can experience richer and more complete responses from Amazon Q Business. Agentic RAG particularly shines when handling complex business scenarios based on your enterprise data—imagine asking about cross-AWS Region performance comparisons, investigating policy implications across departments, or analyzing historical trends in project deliveries. The system excels at breaking down these complex queries into manageable search tasks while maintaining context throughout the conversation.

For the best experience, users should feel confident in asking detailed, multi-part questions. Unlike traditional search systems, Agentic RAG handles nuanced queries like

How have our metrics changed across the southeast and northeast regions in 2024?

The system will work through such questions methodically, showing its progress as it analyzes and breaks the query down into composite parts to fetch sufficient context and generate a complete and accurate response.

Conclusion

Agentic RAG represents a significant leap forward for Amazon Q Business, transforming how organizations use their enterprise data while maintaining the robust security and compliance that they expect with AWS services. Through its sophisticated query processing and contextual understanding, the system enables deeper, more nuanced interactions with enterprise data—from comparative and multi-step queries to interactive multi-turn chat experiences. All of this occurs within a secure framework that respects existing permissions and access controls, making sure that users receive only authorized information while maintaining the rich, contextual responses needed for meaningful insights.

By combining advanced retrieval capabilities with intelligent, conversation-aware interactions, Agentic RAG allows organizations to unlock the full potential of their data while maintaining the highest standards of data governance. The result is an improved chat experience and a more capable query answering engine that maximizes the value of your data assets.

Try out Amazon Q Business for your organization with your data and share your thoughts in the comments.

About the authors

Sanjit Misra is a technical product leader at Amazon Web Services, driving innovation on Amazon Q Business, Amazon’s generative AI product. He leads product development for core Agentic AI features that enhance accuracy and retrieval — including Agentic RAG, conversational disambiguation, tabular search, and long-context retrieval. With over 15 years of experience across product and engineering roles in data, analytics, and AI/ML, Sanjit combines deep technical expertise with a track record of delivering business outcomes. He is based in New York City.

Venky Nagapudi is a Senior Manager of Product Management for Amazon Q Business. His focus areas include RAG features, accuracy evaluation and enhancement, user identity management and user subscriptions.

Yi-An Lai is a Senior Applied Scientist with the Amazon Q Business team at Amazon Web Services in Seattle, WA. His expertise spans agentic information retrieval, conversational AI systems, LLM tool orchestration, and advanced natural language processing. With over a decade of experience in ML/AI, he has been enthusiastic about developing sophisticated AI solutions that bridge state-of-the-art research and practical enterprise applications.

Yumo Xu is an Applied Scientist at AWS, where he focuses on building helpful and responsible AI systems for enterprises. His primary research interests are centered on the foundational challenges of machine reasoning and agentic AI. Prior to AWS, Yumo received his PhD in Natural Language Processing from the University of Edinburgh.

Danilo Neves Ribeiro is an Applied Scientist on the Q Business team based in Santa Clara, CA. He is currently working on designing innovative solutions for information retrieval, reasoning, language model agents, and conversational experience for enterprise use cases within AWS. He holds a Ph.D. in Computer Science from Northwestern University (2023) and has over three years of experience working as an AI/ML scientist.

Kapil Badesara is a Senior Machine Learning Engineer on AWS Q Business, focusing on optimizing RAG systems for accuracy and efficiency. Kapil is based out of Seattle and has more than 10 years of building large scale AI/ML services.

Sunil Singh is an Engineering Manager on the Amazon Q Business team, where he leads the development of next-generation agentic AI solutions designed to enhance Retrieval-Augmented Generation (RAG) systems for greater accuracy and efficiency. Sunil is based out of Seattle and has more than 10 years of experience in architecting secure, scalable AI/ML services for enterprise-grade applications.