Build a conversational natural language interface for Amazon Athena queries using Amazon Nova


Data analysis often presents significant challenges for business users who aren’t proficient in SQL. Traditional methods require technical expertise to query databases, leading to delayed insights and dependence on data teams. Many organizations struggle with making their data accessible to business users while maintaining the analytical capabilities of Amazon Athena.

Modern AI agents are transforming how businesses work with their data by creating natural conversations between people and machines. Instead of learning complex SQL commands, users can simply ask questions in plain English. Amazon Bedrock Agents makes this possible by using foundation models (FMs) that can understand human language, work with various data sources, and perform specific tasks automatically. With these tools, business users can get the answers they need directly from their data, without waiting for technical support.

Amazon Bedrock offers a range of FMs, each designed for specific use cases. Among these, Amazon Nova stands out as the next-generation FM from Amazon, delivering frontier intelligence and industry-leading price-performance. At its core, the Amazon Nova family consists of four distinct types of models, each serving unique business needs. The understanding models (available in Micro, Lite, Pro, and Premier variants) act as sophisticated interpreters of human language, with Amazon Nova Premier offering the most advanced capabilities for complex data analysis. Complementing these are the content generation models, Amazon Nova Canvas and Amazon Nova Reel, which handle creative and content production tasks. There is also a speech-to-speech model, Amazon Nova Sonic. Lastly, the Amazon Nova family offers Amazon Nova Act, a model trained to perform actions in a web browser.

What makes Amazon Nova particularly powerful for our Athena query solution is its exceptional ability to handle complex reasoning tasks, generate precise text, and provide accurate summarizations. These capabilities are essential when translating natural language questions into SQL queries and explaining results back to users in clear, understandable terms.

The Amazon Nova family’s combination of sophisticated features and competitive pricing makes it an ideal choice for businesses looking to bridge the gap between technical data systems and non-technical users.

In this post, we explore an innovative solution that uses Amazon Bedrock Agents, powered by Amazon Nova Lite, to create a conversational interface for Athena queries. We use AWS Cost and Usage Reports (AWS CUR) as an example, but this solution can be adapted for other databases you query using Athena. This approach democratizes data access while preserving the powerful analytical capabilities of Athena, so you can interact with your data using natural language.

Solution overview

The architecture combines several AWS services to transform natural language questions into precise Athena queries for AWS CUR. Users can interact with their data using everyday language, automatically generating and executing appropriate SQL queries. Amazon Bedrock Agents, with Amazon Nova Lite as the FM, serves as the intelligent layer that maintains context throughout the conversation, handles query refinements, and facilitates accurate data retrieval. A key component of the solution is the conversational query agent – an Amazon Bedrock agent powered by Amazon Nova Lite that translates natural language into Athena SQL queries.

The agent integration with Athena supports seamless data exploration through natural conversations, with a streamlined architecture that supports efficient query processing. Other key features of the solution include:

  • Secure user authentication through Amazon Cognito with role-based access control
  • Frontend application hosted on AWS Amplify
  • Real-time query processing and result visualization
  • Natural language to SQL query transformation
  • Context-aware conversation management

The architecture shown in the following diagram demonstrates how we’ve built a secure, scalable system for conversational data queries using several AWS services, including AWS Lambda functions.

The workflow consists of the following steps:

  1. Users interact with a web interface built using HTML, CSS, and JavaScript, hosted on Amplify.
  2. Authentication is handled through Amazon Cognito, which verifies user identities and provides temporary AWS credentials from its identity pool.
  3. After users are authenticated, they can send natural language queries through the interface.
  4. Our conversational query agent, powered by Amazon Bedrock with Amazon Nova Lite, processes these queries with support from two key action groups:
    • A Clock and Calendar action group that provides temporal context.
    • A Build and Run Athena Query action group that handles query execution.
  5. When a user submits a query, Amazon Nova Lite transforms it into SQL, which is then passed to the appropriate Lambda function.
  6. The Lambda function, operating with the necessary AWS Identity and Access Management (IAM) roles, executes the SQL query in Athena.
  7. Athena processes the query against data cataloged in AWS Glue, retrieving results that are then formatted and returned to the user through the same conversational interface.

This architecture provides secure, efficient query processing while maintaining a simple, conversation-like experience for users. The system scales automatically and maintains security through role-based access controls and secure credential management.

Prerequisites

You must have the following in place to complete the solution in this post:

Perform the following steps to set up AWS CUR 2.0 and Athena integration to AWS CUR 2.0 only if you haven’t completed this setup already. If you have the AWS CUR 2.0 set up and its integration with Athena, collect the database name and table name and proceed to the next section to deploy the solution.

  • AWS CUR 2.0 set up:
  • Athena integration to AWS CUR 2.0: The CloudFormation template cur2-glue-athena-integration is designed to automate the integration of AWS CUR 2.0 with Athena. The template includes an AWS Glue crawler, an AWS Glue database, and a Lambda event. During the CloudFormation template deployment, specify the following required parameters:
    • CURReportName – Name of the AWS CUR 2.0 report created.
    • S3BucketName – Destination S3 bucket name for the AWS CUR 2.0 report.
    • S3BucketPath – Amazon S3 path of the bucket (cur-report-path-prefix/cur-report-name/data).

After the CloudFormation template deployment, copy the following from the Outputs tab on the AWS CloudFormation console to use during the deployment of next CloudFormation template as inputs:

  • GlueDatabaseName
  • ExpectedCURTableName

The following screenshot shows an example of the Outputs tab.

Deploy solution resources using AWS CloudFormation

The CloudFormation template conversational-query-agent is designed to run in the us-east-1 Region. If you deploy in a different Region, you must configure cross-Region inference profiles to have proper functionality and update the CloudFormation template accordingly.

During the CloudFormation template deployment, specify the following required parameters:

  • Stack name
  • Foundation model (amazon.nova-lite-v1:0 or amazon.nova-v1:0)
  • Valid user email address
  • Database name (GlueDatabaseName from the stack output)
  • Table name (ExpectedCURTableName from the stack output)
  • Data source S3 bucket name (AWS CUR 2.0 S3 bucket)
  • Athena query results S3 bucket name (provide a new or existing S3 bucket name depending on if the parameter to create an Athena query results S3 bucket is false or true)
  • Parameter to create an Athena query results S3 bucket (true or false)

AWS resource usage will incur costs. When deployment is complete, the following resources will be deployed:

  • Amazon Cognito resources:
  • Lambda functions:
    • BuildandRunAthenaQuery
    • ClockandCalendar
  • Amazon Bedrock agents:
    • ConversationalQueryAgent with action groups:
      • BuildandRunAthenaQueryActionGroup
      • ClockandCalendarActionGroup

After the CloudFormation template deployment, copy the following from the Outputs tab on the AWS CloudFormation console to use during the configuration of your application after it’s deployed in Amplify:

  • AWSRegion
  • BedrockAgentAliasId
  • BedrockAgentId
  • BedrockAgentName
  • IdentityPoolId
  • UserPoolClientId
  • UserPoolId

The following screenshot shows an example of the Outputs tab.

Deploy the Amplify application

You must manually deploy the Amplify application using the frontend code found on GitHub. Complete the following steps:

  • Download the frontend code AWS-Amplify-Frontend.zip from GitHub.
  • Use the .zip file to manually deploy the application in Amplify.
  • Return to the Amplify console page and use the domain it automatically generated to access the application.

Secure access with Amazon Cognito

The application’s security and access management are built on the Amazon Cognito authentication framework. Through the implementation of Amazon Cognito user pools, the system manages user authentication and group organization, and Amazon Cognito identity pools distribute temporary AWS credentials that align with designated IAM roles. This architecture makes sure the Amazon Bedrock Agents API and overall application remain accessible exclusively to authenticated team members, delivering strong security controls without compromising user experience.

Amazon Bedrock Agents for conversational Athena queries

The Amazon Bedrock Agents architecture facilitates data analysis through natural language interactions with AWS CUR data stored in the AWS Glue Data Catalog and accessed using Athena. This agent uses Amazon’s FMs like Amazon Nova Lite to interpret complex cost analysis requests, transform them into precise SQL queries, and present results in a user-friendly format. The agent coordinates with specialized action groups: ClockandCalendarActionGroup for accurate date calculations and BuildandRunAthenaQueryActionGroup for executing optimized Athena queries against AWS CUR data. With comprehensive knowledge of AWS service naming conventions and AWS CUR schema details, the agent handles complex date ranges, cost calculations, and service-specific queries while maintaining security through Amazon Cognito authentication. This architecture demonstrates how Amazon Bedrock Agents can bridge the gap between natural language requests and technical database queries, helping users gain financial insights through simple conversations without needing SQL expertise or deep knowledge of the underlying data structure.

Lambda functions for Amazon Bedrock action groups

As part of this solution, Lambda functions are deployed to support the action groups defined for the ConversationalQueryAgent. These functions enable the agent to perform complex queries on AWS CUR data and provide temporal context for accurate analysis.

The action group ConversationalQueryAgent uses two distinct Lambda backed action groups to deliver comprehensive cost analysis capabilities. The ClockandCalendar Lambda function provides current date and time functionality. This capability makes sure the agent has access to accurate temporal information, which is crucial for generating time-sensitive reports and aligning cost analyses with specific billing periods or custom timeframes.

The BuildandRunAthenaQuery Lambda function serves as the core for the agent’s analytical capabilities. This function connects directly with Athena to execute SQL queries against the AWS CUR data. For demonstration purposes, we use a database named cid-cur with a data table that contains the AWS CUR 2.0 information. The function processes these queries and returns formatted results that are straightforward to interpret and analyze.

Amplify for frontend

Amplify provides a streamlined solution for deploying and hosting web applications with built-in security and scalability features. The service reduces the complexity of managing infrastructure, so developers can concentrate on application development. In our solution, we use the manual deployment capabilities of Amplify to host our frontend application code.

Amazon Bedrock Agents testing

To validate the solution before using the Amplify deployed frontend, we can conduct testing directly on the Amazon Bedrock console. By navigating to the ConversationalQueryAgent, we can pose questions about cost analysis, such as “What are my Top 5 Services cost in each month of first quarter of 2025?” The ConversationalQueryAgent processes the request by first determining the correct time period through ClockandCalendarActionGroup, then constructs and executes appropriate SQL queries through BuildandRunAthenaQueryActionGroup to retrieve the requested cost information. The agent formats the response to provide the month-wise cost of the top 5 services in the first quarter of 2025.

After you set up the application in Amplify, navigate to the specified URL. When you access the application URL, you must provide Amazon Cognito and Amazon Bedrock Agents related details that facilitate secure user authentication and establish a connection between the frontend and the agent. This setup enables the application to manage user sessions and make authorized API calls to AWS services on behalf of the user.

You can enter information with the values you collected from the CloudFormation stack outputs. You will be required to enter the following fields, as shown in the following screenshot:

  • User Pool ID
  • User Pool Client ID
  • Identity Pool ID
  • Region
  • Agent Name
  • Agent ID
  • Agent Alias ID
  • Region

Sign in with your user name and password. A temporary password was automatically generated during deployment and sent to the email address you provided when launching the CloudFormation template. At first sign-in attempt, you will be asked to reset your password.

Now you can ask the same question in the frontend application, for example, “What are my Top 5 Services cost in each month of first quarter of 2025?” In a few seconds, the application will provide you detailed results for the question asked.

The following are a few additional sample queries to demonstrate the capabilities of this tool:

  • What was my costliest Region in 2024?
  • Out of all 4 quarters, which one was the costliest one in 2024?
  • What is the cost of S3, VPC, and Guard duty in Q4 2024?

Adapting the solution for other Amazon S3 backed Athena databases

Although this solution is specifically designed for AWS CUR 2.0 data, you can adapt it for other Amazon S3 backed Athena databases by modifying key sections in the Amazon Bedrock agent instructions:

  • Replace the entire CUR Query Column List section (which contains all the AWS CUR specific column definitions like bill_payer_account_id, line_item_unblended_cost, and so on) with your own table’s column names and descriptions.
  • Modify the AWS Service Names section to reflect the specific services or categories relevant to your data domain or remove it entirely if not applicable.
  • Update the Sample Query section to demonstrate queries appropriate for your data structure and use cases.
  • Adjust the Cost Types and Amortized Cost Calculation sections if your data doesn’t involve financial calculations, replacing them with domain-specific calculation logic relevant to your dataset.

The core query construction principles, date handling, error prevention, and response formatting sections can remain largely unchanged because they provide universal best practices for Athena querying through conversational AI.

Clean up

If you decide to discontinue using the ConversationalQueryAgent application, you can follow these steps to remove it, its associated resources deployed using AWS CloudFormation, and the Amplify deployment:

  1. Delete the CloudFormation stack:
    • On the AWS CloudFormation console, choose Stacks in the navigation pane.
    • Locate the stack you created during the deployment process (you assigned a name to it).
    • Select the stack and choose Delete.
  2. Delete the Amplify application and its resources. For instructions, refer to Clean Up Resources.

Considerations

For optimal visibility across your organization, deploy this Amazon Bedrock agent-powered Athena query solution in your AWS payer account to seamlessly access and analyze cost data from your AWS CUR database. This integration demonstrates how Amazon Bedrock agents can execute complex analytical queries against Athena databases, providing conversational access to your organization’s data while maintaining comprehensive visibility across all linked accounts.

Before deploying to production, enhance security by implementing additional safeguards. You can do this by associating guardrails with your agent in Amazon Bedrock Guardrails.

Conclusion

The integration of Amazon Bedrock Agents with Athena demonstrates the transformative potential of conversational AI for data analytics and cost management. Our ConversationalQueryAgent solution, powered by Amazon Nova Lite, showcases how natural language processing can simplify complex database queries, helping users extract valuable insights from their AWS CUR data through simple conversational prompts. This implementation not only addresses the challenges of SQL query construction for cost analysis, but also provides a flexible framework that can be extended to query different SQL databases through Athena across an organization. By combining the power of the language capabilities of Amazon Bedrock with the analytical capabilities of Athena, this approach sets a new standard for democratizing data access, helping both technical and non-technical users derive actionable insights from their organizational data. As AI and analytics technologies continue to evolve, this solution provides organizations with an intuitive interface to their data repositories, transforming how teams interact with and extract value from their AWS cost data and beyond.

To learn more about Amazon Bedrock Agents, refer to the following resources:


About the Authors

Ravi Kumar is a Senior Technical Account Manager in AWS Enterprise Support who helps customers in the travel and hospitality industry to streamline their cloud operations on AWS. He is a results-driven IT professional with over 20 years of experience. Ravi is passionate about generative AI and actively explores its applications in cloud computing. In his free time, Ravi enjoys creative activities like painting. He also likes playing cricket and traveling to new places.

Salman Ahmed is a Senior Technical Account Manager in AWS Enterprise Support. He specializes in guiding customers through the design, implementation, and support of AWS solutions. Combining his networking expertise with a drive to explore new technologies, he helps organizations successfully navigate their cloud journey. Outside of work, he enjoys photography, traveling, and watching his favorite sports teams.

Sergio Barraza is a Senior Technical Account Manager at AWS, helping customers on designing and optimizing cloud solutions. With more than 25 years in software development, he guides customers through AWS services adoption. Outside work, Sergio is a multi-instrument musician playing guitar, piano, and drums, and he also practices Wing Chun Kung Fu.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *