Building AIOps with Amazon Q Developer CLI and MCP Server


IT teams face mounting challenges as they manage increasingly complex infrastructure and applications, often spending countless hours manually identifying operational issues, troubleshooting problems, and performing repetitive maintenance tasks. This operational burden diverts valuable technical resources from innovation and strategic initiatives. Artificial intelligence for IT operations (AIOps) presents a transformative solution, using AI to automate operational workflows, detect anomalies, and resolve incidents with minimal human intervention. Organizations can optimize their operational efficiency while maintaining security as they manage their infrastructure and applications.

You can use Amazon Q Developer CLI and Model Context Protocol (MCP) servers to build powerful AIOps solutions that can reduce manual effort through natural language interactions. Amazon Q Developer can help developers and IT professionals with many of their tasks—from coding, testing, and deploying, to troubleshooting, performing security scanning and fixes, modernizing applications, optimizing AWS resources, and creating data engineering pipelines. The MCP extends these capabilities by enabling Amazon Q to connect with custom tools and services through a standardized interface, allowing for more sophisticated operational automations.

In this post, we discuss how to implement a low-code no-code AIOps solution that helps organizations monitor, identify, and troubleshoot operational events while maintaining their security posture. We show how these technologies work together to automate repetitive tasks, streamline incident response, and enhance operational efficiency across your organization.

This is the third post in a series on AIOps using generative AI services on AWS. Refer to the following two posts for building AIOps using Amazon Bedrock and Amazon Q Business:

Solution overview

MCP servers act like a universal connector for AI models, enabling them to interact with external systems, fetch live data, and integrate with various tools seamlessly. This helps Amazon Q provide more contextually relevant assistance by accessing the information it needs in real time. The following architecture diagram illustrates how you can use a single configuration file, mcp.json, to configure MCP servers in Amazon Q Developer CLI to connect to external systems.

Solution overview

The workflow consists of the following steps:

  1. The user configures an MCP client in Amazon Q Developer CLI using the mcp.json file.
  2. The user logs in to Amazon Q Developer CLI and asks operational queries in natural language.
  3. Depending on your query, Amazon Q decides which MCP servers that you configured or existing tools to invoke to perform the task.
  4. The MCP server interacts with the respective external system to get the live data that is used by Amazon Q to perform the required task.

In this post, we show how to use Amazon Q Developer CLI to address the following operational issues:

Prerequisites

Complete the following prerequisites before you start setting up the demo:

Configure MCP in Amazon Q Developer CLI

MCP configuration in Amazon Q Developer CLI is managed through JSON files. You will configure the Amazon Bedrock Knowledge Base Retrieval MCP Server. At the time of writing, only the stdio transport is supported in Amazon Q Developer CLI.

Amazon Q Developer CLI supports two levels of MCP configuration:

  • Global configuration – Uses ~/.aws/amazonq/mcp.json and applies to all workspaces
  • Workspace configuration – Uses .amazonq/mcp.json and is specific to the current workspace

For this post, we use the workspace configuration, but you have option to use either of them.

  1. Create a new workspace folder, and inside that folder, create the file .amazonq/mcp.json with the following content:
{
  "mcpServers": {
    "awslabs.bedrock-kb-retrieval-mcp-server": {
      "command": "uvx",
      "args": ["awslabs.bedrock-kb-retrieval-mcp-server@latest"],
      "env": {
        "AWS_PROFILE": "your-profile-name ",
        "AWS_REGION": "your-region",
        "FASTMCP_LOG_LEVEL": "ERROR",
        "KB_INCLUSION_TAG_KEY": "name=aiops-knowledge-base",
        "BEDROCK_KB_RERANKING_ENABLED": "false"
      },
      "disabled": false,
      "autoApprove": []
    }  
  }
}

See the AWS MCP Servers GitHub repository for an updated list of available MCP servers.

  1. Open a terminal, navigate to the workspace folder that you created, and run the following command to log in to Amazon Q Developer CLI:
  1. Follow the instructions to log in to Amazon Q Developer on the command line.
  2. Initiate the chat session by running q and then run /tools to validate that the Amazon Bedrock Knowledge Base Retrieval MCP server is configured.

Tool permissions have two possible states:

  • Trusted – Amazon Q can use the tool without asking for confirmation each time
  • Per-request – Amazon Q must ask for your confirmation each time before using the tool

By default, this tool will not be trusted.

Amazon Q Developer CLI

5. Run /tools trust awslabsbedrock_kb_retrieval_mcp_server___QueryKnowledgeBases to trust the MCP server.

6. Run the /tools command again to validate it.

Amazon Q Developer CLI

Deploy AWS resources

Deploy the following AWS CloudFormation template to deploy the AWS resources that you will use to test AIOps. You can deploy this template in either the us-east-1 or us-west-2 AWS Region. You can deploy it in other Regions by updating the applicable AMI IDs in the template. This template will deploy two EC2 instances and three S3 buckets.

This CloudFormation template is for demo purposes only and not meant for production usage.

AWSTemplateFormatVersion: '2010-09-09'
Description: >-
  This template creates the necessary AWS resources which will be used to test AIOps using 
  Amazon Q Developer CLI with MCP server integration.
Metadata:
  AWS::CloudFormation::Interface:
    ParameterGroups:
      - Label:
          default: Network
        Parameters:
          - SecurityGroupIngressCidrIp
      - Label:
          default: General
        Parameters:
          - Prefix
    ParameterLabels:
      SecurityGroupIngressCidrIp:
        default: Security group ingress CIDR IP
Parameters:
  Prefix:
    Type: String
    Description: Unique name prefix for resources that are created by the stack.
    ConstraintDescription: >-
      must not start with a dash, and must only contain lowercase a-z, digits,
      and a dash.
    AllowedPattern: ^[a-z0-9][a-z0-9-]+$
    MinLength: 1
    MaxLength: 30
    Default: aiops-qdevcli
  SecurityGroupIngressCidrIp:
    Type: String
    Description: >-
      IPv4 address in CIDR format for allowed incoming traffic to the EC2 instance. Defaults to allowing all IPs.
    ConstraintDescription: >-
      must be in the form x.x.x.x/s, where x is 0-255, and s is 0-32.
    AllowedPattern: ^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])(\/([0-9]|[1-2][0-9]|3[0-2]))$
    Default: 0.0.0.0/0
Resources:
  # AIOps Amazon S3 bucket1
  AIOpsQDeveloperCliS3Bucket1:
    Type: AWS::S3::Bucket
    Properties:
      AccessControl: Private
      BucketName:
        Fn::Sub: ${Prefix}-bucket1-${AWS::AccountId}
      PublicAccessBlockConfiguration:
        BlockPublicAcls: true
        BlockPublicPolicy: true
        IgnorePublicAcls: true
        RestrictPublicBuckets: true
  # AIOps Amazon S3 bucket2
  AIOpsQDeveloperCliS3Bucket2:
    Type: AWS::S3::Bucket
    Properties:
      AccessControl: Private
      BucketName:
        Fn::Sub: ${Prefix}-bucket2-${AWS::AccountId}
      PublicAccessBlockConfiguration:
        BlockPublicAcls: true
        BlockPublicPolicy: true
        IgnorePublicAcls: true
        RestrictPublicBuckets: true
  # AIOps Amazon S3 bucket3
  AIOpsQDeveloperCliS3Bucket3:
    Type: AWS::S3::Bucket
    Properties:
      AccessControl: Private
      BucketName:
        Fn::Sub: ${Prefix}-bucket3-${AWS::AccountId}
      PublicAccessBlockConfiguration:
        BlockPublicAcls: true
        BlockPublicPolicy: true
        IgnorePublicAcls: true
        RestrictPublicBuckets: true
  # AIOps Knowledgebase S3 bucket
  AIOpsQDeveloperKBS3Bucket:
    Type: AWS::S3::Bucket
    Properties:
      AccessControl: Private
      BucketName:
        Fn::Sub: ${Prefix}-kb-${AWS::AccountId}
      PublicAccessBlockConfiguration:
        BlockPublicAcls: true
        BlockPublicPolicy: true
        IgnorePublicAcls: true
        RestrictPublicBuckets: true
  # AIOps VPC resources
  AIOpsQDeveloperCliVPC:
    Type: AWS::EC2::VPC
    Properties:
      CidrBlock: 10.0.0.0/16
      Tags:
        - Key: Name
          Value: AIOpsQDeveloperCliVPC
  AIOpsQDeveloperCliSubnet1:
    Type: AWS::EC2::Subnet
    Properties:
      CidrBlock: 10.0.1.0/24
      VpcId:
        Ref: AIOpsQDeveloperCliVPC
      AvailabilityZone: !Select 
        - 0
        - !GetAZs 
          Ref: 'AWS::Region'
      Tags:
        - Key: Name
          Value: AIOpsQDeveloperCliSubnet1
  AIOpsQDeveloperCliSubnet2:
    Type: AWS::EC2::Subnet
    Properties:
      CidrBlock: 10.0.3.0/24
      VpcId:
        Ref: AIOpsQDeveloperCliVPC
      AvailabilityZone: !Select 
        - 1
        - !GetAZs 
          Ref: 'AWS::Region'
      Tags:
        - Key: Name
          Value: AIOpsQDeveloperCliSubnet2
  AIOpsQDeveloperIGW:
    Type: AWS::EC2::InternetGateway
    Properties:
      Tags:
        - Key: Name
          Value: AIOpsQDeveloperIGW
  AIOpsQDeveloperCliVPCGatewayAttachment:
    Type: AWS::EC2::VPCGatewayAttachment
    Properties:
      InternetGatewayId:
        Ref: AIOpsQDeveloperIGW
      VpcId:
        Ref: AIOpsQDeveloperCliVPC
  AIOpsQDeveloperCliRT:
    Type: AWS::EC2::RouteTable
    Properties:
      VpcId:
        Ref: AIOpsQDeveloperCliVPC
      Tags:
        - Key: Name
          Value: AIOpsQDeveloperCliRT
  AIOpsRoute:
    Type: AWS::EC2::Route
    DependsOn:
      - AIOpsQDeveloperCliVPCGatewayAttachment
    Properties:
      DestinationCidrBlock: 0.0.0.0/0
      GatewayId:
        Ref: AIOpsQDeveloperIGW
      RouteTableId:
        Ref: AIOpsQDeveloperCliRT
  AIOpsQDeveloperCliSubnetRouteTableAssociation1:
    Type: AWS::EC2::SubnetRouteTableAssociation
    Properties:
      RouteTableId:
        Ref: AIOpsQDeveloperCliRT
      SubnetId:
        Ref: AIOpsQDeveloperCliSubnet1
  AIOpsQDeveloperCliSubnetRouteTableAssociation2:
    Type: AWS::EC2::SubnetRouteTableAssociation
    Properties:
      RouteTableId:
        Ref: AIOpsQDeveloperCliRT
      SubnetId:
        Ref: AIOpsQDeveloperCliSubnet2
  AIOpsQDeveloperCliSG1:
    Type: AWS::EC2::SecurityGroup
    Properties:
      GroupDescription: >-
        Allows incoming traffic on port 5080 and denies all outgoing traffic.
      SecurityGroupEgress:
        - Description: Denies all outgoing traffic.
          IpProtocol: -1
          CidrIp: 0.0.0.0/32
      SecurityGroupIngress:
        - Description: Allows incoming TCP traffic on port 22.
          IpProtocol: tcp
          FromPort: 22
          ToPort: 22
          CidrIp:
            Ref: SecurityGroupIngressCidrIp        
      VpcId:
        Ref: AIOpsQDeveloperCliVPC
      Tags:
        - Key: Name
          Value: AIOpsQDeveloperCliSG1
  AIOpsQDeveloperCliSG2:
    Type: AWS::EC2::SecurityGroup
    Properties:
      GroupDescription: >-
        Allows incoming traffic on port 5080 and denies all outgoing traffic.
      SecurityGroupEgress:
        - Description: Denies all outgoing traffic.
          IpProtocol: -1
          CidrIp: 0.0.0.0/32
      SecurityGroupIngress:
        - Description: Allows incoming TCP traffic on port 5080.
          IpProtocol: tcp
          FromPort: 5080
          ToPort: 5080
          CidrIp:
            Ref: SecurityGroupIngressCidrIp
        - Description: Allows incoming TCP traffic on port 22.
          IpProtocol: tcp
          FromPort: 22
          ToPort: 22
          CidrIp:
            Ref: SecurityGroupIngressCidrIp        
      VpcId:
        Ref: AIOpsQDeveloperCliVPC
      Tags:
        - Key: Name
          Value: AIOpsQDeveloperCliSG2
  EC2KeyPair:
    Type: AWS::EC2::KeyPair
    Properties:
      KeyName: 
        Fn::Sub: ${Prefix}-keypair-${AWS::AccountId}
  # EC2 instance to demo high CPU Utilization AIOps  
  EC2InstanceHighCPUUtilDemo:
    Type: AWS::EC2::Instance
    Properties:
      InstanceType: t2.micro
      KeyName: !Ref EC2KeyPair      
      ImageId: !FindInMap [RegionMap, !Ref 'AWS::Region', AL2023]
      NetworkInterfaces:
        - AssociatePublicIpAddress: true
          DeviceIndex: 0
          SubnetId: !Ref AIOpsQDeveloperCliSubnet1
          GroupSet: 
            - !Ref AIOpsQDeveloperCliSG1
      Tags:
        - Key: Name
          Value:
            Fn::Sub: ${Prefix}-high-cpu-util
  # EC2 instance to demo unwanted open port detection AIOps  
  EC2InstanceOpenPortDemo:
    Type: AWS::EC2::Instance
    Properties:
      InstanceType: t2.micro
      KeyName: !Ref EC2KeyPair      
      ImageId: !FindInMap [RegionMap, !Ref 'AWS::Region', AL2023]
      NetworkInterfaces:
        - AssociatePublicIpAddress: true
          DeviceIndex: 0
          SubnetId: !Ref AIOpsQDeveloperCliSubnet1
          GroupSet: 
            - !Ref AIOpsQDeveloperCliSG2
      Tags:
        - Key: Name
          Value:
            Fn::Sub: ${Prefix}-open-port-demo
  CPUUtilizationAlarm:
    Type: AWS::CloudWatch::Alarm
    Properties:
      AlarmName: 
        Fn::Sub: ${Prefix}-EC2-Instance-CPU-Utilization
      AlarmDescription: Alarm when server CPU exceeds 70%
      ComparisonOperator: GreaterThanThreshold
      EvaluationPeriods: 1
      MetricName: CPUUtilization
      Namespace: AWS/EC2
      Period: 60
      Statistic: Average
      Threshold: 70.0
      ActionsEnabled: false
      Dimensions:
        - Name: InstanceId
          Value: !Ref EC2InstanceHighCPUUtilDemo
      Unit: Percent
Mappings:
  RegionMap:
    us-east-1:
      AL2023: ami-085ad6ae776d8f09c
    us-west-2:
      AL2023: ami-0005ee01bca55ab66
Outputs:
  AIOpsQDeveloperCliS3Bucket1:
    Description: S3 bucket created for testing AIOps
    Value:
      Ref: AIOpsQDeveloperCliS3Bucket1
  AIOpsQDeveloperCliS3Bucket2:
    Description: S3 bucket created for testing AIOps
    Value:
      Ref: AIOpsQDeveloperCliS3Bucket2
  AIOpsQDeveloperCliS3Bucket3:
    Description: S3 bucket created for testing AIOps
    Value:
      Ref: AIOpsQDeveloperCliS3Bucket3
  AIOpsQDeveloperKBS3Bucket:
    Description: S3 bucket created for testing AIOps
    Value:
      Ref: AIOpsQDeveloperKBS3Bucket
  EC2InstanceHighCPUUtilDemo:
    Description: EC2 instance for testing AIOps
    Value:
      Ref: EC2InstanceHighCPUUtilDemo
  EC2InstanceOpenPortDemo:
    Description: EC2 instance for testing AIOps
    Value:
      Ref: EC2InstanceOpenPortDemo

Validate that the template deployed two EC2 instances, which are in Running state.

EC2 Console

Additionally, validate that the template created three S3 buckets with the names aiops-qdevcli-bucketX- and one bucket with the name aiops-qdevcli- in your selected Region.

S3 Console

Create an Amazon Bedrock knowledge base

Upload the sample high CPU utilization runbook to the aiops-qdevcli- bucket. Create a knowledge base pointing to the bucket, and note the knowledge base ID to use in the first example use case.

Use case 1: Identify and remediate high CPU utilization in an EC2 instance

In this use case, you introduce CPU stress in one of the EC2 instances and then use Amazon Q Developer CLI to identify and remediate it.

  1. On the Amazon EC2 console, log in to the aiops-qdevcli-high-cpu-util instance using EC2 Instance Connect.
  2. Run the following command to install stress-ng:
sudo dnf install stress-ng

  1. Run the following command to stress the EC2 instance for 1 hour:
stress-ng --cpu 1 --timeout 3600s

You must wait approximately 10 minutes for the Amazon CloudWatch alarm to get triggered.

  1. Return to the Amazon EC2 console and check that the aiops-qdevcli-high-cpu-util instance is currently in Alarm state.
  2. From the Amazon Q Developer CLI, use a natural language query to check for operation issues in your account. Use the knowledge base ID that you saved in the previous section.

Amazon Q Developer CLI autocorrects the errors that it encountered while running the commands.

Watch the following video for more details.

Due to the inherent nondeterministic nature of the FMs, the responses you receive from Amazon Q Developer CLI might not be exactly the same as those shown in the demo.

Use case 2: Identify and remove public access from an S3 bucket

In this use case, you will simulate an accidental security issue by unblocking public access for one of the buckets and then use Amazon Q Developer CLI to identify and remediate the issue.

  1. On the Amazon S3 console, open one of the aiops-qdevcli-xxxx buckets, and on the Permissions tab, choose Edit and change Block all public access to Off.

S3 public access

  1. Return to the Amazon Q Developer CLI and ask questions in natural language to identify and remediate the operational issue.

Watch the following video for more details.

Use case 3: Identify and block a specific unwanted open port for inbound connection to an EC2 instance

In this use case, you will use Amazon Q Developer CLI to identify the EC2 instance that has a specific port open and then close the port.

  1. On the Amazon EC2 console, note that the aiops-qdevcli-open-port-demo instance has port 5080 open for all inbound TCP connections. This is an unwanted security risk that you want to identify and remediate.

EC2 Console

  1. Return to Amazon Q Developer CLI and use natural language queries to identify the EC2 instance with port 5080 open and fix the issue.

Watch the following video for details.

Clean up

Properly decommissioning provisioned AWS resources is an important best practice to optimize costs and enhance security posture after concluding proofs of concept and demonstrations. Complete the following steps to delete the resources created in your AWS account:

  1. On the Amazon Bedrock console, delete the Amazon Bedrock knowledge base.
  2. On the Amazon S3 console, empty the aiops-qdevcli-kb-xxx bucket.
  3. On the AWS CloudFormation console, delete the CloudFormation stack.

As an alternative, try the preceding steps using natural language queries in Amazon Q Developer CLI.

  1. Finally, delete the .amazonq/mcp.json file from your workspace folder to remove the MCP configuration for Amazon Q Developer CLI.

Conclusion

In this post, we showed how Amazon Q Developer CLI interprets natural language queries, automatically converts them into appropriate commands, and identifies the necessary tools for execution. The solution’s intelligent error-handling capabilities analyze logs and perform auto-corrections, minimizing manual intervention. By implementing Amazon Q Developer CLI, you can enhance your team’s operational efficiency, reduce human errors, and manage complex environments more effectively through a conversational interface.We encourage you to explore additional use cases and share your feedback with us. For more information on Amazon Q Developer CLI and AWS MCP servers, refer to the following resources:


About the authors

Biswanath Mukherjee is a Senior Solutions Architect at Amazon Web Services. He works with large strategic customers of AWS by providing them technical guidance to migrate and modernize their applications on AWS Cloud. With his extensive experience in cloud architecture and migration, he partners with customers to develop innovative solutions that leverage the scalability, reliability, and agility of AWS to meet their business needs. His expertise spans diverse industries and use cases, enabling customers to unlock the full potential of the AWS Cloud.

Upendra V is a Senior Solutions Architect at Amazon Web Services, specializing in Generative AI and cloud solutions. He helps enterprise customers design and deploy production-ready Generative AI workloads, implement Large Language Models (LLMs) and Agentic AI systems, and optimize cloud deployments. With expertise in cloud adoption and machine learning, he enables organizations to build and scale AI-driven applications efficiently.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *