Accelerating data science innovation: How Bayer Crop Science used AWS AI/ML services to build their next-generation MLOps service


The world’s population is expanding at a rapid rate. The growing global population requires innovative solutions to produce food, fiber, and fuel, while restoring natural resources like soil and water and addressing climate change. Bayer Crop Science estimates farmers need to increase crop production by 50% by 2050 to meet these demands. To support their mission, Bayer Crop Science is collaborating with farmers and partners to promote and scale regenerative agriculture—a future where farming can produce more while restoring the environment.

Regenerative agriculture is a sustainable farming philosophy that aims to improve soil health by incorporating nature to create healthy ecosystems. It’s based on the idea that agriculture should restore degraded soils and reverse degradation, rather than sustain current conditions. The Crop Science Division at Bayer believes regenerative agriculture is foundational to the future of farming. Their vision is to produce 50% more food by restoring nature and scaling regenerative agriculture. To make this mission a reality, Bayer Crop Science is driving model training with Amazon SageMaker and accelerating code documentation with Amazon Q.

In this post, we show how Bayer Crop Science manages large-scale data science operations by training models for their data analytics needs and maintaining high-quality code documentation to support developers. Through these solutions, Bayer Crop Science projects up to a 70% reduction in developer onboarding time and up to a 30% improvement in developer productivity.

Challenges

Bayer Crop Science faced the challenge of scaling genomic predictive modeling to increase its speed to market. It also needed data scientists to focus on building the high-value foundation models (FMs), rather than worrying about constructing and engineering the solution itself. Prior to building their solution, the Decision Science Ecosystem, provisioning a data science environment could take days for a data team within Bayer Crop Science.

Solution overview

Bayer Crop Science’s Decision Science Ecosystem (DSE) is a next-generation machine learning operations (MLOps) solution built on AWS to accelerate data-driven decision making for data science teams at scale across the organization. AWS services assist Bayer Crop Science in creating a connected decision-making system accessible to thousands of data scientists. The company is using the solution for generative AI, product pipeline advancements, geospatial imagery analytics of field data, and large-scale genomic predictive modeling that will allow Bayer Crop Science to become more data-driven and increase speed to market. This solution helps the data scientist at every step, from ideation to model output, including the entire business decision record made using DSE. Other divisions within Bayer are also beginning to build a similar solution on AWS based on the success of DSE.

Bayer Crop Science teams’ DSE integrates cohesively with SageMaker, a fully managed service that lets data scientists quickly build, train, and deploy machine learning (ML) models for different use cases so they can make data-informed decisions quickly. This boosts collaboration within Bayer Crop Science across product supply, R&D, and commercial. Their data science strategy no longer needs self-service data engineering, but rather provides an effective resource to drive fast data engineering at scale. Bayer Crop Science chose SageMaker because it provides a single cohesive experience where data scientists can focus on building high-value models, without having to worry about constructing and engineering the resource itself. With the help of AWS services, cross-functional teams can align quickly to reduce operational costs by minimizing redundancy, addressing bugs early and often, and quickly identifying issues in automated workflows. The DSE solution uses SageMaker, Amazon Elastic Kubernetes Service (Amazon EKS), AWS Lambda, and Amazon Simple Storage Service (Amazon S3) to accelerate innovation at Bayer Crop Science and to create a customized, seamless, end-to-end user experience.

The following diagram illustrates the DSE architecture.

Comprehensive AWS DSE architecture showing data scientist workflow from platform through deployment, with monitoring and security controls

Solution walkthrough

Bayer Crop Science had two key challenges in managing large-scale data science operations: maintaining high-quality code documentation and optimizing existing documentation across multiple repositories. With Amazon Q, Bayer Crop Science tackled both challenges, which empowered them to onboard developers more rapidly and improve developer productivity.

The company’s first use case focused on automatically creating high-quality code documentation. When a developer pushes code to a GitHub repository, a webhook—a lightweight, event-driven communication that automatically sends data between applications using HTTP—triggers a Lambda function through Amazon API Gateway. This function then uses Amazon Q to analyze the code changes and generate comprehensive documentation and change summaries. The updated documentation is then stored in Amazon S3. The same Lambda function also creates a pull request with the AI-generated summary of code changes. To maintain security and flexibility, Bayer Crop Science uses Parameter Store, a capability of AWS Systems Manager, to manage prompts for Amazon Q, allowing for quick updates without redeployment, and AWS Secrets Manager to securely handle repository tokens.

This automation significantly reduces the time developers spend creating documentation and pull request descriptions. The generated documentation is also ingested into Amazon Q, so developers can quickly answer questions they have about a repository and onboard onto projects.

The second use case addresses the challenge of maintaining and improving existing code documentation quality. An AWS Batch job, triggered by Amazon EventBridge, processes the code repository. Amazon Q generates new documentation for each code file, which is then indexed along with the source code. The system also generates high-level documentation for each module or functionality and compares the AI-generated documentation with existing human-written documentation. This process makes it possible for Bayer Crop Science to systematically evaluate and enhance their documentation quality over time.

To improve search capabilities, Bayer Crop Science added repository names as custom attributes in the Amazon Q index and prefixed them to indexed content. This enhancement improved the accuracy and relevance of documentation searches. The development team also implemented strategies to handle API throttling and variability in AI responses, maintaining robustness in production environments. Bayer Crop Science is considering developing a management plane to streamline the addition of new repositories and centralize the management of settings, tokens, and prompts. This would further enhance the scalability and ease of use of the system.

Organizations looking to replicate Bayer Crop Science’s success can implement similar webhook-triggered documentation generation, use Amazon Q Business for both generating and evaluating documentation quality, and integrate the solution with existing version control and code review processes. By using AWS services like Lambda, Amazon S3, and Systems Manager, companies can create a scalable and manageable architecture for their documentation needs. Amazon Q Developer also helps organizations further accelerate their development timelines by providing real-time code suggestions and a built-in next-generation chat experience.

“One of the lessons we’ve learned over the last 10 years is that we want to write less code. We want to focus our time and investment on only the things that provide differentiated value to Bayer, and we want to leverage everything we can that AWS provides out of the box. Part of our goal is reducing the development cycles required to transition a model from proof-of-concept phase, to production, and ultimately business adoption. That’s where the value is.”

– Will McQueen, VP, Head of CS Global Data Assets and Analytics at Bayer Crop Science.

Summary

Bayer Crop Science’s approach aligns with modern MLOps practices, enabling data science teams to focus more on high-value modeling tasks rather than time-consuming documentation processes and infrastructure management. By adopting these practices, organizations can significantly reduce the time and effort required for code documentation while improving overall code quality and team collaboration.

Learn more about Bayer Crop Science’s generative AI journey, and discover how Bayer Crop Science is redesigning sustainable practices through cutting-edge technology.

About Bayer

Bayer is a global enterprise with core competencies in the life science fields of health care and nutrition. In line with its mission, “Health for all, Hunger for none,” the company’s products and services are designed to help people and the planet thrive by supporting efforts to understand the major challenges presented by a growing and aging global population. Bayer is committed to driving sustainable development and generating a positive impact with its businesses. At the same time, Bayer aims to increase its earning power and create value through innovation and growth. The Bayer brand stands for trust, reliability, and quality throughout the world. In fiscal 2023, the Group employed around 100,000 people and had sales of 47.6 billion euros. R&D expenses before special items amounted to 5.8 billion euros. For more information, go to www.bayer.com.


About the authors

Headshot of Lance SmithLance Smith is a Senior Solutions Architect and part of the Global Healthcare and Life Sciences industry division at AWS. He has spent the last 2 decades helping life sciences companies apply technology in pursuit of their missions to help patients. Outside of work, he loves traveling, backpacking, and spending time with his family.

Headshot of Kenton BlacuttKenton Blacutt is an AI Consultant within the Amazon Q Customer Success team. He works hands-on with customers, helping them solve real-world business problems with cutting-edge AWS technologies. In his free time, he likes to travel and run an occasional marathon.

Headshot of Karthik PrabhakarKarthik Prabhakar is a Senior Applications Architect within the AWS Professional Services team. In this role, he collaborates with customers to design and implement cutting-edge solutions for their mission-critical business systems, focusing on areas such as scalability, reliability, and cost optimization in digital transformation and modernization projects.

Headshot of Jake MalmadJake Malmad is a Senior DevOps Consultant within the AWS Professional Services team, specializing in infrastructure as code, security, containers, and orchestration. As a DevOps consultant, he uses this expertise to collaboratively works with customers, architecting and implementing solutions for automation, scalability, reliability, and security across a wide variety of cloud adoption and transformation engagements.

Headshot of Nicole BrownNicole Brown is a Senior Engagement Manager within the AWS Professional Services team based in Minneapolis, MN. With over 10 years of professional experience, she has led multidisciplinary, global teams across the healthcare and life sciences industries. She is also a supporter of women in tech and currently holds a board position within the Women at Global Services affinity group.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *