in , , ,

Leveraging Generative AI In E-Commerce For Catalog Enrichment

Udit Mehrotra is a distinguished product leader who currently serves as Head of Product Management for Customer Experience at Amazon Canada.

The emergence of generative artificial intelligence (GenAI) has transformed how retailers approach catalog management. Large language models (LLMs) have demonstrated exceptional capabilities in text generation by achieving human-level performance for tasks such as product descriptions, and are now widely adopted.

With text generation becoming mainstream, image generation has emerged as the next major area of innovation. This article explores the current capabilities of image generation technology and where it’s headed next.

Text Generation: A Mature Technology

Customers rely heavily on accurate and complete catalog data to make informed purchasing decisions, making high-quality product catalogs essential for e-retailers. However, maintaining this quality at scale has historically been difficult due to the sheer volume of data and the need to gather input from multiple sources.

Manual review and correction of catalog issues is prohibitively expensive, which is why past efforts were typically limited to only the most popular products. Generative AI is now transforming this process. General-purpose LLMs like GPT-4 and Claude use advanced natural language processing to analyze product attributes and generate compelling, context-aware descriptions. Retailers are further improving performance by training custom LLMs on extensive shopping datasets and integrating them into catalog management pipelines to automatically produce high-quality product descriptions and detail page content at scale.

Results from such integrations show higher engagement rates and conversion rates, all at a fraction of the cost. For instance, Walmart reported using large language models in 2024 to create or improve more than 850 million pieces of data across its product catalog. Without generative AI, the process would have required 100 times the headcount to be completed in the same amount of time.

There remain some challenges in handling use cases such as technical specifications where accuracy is of utmost importance, but it is safe to say that text generation is now reliably deployable across diverse catalog applications across all major retailers.

The Current Focus: Image Generation

While text generation has reached maturity for deployment at scale, image generation is more nascent. The primary image generation technologies are generative adversarial networks (GANs) and variational autoencoders (VAEs).

Tools like DALL-E 3, Midjourney and Stable Diffusion have been transformational for retail marketing teams. Companies such as Amazon have also invested heavily in building models to generate lifestyle images with simple text prompts such as “air fryer in a kitchen setting” that can super-impose a branded air fryer on a synthetically generated kitchen island within seconds, something that would have otherwise taken days of traditional photography.

With that said, the challenges are substantial. Unlike text, where spellcheck and fact-checking can catch most problems, judging image quality remains subjective. Moving from text to image generation is not just adding pixels to words—it is exponentially more complex, requiring the navigation of visual quality, brand identity and aesthetic judgment all at once.

Take the use case of generating images with super-imposed text. Multinational brands generate such creatives in multiple languages simultaneously and want consistency across images to preserve the brand ethos. Extracting the text from the source image, translating it to a secondary language and ensuring that the translated text is inpainted back to the source image as an overlay are all complex steps that require advanced machine learning models.

Areas Of Active Research

As image generation technology matures, several key research areas have emerged to unlock its full potential for retail applications. These efforts are critical to scaling the technology across large and diverse product catalogs while maintaining quality and efficiency.

Model Improvements

A major focus of research is on enhancing model capabilities to generate high-resolution, photorealistic images that capture fine-grained details—such as fabric texture, stitching or subtle wrinkles in clothing. These visual elements are essential for helping customers assess product quality and make confident purchase decisions online. Generating consistent imagery across product variations (such as the same shirt in different colors or sizes) is another challenging but essential use case.

Quality Assurance

Despite advances in generative AI, quality control for generated images is still predominantly manual. Developing automated quality assurance systems that can evaluate image realism, relevance and consistency—and flag subpar outputs for human review—is crucial to operationalizing image generation in production environments.

Efficiency Optimization

State-of-the-art image generation models are computationally intensive, often requiring significant GPU resources to produce high-quality outputs. This creates a bottleneck when scaling to thousands of product images.

Research in this area focuses on optimizing both model architectures and hardware utilization. There is also growing interest in creating domain-specific or lightweight models that require less compute while delivering results tailored to retail-specific use cases.

Conclusion

Generative AI is transforming e-commerce catalog management by addressing longstanding challenges and unlocking new possibilities. While text generation has matured into a reliable and widely deployed solution, the evolution of image generation marks an exciting new chapter.

With advancements in quality, efficiency and system integration, image generation models are poised to redefine how retailers create, manage and present visual content. As research continues to tackle the complexities of visual fidelity and aesthetic quality, generative AI is set to enhance customer experience, drive engagement and streamline operations.


Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?


This post was created with our nice and easy submission form. Create your post!

What do you think?

Understanding Gen A To Z: Why You Shouldn't Take A One-Size-Fits-All Approach When Advertising On Social Media

Understanding Gen A To Z: Why You Shouldn't Take A One-Size-Fits-All Approach When Advertising On Social Media

Apple's M4 MacBook Air is back on sale for $150 off

Apple's M4 MacBook Air is back on sale for $150 off