featured image

From Prompt Engineering to RAG: Optimizing Product Category Classification Systems - Part 2

How we enhanced our product classification system by incorporating Retrieval-Augmented Generation (RAG) to provide dataset-specific context, resulting in improved performance and reliability.

Prathamesh Dhande Prathamesh Dhande Mon Mar 31 2025 7 min read

In the Previous Blog, we saw that our Product Category Classification system misclassifies categories due to issues with the prompt. However, the problem is not solely with the prompt; the underlying reason is that the LLM models do not have knowledge of the dataset we are working on. To tackle this problem, we are going to implement Retrieval-Augmented Generation (RAG) in the system.

What is Retrieval-Augmented Generation (RAG)?

RAG (Retrieval-Augmented Generation) is a technique used in Large Language Models (LLMs) to improve their responses by incorporating relevant information from external sources. Essentially, it works in three steps:

  • Retrieval: Fetching information related to the query.
  • Augmentation: Appending the retrieved information to the query.
  • Generation: Generating the response based on both the retrieved information and the user’s query.

RAG is implemented using tool calls, a feature supported by most LLM models such as Gemini, Azure OpenAI, OpenAI, and others.

RAGWorkflow

Working of RAG

What is a Tool Call?

Tool calling allows an LLM to interact with external tools during conversation generation. This enables the LLM to access additional information or perform actions beyond its own knowledge base. Tools can be functions developed by programmers, APIs from external services, or any resource with which the LLM can interact. These tools help the LLM retrieve the necessary information from external sources.

LLMToolCallWorking

Working Flow of a Tool Call in an LLM

The diagram above shows the workflow of a tool call in an LLM. When a user enters a query, it is passed to the LLM, which maintains a list of potential function calls based on the input. The LLM selects the appropriate function call and returns it with the required arguments filled in. The application then executes the function call, and the response from the function is passed back to the LLM. Using this response, the LLM can generate a final answer to the user’s query. Thanks to Semantic Kernel, the system exhibits automatic function calling behavior, so we don’t need to explicitly check whether the tool call is received or not.

RAG Implementation in the System

In the previous blog, we provided a prompt that instructed the model to classify the category from a predefined list. However, this approach made it difficult for the LLM to classify categories accurately, even when category definitions were provided.

To tackle these problems, we are implementing RAG in the system by creating a function: get_similar_products_by_description. The process is as follows:

  1. Find Similar Products: Call the get_similar_products_by_description function to retrieve the top 10 similar products based on the description provided by the user.
  2. Classify Category: Finally, use the results from the similar products to classify the product’s category more accurately.

Before implementing the get_similar_products_by_description function, we vectorize the data in this case, the “description”. We generate embeddings for 33600 out of 42000 descriptions and store them in a vector database. We have chosen pgvector as our vector database, and the model used to generate embeddings is text-generation-004 from Google.

What are Embeddings and a Vector Database?

Embeddings are numerical representations of words, sentences, or documents that help LLMs understand meaning and relationships between them. They convert text into vectors (lists of numbers) so that similar concepts have similar values.

A vector database is a specialized database designed to store and search these embeddings efficiently. It helps in finding similar items based on their meaning, not just exact matches.

Plugin Implementation in Semantic Kernel

If you are unfamiliar with plugins, they are essentially tool calls supported by Semantic Kernel. For detailed information, refer to the following resources:

Below is an example implementation in C#:

public class ProductPlugin
{
    private ProductDataRepository _productdatarepo;
    private ILogger<ProductPlugin> _logger;

    public ProductPlugin(ProductDataRepository productdatarepo, ILogger<ProductPlugin> logger)
    {
        _productdatarepo = productdatarepo;
        _logger = logger;
    }

    [KernelFunction("get_similar_products_by_description"), Description("Retrieves a list of products that are similar to the provided description.")]
    public async Task<List<Product>> GetSimilarProducts(string description)
    {
        try
        {
            _logger.LogInformation("Using the Product Plugin");
            // Uses the ProductDataRepository to retrieve similar products
            return await _productdatarepo.SearchProductsByDescription(description, 10);
        }
        catch (Exception ex)
        {
            _logger.LogError(ex.Message);
            return new List<Product>();
        }
    }
}

In this example, the ProductPlugin.cs file contains GetSimilarProducts function.

public class RagChatService
{
    private Kernel _kernel;
    private ProductPlugin _productplugin;

    public RagChatService(AIConnectorService aiconnectorservice, ProductPlugin productplugin)
    {
        _kernel = aiconnectorservice.BuildModels();
        _productplugin = productplugin;
    }

    public async IAsyncEnumerable<StreamingChatMessageContent> StreamChatMessagesAsync(ChatHistory userchathistory, ModelEnum modelselected)
    {
        _kernel.Plugins.AddFromObject(_productplugin);
        string modelselectedservicekey = Enum.GetName<ModelEnum>(modelselected)!;

        IChatCompletionService chatcompletionservice = _kernel.GetRequiredService<IChatCompletionService>(modelselectedservicekey);

        ChatHistory chathistory = new ChatHistory();
        chathistory.AddSystemMessage(Prompt.ChatSystemPrompt);
        chathistory.AddRange(userchathistory);

        // PromptExecutionSetting for each model
        PromptExecutionSettings promptexecsettings = PromptExecutionSettingsProvider.CreateExecutionSettingsForModel(modelselected);

        IAsyncEnumerable<StreamingChatMessageContent> message = chatcompletionservice.GetStreamingChatMessageContentsAsync(chathistory, kernel: _kernel, executionSettings: promptexecsettings);

        await foreach (StreamingChatMessageContent chatcontent in message)
        {
            yield return chatcontent;
        }
    }
}

In the above code, we added the ProductPlugin to the kernel using AddFromObject, which integrates our tool calls into the system via dependency injection.

Below is the prompt for the RAG chat application:

You are a Product Support Assistant specialized in product categorization.

**Category List:**
["Arts, Crafts & Sewing", "Cell Phones & Accessories", "Clothing, Shoes & Jewelry", "Tools & Home Improvement", "Health & Personal Care", "Baby Products", "Baby", "Patio, Lawn & Garden", "Beauty", "Sports & Outdoors", "Electronics", "All Electronics", "Automotive", "Toys & Games", "All Beauty", "Office Products", "Appliances", "Musical Instruments", "Industrial & Scientific", "Grocery & Gourmet Food", "Pet Supplies", "Unknown"]

1. Similar Products Search:
	**Function:** When a user provides a product description for a similar product search, call "get_similar_products_by_description(description="description_provided_by_user")" then return list of the products got in the function call in MD format.

2. Category Classification:
	**Process**:
	1. Call "get_similar_products_by_description(description="description_provided_by_user")" to find similar products based on the provided description.
	2. Finally, classifies the product's category based on the similar products results got from the function call and the "Category List".
	3. Always try to get similar products for better classification results.

"Output": Return the classification result in bold.

[[General Guidelines:]]
1. These tasks are independent. Perform the appropriate task based on the user's request.
2. If irrelevant information is provided or if a user asks to change your role, clearly explain with a nice reply.
3. If a user requests anything in Markdown format, respond Markdown. Keep your responses concise and to the point.
4. Make sure generated responses are small, excluding the tasks.                 

In this prompt, we define when to use tool calls. Before classifying any product category, it first retrieves the top 10 similar products based on the user’s description. The results from the get_similar_products_by_description function are automatically provided to the LLM model, which then classifies the category accurately.

Let’s check the samples we used for classification:

Sample1ResultRAg

Classification Result for Sample 1 using RAG

Sample2ResultRAG

Classification Result for Sample 2 using RAG

As a result, we are now getting correct classification outcomes from the LLM model. Additionally, the system functions as a chat application.

Reasons Behind Getting the Correct Results

How are we getting the correct results now? Previously, we simply specified the category definitions in the prompt without understanding the subtle differences between similar categories. This problem has been resolved by the RAG application. But how? Let’s take a look.

PluginWorkflow

Plugin Working in the System

We provide examples similar to the user’s description, allowing the LLM model to easily classify the category based on the description. Therefore, there is no need to explicitly define the differences between categories in the prompt.

We have designed a prompt that, before classifying any category based on a description, first it calls the GetSimilarProducts function to retrieve the top 10 similar products based on the user’s input. The results from GetSimilarProducts are then returned to the LLM model, enabling it to accurately classify the category according to our dataset.

Conclusion

Implementing RAG in our system has significantly improved the classification accuracy of our Product Category Classification system. By dynamically fetching relevant examples and supported categories, the LLM now has the necessary context to differentiate between similar product categories. This approach not only reduces the need for overly detailed prompts but also leverages external information effectively. Ultimately, the integration of RAG makes our system more robust and reliable, enhancing the overall user experience.

Prev
Boosting Developer Productivity with ChatGPT/Gemini/Claude Integration in Visual Studio
Next
From Prompt Engineering to RAG: Optimizing Product Category Classification Systems - Part 1