Building a .NET Food Health Analyzer with Azure OpenAI and Semantic Kernel

In the previous blog we had an overview of the Food Health Checker app.

It’s a simple app that gets the list of ingredients present in an image and then checks whether the given food is healthy or not based on the extracted ingredients. Today we will focus more on the Semantic kernel side of the app.

Traditionally for this problem, we would have required two different models, one for OCR extracting all the text present in the given image and the second one for generating a response based on the extracted text.

Not to mention these models are very limited. A minor change in requirement meant retraining the entire model.

However, with the recent ML breakthroughs, this trend has shifted. Once a Large language model reaches a certain threshold it starts displaying various skills.

Now rather than limiting LLM’s skill to just language tasks, we are steering towards multi-modal models, capable of not just language tasks but vision and audio tasks too.

Due to the current capabilities of these Multimodal models, the initial problem has been simplified.

To complete the above problem we can just write different prompts for each task. Think of prompts as functions written in natural language. We will be creating our prompts in .NET using the Semantic Kernel abstractions.

We are keeping all our prompt functions in a single class named “FoodCheckerPlugin”.

Now let’s start with our first prompt for getting the ingredients and Nutritional values present in an image. This prompt will contain the SystemMessage, User instructions, and the image URL that will be sent to the AI model. We are organizing this with the help of ChatHistory.

//First create a ChatHistory object with a System message 
ChatHistory chat = new(FoodCheckerTemplates.SystemMessage);
//Add the instruction and image URL as a user message
chat.AddUserMessage(new ChatMessageContentItemCollection
{
    new TextContent(FoodCheckerTemplates.GetIngredients),
    new ImageContent(new Uri(input,UriKind.Absolute))
});

Here are the definitions of our SystemMessage and instructions used in the above ChatHistory

public static class FoodCheckerTemplates
{   
    public const string SystemMessage = @"You are an AI Food expert with extensive knowledge in Nutrition";
    public const string GetIngredients =
@"
[Instruction]    
Get the ingredients and nutritional values in english from the given food product images as briefly as possible in the given format else respond with <|ERROR|> if nothing is found
[RESPONSE]
**Ingredients**
**Nutritional Values**";
}

Now let’s convert it into a complete Semantic kernel function.

[KernelFunction, Description("Get the ingredients and nutritional values from the given food product images")]
public async IAsyncEnumerable<string> GetIngredientsAsync(
[Description("Food ingredients image URL")] string input, Kernel kernel, 
[EnumeratorCancellation] CancellationToken cancellationToken = default)
{
 // Get the registered chat service
    var chatService = kernel.GetRequiredService<IChatCompletionService>();
 // Since images aren't directly supported yet in prompt templates we are using the `ChatHistory` object instead
    ChatHistory chat = new(FoodCheckerTemplates.SystemMessage);
    chat.AddUserMessage(new ChatMessageContentItemCollection
 {
        new TextContent(FoodCheckerTemplates.GetIngredients),
        new ImageContent(new Uri(input,UriKind.Absolute))
 });
 // Make the AI call with the above chatHistory and return the streaming response
    await foreach (var result in that service.GetStreamingChatMessageContentsAsync(chat, s_settings, kernel, cancellationToken))
 {
        var generatedText = result?.ToString() ?? string.Empty;
        Console.Write(generatedText);
        yield return generated text;
 }
}

The above function GetIngredientsAsync takes an imageUrl and a kernel as input and returns a streaming response.

s_settings is a static prompt setting defined in the class FoodCheckPlugin that we are passing along with our prompts.

    public class FoodCheckerPlugin
 {
        static readonly PromptExecutionSettings s_settings = new OpenAIPromptExecutionSettings()
 {
            Temperature = 0.2,
            TopP = 0.5,
            MaxTokens = 400
 };
 }

Now to the next task - Assess whether the food is healthy or not based on the provided ingredients and Nutritional values. Since we just need the language capabilities of the AI model, we can directly define the prompt as a Kernel function.

 private readonly KernelFunction _checkFoodHealthFunction;
 public FoodCheckerPlugin()
 {
     _checkFoodHealthFunction = KernelFunctionFactory.CreateFromPrompt(
         FoodCheckerTemplates.CheckFoodHealth,
         description: "Given the list of ingredients classify whether the Nutri-score and Nova-group along with the reasons in simpler terms\r\n",
         executionSettings: s_settings);
 }

Prompt Template

public const string CheckFoodHealth =
@"
[Instruction]    
Given the list of ingredients for a food product give it a Rating from Very Unhealthy to Very Healthy. Also, give the reasoning in ELI5 format using less than 3 sentences in the below response format. 
Also list any allergens, cancer-causing or harmful substances if present along with the exact reason.
[Ingredients]
{{$input}}
[RESPONSE]
**Predicted Rating**
**Reasoning**
**Harmful substances**
";

When we get the response from the first prompt template we can manually pass it as the input in the above prompt through KernelArguments.

Let’s complete this function by writing it as a Kernel function.

[KernelFunction, Description("Given the list of ingredients classify whether the Nutri-score and Nova-group along with the reasons in simpler terms")]
public async IAsyncEnumerable<string> CheckFoodHealthAsync(
[Description("List of ingredients and nutritional values")] string input, Kernel kernel,
[EnumeratorCancellation] CancellationToken cancellationToken = default)
 {
     await foreach (var result in _checkFoodHealthFunction.InvokeStreamingAsync(kernel, new(s_settings) { { "input", input } }, cancellationToken))
 {
         var generatedText = result?.ToString() ?? string.Empty;
         Console.Write(generatedText);
         yield return generatedText;
 }
 }

Here’s the complete code for FoodCheckerPlugin and FoodCheckerTemplates in a single block.

using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.ChatCompletion;
using Microsoft.SemanticKernel.Connectors.OpenAI;
using System.ComponentModel;
using System.Runtime.CompilerServices;

namespace FoodHealthChecker.SemanticKernel.Plugins
{
    public class FoodCheckerPlugin
 {
        static readonly PromptExecutionSettings s_settings = new OpenAIPromptExecutionSettings()
 {
            Temperature = 0.2,
            TopP = 0.5,
            MaxTokens = 400
 };
        private readonly KernelFunction _checkFoodHealthFunction;
        public FoodCheckerPlugin()
 {
            _checkFoodHealthFunction = KernelFunctionFactory.CreateFromPrompt(
                FoodCheckerTemplates.CheckFoodHealth,
                description: "Given the list of ingredients for a food product give it a Rating from Very Unhealthy to Very Healthy.",
                executionSettings: s_settings);
 }
 [KernelFunction, Description("Given the list of ingredients for a food product give it a Rating from Very Unhealthy to Very Healthy.")]
        public async IAsyncEnumerable<string> CheckFoodHealthAsync([Description("List of ingredients and nutritional values")] string input, Kernel kernel, [EnumeratorCancellation] CancellationToken cancellationToken = default)
 {
            await foreach (var result in _checkFoodHealthFunction.InvokeStreamingAsync(kernel, new(s_settings) { { "input", input } }, cancellationToken))
 {
                var generatedText = result?.ToString() ?? string.Empty;
                Console.Write(generatedText);
                yield return generatedText;
 }
 }

 [KernelFunction, Description("Get the ingredients and nutritional values from the given food product images")]
        public async IAsyncEnumerable<string> GetIngredientsAsync([Description("Food ingredients image url")] string input, Kernel kernel, [EnumeratorCancellation] CancellationToken cancellationToken = default)
 {
            var chatService = kernel.GetRequiredService<IChatCompletionService>();

            ChatHistory chat = new(FoodCheckerTemplates.SystemMessage);
            chat.AddUserMessage(new ChatMessageContentItemCollection
 {
                new TextContent(FoodCheckerTemplates.GetIngredients),
                new ImageContent(new Uri(input,UriKind.Absolute))
 });
            await foreach (var result in chatService.GetStreamingChatMessageContentsAsync(chat, s_settings, kernel, cancellationToken))
 {
                var generatedText = result?.ToString() ?? string.Empty;
                Console.Write(generatedText);
                yield return generatedText;
 }
 }
 }

public static class FoodCheckerTemplates
{   
    public const string SystemMessage = @"You are an AI Food expert with extensive knowledge in Nutrition";

    public const string GetIngredients =
@"
[Instruction]    
Get the ingredients and nutritional values in english from the given food product images as briefly as possible in the given format else respond with <|ERROR|> if nothing is found
[RESPONSE]
**Ingredients**
**Nutritional Values**";
  public const string CheckFoodHealth =
@"
[Instruction]    
Given the list of ingredients for a food product give it a Rating from Very Unhealthy to Very Healthy. Also, give the reasoning in ELI5 format using less than 3 sentences in the below response format. 
Also list any allergens, cancer-causing or harmful substances if present along with the exact reason.
[Ingredients]
{{$input}}
[RESPONSE]
**Predicted Rating**
**Reasoning**
**Harmful substances**
";
}
}

Now that we have our plugin ready we need to create a kernel for it and expose it through a service.

public class FoodCheckerService
{
private Kernel _kernel;
private readonly FoodCheckerPlugin _foodCheckerPlugin;
//Injected from the service collection  
public FoodCheckerService(IConfiguration config, FoodCheckerPlugin foodCheckerPlugin)
{
    var azureOptions = config.GetSection("AzureOpenAI").Get<AzureOpenAIOptions>();
 //Defining the kernel 
    var kernelBuilder = Kernel.CreateBuilder();
    kernelBuilder.AddAzureOpenAIChatCompletion(azureOptions.DeploymentName, azureOptions.Endpoint, azureOptions.ApiKey);
    var kernel = kernelBuilder.Build();
 //Adding the plugin in the kernel 
 //We can also add it before building the kernel 
    _foodCheckerPlugin = foodCheckerPlugin;
    kernel.Plugins.AddFromObject(foodCheckerPlugin);
    _kernel = kernel;
}
public IAsyncEnumerable<string> CheckFoodHealthAsync(string ingredientResponse, CancellationToken cancellationToken = default)
 {
    return _foodCheckerPlugin.CheckFoodHealthAsync(ingredientResponse, _kernel, cancellationToken);
 }
public IAsyncEnumerable<string> GetIngredirentsAsync(string hostedImageUrl, CancellationToken cancellationToken = default)
 {
    return _foodCheckerPlugin.GetIngredientsAsync(hostedImageUrl, _kernel, cancellationToken);
 }
}

In the constructor, we are building our kernel with the AzureOpenAI service and adding the FoodCheckerPlugin necessary for calling the KernelFunctions through our kernel. We could have also used OpenAI separately or along with AzureOpenAI with some small modifications.

The AzureOpenAI config is defined in the appsettings.json

{
  "Logging": {
    "LogLevel": {
      "Default": "Information",
      "Microsoft.AspNetCore": "Warning"
 }
 },
  "AllowedHosts": "*",
  "AzureOpenAI": {
    "DeploymentName": "",
    "Endpoint": "",
    "ApiKey": ""
 },
}

While the definition for AzureOpenAIOptions matches the above keys.

internal class AzureOpenAIOptions
{
    public string DeploymentName { get; set; }
    public string ApiKey { get; set; }
    public string Endpoint { get; set; }
}

We also need to add the required classes in the ServiceCollection in Program.cs

builder.Services.AddSingleton<FoodCheckerPlugin>();
builder.Services.AddTransient<FoodCheckerService>();

Finally, we can now call our service on the razor page and display the generated streaming response as we want to.

Since these are native c# functions we can simply call them one after the other passing the output of the 1st function call as the input for the second call.

@code{
[Inject]
public FoodCheckerService _foodCheckerService { get; set; } = default!;
string selectedImgUrl = "https://demoFoodImage/image.jpg)"
string Ingredients = string.Empty;
string Result = string.Empty;

private async Task CheckHealth(){
await foreach (var response in _foodCheckerService.GetIngredirentsAsync(selectedImgUrl))
 {
        Ingredients += response;
 }
await foreach (var response in _foodCheckerService.CheckFoodHealthAsync(Ingredients))
 {
         Result += response;
 }
 }
}

NOTE: This is a simplified version of the code focusing on the Semantic kernel bits. You can check the final implementation in our Github repo

Here we are sequentially calling the APIs since the task has a defined flow but we could also use this in a chat app where the model can call these plugin functions internally whenever required.

The current Multimodal models open up various new ways to solve problems that weren’t possible before. Rather than writing rule-based applications that require specific inputs, we can now write applications that take natural language as input and fill in the gaps necessary for completing a task.

While the computation required for these is still too high compared to the traditional approaches, as we further improve and refine these models the costs to operate these models would keep on decreasing.