Dec 26 2025

PropertySphere bot: understanding images

time to read 13 min | 2490 words

In the previous post, I talked about the PropertySphere Telegram bot (you can also watch the full video here). In this post, I want to show how we can make it even smarter. Take a look at the following chat screenshot:

What is actually going on here? This small interaction showcases a number of RavenDB features, all at once. Let’s first focus on how Telegram hands us images. This is done using Photoor Document messages (depending on exactly how you send the message to Telegram).

The following code shows how we receive and store a photo from Telegram:

// Download the largest version of the photo from Telegram:
var ms = new MemoryStream();
var fileId = message.Photo.MaxBy(ps => ps.FileSize).FileId;
var file = await botClient.GetInfoAndDownloadFile(fileId, ms, cancellationToken);

// Create a Photo document to store metadata:
var photo = new Photo
{
    ConversationId = GetConversationId(chatId),
    Id = "photos/" + Guid.NewGuid().ToString("N"),
    RenterId = renter.Id,
    Caption = message.Caption ?? message.Text
};

// Store the image as an attachment on the document:
await session.StoreAsync(photo, cancellationToken);
ms.Position = 0;
session.Advanced.Attachments.Store(photo, "image.jpg", ms);
await session.SaveChangesAsync(cancellationToken);

// Notify the user that we're processing the image:
await botClient.SendMessage(
chatId,
       "Looking at the photo you sent..., may take me a moment...",
       cancellationToken
);

A Photo message in Telegram may contain multiple versions of the image in various resolutions. Here I’m simply selecting the best one by file size, downloading the image from Telegram’s servers to a memory stream, then I create a Photo document and add the image stream to it as an attachment.

We also tell the client to wait while we process the image, but there is no further code that does anything with it.

Gen AI & Attachment processing

We use a Gen AI task to actually process the image, handling it in the background since it may take a while and we want to keep the chat with the user open. That said, if you look at the actual screenshots, the entire conversation took under a minute.

Here is the actual Gen AI task definition for processing these photos:

var genAiTask = new GenAiConfiguration
{
    Name = "Image Description Generator",
    Identifier = TaskIdentifier,
    Collection = "Photos",
    Prompt = """
        You are an AI Assistant looking at photos from renters in 
        rental property management, usually about some issue they have. 
        Your task is to generate a concise and accurate description of what 
        is depicted in the photo provided, so maintenance can help them.
        """,


    // Expected structure of the model's response:
    SampleObject = """
        {
            "Description": "Description of the image"
        }
        """,


    // Apply the generated description to the document:
    UpdateScript = "this.Description = $output.Description;",


    // Pass the caption and image to the model for processing:
    GenAiTransformation = new GenAiTransformation
    {
        Script = """
            ai.genContext({
                Caption: this.Caption
            }).withJpeg(loadAttachment("image.jpg"));
            """
    },
    ConnectionStringName = "Property Management AI Model"
};

What we are doing here is asking RavenDB to send the caption and image contents from each document in the Photos collection to the AI model, along with the given prompt. Then we ask it to explain in detail what is in the picture.

Here is an example of the results of this task after it completed. For reference, here is the full description of the image from the model:

A leaking metal pipe under a sink is dripping water into a bucket. There is water and stains on the wooden surface beneath the pipe, indicating ongoing leakage and potential water damage.

What model is required for this?

I’m using the gpt-4.1-mini model here; there is no need for anything beyond that. It is a multimodal model capable of handling both text and images, so it works great for our needs.
You can read more about processing attachments with RavenDB’s Gen AI here.

We still need to close the loop, of course. The Gen AI task that processes the images is actually running in the background. How do we get the output of that from the database and into the chat?

To process that, we create a RavenDB Subscription to the Photos collection, which looks like this:

store.Subscriptions.Create(new SubscriptionCreationOptions
{
    Name = SubscriptionName,
    Query = """
        from "Photos" 
        where Description != null
        """
});

This subscription is called by RavenDB whenever a document in the Photos collection is created or updated with the Description having a value. In other words, this will be triggered when the GenAI task updates the photo after it runs.

The actual handling of the subscription is done using the following code:

_documentStore.Subscriptions.GetSubscriptionWorker<Photo>("After Photos Analysis")
    .Run(async batch =>
    {
        using var session = batch.OpenAsyncSession();
        foreach (var item in batch.Items)
        {
            var renter = await session.LoadAsync<Renter>(
item.Result.RenterId!);
            await ProcessMessageAsync(_botClient, renter.TelegramChatId!,
                $"Uploaded an image with caption: {item.Result.Caption}\r\n" +
                $"Image description: {item.Result.Description}.",
                cancellationToken);
        }
    });

In other words, we run over the items in the subscription batch, and for each one, we emit a “fake” message as if it were sent by the user to the Telegram bot. Note that we aren’t invoking the RavenDB conversation directly, but instead reusing the Telegram message handling logic. This way, the reply from the model will go directly back into the users' chat.

You can see how that works in the screenshot above. It looks like the model looked at the image, and then it acted. In this case, it acted by creating a service request. We previously looked at charging a credit card, and now let’s see how we handle creating a service request by the model.

The AI Agent is defined with a CreateServiceRequest action, which looks like this:

Actions = [
    new AiAgentToolAction
    {
        Name = "CreateServiceRequest",
        Description = "Create a new service request for the renter's unit",
        ParametersSampleObject = JsonConvert.SerializeObject(
            new CreateServiceRequestArgs
            {
                    Type =         """
Maintenance | Repair | Plumbing | Electrical | 
HVAC | Appliance | Community | Neighbors | Other
""",
            Description =         """
Detailed description of the issue with all 
relevant context
"""
                })
    },
]

As a reminder, this is the description of the action that the model can invoke. Its actual handling is done when we create the conversation, like so:

conversation.Handle<PropertyAgent.CreateServiceRequestArgs>(
"CreateServiceRequest", 
async args =>
{
    using var session = _documentStore.OpenAsyncSession();
    var unitId = renterUnits.FirstOrDefault();
    var propertyId = unitId?.Substring(0, unitId.LastIndexOf('/'));


    var serviceRequest = new ServiceRequest
    {
        RenterId = renter.Id!,
        UnitId = unitId,
        Type = args.Type,
        Description = args.Description,
        Status = "Open",
        OpenedAt = DateTime.UtcNow,
        PropertyId = propertyId
    };


    await session.StoreAsync(serviceRequest);
    await session.SaveChangesAsync();


    return $"Service request created ID `{serviceRequest.Id}` for your unit.";
});

In this case, there isn’t really much to do here, but hopefully this conveys the kind of code this allows you to write.

Summary

The PropertySphere sample application and its Telegram bot are interesting, mostly because of everything that isn’t here. We have a bot that has a pretty complex set of behaviors, but there isn’t a lot of complexity for us to deal with.

This behavior is emergent from the capabilities we entrusted to the model, and the kind of capabilities we give it. At the same time, I’m not trusting the model, but verifying that what it does is always within the scope of the user’s capabilities.

Extending what we have here to allow additional capabilities is easy. Consider adding the ability to get invoices directly from the Telegram interface, a great exercise in extending what you can do with the sample app.

There is also the full video where I walk you through all aspects of the sample application, and as always, we’d love to talk to you on Discord or in our GitHub discussions.

Tweet Share Share 0 comments

Tags:

Oren Eini

Oren Eini

CEO of RavenDB

PropertySphere bot: understanding images

Gen AI & Attachment processing

What model is required for this?

Summary

Comments

Comment preview

Join the conversation...

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication

Main feed
Comments feed

Oren Eini

CEO of RavenDB

Gen AI & Attachment processing

What model is required for this?

Summary

Related posts that you may find interesting:

Comments

Comment preview

Join the conversation...

Markdown formatting

Phrase Emphasis

Links

Images

Headers

Lists

Blockquotes

Horizontal Rules

Manual Line Breaks

Fenced Code Blocks

Header IDs

Tables

Definition Lists

Footnotes

Abbreviations

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication