Function Details

Caption Image

Provided by

Logo

HuggingFace

Takes one or more image URLs or binary strings (passed by the image upload input node) and outputs descriptions for each image. Multiple images should be separated by commas (maximum 10 images).

Outputs:

This node can output in different formats:

Single Image

When passing a single image, the node outputs a simple text description.

Multiple Images

When passing multiple images, you can choose between two output formats:

  1. Structured (JSON) - Returns a JSON array with the following structure:

[
{
"image_index": 0,
"image_url": "https://example.com/image.jpg", // included for URL inputs
"image_caption": "Description of the image"
},
{
"image_index": 1,
"image_caption": "Description of the second image"
}
]

  1. Unstructured (Text) - Returns a text string with the following format:

Image Index: 0
Image URL: https://example.com/image.jpg
Image Caption: Description of the image

Image Index: 1
Image URL: https://example.com/image2.jpg
Image Caption: Description of the second image