If you have ever been a photo editor or journalist needing to write captions over and over that not only describe the photo but engage the readers, you can understand how challenging it can be. It is a matter of prioritizing between what is in an image (center or background) and what thoughts or feelings it evokes. This interesting information came to us from Eureka Alert! in their article, “Paying attention to words not just images leads to better image captions.”
Computer image captioning brings together two key areas in artificial intelligence: computer vision and natural language processing. For the computer vision side, researchers train their systems on a massive dataset of images, so they learn to name objects in images. Language models can then be used to put these words together.
A team of University and Adobe researchers is outperforming other approaches to creating computer-generated image captions in an international competition. To win they are changing the rules. They are applying priority to both what the words mean and how they fit in a sentence structure, just as much as thinking about the image itself.
Melody K. Smith
Sponsored by Access Innovations, the world leader in thesaurus, ontology, and taxonomy creation and metadata application.