Rocchietti G., Pugliese C., Sartori Rangel G., Carvalho J. T.
Image Captioning Large Language Models Summarization Urban Regions
Urban research faces challenges in understanding and describing city regions, which are essential for urban planning and tourism management. Traditional methods rely on predefined areas and non-human-readable representations. This paper presents a new unsupervised approach that overcomes these limitations using a data-driven method with Instruction-tuned Large Language Models (ILLMs). Our technique dynamically identifies urban regions with similar features and generates human-readable descriptions. We validate this method using Flickr images from Pisa, Italy, and our results show that it effectively captures the semantic features of urban regions and generates comprehensible textual descriptions.
Publisher: Association for Computing Machinery, Inc
@inproceedings{oai:iris.cnr.it:20.500.14243/543147, title = {From geolocated images to urban region identification and description: a large language model approach}, author = {Rocchietti G. and Pugliese C. and Sartori Rangel G. and Carvalho J. T.}, publisher = {Association for Computing Machinery, Inc}, doi = {10.1145/3678717.3691317}, year = {2024} }