Abstract: It is not uncommon that real-world data are distributed with a long tail. For such data, the learning of deep neural networks becomes challenging because it is hard to classify tail classes ...
Abstract: Remote sensing image (RSI) captioning is a vision-language multimodal task concentrating on both image comprehension and sentence generation. Several studies suggest that ...