Researchers at universities such as Zhejiang University and Sydney University proposed MirrorGAN, as a global-local attention and semantically maintained text-image-text framework, addresses semantic consistency between textual descriptions and visual content, and refreshes records on COCO datasets.