This study investigates the effect of visual structure presented in visual metaphor on the viewer’s cognitive elaboration. It also attempts to examine the role of verbal text in enhancing the cognitive elaboration of visual structure. To this end, this research uses Phillips and Mc Quarrie’s [1] typology of visual metaphor which offers three types of visual structure, namely: juxtaposition, fusion and replacement. The first finding shows that viewers enjoy processing incongruity presented in less complex visual structure, i.e. juxtaposition and fusion. However, a high level of incongruity in visual structure, notably replacement, requires more cognitive elaboration and extra mental effort, and will lead the viewers to opt out from processing the visual structure. As far as verbal text is concerned, the main results show that the introduction of a verbal text has a significant effect on processing complex visual structure, i.e. replacement. For visual structures of juxtaposition and fusion, the presence of verbal text does not have a significant effect on processing visual structure. Consequently, the main finding of this study lies in the fact that viewers enjoy solving incongruity in print advertisements. However, when the level of incongruity increases and becomes too complex to process and understand, the viewer will simply opt out from processing the metaphorical image. The introduction of verbal text in this case will help viewers process the incongruity.