VANK Report Highlights AI Biases in Depicting Korean Heritage

A surge in foreign tourists following the BTS (Bangtan Sonyeondan) concert at Gwanghwamun has brought renewed attention to how major artificial intelligence (AI) platforms represent Korea’s cultural heritage. However, several leading AI services have been found to misidentify or distort key elements of Korean culture, raising concerns about the urgent need for improvement, especially as many international visitors rely on AI to access information about Korea.

According to the “AI Performance Evaluation Index: Image Analysis Report” released on March 19, 2026, by the Voluntary Agency Network of Korea, a range of errors—including confusion over national identity, distortion of traditional elements, and omission of key components—were identified across major AI platforms. Over a two-week period starting February 12, VANK requested image generation from platforms such as ChatGPT, Gemini, Perplexity, Grok, Bing, and Microsoft Copilot, focusing on Korea’s tangible and intangible cultural heritage, as well as its food and culinary traditions, and analyzed the results.

Perplexity, for instance, generated an image of Gyeongbokgung Palace—the backdrop of the BTS Gwanghwamun concert—using predominantly gold and turquoise tones, evoking the appearance of a Chinese imperial palace. Key features such as rank stones and the statues of the twelve zodiac animals were missing. Gemini, meanwhile, inserted an excessive number of cherry blossoms into its depiction of Gyeongbokgung. Lee Sei-yeon, a youth researcher at VANK, noted that “there are many cases where cherry blossoms are automatically included when generating East Asian architecture,” adding that “given that cherry blossoms are widely recognized as a cultural symbol of Japan, foreign viewers may mistakenly perceive the heritage as Japanese rather than Korean.”

When asked to generate an image of a scene featuring the song “Arirang”—also the title of BTS’s comeback album—Grok failed to incorporate identifiable features and did not reflect Korean elements such as traditional attire. Gemini also produced errors, placing Japanese-style red lanterns in the background instead of Korean traditional elements. In particular, Grok generated distorted results when prompted with Pansori, mixing in elements of Chinese opera.

In its report, VANK evaluated and compared the level of cultural image representation across major generative AI platforms using three criteria: accuracy of components, cultural uniqueness and non-confusability, and appropriateness of historical and temporal context. ChatGPT ranked first with a score of 50.3, while Grok ranked last at 30.4. Following ChatGPT were Copilot (45.2), Gemini (39.7), Perplexity (38.2), and Bing (34.1).

The analysis found that generative AI demonstrated relatively high accuracy in areas with clear visual characteristics and abundant data, such as food. However, it showed limitations in domains requiring structural understanding and contextual interpretation. In particular, significant disparities were observed among platforms in areas that require a comprehensive reflection of spatial structure, history, and symbolism, such as territory and both tangible and intangible cultural heritage.

Park Gi-tae, head of VANK, stated, “With global interest in Korean tradition and history higher than ever, driven by BTS and K-pop Demon Hunters, it is essential to correct the errors shown by AI in order to accurately present Korea.” VANK also plans to develop an evaluation index for assessing AI’s narrative capabilities, following this image performance assessment.

The report will be made available on the VANK website.

위로 스크롤