This contribution is based on a presentation given at The Digital Orientalist’s Virtual Conference 2025 (AI and the Digital Humanities) and was written by Dr. Stephanie Santschi (Department of Art History at the University of Zurich) and Dr. Drew Richardson (Department of History and Cowell College at the University of California Santa Cruz). The recording of the presentation can be found here.
The *Drawing from the Crowd Platform was launched in October 2025 and can be accessed here, further information and instruct…
This contribution is based on a presentation given at The Digital Orientalist’s Virtual Conference 2025 (AI and the Digital Humanities) and was written by Dr. Stephanie Santschi (Department of Art History at the University of Zurich) and Dr. Drew Richardson (Department of History and Cowell College at the University of California Santa Cruz). The recording of the presentation can be found here.
The Drawing from the Crowd Platform was launched in October 2025 and can be accessed here, further information and instruction are available here.
What can a slight shift in Mount Fuji’s position in Japanese prints teach us about how society learned to see the world differently? In early 2025, an interdisciplinary team of Nippon Foundation programs alumni, Dr. Stephanie Santschi (PI), Dr. Drew Richardson, Himanshu Panday, and Hirohito Tsuji, in collaboration with the citizen science geo-referencing project, Smapshot, launched the beta prototype for Drawing from the Crowd [full-version launched in October 2025, available here], a Digital Humanities platform that combined Computer Vision with Citizen Science to investigate topographical realities in Edo period (1603–1868) Japanese woodblock prints (ukiyo-e 浮世絵). This platform allows users to geolocate viewpoints and landmarks in prints to understand how these widely circulated images contributed to early modern Japanese perceptions of geography.
Why Combine AI and Human Intelligence?
Imagine trying to analyze thousands of Japanese prints by hand—the dataset is too large for a single researcher or even research team. While Artificial Intelligence (AI) empowers working with such large datasets, AI alone is severely limited; it cannot, for example, understand the cultural nuances of artistic tradition. Citizen science, on the other hand, allows for human discernment and expertise but lacks AI’s capacity for managing big data. For this reason, Drawing from the Crowd combined these two approaches: in a workflow where Computer Vision and Citizen Science complement each other, Computer Vision functions as an organization tool to enable the interpretive work of Citizen Science volunteers. Computational structuring and comparison tools, including algorithms trained on visual patterns, support researchers sort through large print repositories. They help group them by visual similarities, such as belonging to the same artistic school, displaying similar wave patterns, or positioning mountains in similar ways. Citizen scientists then use an adapted version of Smapshot, a Swiss georeferencing platform developed by HEIG-VD, to pinpoint the viewing position that each print creates for its audience—the spatial relationship the composition establishes between viewer and landscape. Establishing this viewing position allows us to study the creation of each print and determine whether or not the viewer of the print would see the same view if they sat in that location. In other words, were the landscapes depicted in Edo period woodblock prints visible with the naked eye or products of creative re-imagining?
Research shows that AI can improve citizen science projects, such as by helping to streamline data analysis and by facilitating accuracy checks. It can also improve how humans and computational processes work together (Fortson et al., 2024). However, the inherent biases of AI vision systems and the research conducted with them necessitates caution (Impett and Offert 2022). Because models are often trained primarily on Western art, they may misunderstand non-Western traditions such as ukiyo-e (Ananthram et al., 2024). This makes reflections on process and intent even more crucial: designing digital tools for cultural analysis requires careful consideration of how interface decisions may affect what users perceive and deduce (Windhager et al., 2025). Awareness of these problems has informed our development of the platform; we aim to ensure that the workflows of Drawing from the Crowd promote careful engagement with the complexities of artistic representation and never suggest that we can pinpoint locations with false precision.
Theoretical Framework
Ukiyo-e prints depict more than what we colloquially call “landscapes”—they shaped how people in premodern Japan understood their world and were part of what scholar Marcia Yonemoto (2003) calls a “spatial vernacular,” a shared language to discuss space and place shared by print designers, publishers, and viewers.
Ukiyo-e did not prioritize linear perspective or topographical accuracy, a quality frequently attributed to Western landscape traditions (Kubo et al. 2008). Instead of photographic accuracy, artists used what we are calling “significant topography”—they depicted recognizable landmarks like Mount Fuji but often rearranged them for better visual impact. As more visitors experienced these places firsthand, spatial relationships in the prints increasingly referenced topological features with greater realism. Thus, the landscapes of woodblock prints were Edo period citizens’ negotiation of artistic traditions and geographical reality.
Case Study: Enoshima Peninsula
In the presentation’s case study, we compared different views of Enoshima 江の島, a peninsula at the time about two days southwestwards travel from the city of Edo, a “noteworthy place,” or *meisho *名所, and a popular sightseeing spot (Nenzi 2008). A fascinating tendency emerged: in earlier prints that depicted Enoshima from the shore of Shichirigahama 七里ヶ浜, Mount Fuji tends to appear to the left of Enoshima peninsula; while in later prints, it gravitates toward the right. Such discrepancies demonstrate the need for further research into the creative processes that shaped these prints, such as the question of how artistic representation relates to physical geography and aesthetic tradition. They also encourage us to take another look at the individual prints themselves.
By comparing artistic representations of Enoshima with actual geography using topological 3D models, we found that the later positioning is geographically more accurate if the viewer is imagined to have stood on the beach. This trend suggests that the pictorial standard was transformed by the spatial imaginary that illustrators and print buyers developed while frequenting these places more often, which aligns with Nenzi (2008)’s analysis of Enoshima’s treatment in textual sources, and expands Yonemoto (2003)’s “spatial vernacular” concept to the visual language of the prints.
What This Tells Us
There is more at stake here than simply mapping which places appear in prints. Through its comparison with 3D models, our research reveals something more profound about how people in Edo-period Japan were actively negotiating between what they had been told places looked like (through artistic and poetic tradition) and what they were starting to see with their own eyes (through increased travel). The prints served as tools for making sense of a changing world.
Validation and Preliminary Findings
Of course, making these kinds of claims about historical visual culture requires methodological rigor to ensure our interpretations are sound. Computer Vision and Citizen Science independently face validation challenges when confronted with the multiple realities residing in artistic representation. Our innovation lies in their deliberate integration, creating multiple validation pathways where computational results are verified by human expertise and human interpretations gain consistency through computational patterns. On the one hand, computational methods excel at processing large quantities of material and identifying subtle patterns across large datasets. Human participants, on the other hand, bring contextual knowledge, understanding of artistic conventions, and flexible reasoning about ambiguous spatial relationships. They also contribute cultural sensitivity to non-Western representational traditions.
Drawing from the Crowd demonstrates the productive potential of integrating computational methods with Citizen Science for studying historical visual culture. Our findings reveal how ukiyo-e landscapes functioned as active participants in negotiating space between geographical reality and artistic convention.
Outlook
The project is currently being implemented and will be released as an active platform in late 2025. At the time of release, we will invite citizen scientists to participate. Computer Vision analysis is continuing in parallel through distant viewing approaches to identify broader patterns across the corpus.
Want to get involved?
Whether you are interested in Japanese history, geography, or just love solving visual puzzles, we would love to hear from you. Follow our progress and sign up for updates by emailing project PI Stephanie Santschi. Do you have expertise in visual AI, ukiyo-e, Japanese geography, or crowdsourcing? We are especially eager to connect with potential collaborators, beta testers, and interested citizen scientists.
References
Ananthram, Amith, Elias Stengel-Eskin, Mohit Bansal, and Kathleen McKeown. “See It from My Perspective: How Language Affects Cultural Bias in Image Understanding.” arXiv preprint arXiv:2406.11665 (2024). https://doi.org/10.48550/arXiv.2406.11665.
Fortson, Lucy, Kevin Crowston, Laure Kloetzer, and Marisa Ponti. “Artificial Intelligence and the Future of Citizen Science.” Citizen Science: Theory and Practice 9, no. 1 (2024): 1–32. https://doi.org/10.5334/cstp.812.
Impett, Leonardo, and Fabian Offert. “There Is a Digital Art History.” Visual Resources 38, no. 2 (2022): 186–209. https://doi.org/10.1080/01973762.2024.2362466.
Kubo, Yuka, Zhao Jie, and Koichi Hirota. “A Method for Transformation of 3D Space into Ukiyo-e Composition.” In ACM SIGGRAPH ASIA 2008 Artgallery: Emerging Technologies, 29–35. New York: Association for Computing Machinery, 2008. https://doi.org/10.1145/1504229.1504250.
Nenzi, Laura. Excursions in Identity: Travel and the Intersection of Place, Gender, and Status in Edo Japan. Honolulu: University of Hawai’i Press, 2008.
Windhager, Florian, Eva Mayr, and Katrin Glinka. “From Exploration to Critique: Catalyzing Critical Inquiry With Cultural Collection Visualizations.” IEEE Computer Graphics and Applications 45, no. 4 (July–August 2025): 45–59. https://doi.org/10.1109/MCG.2025.3559769.
Yonemoto, Marcia. Mapping Early Modern Japan: Space, Place, and Culture in the Tokugawa Period (1603-1868). Berkeley: University of California Press, 2003.
Cover Image: Ryūryūkyo Shinsai, “Turtle Island and Fujiyama” [sic!], 19th c., poem print (surimono 摺物), ink and color on paper, H. O. Havemeyer Collection, The Metropolitan Museum of Art, New York (JP1950), https://www.metmuseum.org/art/collection/search/54518.