2026-01-22: Paper Summary: "Towards a better QA process: Automatic detection of quality problems in archived websites using visual comparisons" (opens in new tab)

Figure 1: Example of an image pair from Reyes Ayala’s dataset of a live web page (left screenshot) and an archived web page (right screenshot).

Reyes Ayala, B. (2025). Towards a better QA process: Automatic detection of quality problems in archived websites using visual comparisons. In M. Cornia et al. (Eds.). Proceedings of the 21st Conference on Information and Research science Connecting to Digital and Library science, 3937. Udine, Italy: CEUR Workshop Proceedings https://ceur-ws.org/Vol-3937/

Measuring Visual Correspondence With Web Archiving Screenshot Compare Tool

Reyes Ayala created the Web Archiving Screenshot Compare tool which assists with automating quality assurance by comparing screenshots of the live web page and the archived web page and determining the visual correspondence. The process for generating screenshots involves several steps. First, the tool reads the settings file that contains the seed list. For each seed, it checks if the web page exists and, if so, takes a screenshot. Next, the tool creates a CSV file with a list of the Archive-It URI-Ms associated with the archived versions of the seed. Then, the Web Archiving Screenshot Compare tool takes a screenshot of each archived web page. Finally, the URI-Ms and their screenshot file names are written to a CSV file.

After the screenshots are taken, the Web Archiving Screenshot Compare tool can use an image similarity metric to compute a score. Before computing a score, this tool checks if the live web page screenshot is not blank and then will crop a screenshot from the (live screenshot and archived screenshot) pair if both images are different sizes. After the score is computed it is output to a CSV file.

Determining the Effectiveness of the Image Similarity Metrics

The image similarity metrics supported by the Web Archiving Screenshot Compare tool are Structural Similarity Index (SSIM), Mean Squared Error (MSE), Normalized Root Mean Square Error (NRMSE), Perceptual Hash (P-Hash), Peak Signal to Noise Ratio (PSNR), and a percentage similarity metric that Reyes Ayala created. Three of these metrics (MSE, P-Hash, and PSNR) were discarded from her evaluation, because these metrics did not have an upper bound. The percentage similarity metric was also discarded, because it had a strong negative correlation with NRMSE.

Loading more...

Keyboard Shortcuts

Navigation
Next / previous item
j/k
Open post
oorEnter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help