Automating AV Archival Workflows: Part 2

INPT: INgest and Processing Toolkit

INPT (INgest and Processing Toolkit) is a bash scripted command line tool for automating the processing workflow for time-based media at the Hirshhorn Museum and Sculpture Garden (HMSG). I began writing the code as a Project Conservator of Time-based Media at HMSG in 2020. The next year I started a contract focused solely on developing the tool, supervised by HMSG Variable Media conservator Briana Feston-Brunet. For more on the origins of the project and my journey getting started with coding, see the first post in this series: Automating AV Archival Workflows: Part 1.

INPT strings together a series of command line interface (cli) tools and functions to autom…

INPT: INgest and Processing Toolkit

INPT strings together a series of command line interface (cli) tools and functions to automate common steps in processing digital artworks. At HMSG, prior to developing INPT, we were running these tools manually, one at a time. Some of the steps in the workflow are quick, like running MediaInfo on a video file, but others, like using FFmpeg to create a frame-level MD5 manifest, can be quite time consuming and tedious. With INPT, the conservator can select which files they wish to process, which tools they wish to use, and then let the process run unsupervised end-to-end.

The documentation and user guide for INPT is available here: https://eddycolloton.github.io/Documentation_INPT/

The code for the project is here: https://github.com/eddycolloton/INPT/

Typos: Handling User Input Error

In my previous post I described collecting information with the bash keywords “select” and “case” to create prompts. I used this method to allow the user to input descriptive metadata like the artist’s last name or the artwork’s accession number. But, I didn’t account for typos. Whoops! Easy mistakes to make were not easy to undo and could mean starting the whole process over. This is especially annoying when the whole point of automation is to make the process less tedious!

My first solution was to create a “Go back” option after any information was put in manually. In the code example below, this is done by having the $select_again variable set to “yes” and the prompt wrapped in a “while” code block. This way, the “ConfirmInput” function wraps the user input in a loop that continues until the user confirms their entry is correct:

ConfirmInput () {
local var="$1"
local var_display_name="$2"
local prompt_context="$3"
select_again=yes
while [[ "$select_again" = yes ]] ; do
echo -e "\nManually input the $"
if [[ ! -z "$prompt_context" ]] ; then
# Optional additional argument to provide context on prompt for input
echo -e "$prompt_context"
## vars w/ spaces passed to prompt_context not displaying correctly!
fi
read -e user_input
# Read user input as variable $user_input
user_input="$"
# If the user_input path is dragged and dropped into terminal, the trailing whitespace can eventually be interpreted as a "\" which breaks the CLI tools.
logNewLine "The $ manually input: $" "$CYAN"
if [[ "$typo_check" == true ]] ; then
# If typo check option is turned on, then confirm user_input
cowsay "Just checking for typos - Is the $ entered correctly?"
select options in "yes" "no, go back a step"; do
case $options in
yes)
select_again=no
break;;
"no, go back a step")
select_again=yes
unset user_input
break;;
esac
done
if [[ "$select_again" = yes ]] ;
then echo -e "Let's try again"
fi
else
select_again=no
fi
eval "$"
export var="$"
done
}

This code creates an option for the user to go back and correct their last input. It does not solve the underlying problem, though, that typing directly into the command prompt is kind of a bummer.

Eventually I created an option to pre-populate the required information using ‘input’ and ‘output’ CSV files, rather than typing everything manually during script execution. The user can provide as much or as little descriptive information about the artwork and processing workflow as they want via CSV files provided as “arguments” to the scripts. The video below demonstrates what this looks like:

Required information not provided in the CSV is collected through prompts, but where possible, required information will be inferred using context clues. For example, if the artist’s last name is provided in the input CSV but the directory path to the artwork file (where the collected metadata is stored) is not known, the script will search for it using the “find” command:

function FindArtworkFile {
FindArtFile="$(find "$" -maxdepth 1 -type d -iname "*$ArtistLastName*")"
# searches artwork files directory for Artists Last Name
if [[ -z "$" ]]; then
# if the find command returns nothing then

Only if the search returns no results will the script move on to prompting the user to drag and drop the path into the terminal.

Learning on the Job

As I discussed in my first post, my background is in moving image archiving and conservation of time-based media. I learned command line tools on the job, and my familiarity with those tools inspired me to learn simple scripting. I’m not a “software developer” per se, even though I have now developed software. Since creating INPT for the Hirshhorn was very much a learning process, if I could do it again, I’d do it differently:

INPT is written in Bash. I really like Bash, and it doesn’t have messy versioning or modular libraries to manage. But, it’s not always intuitive, and the “bare bones” nature of the language makes sharing variables and other info across scripts confusing at times.

App packaging is tricky for me as a beginner developer. But, if I were starting this project from scratch, I could write the code for INPT in a way that would make it easier to wrap up as a cohesive app. Currently, users still need to run individual scripts, which means they also need to run onboarding steps like making the scripts executable (`chmod +x`).

INPT doesn’t have a GUI (Graphic User Interface). The number of details that are necessary for the user to track is too difficult for a command line tool. Even though all of the “work” of the tool is performed by CLI tools - like MediaInfo, Exiftool, Siegfried and FFmpeg - the descriptive metadata necessary to correctly position the outputs requires lots of manual inputs. A simple GUI would make this more pleasant than typing into a CSV.

Introducing: AV Spex

When my contract with the Hirshhorn was completed, I was lucky enough to begin a new contract with another museum at the Smithsonian, the National Museum of African American History and Culture (NMAAHC). Like with INPT, I was able to leverage my experience as a media conservator to build a tool to help process digital media, this time focusing specifically on quality control of video. The result is a macOS app written in python called AV Spex.

In my next post in this series I will discuss my experience building the app and it’s current functionality.

INPT: INgest and Processing Toolkit

INPT: INgest and Processing Toolkit

Typos: Handling User Input Error

Learning on the Job

INPT is written in Bash. I really like Bash, and it doesn’t have messy versioning or modular libraries to manage. But, it’s not always intuitive, and the “bare bones” nature of the language makes sharing variables and other info across scripts confusing at times.

Introducing: AV Spex

Similar Posts