Oregon Theater Project: A Dataset of Oregon Cinemas from the Silent Era


DATA PAPER

CORRESPONDING AUTHOR:

Dr. Michael Aronson

Cinema Studies, University of 
Oregon, Eugene, OR, USA

aronson@uoregon.edu

KEYWORDS:
film exhibition; movie theaters; 
film history; new cinema 
history; Oregon film history

TO CITE THIS ARTICLE:
Aronson, M., Peterson, E., & 
Hayden, G. (2022). Oregon 
Theater Project: A Dataset 
of Oregon Cinemas from 
the Silent Era. Journal of 
Open Humanities Data, 8: 
27, pp. 1–7. DOI: https://doi.
org/10.5334/johd.92

Oregon Theater Project: A 
Dataset of Oregon Cinemas 
from the Silent Era

MICHAEL ARONSON 

ELIZABETH PETERSON 

GABRIELE HAYDEN 

ABSTRACT
The Oregon Theater Project (OTP) dataset is part of an ongoing collaborative research 
project by undergraduate students enrolled in successive iterations of “Exhibition & 
Audiences,” a Cinema Studies course at the University of Oregon. It will be updated with 
additional data each time the course is taught. The data set comprises geo/historical 
data about movie theaters (cinemas) and exhibition in the state from approximately 
1894 to 1929. The data is presented on a public website (https://oregontheaterproject.
uoregon.edu/) which includes maps and individual theater profiles produced by the 
students. All profiles, and the underlying data, are reviewed by the course instructors 
and edited as needed for clarification or accuracy. Profiles include, where available, 
the theater name, address, city, state, latitude, longitude, number of seats, owner/
manager names, and a narrative description. The underlying data, shared as Excel 
documents and tab-delimited spreadsheets, invites historical comparative analysis of 
film exhibition practices across time and locale, both local and global.

*Author affiliations can be found in the back matter of this article

mailto:aronson@uoregon.edu
https://doi.org/10.5334/johd.92
https://doi.org/10.5334/johd.92
https://oregontheaterproject.uoregon.edu/
https://oregontheaterproject.uoregon.edu/
https://orcid.org/0000-0003-1790-7816
https://orcid.org/0000-0003-1258-4122
https://orcid.org/0000-0003-4740-4187


2Aronson et al.  
Journal of Open 
Humanities Data  
DOI: 10.5334/johd.92

(1) OVERVIEW 
REPOSITORY LOCATION 

Harvard Dataverse: https://doi.org/10.7910/DVN/FGOUZ3

Front end interface: https://oregontheaterproject.uoregon.edu/ 

Context 

The Oregon Theater Project (OTP) is one of an increasing number of digital projects documenting 
and sharing the history of movie theaters (cinemas), film programming, and film reception. 
Most of these projects do not make their data publicly available in a usable format, even 
though the value of these data projects is greatly increased if they allow data to be aggregated 
(Aronson et al., 2022a). This data paper contributes to building open data in regional cinema 
history; it describes the preliminary version of a data set that will be updated regularly.

The Oregon Theater Project (OTP) is a collaboration between faculty in Cinema Studies and the 
University of Oregon Libraries, with a goal of integrating information literacy skills and concepts, 
as well as digital humanities tools, into the historical research course “Exhibition & Audiences”. 
Students, guided by faculty mentors, come away from this course with a broad knowledge 
of film exhibition theory and history, along with a firm grasp of research methods. Students 
learn how to identify appropriate sources for their information need; to select appropriate 
research tools from a variety of options; to search efficiently within online databases and 
digital collections, as well as traditional print-based media; to evaluate sources for credibility 
and authority; to analyse and interpret primary sources; to use information ethically; to cite 
their sources appropriately; and to publish their finished work online using a selection of digital 
humanities presentation tools. Each time the course is taught, students build on and improve 
the research conducted by students in previous years. A new, improved data set based on this 
work will be published following each course iteration.

(2) METHOD 
In the OTP, undergraduate students learn cinema studies research methods within a context 
of film exhibition history and audiences course content. Students conduct original research in 
primary sources to compile data and to compose short narratives about Oregon movie theaters 
during the period of study (1894–1929). Primary sources include newspapers, industry trade 
journals, city and county directories, business directories, maps, and photographs. Students in 
the course use a shared Google Drive with a hierarchical folder and file system to manage their 
research materials. 

STEPS 

Students enter data directly into a structured website platform built on a Drupal content 
management system. Figure 1 shows a screenshot of part of the page students use to enter 
information about a new theatre. Data is updated directly in the platform every time a class 
is taught. The Drupal database includes images taken from newspapers that are the source 
of most of the information contained in the database. These images are taken informally as 
screen shots and published on our website under “fair use” terms. 

Because we do not have copyright documentation or permissions for each image, we are 
not including the images as part of this data set. However, we include several data columns 
that reference these files to create more contextual information. First, we include a column, 
‘works_cited’, that offers unstructured text citations to sources. Second, we include both 
plain text and full html versions of text from the website (column names are ‘body’, ‘body_
html’; ‘additional_facts’, ‘additional_facts_html’; ‘works_cited’; ‘works_cited_html’). The html 
versions include relative links to images as they are embedded in the text. Finally, we include 
a variable that lists image file names for images highlighted in a special section on the page 
(‘gallery_images’). In theory, this should allow users to create links back to the images for the 
lifetime of the website.

https://doi.org/10.7910/DVN/FGOUZ3
https://oregontheaterproject.uoregon.edu/


3Aronson et al.  
Journal of Open 
Humanities Data  
DOI: 10.5334/johd.92

QUALITY CONTROL 

The course instructors serve as editors for the course data and content. They review every 
entry for accuracy, citations, and correct formatting. Students follow a file-naming convention 
that embeds source citation information within file names to ensure proper attribution during 
data entry and writing. This method also allows the course instructors to easily consult the 
research materials to verify facts as presented in the theater data and narratives. After the 
class is finished, the course instructors remediate any data entry errors that affect data 
completeness (such as missing geospatial coordinates) in the Drupal database. However, 
because when we began this project proofreading was focused on the human-readable 
website and not on creating machine-readable data, we have not systematically corrected 
differences in formatting in string variables such as addresses. Missing data may be blank or 
listed as ‘unknown’ or ‘Unknown’ and there may be extra spaces, periods, or other irregularities. 
We hope in future versions to remediate these issues.

Data is exported as a csv file from several SQL views in the Drupal database, cleaned using an 
R script, and saved as new spreadsheets. As documented in the Readme file and the R script 
included with the data set, we trim white space from some columns, split out some variables, 
and join several spreadsheets to create final versions we think may be most useful to future 
users. Blanks have been left as they are rather than converted to NAs. To make this data widely 
accessible, we share results in tab-delimited form and as Excel files; we also share the original 
files downloaded from Drupal and the R script used to process them. In future versions of this 
data set, we hope to also include links to theater urls in the front-end database and shapefiles 
corresponding to theater locations.

DATA STRUCTURE

While the data readme will include complete, up-to-date documentation of data variables as 
the data set grows and evolves, here we highlight import elements of the processed data that 
we expect will remain stable over time. The tabular data contained in the files ‘theaters_[date].
tab’ and ‘theaters_excel_[date].xlsx’ includes the following important variables:

id (integer) – Unique ID assigned to each theater “entry” in Drupal. A theater with the 
same name will sometimes be listed more than once (and thus will have more than 
one theater id). Sometimes this means that the theater has moved, and sometimes 
it means that two unrelated theaters with the same name appear in two locations.

theater_name (character) – Theater refers to a physical building, sometimes called 
a “cinema” or “cineplex.” We are defining a theater as anywhere where a film was 
displayed to a public audience. Theater names are not unique.

address (character) – Full address (if known) or intersection. We hope in future to 
standardize entries in this column.

Figure 1 A partial screenshot 
of the Drupal form for entering 
information about theaters 
in the Oregon Theater Project 
website.


4Aronson et al.  
Journal of Open 
Humanities Data  
DOI: 10.5334/johd.92

city, state, city_state (character) – City in Oregon, state (OR), or “City, OR”.

latitude, longitude (double/float) – in degrees.

start_date_of_operation, end_date_of_operation (date) – In “yyyy-mm-dd” format.  
Theaters for which no closing date was entered were coded by the Drupal database 
as “ongoing” or “still open.” This may mean they are in fact still open, or it may mean 
that the closing date is unknown. In either case, the data export records their closing 
date as the date the data was last downloaded. These theaters will have the most 
recent “end_date” entries and are recognizable as many will “end” on the same 
recent day.

start_year, end_year (integer) – in “yyyy” format.

number_of_seats (character) – venue capacity. This is sometimes an integer, but 
sometimes it includes more extensive notes or estimates.

owner_and_manager_names (character) – If individual names were created as 
separate entries in the Drupal database, then each name is separated by a semicolon 
in this column. However, some entries were created as just one entry separated by 
commas or have complex annotations. We hope in future to standardize this field to 
allow exploration of who owned more than one theater.

body, additional_facts, body_html, additional_facts_html (character) – Descriptions 
of the movie theater written by a student or group of students. “html” versions 
include all html formatting that creates the page, including links to embedded 
images. IMPORTANT NOTE: in the ‘theaters_excel_[date].xlsx’ version of the data set, 
‘body_html’ is replaced by ‘body_html_length’, which is an integer value listing the 
number of characters in the ‘body_html’ column. Because some columns exceed the 
maximum cell length in Excel, ‘body_html’ is omitted from the Excel files.

gallery_images (character) – list of 0 to many relative links to images used in the 
“gallery” section of a blog post, separated by semicolons.

The ‘owners_[date].tab’ and ‘owners_excel_[date].xlsx’ files repeat information found in the 
theaters spreadsheets but create a new row for each owner/manager of a particular theater 
that was broken out (separated by a semicolon) in the original data. “owner_and_manager_
name’s” (character) is the only column containing unique values in this spreadsheet.

The ‘articles_[date].tab’ and ‘articles_excel_[date].xlsx’ spreadsheets include a list of articles 
(blog posts) that are not entries for a specific theater. The articles data have a unique integer 
id assigned by Drupal, ‘gallery_images’, ‘body’, and either ‘body_html’ or ‘body_html_length’ 
columns with the same specifications as the theaters data sets. Columns unique to this data 
set include ‘authored_by’ (character), which is the name of the Drupal user who uploaded the 
article (sometimes but not always the article author), and ‘categories’ (character), a list of 0 to 
many topic tags assigned in Drupal and separated by semicolons.

Data users could link articles to theaters spreadsheets via the ‘related_cities_and_theaters’ 
column in the articles data, which sometimes indicates that the article is describing a theater 
set in a particular city. Any such join would be incomplete, since the column takes between 0 
and many cities or theaters, separated by a semicolon. The column would need to be divided 
into multiple columns and parsed to identify cities vs theaters. In future we plan to parse this 
column for users. Cities are listed in the format “City, OR” and could be joined via the ‘city_
state’ column in the theaters spreadsheet. Theaters should be listed using the same name 
used in the ‘theater_name’ column in the ‘theaters’ spreadsheet, but there may be errors. 
Since the combination of ‘theater_name’ and ‘city_state’ is likely to be unique, articles could be 
imperfectly joined to theaters using both columns as keys.

(3) DATASET DESCRIPTION
Object name – Oregon Theater Project Database. See ‘OR_Theater_Project_Readme_2022-08.
txt’ for complete list of filenames.

Format names and versions – tab, txt, xlsx, R, PDF

Creation dates – 2020-01-01 to 2022-08-26


5Aronson et al.  
Journal of Open 
Humanities Data  
DOI: 10.5334/johd.92

DATASET CREATORS 

Michael Aronson and Elizabeth Peterson (University of Oregon) were responsible for 
conceptualization, funding acquisition, project administration, supervision, dataset creation 
and editing. John Zhao and Gabriele Hayden (University of Oregon) designed the data export 
views, and Gabriele Hayden cleaned and curated the dataset. 

The following University of Oregon students contributed research and writing to create this 
dataset: Lauren Adzima, Khalil Afariogun, Andrew Arachikavitz, Malia Balzer, Jacob Beeson, 
Sylas Bosman, Kyra Brennan, Ezra Brothers, Christian Cancilla, Katy Cannon, Eliza Castillo-
Salazar, Jourdan Cerillo, Tom Chamberlain, Shelby Chapman, Cody Churchill, Jude Corwin, 
Heath Cotter, Julian D’Ambra, Megan Deck, Patrick Dunham, Chloe Duryea, Leah Durkee, 
Morgan Egbert, Maggie Elias, Jack Elliot, Joseph Endler, Emily Fine, Kyle Fleming, Alex Fox, 
Javier Fregoso, Sammie Garcia, Hayden Garrett, Ireland Gill, Austin Griggs, Tayte Hansen, 
Isabella Harrington, Kara Hilton, Ashli Horrell, Amanda James, Zach Jones, Ethan Laarman-
Hughes, Addie Lacewell, Abby Lewis, Jimmy Lieu, Kaden Lipkin, Joie Littleton, Wanfang Long, 
Peter Lovejoy, Shelby Marthaller, Cassie McCready, Carly McDaniel, Brittany McDowell, Brendan 
McMahon, Eric McMichael, Maddie Miner, Maryam Moghaddami, Jack Moran, Parker Morgan, 
Nicholas Mundorff, Alexis Neal, Michael O’Ryan, Kelsey Parker, Dre Parkinson, Reese Patanjo, 
Katherine Pelch, Ben Pettis, Sienna Pigg, Shelby Platt, Ellie Reis, Bailey Rierden, Manuel Rios, 
Jayna Rogers, Anthoni Rosas, Emily Ruthruff, Payton Schiffer, Becca Schomer, Huntley Sims, 
Bella Smith, Megan Snyder, Britnee Spelce-Will, Malley Stanovsek, Connor Templeman, Weston 
Tengan, Jess Thompson, Sarah Tidwell, Evan Vacek, Dylan Wakelin, Jalon Watts, Joe Weber, 
Makaal Williams, Veronica Wilson, Charlie Winn, David Young, and Sam Zepeda.

Language – English

License – CC-BY

Repository name – Harvard Dataverse

Publication date – 2022-10-31

(4) REUSE POTENTIAL
This data is likely to be of interest to scholars in the humanities and social sciences. It could be 
used to create new visualizations or digital exhibitions; re-creating a map of these venues, for 
example, could be a project for an advanced digital humanities course. It could be aggregated 
with other regional, national, or international cinema history projects, such as that shared on 
the Mapping Movies site, or could be modified to fit the data model used by Cinema Context 
or the European Cinema Audiences project1 to allow for the comparative study of cinema 
venues (Klenotic, 2022; CREATE, 2022). However, this would require standardizing many of the 
freeform columns in our data. The information contained in this data set would map onto 
the Venue, Address, Person, Company, Publication, and Archive tables in the original Cinema 
Context SQL database (van Oort & Noordegraaf, 2020). This data could also be used in social 
science research, for example to track the relationship between the opening and closing of 
theaters and larger socioeconomic trends across Oregon.

One of our anonymous reviewers offered several specific, inspiring suggestions for how our 
data set, aggregated with others, could be useful in tracking historical questions. For example, 
the data on theater owners and managers could be cleaned and aggregated with other data 
sets to map female business ownership during the years leading up to the passage of the 19th 
amendment granting women’s suffrage in the US in 1920. Theater openings and closings might 
offer insights—particularly when aggregated with other historical business data in Oregon or 
data on other theaters across the US—into how businesses adapted to economic shocks such 
as World War I, the 1918 flu pandemic, or the white supremacist terrorism of the Red Summer 
of 1919. 

Scholars seeking to pursue the kinds of data aggregation that would allow for such work must 
do a great deal of sophisticated data processing to normalize data across differences of data 
definition and structure. We have done our best to document how our data is defined and 

1 https://www.europeancinemaaudiences.org/research/, last accessed date: 8 November 2022.

https://www.europeancinemaaudiences.org/research/


6Aronson et al.  
Journal of Open 
Humanities Data  
DOI: 10.5334/johd.92

structured to allow for others to build on our work. However, as we discuss in Aronson et al. 
(2022a), the first challenge scholars face is simply gaining access to the data itself. The data 
set from that paper includes links to the minority of projects surveyed that do share data as of 
2022 and may form a starting point for scholars seeking to do comparative work (Aronson et 
al., 2022b). We are inspired to share our own small, imperfect data set to model for colleagues 
what we hope they will do as well: share data early and often, updating as the extent and 
quality of the data improves over time. 

ACKNOWLEDGEMENTS
The OTP platform was created in collaboration with Shirley Galloway, Loring Hummel, Daniel 
Mundra, Caden Williams and John Zhao, programmers and web designers in the College of Arts 
and Sciences at the University of Oregon. Thank you to our reviewers, whose suggestions have 
greatly improved the quality of this data paper and given us several ideas for how to improve 
our data going forward.

FUNDING INFORMATION
Funding for the Oregon Theater Project was, in part, provided by a 2019 instructional grant 
(approximately $15,000) from the Tom and Carol Williams Fund for Undergraduate Education 
at the University of Oregon.  

COMPETING INTERESTS
The authors have no competing interests to declare.

AUTHOR CONTRIBUTIONS
Michael Aronson: Conceptualization, Funding Acquisition, Project Administration, Supervision, 
Writing

Elizabeth Peterson: Conceptualization, Funding Acquisition, Project Administration, Supervision, 
Writing

Gabriele Hayden: Data Curation, Writing

AUTHOR AFFILIATIONS
Dr. Michael Aronson  orcid.org/0000-0003-1790-7816 
Cinema Studies, University of Oregon, Eugene, OR, USA

Elizabeth Peterson  orcid.org/0000-0003-1258-4122

Digital Scholarship Services, University of Oregon Libraries, Eugene, OR, USA

Dr. Gabriele Hayden  orcid.org/0000-0003-4740-4187 
Data Services, University of Oregon Libraries, Eugene, OR, USA

REFERENCES 
Aronson, A., Peterson, E., & Hayden, G. (2022a). Local cinema history at scale: Data and methods for 

comparative exhibition studies. (forthcoming). Iluminace: Journal for Film Theory, History, and 

Aesthetics, 34(2). Preprint. DOI: https://doi.org/10.7264/t0ky-0q37

Aronson, A., Peterson, E., & Hayden, G. (2022b). “Replication Data for: Local Cinema History at Scale: Data 
and Methods for Comparative Exhibition Studies”. Harvard Dataverse, V1. UNF:6:/qdV535CScvkd2ODC/

DAkQ== [fileUNF]. DOI: https://doi.org/10.7910/DVN/6WOQPO

CREATE. (2022). Cinema Context RDF Documentation. Retrieved from https://uvacreate.gitlab.io/cinema-
context/cinema-context-rdf/ (last accessed date: 8 November 2022).

Klenotic, J. (2022). Mapping movies. Retrieved from http://mappingmovies.unh.edu/ (last accessed date: 
8 November 2022).

van Oort, T., & Noordegraaf, J. (2020). The Cinema Context Database on film exhibition and distribution 
in the Netherlands: A critical guide: arts and media. Research Data Journal for the Humanities and 

Social Sciences, 5(2), 91–108. DOI: https://doi.org/10.1163/24523666-00502008

https://orcid.org/0000-0003-1790-7816
https://orcid.org/0000-0003-1790-7816
https://orcid.org/0000-0003-1258-4122
https://orcid.org/0000-0003-1258-4122
https://orcid.org/0000-0003-4740-4187
https://orcid.org/0000-0003-4740-4187
https://doi.org/10.7264/t0ky-0q37 
https://doi.org/10.7910/DVN/6WOQPO
https://uvacreate.gitlab.io/cinema-context/cinema-context-rdf/
https://uvacreate.gitlab.io/cinema-context/cinema-context-rdf/
http://mappingmovies.unh.edu/
https://doi.org/10.1163/24523666-00502008


7Aronson et al.  
Journal of Open 
Humanities Data  
DOI: 10.5334/johd.92

TO CITE THIS ARTICLE:
Aronson, M., Peterson, E., & 
Hayden, G. (2022). Oregon 
Theater Project: A Dataset 
of Oregon Cinemas from 
the Silent Era. Journal of 
Open Humanities Data, 8: 
27, pp. 1–7. DOI: https://doi.
org/10.5334/johd.92

Published: 12 December 2022

COPYRIGHT:
© 2022 The Author(s). This is an 
open-access article distributed 
under the terms of the Creative 
Commons Attribution 4.0 
International License (CC-BY 
4.0), which permits unrestricted 
use, distribution, and 
reproduction in any medium, 
provided the original author 
and source are credited. See 
http://creativecommons.org/
licenses/by/4.0/.

Journal of Open Humanities 
Data is a peer-reviewed open 
access journal published by 
Ubiquity Press.

https://doi.org/10.5334/johd.92
https://doi.org/10.5334/johd.92
http://creativecommons.org/licenses/by/4.0/
http://creativecommons.org/licenses/by/4.0/