Open data visualizations and analytics as tools for policy-making


University of South Florida

From the SelectedWorks of Loni Hagen

2019

Open data visualizations and analytics as tools for
policy-making
Loni Hagen, University of South Florida

Available at: https://works.bepress.com/loni-hagen/14/

http://www.usf.edu
https://works.bepress.com/loni-hagen/
https://works.bepress.com/loni-hagen/14/


Contents lists available at ScienceDirect

Government Information Quarterly

journal homepage: www.elsevier.com/locate/govinf

Open data visualizations and analytics as tools for policy-making

Loni Hagena,⁎, Thomas E. Kellera, Xiaoyi Yerdenb, Luis Felipe Luna-Reyesb

a University of South Florida, 4202 E. Fowler Avenue, Tampa, FL 33620-7800, USA
b University at Albany, 135 Western Ave., Albany, NY 12222, USA

A R T I C L E I N F O

Keywords:
Policy informatics
Policy analytics
Open data
Topic modeling
Visual analytics
Usability testing

A B S T R A C T

Government agencies collect large amounts of structured and unstructured data. Although these data can be used
to improve services as well as policy processes, it is not always clear how to analyze the data and how to glean
insights for policy making, especially when the data includes large volumes of unstructured text data. This article
reports opinions found in “We the People” petition data using topic modeling and visual analytics. It provides an
assessment of the usability of the visual analytics results for policy making based on interviews with data
professionals and policy makers. We found that visual analytics have potentially positive impacts on policy
making practices. Experts also articulated potential barriers regarding the adoption of visual analytics tools, and
made suggestions. Potential barriers included insufficient resources in government agencies and difficulty in-
tegrating analytics with current work practices. The main suggestions involved providing training and inter-
pretation guidelines along with the visual analytics tools. Major contributions of this study include: (1) sug-
gesting viable visualization tools for analyzing textual data for policy making, and (2) suggesting how to lower
barriers to adoption by increasing usability.

1. Introduction

Technology developments have been recognized as catalyzers of
organizational change and transformation (Bannister & Connolly, 2014;
Treacy & O'Sullivan, 2010). The Internet has not been an exception, and
it has triggered the development of new business models and service
delivery mechanisms both in the public and private sectors (Bergh &
Benghiat, 2017; Luna-Reyes & Gil-Garcia, 2014). In the private sector,
for example, businesses like Amazon have applied information tech-
nologies and data analysis techniques to transform the retail industry
(Bergh & Benghiat, 2017). In the public sector, technologies have also
promoted change, although research suggests that change has not been
transformational, but incremental (Norris & Reddick, 2013). However,
recent trends in open data, big data, and data analytics have renewed
both the possibility and interest in transforming government activity,
particularly in the development of policy (Janssen & Helbig, 2019;
Puron-Cid, Gil-Garcia, & Luna-Reyes, 2016).

In particular, some researchers have identified the potential impact
of social media data and petitioning systems in the early stages of policy
making, contributing to the improvement of problem definition and
agenda setting activities (Hagen, Harrison, & Dumas, 2018; Janssen &
Helbig, 2019; Luna-Reyes, 2017). More specifically, Janssen and Helbig
(In Press) pointed to the need for developing methods to analyze

content developed with such platforms as sources of inspiration for
policy makers. However, data collected through these platforms poses
at least two challenges for its effective use. First, these datasets include
large amounts of unstructured textual data that makes manual reading
too burdensome to understand the content. Although recent efforts to
develop advanced text mining tools have contributed to the first chal-
lenge, the use of such tools poses a second challenge given that there is
still much to learn in its application and interpretation by policy ma-
kers. In this way, it is rare to find empirical examples of textual data
being successfully adopted for policy making. However, one of the
motivations behind opening data by government is to promote in-
novations that facilitate the exploitation of these data (Mergel,
Kleibrink, & Sörvik, 2018).

Motivated by these challenges, we explore data from the We The
People petitioning platform to answer two research questions: (1) what
is a potential solution to efficiently extract and effectively present topics
expressed in large volumes of textual data?, and (2) to what extent do
policy makers consider visual analytics solutions to be usable and useful
for policy making?. To answer the first question, we extend previous
work on topic modeling (Hagen, 2018) by applying topic modeling for
topic extraction and visualization tools such as LDAvis for presenting
the extracted topics. Then, to answer the second research question, we
test the usability of these possible solutions with policy makers, data

https://doi.org/10.1016/j.giq.2019.06.004
Received 17 August 2018; Received in revised form 5 June 2019; Accepted 9 June 2019

⁎ Corresponding author at: School of Information, University of South Florida, 4202 E. Fowler Avenue, CIS2031, Tampa, FL 33620-7800, USA.
E-mail address: lonihagen@usf.edu (L. Hagen).

Government Information Quarterly xxx (xxxx) xxx–xxx

0740-624X/ © 2019 Elsevier Inc. All rights reserved.

Please cite this article as: Loni Hagen, et al., Government Information Quarterly, https://doi.org/10.1016/j.giq.2019.06.004

http://www.sciencedirect.com/science/journal/0740624X
https://www.elsevier.com/locate/govinf
https://doi.org/10.1016/j.giq.2019.06.004
https://doi.org/10.1016/j.giq.2019.06.004
mailto:lonihagen@usf.edu
https://doi.org/10.1016/j.giq.2019.06.004


analysts, and communication specialists to empirically show their
perspectives on adopting such visual analytics tools for everyday
practices. In this way, this research contributes to the data-driven
policy making literature by proposing a framework to facilitate the
analysis and visualization of large volumes of text data, and by diag-
nosing government practitioners' responses and feedback on such visual
analytics tools for policy making.

The structure of the paper is as follows: The second section presents
background information about “We the People” data. The third section
discusses theoretical foundation of value creation through open data
and introduces topic modeling and visual analytics research conducted
in open data context. The fourth section describes the data and
methods, including a potential solution to distill and present re-
presentative themes expressed in large volumes of text data. The fifth
section presents our key findings from the usability evaluation. The
sixth section discusses the main findings in terms of barriers and lim-
itations, and the final section includes conclusion and future research.

2. Background: We the people open data

The US e-petitioning platform “We the People” (WtP) was launched
in 2011 as the flagship initiative of the Obama administration to in-
crease public participation in government (The White House, 2015).
The data created through e-petitioning includes petition title, petition
texts, signatures and their accumulation, some characteristics of peti-
tioners and signers, issue categories and metadata (The White House,
2017). According to the platform rules, petitions that accumulate more
than 100,000 signatures in less than 30 days get an official update from
the White House. Although not all petitions reach this threshold, data
from past petitions are made available to the public for free use, re-use,
and distribution as Open Data (Ubaldi, 2013). Datasets are updated
about every 6 months by including new data. Following general prin-
ciples of open data as a source of innovation, the WtP platform provides
an API to facilitate data access and manipulation (see https://petitions.
whitehouse.gov/developers/get-code). Moreover, the platform pro-
vides some analytical tools developed by civic programmers (https://
obamawhitehouse.archives.gov/blog/2014/06/03/hackathon-here-
white-house). It has been suggested in previous research that open
petitioning data are potential sources of policy topics of public interest
(Hagen et al., 2018). In this way, open petitioning data becomes the
focal point of interest in this research.

WtP open data is unique in three aspects. First, the dataset includes
direct expressions of citizen opinion to governments, which is rarely
available in traditional information sources such as major news outlets,
survey results, or administrative data. Therefore, the petition data can
be used to inform public opinion and sentiment regarding policy mat-
ters to policy makers. Second, the WtP dataset is a good example of a
technically advanced open data set; it is a quality dataset arranged with
defined metadata, arranged in a machine-readable format, and is made
available through an open API. Third, WtP data is a by-product of a
petitioning platform, and governments are flooded with similar types of
datasets as the use of social media platforms increases.

The major challenges in using WtP data for creating value for the
policy-making process is the volume of data and the unstructured
nature of petitions. Use of unstructured data such as abundant text data
has been recognized as one of the biggest challenges of big data ana-
lytics (Siegel, 2016). While open government and open data initiatives
create and share unprecedented amount of text data including citizen
expressions, the process of going through them are too time consuming
and complicated to be practical, especially if policy makers need to go
through large volumes of text (Walters, Aydelotte, & Miller, 2000).
These types of big textual data are growing exponentially as the number
of government-led platforms and adoption of commercial social media
increases. Topic modeling and the recent development of visualization
tools may help to reduce cost and time related to the analysis of large
volumes of text data.

3. Literature review

3.1. Analytics to create value through open data

Governments around the world have exerted efforts to “create and
institutionalize a culture of Open Government” (Nam, 2012, p. 348) by
embracing the ideas of transparency, civic engagement in governance,
and policy making (Aitamurto, 2012; The White House, 2011). Opening
data not only brings changes in government's culture towards “open-
ness, transparency and accountability,” but can also increase public
engagement by cultivating a culture of sharing and collaborating
through open data (Ubaldi, 2013). These cultural changes and active
citizen engagement can create economic innovations (Mergel et al.,
2018; Zuiderwijk, Helbig, Gil-Garcia, & Janssen, 2014), improved
government performance (Ubaldi, 2013), and increased accountability
of elected officials (Sivarajah et al., 2016). Unfortunately, actual crea-
tion of value through innovative use of open data has proven to be a
difficult task. Despite the increasing initiatives of open data platforms,
reported use cases and created value have been lacking (Najafabadi &
Luna-Reyes, 2017). For example, out of 183,000 datasets published in
data.gov (The United States' open data portal), only 78 apps are made
available in the platform as of November 2017.

Data and technology barriers are one of the major obstacles in
achieving innovation through open data initiatives (Magalhaes &
Roseira, 2017; Toots, McBride, Kalvet, & Krimmer, 2017; Zuiderwijk
et al., 2014; Zuiderwijk, Janssen, Choenni, Meijer, & Alibaks, 2012).
Early on, scholars stressed the importance of open data technolo-
gies—in terms of uniformity and integration of information sources as
well as the importance of creating metadata (Dawes, Pardo, &
Cresswell, 2004). Later, studies recommended that interactivity and
usability are crucial elements to make open platforms available for
meaningful citizen engagement (Toots et al., 2017). More recently,
open data scholars identified certain technical requirements—such as
machine-readable formats, use of APIs, tools for data wrangling, and
technical competence of users—are lacking in achieving innovation
using open data (Magalhaes & Roseira, 2017; Zuiderwijk et al., 2014).

As scholars commonly have recognized, publishing data is not en-
ough to attain innovation using open data (Janssen, Charalabidis, &
Zuiderwijk, 2012). The success of open data depends on active external
participation to use the published data (Attard, Orlandi, Scerri, & Auer,
2015). However, for non-technical users, the fundamental lack of ex-
pertise and knowledge required for the collection, manipulation, ana-
lysis, and interpretability of the data hinders meaningful engagement
with open data, and it is a critical problem (Graves & Hendler, 2013).
An important portion of open data users may be non-technical users
who want to analyze trends over time to understand longitudinal
changes but cannot perform required tasks due to a lack of expertise.
Recent studies have rightly pointed out lacking capabilities of the
supply-side open data platforms for supporting non-technical users
(Chatfield & Reddick, 2017) as well as lacking best practices for using
the data (Bertot, Butler, & Travis, 2014).

3.2. Visualization of topic modeling

For understanding topics and themes expressed in large volumes of
text data, topic modeling has been frequently adopted to automatically
discover latent themes in a document collection based on the co-oc-
currence of words (Blei, 2012). The outcome of topic modeling includes
topics (a keyword list sorted by the relevance ranking to the topic) and
topic proportions in each document. In general, five to thirty highly
ranked keywords are presented as a topic.

Topic modeling is an unsupervised machine learning method that
extracts topics without relying on prior human knowledge. So, there are
two noticeable issues when applying topic modeling results for policy
making. The first issue is doing it right. It is important to make proper
decisions and care in the process of modeling to produce human

L. Hagen, et al. Government Information Quarterly xxx (xxxx) xxx–xxx

2

https://petitions.whitehouse.gov/developers/get-code
https://petitions.whitehouse.gov/developers/get-code
https://obamawhitehouse.archives.gov/blog/2014/06/03/hackathon-here-white-house
https://obamawhitehouse.archives.gov/blog/2014/06/03/hackathon-here-white-house
https://obamawhitehouse.archives.gov/blog/2014/06/03/hackathon-here-white-house
http://data.gov


interpretable topics (Boyd-Graber, Mimno, & Newman, 2014). Hagen
(2018) extracted topics using petitioning data, although this study focus
is limited to showing “how to train and evaluate” topic modeling and
does not show how topic modeling results can be presented and utilized for
policy making and can be implemented for everyday practices. Our work
extends these efforts to produce interpretable topics that are, therefore,
amenable to policy making.

The second issue of topic modeling for policy making resides in how
to interpret the meaning of topics and relationship among them (Hagen,
2016; Sievert & Shirley, 2014). Given that topics are extracted based
solely on the statistical traits of term co-occurrence, there is no theo-
retical reason to believe they are easily interpretable by a human (Boyd-
Graber et al., 2014). However, some digital government studies have
adopted topic modeling to identify and understand public opinions ex-
pressed in text data. Reddick, Chatfield, and Ojo (2017), for example,
extracted topics appearing Facebook posts as an effort to create a social
media text analytics framework. Hagen, Uzuner, Kotfila, Harrison, and
Lamanna (2015) extracted emerging topics from WtP data using a small
set of petitions created in the early years of WtP (initiation to mid-
2014). Although both examples are steps in the right direction, these
studies only displayed topic words with limited interpretations, and it is
still hard to make sense out of the topic modeling results for non-
technical readers based solely on the presented topic words. In order to
improve interpretability of topic modeling results, more recent studies
have adopted visual analytics to present topic modeling results. Cassi,
Lahatte, Rafols, Sautier, and de Turckheim (2017) explained the re-
lationships between the ways in which the academic literature and
social needs as expressed in discussions among members of the Eur-
opean Parliament approach the topic of obesity. Visual analytics tools
were effective in presenting the clear misalignment between academic
studies and social needs in terms of the obesity issue.

In addition to an improved interpretability, visual analytics tools
enable meaningful engagement of non-technical users. Graves and
Hendler (2013) proposed the use of visualization methods to provide
simple mechanisms for non-technical users to explore open data. Using
over 160 public datasets, Keshif, a visualization tool, “let the user de-
fine what is being visualized and explored, not how” (Yalçın, Elmqvist,
& Bederson, 2016). Poucke et al. (2016) demonstrated that researchers
can build complex and automated processes with multiple mouse clicks
instead of programming codes. Using rapidminer (rapidminer, 2017), a
big data analytics tool, non-coding scientists can prepare data, train and
validate models, and embed analytic results. As such, experts of open
data stressed the importance of data analysis and visualization tools to
achieve innovation using open data (Toots et al., 2017).

Consumers and end users of open data are diverse (e.g., government
employees, innovators, citizens, and journalists/researchers/activists)
(Gascó-Hernández, Martin, Reggi, Pyo, & Luna-Reyes, 2018). One of the
most popular user groups of open data have been technicians who used
open data to develop new tools. Developers and data suppliers (most
often using open data) get together through hackathons in order to
create new services and products using open data. However, we do not
know to what extent these products and services have been used by
governments to create value, nor do we have information regarding
their influence on actual policy making. Perhaps we can achieve in-
novation from open data when we make visual analytics tools available
on open data platforms alongside open data sets. Moreover, innovative
use cases, if provided on open data platforms, can stimulate users'
creativity. Further, user-perception on usefulness of a new technology
also influences the users' intention to actually use the technology.

4. Methods

4.1. Data

We used data collected through the publicly available White House
application program interface (API) that contains all petitions related

data appeared on the WtP website between September 22, 2011 (the
initiation date), and July 12, 2016. This corpus contained 4985 petition
documents. We combined each petition title and its corresponding ra-
tionale into one document, which forms the basic unit for this analysis.
Fig. 1 is an example of a WtP petition. Available datasets include meta
data (including signature counts, user tagging information, the petition
creation dates, signature dates and initials of signers).

4.2. Tools for assessing and visualizing data1

We collected the WtP OGD data from the WtP API and stored them
in a MySQL database (an open source Structured Query Language (SQL)
database) (Oracle, 2017). We queried relevant data fields (petition
creation date, title, petition body, and signature counts) from the SQL
data for the analysis.

After selecting petitions written in English, we converted all texts to
lower case, normalized white spaces, eliminated punctuations, non-al-
phanumeric characters, and removed short words of only one or two
characters using R tm package. We used an English stopwords dic-
tionary included in the “mallet” package to eliminate less informative
words such as “a,” “the,” and “of,” which appear in almost every
English documents; “amp” is added in the stopwords dictionary to
eliminate “amp” which is a processed version of ampersand (&). We
used the R mallet package to train Latent Dirichlet Allocation (LDA)
topic models (Mimno, 2013). Statistical topic modeling such as LDA
(Blei, 2012) extracts a coherent theme, which is a probability dis-
tribution over a vocabulary assuming that documents are composed of
multiple themes. Each theme (or topic) is generally represented by
words (we call this topic words) that appear the most frequently in the
relevant documents and also is represented by documents that are the
most representative of the theme. In deciding number of topics to
produce, we followed suggestions made by Hagen (2018)—30 topics
were produced using 3344 petitions and 26 topics are good quality
topics for a direct human interpretation, and a manual content analysis
result by PEW (2016)—25 issue categories are reported after manual
analysis of 4799 WtP petitions. Based on the two studies, it is apparent
that about 25 policy issue-dimensions can reasonably reflect the WtP
corpus. We decided to produce 30 topics expecting that about 25 topics
would be “human interpretable” topics because a small portion of the
final topics are likely to be low quality for human interpretation (Boyd-
Graber et al., 2014; Hagen, 2016). Using random initiation, we have
produced ten sets of 30 topics to reassure random initiation does not
influence the stability of the topics. We found that most of the topics
(26 out of 30) make sense for human interpretation (Appendix I reports
the 30 topics, labels, and quality).

We then developed visualizations for these LDA topics using LDAvis,
an open source topic modeling visualization tool (Sievert & Shirley,
2014). We also aggregated available information from the dataset (i.e.,
signature counts and dates of petition creation) as well as Google
Trends for topic interpretation. Fig. 2 shows the framework of the visual
analytics using topic modeling.

To help the interpretation and further analyses, we labeled each
topic based on the LDAvis visualization results. The topic words were
sorted in descending order based on the estimated term frequency
within the selected topic (red bars in Fig. 3), which informs topic words
that are highly relevant to the specific topic. The relevance of a term to
topic is given by a weight parameter λ. Topic words displayed in
Fig. 3(a) are acquired using λ = 1. Topic words displayed in Fig. 3(b)
are results from using λ = 0.6, an optimal value suggested in the lit-
erature (Sievert & Shirley, 2014). The width of the blue bar indicates
the “corpus-wide frequencies of each term,” and the width of the red
bar represents “the topic-specific frequencies of each term” (Sievert &

1 The R script and the data we used for the analysis is available: https://
github.com/lonihagen/Topic-Modeling

L. Hagen, et al. Government Information Quarterly xxx (xxxx) xxx–xxx

3

https://github.com/lonihagen/Topic-Modeling
https://github.com/lonihagen/Topic-Modeling


Shirley, 2014, p. 68). For example, the red bars for “election” and
“clinton” are fully red, with no blue bar showing (in Fig. 3(a)), which
means that these terms are used exclusively in Topic 5, and thus are
highly representative of the Topic 5. When used λ = 0.6 in Fig. 3(b),

these two terms are the first and the second most highly relevant terms
representing the Topic 5. After extracting the 30 topics, labels are se-
lected from the top 10 topic words (except Police & BLM) displayed by
LDAvis (relevance parameter λ = 0.6) and by also considering semantic
meaningfulness.

The size of circles (on the left side of Fig. 3, which shows the global
topic view) “are proportional to the relative prevalence of the topics in
the corpus” (Sievert & Shirley, 2014, p. 68). For example, Topic 1 is
prevalent in about 20% of the corpus, while Topic 21 is prevalent in
about 2% of the corpus according to the circle size displayed in Fig. 3.
The biggest topic and the smallest topics tend to be hard to interpret
because they often include a mixture of different topics according to a
study conducted by Hagen (2016). Also, the distance between topics
indicates the semantic distance of topics. For the usability assessment,
we created a software package which has interactive features (snap-
shots of the package is in Figs. 3, 4, and 5).

In addition to the important topic words, the visualization enables
the representation of relations between topics, and the prevalence of
topics in the entire set of petitions. For example, Fig. 4 shows topic 13,
which is a topic about police brutality and the Black Lives Matter (BLM)
movement (Rickford, 2016). The left pane of Fig. 4 shows topological
positioning of topic 13, which is located close to topics 20 (Http and
China—lacking human rights in China), 6 (Prison Sentence topic), 19
(White Genocide) topics. The right-side pane in Fig. 4 shows the most
relevant words representing the topic: “police,” “officers,” “enforce-
ment,” “officer,” “violence,” “black,” “shot,” “law,” “unarmed,”
“brown,” and “killed.” In addition, when we click the first topic word
“police” for example, we can see other topics that include “police” in
their topic words. For example, Fig. 5 shows that topics 6 (Prison
Sentence topic) and 7 (Terrorism Syria topic) include the term “police”
in topic words. Since the size of topic 6 is bigger in this case, the term
“police” plays more important role to form topic 6 (Prison Sentence
topic) compared to topic 7 (Terrorism Syria topic).

As such, the LDAvis results show contextual richness of topic
modeling results by informing topological position of the topics, and

Fig. 1. An example of an WtP petition.
Note: The first two lines (bold and large font) are the title of the petition, and the rest of the text is the rationale of this petition.

Fig. 2. Framework of the visual analytics of topic modeling.

L. Hagen, et al. Government Information Quarterly xxx (xxxx) xxx–xxx

4


Fig. 3. LDAvis results using λ = 1 (a) and λ = 0.6(b) focused on “clinton” topic.

L. Hagen, et al. Government Information Quarterly xxx (xxxx) xxx–xxx

5


relations of the topic with other closely related topics. Also, the red bar
on the right pane shows the level of importance of each term in the
topic. These added information provided by LDAvis provides a rich
snapshot of public opinions expressed in WtP petitions.

In addition to LDAvis visualizations, we produced two other types of
visualizations. As a way of visualizing the popularity of each topic, we
decided to show signature counts over time (see Fig. 6). Some topics
such as Election Clinton, Police & BLM, and Prison Sentence topics seem to
gain public attention over time. Other topics such as Food Labeling, Guns

Firearms, Marijuana, and Secession topics show overall negative slopes
and thus indicate decreasing levels of attention on these topics. Some
other topics have different behaviors depending on external events.
Police & BLM topic, for example, includes topic words such as “police,”
“law,” “officers,” “violence,” “enforcement,” “officer,” “black,” and
“death.” The majority of petitions representing the topic are critical of
police brutality, especially against African-Americans. Among the top
20 highly relevant petitions to the topic, petitions requesting police
officers to wear body cameras were extremely popular, starting on

Fig. 4. LDAvis results using λ = 0.6(b) focused on “Police & BLM” topic.

Fig. 5. Topics including “police” in topic words.

L. Hagen, et al. Government Information Quarterly xxx (xxxx) xxx–xxx

6


August 13, 2014 right after the Michael Brown case, a black male shot
by a police officer on August 9, 2014.

Similarly, several petitions under the Guns Firearms topic were in-
itiated right after Sandy Hook Elementary School shooting on
December 14, 2012, but the level of public attention to the Guns
Firearms topic (reflected in number of signatures) have been decreasing
ever since (see Guns Firearms topic in Fig. 6 in the second row).

Other information sources such as Google Trends can be used to
compare petition topics and popularities against the keyword search
results of Google Trends (see Fig. 7). Google Trends results can be used
as a proxy to measure what people are thinking (Stephens-Davidowitz,
2017). We selected relevant topic words from a sample of six topics and
searched in Google Trends in the United States (https://trends.google.
com/). The Google Trends results are displayed in the left column, in
contrast to the WtP topics and signature counts displayed in the right
column of Fig. 7.

Some WtP topic popularities seem to correspond to people's
thoughts reflected in Google Trends. For example, the attention paid to

the topics, Marijuana, Guns Firearms, and Secession, have decreased
since they were peaked in 2012 in both Google Trends and the WtP
topics. The Election Clinton, and Police & BLM topics have gained higher
attention in Google Trends as well as in WtP (fourth and fifth rows of
Fig. 7). These results indicate that WtP may reflect the public's attention
to certain topics, and topic modeling results combined with signature
counts can reveal the level of popularity of certain topics. However, due
to the platform specific effect, it would be naïve to think that WtP al-
ways should correctly reflect the public's attention. For example, the
White Anti Genocide topic was extremely popular in 2012 and has de-
creased in popularity on WtP, while making gains in popularity on
Google trend (the last row of Fig. 7). During 2012 and 2013 after
President Obama's reelection, there were organized activities relating to
petition creation and signing on WtP regarding “white genocide” issues
(Hagen, 2016), which has gradually decreased since then. Specific
groups of people were dedicated to spread out the agenda on WtP. As
seen in the Google trend results, the public started paying attention to
this topic much later (since mid-2014) than WtP, according to Google

Fig. 6. Changes of number of signatures per topic by time.

L. Hagen, et al. Government Information Quarterly xxx (xxxx) xxx–xxx

7

https://trends.google.com/
https://trends.google.com/


trends results. These interpretations are merely examples, and were not
provided to the experts. If the visual analytics are effective, we expect
that policy makers can acquire actionable information and insights that
can be used for their policy making.

Note: Y axis of the Google trends results represent search interest
relative to the highest point on the chart for the given region (U.S.) and
time. A value of 100 is the peak popularity for the term. A value of 50
means that the term is half as popular. Y axis of the topic popularity is
log values of signature counts of petitions assigned to the topic.

4.3. Usability assessment

Usability assessments have been used as tools to involve users in the
development of technologies to better understand their needs as well as
forms in which technology can support their work processes (Howell &
Lang, 2017; Rubin, 1994). User-centered approaches to application
development involve the use of tools and methods to help software
developers and analysts improve the usability of their applications. The

International Standard Organization (ISO 9241-11) defines usability as
the extent in which a product –in this case a visualization tool—can
serve the needs of a specific user group. Usability tests are commonly
used to assess information systems. The ISO standard identifies three
main indicators for usability, effectiveness, efficiency, and user sa-
tisfaction (ISO 9241-11). Effectiveness refers to the extent in which the
product features help the user to accomplish the stated goals. Efficiency
is related mainly to the extent in which the product helps the user to
reach these goals with the least possible effort. Finally, user satisfaction
refers to the subjective perception of the user and the interaction with
the product. Nielsen (2012) suggests additional indicators such as
learnability (how easy is to move around the interface), memorability
(how easy is to remember how to use it) and errors (how many errors
people make when interacting with the system). The utility of the
system –providing the features you need—is closely related to usability.
In fact, it is suggested that the usefulness of a system results from
considering both usability and utility (Nielsen, 2012).

We adopted a Heuristic Evaluation approach to usability testing

Fig. 7. Google trends and topic popularity.

L. Hagen, et al. Government Information Quarterly xxx (xxxx) xxx–xxx

8


(Nielsen & Molich, 1990), to assess potential ways in which our vi-
sualizations may support the process of policy making as well as po-
tential improvements. We were mostly interested in understanding the
utility of the visualizations, as well as its learnability and user sa-
tisfaction. In this way, we designed a set of questions with these di-
mensions in mind. We also included questions related to the nature of
their expertise and current positions to better understand their re-
sponses. Finally, we asked them to give suggestions for improvement
and general comments. The interview included 12 questions (see Ap-
pendix II). Consistent with the Heuristic Evaluation Approach, we used
these questions to explore the expert perspective on the visualizations.

We approached 6 experts who were either policy makers, data
analysts, or communications specialists. Although our original plan was
to involve only policy and data analysts, one of them suggested the
inclusion of a communications specialist. Sample size is consistent with
usability testing practices, and experts were selected using a con-
venience sampling (Rubin, 1994).

Usability tests were conducted with each expert individually during
the months of May and June in 2018. Each interview started by asking
experts about their background, experience in data analysis, and per-
ception about social media and petitions sites for policy making. Then,
we introduced 1) interactive LDAvis interface, snapshots of which are
shown in Figs. 3, 4, and 5 2) topics and signature counts by time, shown
in Fig. 6, and 3) Google trends and topic popularity, shown in Fig. 7, to
the interviewees. It is important to note that some visualizations pre-
sented to the interviewees were interactive, allowing them to explore
relationships among topics in the computer, and doing some simple
analysis with the graphs. Each expert had a chance to interact with the
LDAvis interactive visualization tool, as well as the two graphs for
5–10 min. After introducing the visualization tools, we asked experts
about their interpretation about the utility, learnability and satisfaction
of the visualization tools in their daily job. Each interview had a
duration of 45 to 60 min. Five sessions took place in a discussion room
on campus reserved by a member of the research team and the other
one was conducted at the participant's office.

5. Findings: Usability assessment

In this section of the paper, we include the main findings from the
usability assessment of the visualization tools introduced in previous
sections of the paper. Data for the assessment comes from the six face-
to-face interviews with experts in data analysis, policy making, and
communications. Among the experts, three were policy makers from
different levels of public sectors of New York State, including institution
level, district level, and state level. Only one of them had experience of
using data visualizations for policy making. There were two other ex-
perts who were data analysts with a background in information science.
One of them has significant experience in data analysis, algorithm de-
sign, and health informatics, and the other has several years of ex-
perience using data visualization for decision making in the private
sector. We also interviewed a communication specialist from a public
institution, considering her potential in using data visualization for
decision making as a criteria for selection. Table 1 presents an overview
of main responses from experts in the usability assessment. All experts
found at least some topics to be relevant for the policy conversation.
Expert 6 suggested that topics in the interface varied in terms of re-
levance, some of them were more important than others.

We found that experts were able to use the interactive LDAvis in-
terface, and that –in general—their interpretations of the data were
consistent among themselves. In general, experts perceived that it was
easy to interact with the visualization interface and interpret the results
especially with a brief introduction from the interviewer. As it is shown
in Table 1, at least two of them found them less intuitive than the other
experts and harder to interpret. Some of their reactions included
phrases such as “the interface is designed very well, everything is very
clear, I feel comfortable interacting with it,” or “your introduction helps

a lot… for me to understand the interface and to interpret the visuali-
zation.”

They think the tool is potentially helpful for analyzing large
amounts of qualitative data through theme generation, and the data
visualization provides an easier way to communicate with people pos-
sessing different levels of technical proficiency. For example, one
mentioned, “couple years ago, we have received a lot of feedback from
the residents in our district through the survey we sent out, however,
due to a lack of staff and technique, we did not know what to do with it.
Now I can see that this tool will be very helpful with analyzing those
kind of feedback”. Another one explained, “I think this tool will be very
useful to put information into different categories or themes,” and
“Data visualization provides summarized results and present it in a very
vivid way. It is especially good at presenting the trend and the changes
over time.”

Some interviewees without prior experience using visualization,
however, conveyed their struggle: “The data visualization catches my
eyes but I am not sure whether I understand it correctly. Some of the
themes are very self-explanatory, some are not. Maybe because I do not
have enough experience, but I think it is very important someone can
help people to interpret it in a right way.”

Some experts found visualizations over time (see Fig. 6) particularly
interesting, finding different ways of describing them. Some of them
described the trends using phrases such as: “It seems that the search
interest does not match the signatures over time, I don't know why.
Some results are even opposite….” and “Hmmm, it is interesting, the
search interest does not necessarily match the signatures overtime,
which means that people who are interested in search some topic but
may not end up act on it to sign the petition about that topic….”

In addition, most of the experts recognized the utility of LDA tools in
analyzing qualitative data in general, and they also pointed out po-
tential areas of improvement and obstacles for them to implement these
tools in their own practice. For example:

Currently, this tool only focuses on topic extraction. However, as a
policy maker, when we make decision, we mainly focus on under-
standing people's opinion, whether they are for or against some is-
sues. We would also be interested to know what specific issues about
certain topic that people are interested in. For example, the health
care topic, what specific issues people are interested in, do they
support or against it?

A couple more shared concerns are associated with a lack of re-
sources and the need for training in the use of this type of tools. For
example, one expert stated:

I am working in the same office with other legislators, and we share
one analyst. Most of the time, I will conduct research on my own.
For me, I will need some training to be able to use and understand
this tool. Also, we have to consider the budget of the department to
implement this tool, or even hire some technical person to manage
this tool. It is not feasible for my department, at least for now it is
not feasible.

Similarly, another expert shared:

Designing and implementing a visualization tool requires additional
funding, staffs with technical skills, data analytical skill, critical
thinking, reflective ability, communication skills….Training is ne-
cessary, especially for people with no technical background to learn
how to use the tool to help with their daily work.

For one, the actual incorporation of the tool into his daily work was
unclear: “I rarely use data visualization in my own work, I can see its
merit, but I am not sure how to incorporate [it] in my work, maybe in
the future, there will be an opportunity for me to do so.”

Experts provided suggestions for the future improvement of the tool.
Referring to the LDAvis interface, one of them suggested, “I think in the
interface, instead of numbers, adding the labels to each topic will make

L. Hagen, et al. Government Information Quarterly xxx (xxxx) xxx–xxx

9


T
ab

le
1

O
ve

rv
ie
w

of
m
ai
n
re
sp
on

se
s
fr
om

ex
p
er
ts
.

T
op

ic
E
xp

er
t
1

E
xp

er
t
2

E
xp

er
t
3

E
xp

er
t
4

E
xp

er
t
5

E
xp

er
t
6

E
xp

er
t
ba

ck
gr
ou

n
d

Le
gi
sl
at
iv
e
D
ir
ec
to
r
(P
ol
ic
y

A
n
al
ys
t)

P
ri
n
ci
p
al

D
at
a
Sc
ie
n
ti
st

Fo
rm

er
T
ec
h
n
ol
og

y
D
ir
ec
to
r
at

P
w
C

T
ru
st
ee

fo
r
th
e
N
Y
S
H
ig
h
er

E
d
u
ca
ti
on

Se
rv
ic
es

an
al
yz
in
g

p
ol
ic
y
im

p
ac
ts

C
ou

n
ty

Le
gi
sl
at
u
re

R
ep

re
se
n
ta
ti
ve

C
om

m
u
n
ic
at
io
n
s
Sp

ec
ia
li
st
.
U
se
r
of

d
at
a
in

n
ew

s.

R
el
ev

an
ce

of
th
e
to
p
ic
s

O
n
ly

th
e
U
kr
ai
n
e
R
u
ss
ia
,

se
ce
ss
io
n
to
p
ic
s
ar
e
re
le
va

n
t.

So
m
e
ar
e
re
le
va

n
t.

So
m
e
ar
e
re
le
va

n
t.

So
m
e
ar
e
re
le
va

n
t.

M
an

y
to
p
ic
s
ar
e
bi
g
is
su
es

in
re
al

li
fe
.

So
m
e
ar
e
m
or
e
re
le
va

n
t
to

ge
n
er
al

p
u
bl
ic

in
te
re
st

th
an

th
e
ot
h
er
.

In
te
rp
re
ta
bi
li
ty

V
is
u
al
iz
at
io
n
of

to
p
ic
s
ov

er
ti
m
e

ar
e
m
or
e
m
ea
n
in
gf
u
l
th
an

a
si
m
p
le

to
p
ic

m
od

el
in
g

p
re
se
n
ta
ti
on

it
se
lf
.

In
te
re
st
in
g
to
p
ic
s.

So
m
e

to
p
ic

si
gn

at
u
re
s
se
em

to
m
at
ch

th
e
ge

n
er
al

in
te
re
st

ov
er

ti
m
e,

li
ke

C
li
n
to
n
&

el
ec
ti
on

.

Li
ke

s
th
e
d
at
a
vi
su
al
iz
at
io
n
of

si
gn

at
u
re
s
ov

er
ti
m
e,

bu
t
h
e

w
ou

ld
li
ke

to
se
e
it
ac
ro
ss

lo
n
ge

r
ti
m
e
fr
am

e
to

ge
t
a
m
or
e

in
fo
rm

at
io
n
.
So

m
e
to
p
ic
s
ar
e

ve
ry

va
gu

e.

C
h
an

ge
of

th
e
n
u
m
be

r
of

th
e

si
gn

at
u
re
s
m
at
ch

th
e
ch

an
ge

of
th
e
p
u
bl
ic

in
te
re
st

in
p
ol
it
ic
s
in

re
al

li
fe
.

H
e
fe
lt
th
e
re
su
lt
s
ar
e
in
te
re
st
in
g,

bu
t
it
co

u
ld

h
av

e
be

en
li
tt
le

co
n
fu
si
n
g
w
it
h
ou

t
so
m
e
ex
tr
a

ex
p
la
n
at
io
n
s.

T
h
e
se
ar
ch

in
te
re
st

d
o

n
ot

n
ec
es
sa
ri
ly

al
ig
n
w
it
h
th
e

n
u
m
be

r
of

th
e
si
gn

at
u
re
s.

It
ca
tc
h
es

m
y
ey

es
,
bu

t
n
ot

ve
ry

se
lf
-

ex
p
la
n
at
or
y.

M
or
e
in
te
re
st
ed

in
th
e

re
su
lt
s
w
it
h
bi
g
in
cr
ea
se

or
d
ec
re
as
e.

Le
ar
n
ab

il
it
y

V
er
y
u
se
r
fr
ie
n
d
ly
.
If

im
p
le
m
en

te
d
in
to

th
ei
r
d
om

ai
n
,

on
ly

n
ee
d
m
in
or

tr
ai
n
in
g
to

u
se

it
bu

t
n
ee
d
m
or
e
tr
ai
n
in
g
to

u
n
d
er
st
an

d
th
e
m
ec
h
an

is
m

be
h
in
d
it
.

N
ic
e
d
es
ig
n
.
V
er
y
ea
sy

to
in
te
ra
ct

w
it
h
.

E
as
y
to

in
te
ra
ct

w
it
h
it
bu

t
n
ot

cl
ea
r
w
h
at

ar
e
th
e
in
si
gh

ts
th
at

ca
n
be

d
ra
w
n
fr
om

th
es
e
re
su
lt
s.

E
as
y
to

u
n
d
er
st
an

d
an

d
in
te
ra
ct

w
it
h
th
e
in
te
rf
ac
e.

It
w
il
l
be

co
n
fu
si
n
g
w
it
h
ou

t
ex
p
la
n
at
io
n
.

It
is

p
re
tt
y
se
lf
-e
xp

la
n
at
or
y
af
te
r
th
e

in
tr
od

u
ct
io
n
.
It

is
ea
sy

to
in
te
ra
ct

w
it
h
th
e
in
te
rf
ac
e.

U
ti
li
ty

G
oo

d
to

an
al
yz
e
fe
ed

ba
ck

fr
om

re
si
d
en

ts
au

to
m
at
ic
al
ly
.
M
ay

al
so

be
h
el
p
fu
l
to

an
al
yz
e
so
m
e

co
n
tr
ov

er
si
al

is
su
es

(o
n
ly

to
a

ce
rt
ai
n
d
eg

re
e)
.

It
w
il
l
be

u
se
fu
l
to

tr
ac
k

lo
n
gi
tu
d
in
al

ch
an

ge
if
it

ca
n
be

p
ro
ve

d
re
p
re
se
n
ti
n
g
ge

n
er
al

p
u
bl
ic
s.

D
at
a
vi
su
al
iz
at
io
n
h
el
p
ea
si
ly

co
m
m
u
n
ic
at
e
in
si
gh

ts
w
it
h

p
eo

p
le

w
it
h
d
iff
er
en

t
le
ve

ls
of

te
ch

n
ic
al

ba
ck
gr
ou

n
d
s.

It
w
il
l
be

h
el
p
fu
l
to

d
ea
li
n
g

w
it
h
la
rg
e
am

ou
n
t
of

qu
al
it
at
iv
e
d
at
a.

G
oo

gl
e
tr
en

d
re
su
lt
s
m
ay

be
tt
er

re
p
re
se
n
t

ge
n
er
al

p
u
bl
ic

in
te
re
st

ra
th
er

th
an

p
et
it
io
n
si
gn

at
u
re
s.

H
el
p
fu
l
fo
r
el
ec
te
d
offi

ci
al
s
an

d
p
ol
ic
y
m
ak

er
s
to

ge
t
to

kn
ow

th
e

sp
ec
ifi
c
p
eo

p
le
's
co

n
ce
rn

an
d

at
ti
tu
d
e
to
w
ar
d
s
ce
rt
ai
n
is
su
es
.

H
el
p
in
te
ra
ct

w
it
h
d
at
a,

an
d
ea
si
er

to
ex
tr
ac
t
in
fo
rm

at
io
n
fr
om

th
e

vi
su
al
iz
at
io
n
.

Sk
il
ls

n
ee
d
ed

to
be

ab
le

to
p
ro
d
u
ce

or
ap

p
ly

th
es
e
to
ol
s

M
ak

e
su
re

th
at

th
e
d
at
a
ar
e
fr
om

ex
p
er
t
so
u
rc
es

th
at

ca
n
re
p
re
se
n
t

th
e
ge

n
er
al

p
u
bl
ic
s.

N
ee
d
to

be
tr
ai
n
ed

to
u
n
d
er
st
an

d
th
e

m
ec
h
an

ic
s
be

h
in
d
th
e
sc
en

es
.

N
ee
d
so
li
d
te
ch

n
ic
al

sk
il
ls

to
p
u
t
th
in
gs

to
ge

th
er
.

A
ls
o
n
ee
d
te
ch

n
ic
al

tr
ai
n
in
g
to

u
se

th
e
to
ol
s.

H
ow

to
u
se

th
e
to
ol

to
eff

ec
ti
ve

ly
co

m
m
u
n
ic
at
e
w
it
h
cl
ie
n
ts

an
d
le
t

th
em

u
n
d
er
st
an

d
th
e
in
fo
rm

at
io
n

co
n
ta
in
ed

in
th
e
vi
su
al
iz
at
io
n
.

C
ri
ti
ca
l
th
in
ki
n
g,

be
re
fl
ec
ti
ve

,
go

od
at

m
at
h
,c
om

p
u
te
r
sc
ie
n
ce

an
d
te
ch

n
ol
og

y.

D
at
a
an

al
ys
is

sk
il
l,
p
ro
gr
am

m
in
g

sk
il
ls
.

D
at
a
an

al
yt
ic

sk
il
l.

P
ot
en

ti
al

of
so
ci
al

m
ed

ia
to

in
fl
u
en

ce
p
ol
ic
y

co
n
ve

rs
at
io
n
s

So
ci
al

m
ed

ia
an

d
p
et
it
io
n
s
si
te
s

d
efi

n
it
el
y
p
la
y
a
ro
le
,
bu

t
it

ca
n
n
ot

re
p
re
se
n
t
th
e
ge

n
er
al

p
u
bl
ic

be
ca
u
se

of
ac
ce
ss

is
su
es
.

N
ot

su
re

h
ow

m
u
ch

it
w
il
l

h
av

e
im

p
ac
t
on

le
gi
sl
at
u
re
s
or

p
ol
ic
y

m
ak

in
g.

Sk
ep

ti
ca
l
be

ca
u
se

th
ey

on
ly

re
p
re
se
n
t
sm

al
l
gr
ou

p
s
of

p
eo

p
le

w
h
o
ar
e
ei
th
er

fa
r-
le
ft

or
fa
r-

ri
gh

t.

N
ot

su
re

if
so
ci
al

m
ed

ia
an

d
p
et
it
io
n
ca
n
re
p
re
se
n
t
th
e

ge
n
er
al

p
u
bl
ic
.

It
's
go

od
to

co
ll
ec
t
fe
ed

ba
ck

an
d

in
te
ra
ct

w
it
h
p
eo

p
le
.

N
ot

su
re

h
ow

it
w
il
l
aff

ec
t
th
e
p
ol
ic
y

m
ak

in
g,

m
ay

ra
is
e
aw

ar
en

es
s.

P
et
it
io
n
s
w
on

't
n
ec
es
sa
ri
ly

le
ad

to
an

y
ch

an
ge

in
p
ol
ic
y
m
ak

in
g.

In
ap

p
ro
p
ri
at
e
co

n
te
n
ts

of
th
e

p
et
it
io
n
s
m
ak

e
p
eo

p
le

vi
ew

th
em

as
n
on

-h
ig
h
qu

al
it
y
re
fe
re
n
ce
.

L. Hagen, et al. Government Information Quarterly xxx (xxxx) xxx–xxx

10


it easier to see which circle represents which topic.” Another expert
suggested including some measure of people's interest or sentiment
analysis, “For me, I would like to see the tool to generate more in-depth
analysis on what specific aspects related to each topic that people are
interested in, and conduct some sentiment analysis to see what their
attitudes towards these aspects are.” Finally, another expert suggested
including in the visualization information about the validity of the
analysis, “Besides the results of the tool, I would be more interested to
see how to validate the tool.”

Finally, all experts expressed skepticism about how well social
media or online petitions reflect the interests of the general public to
some extent. For example, one of them mentioned, “In my opinion,
online petitions only represent a small group of people who may share a
very extreme idea or has strong motivation to express themselves. It is
difficult to assess to what extent it represents the more general in-
terest.” Another expert also discussed the importance of his local con-
stituents and issues related to online platforms access, “As a policy
maker, I care more about the interest and need of the people in my
district. In my district, most of the people do not participate in online
activities, their opinion may not be shown from these petitions.”

6. Discussion

Using topic modeling and visualization tools, we observed that the
government experts recognize adopting visual analytics tools as a dis-
tant future, rather than current and feasible practices. So, we begin the
discussion by deliberating barriers and suggestions for adopting visual
analytics tools for policy making based on our interview results.

6.1. Barriers, limitations, and suggestions

All experts thought the tools are potentially helpful for analyzing
large amounts of qualitative data by generating themes, and that the
data visualization offers an easier way to communicate with people
with different technical backgrounds. However, the interview data also
identified issues involved with adopting these tools and their corre-
sponding analytics results into their work practices.

Experts stressed that developing “user-centric” tools that support
achieving their goals will be crucial. When it comes to “user-centric,”
previous efforts of providing tools for “users” have mainly assumed
users are citizens or developers (Cisco, 2013; Sahuguet, Krauss,
Palacios, & Sangokoya, 2014). Efforts to develop tools with government
practitioners as “users” have been lacking. Government practitioners
are bounded by structure, rules, regulations, and limited resources,
which makes tool development and implementation often difficult.

Our tool is specifically designed for government practitioners by
adopting no-cost, open source tools in order to address resource con-
straints issues. Even with the open source tools, the experts identified
that a lack of skills are the major barriers for them adopting visual
analytics tools. Experts stated that training and some level of guidance
on interpreting LDA analytics results will be necessary for them to
adopt these results for policy making. In fact, experts stated that the
minimal level of introductory training provided to them during the
study regarding the LDAvis tool was very helpful in interpreting the
LDA results. This is in line with a previous study's findings, which
stressed the importance of training to increase confidence of data users
(Gascó-Hernández et al., 2018).

Interestingly though, when it comes to implementing the tools in
their practices, experts assume that the new tools should work while
keeping the current work practices uninterrupted. A policy maker
working in the legislative field stated that he relies on document review
and door-to-door visits to collect feedback, which information he re-
ferences for agenda setting activities. And, he stated that new tools such
as LDA analytics are not relevant to his work because it does not fit into
his current work practice. When it comes to implementing new tools,
therefore, it would be helpful to assess current work practices, and to

include a feedback loop so that newly adopted tools factor into current
practices to bring improvement in work practices, rather than being
regarded as a disruption. Ostensibly, this view may vary across levels of
government and the perceived access to technology by constituents.

6.2. Higher bars for adopting information acquired by data-driven analysis
for policy making

When the visual analytics results were presented (without providing
our interpretations), the experts responded with mixed responses in
terms of interpretability of the visual analytics results. While all the
experts were able to make sense out of the LDAvis presentations, which
we thought was promising, they were split on interpreting the signature
trends and the comparison results with Google Trends. This is see-
mingly because interpretation of these additional visual analysis results
requires a technical understanding and contextual knowledge of plat-
form specific effects.

Experts tended to expect that WtP should represent the entire
public's opinion to add value to the policy process, and based on this
expectation, some of them concluded the analysis was not useful for
their decision making because these results cannot be generalizable.
Interestingly, when asked about the usual ways of introducing topics
into the legislative or policy agenda, experts suggested pathways in-
volving only one or two simple pipelines. Each expert identified only
one or two ways that lead to agenda setting in their offices, which are
based on letters written by residents, issues people talk about, stake-
holder's concern, or reflecting an institution's priorities and subsequent
discussions. That means, although experts also depend on a single path
to agenda setting, when the analysis results are produced based on one
or two platform(s) and computational methods, they raise the bar to
conclude the results are not usable for their policy making because the
data and analysis results are not generalizable (which we do not claim it
can be generalizable).

Considering policy makers' higher expectation for information ex-
tracted via a data-driven process, visual analytics should consider in-
cluding multiple data sources for conducting analyses. This way, di-
verse pathways can be produced that can be helpful for agenda setting
and are not bounded by one specific environment. To clarify, contextual
information attached to data are still important for policy making. What
we need to be careful about in analytics tool development is under-
standing the extent we can deliver information by reflecting the con-
textual basis for that information.

6.3. Implications on tool development

As previous studies have suggested, making good quality open da-
tasets available would be a good start for open data initiatives, but
analytics tools provided alongside the datasets help create immediate
benefits by extracting useful information from the data. In fact, some
U.S. open data sites provide tools for visualization. For example, New
York City, San Francisco, and Orlando, among many other major cities,
provide interactive visualizations through private vendors.
Unfortunately, any analytical tool that also enables textual analysis is
not yet available in these platforms.

Our study has implications for tool development so that engineers
can develop usable and useful tools for government practitioners using
open data. We demonstrated that our topic modeling analytics and vi-
sualizations could be useful for policy making when there are large
volumes of text data. In order for the LDA results and visualizations to
be useful for decision making and agenda setting, government practi-
tioners wanted to see more granular information regarding each topic.
Specifically, experts suggested that knowing more granular levels of
issues than topic level and public attitude expressed in each topic would
be highly valuable for making decisions based on the LDA analytics
results. For example, as stated above, one expert suggested that repla-
cing numbers with labels will be more useful for understanding topics

L. Hagen, et al. Government Information Quarterly xxx (xxxx) xxx–xxx

11


at a glance – a move that would make the tool more user-friendly. All in
all, the study highlights the importance of user (in this case, govern-
ment practitioners) engagement in tool development process.

7. Conclusion

In this paper, we extend open data research by suggesting a process
to extract and visualize textual big data in order to make sense of it.
LDA topic modeling was used to extract emerging topics from petitions,
and visualization tools such as LDAvis were used for visual presenta-
tions of the topics. Then, we interviewed 6 experts to assess the us-
ability of the prototype visualizations as well as to gather more general
impressions of their potential value for policy making.

The interview results of the visual analytics tools show that the
experts were positive about the usability of the analytics results and
tools regardless of their technical experience. Still, experts had overall
high standards for usability and usefulness. While acknowledging the
potential of these tools they also desired to maintain their current
practices for setting policy agenda. In addition, experts expressed that a
lack of resources and training are major barriers for adopting such
tools.

Visual analytics tools have evolved so that practitioners, even those
who are not big data scientists or engineers, can use these techniques to
extract useful and actionable information (Marr, 2018). Our results
suggest that achieving tangible benefits from using open data for gov-
ernment policy making through innovative tools and techniques may
require overcoming major barriers. Nonetheless, involving policy ma-
kers as well as policy analysts in the process of tool building and ana-
lytics may provides insights and lessons for the continued adoption of
visual analytics for policy making.

This study contributes to the open data literature by producing and
testing possible solutions to extract useful information from text data
using visual analytics and LDA topic modeling. We expect that these
solutions may offer insights to government practitioners as well as
scholars of e-government. These possible solutions can be used to
convince and motivate other policy makers and to encourage and in-
spire others to participate in the open data movement. This study also
contributes to the policy and data analytics literature by applying topic
modeling, an automatic topic extraction method, for policy making.
Hagen (2018) demonstrated the process, validity, and evaluation of
topic modeling using a WtP data set. Hagen called for more case studies
using topic modeling with additional datasets in order to establish the
validity of adopting unsupervised learning methods. Compared to
Hagen (2018), we produced similar topics using a bigger data set, and
captured new topics reflecting important issues during 2015–2016,
such as the U.S. Presidential election and the Black Lives Matter
movement. Our study also validates the stability of mallet topic mod-
eling for extracting interpretable opinions.

Some limitations should be noted. The LDA topic modeling we
adopted for the study treats words as discrete entities, which is called
bag-of-words representation, which does not capture the full meaning
of the text. This is considered as one of the weaknesses of LDA models.
More advanced topic modeling methods could potentially increase
quality and interpretability of the topics. Studies show that including
semantic information in topic modeling can improve topic quality

(Batmanghelich, Saeedi, Narasimhan, & Gershman, 2016). Also, putting
higher weights on named entities such as person name, location name,
and events can improve interpretability and usability of topics
(Krasnashchok & Jouili, 2018; Lau, Baldwin, & Newman, 2013). Topic
modeling is an unsupervised machine learning methods, which is de-
vised to enhance human decision making. Therefore, rigorous vetting of
interpretability and utility are extremely important. So, some recent
studies showed that incorporating user feedback in the topic modeling
process can improve the interpretability and usefulness of the topics
(Feng & Boyd-Graber, 2019; Kumar, Smith-Renner, Findlater, Seppi, &
Boyd-Graber, 2019).

In the future, we plan to adopt more advanced topic modeling tools
to enhance interpretability of the topics, and also to analyze attitude
and sentiment associated with each topic, as the experts suggested.
Further, we are also interested in studies on training programs to fa-
cilitate open data use for value creation using analytical tools. Future
research will benefit from more domain-specific tool development and
from including policy makers in the tool development process. In this
way, the application will be tailored to the needs of users, and both
usability and value will be augmented.

Loni Hagen is an assistant professor at the University of South
Florida's School of Information. Her current research interests are in use
of computational methods to extract actionable information from open
data for data-driven policy making. Her research domains include e-
participation, emergency communication, privacy, and cybersecurity.

Thomas E. Keller is a research scientist affiliated with Research
Computing and the Genomics Program at the University of South
Florida. His current interests are in data relating to text analysis using
open data and social network analysis as well as computational and
evolutionary biology with epigenomics and deep learning.

Luis Felipe Luna-Reyes is an Associate Professor in the Department
of Public Administration and Policy. He has been a Fulbright Scholar
and he is currently Faculty Fellow at the Center for Technology in
Government. He is also a Research Affiliated at the Universidad de las
Americas, Puebla and a member of the Mexican National Research
System. His research is at the intersection of Public Administration,
Information Systems and Systems Sciences. He uses multi-method ap-
proaches to contribute to a better understanding of collaboration and
governance processes in the development of information technologies
across functional and organizational boundaries in government. He is
the author or co-author of more than 100 articles published in leading
Journals and Academic Conferences.

Xiaoyi Zhao is a PhD candidate in Information Science PhD pro-
gram at the University at Albany, College of Emergency Preparedness,
Homeland Security and Cybersecurity. Her current research interests
are exploring the utilization and impact of open data using mixed
methods including quantitative and qualitative data analysis and
system dynamic modeling

Acknowledgements

Loni Hagen was supported by the National Research Foundation of
Korea Grant funded by the Korean Government (NRF-
2017S1A3A2066084).

Appendix I: 30 topics

Table A1
LDA-topics, labels and topic quality.

Topic ID Label Topic words

1 People people time make country american stop government states
2 President Obama** president obama congress states united petition act administration
3 Tax Budget* tax federal pay government money dollars budget employees

(continued on next page)

L. Hagen, et al. Government Information Quarterly xxx (xxxx) xxx–xxx

12


Table A1 (continued)

Topic ID Label Topic words

4 Cancer Disease** health care cancer disease research medical treatment patients
5 Election Clinton** vote investigation election clinton investigate people federal party
6 Prison Sentence** justice years prison case life trial court release
7 Terrorism Syria** war terrorist people stop government terrorism genocide syria
8 Guns Firearms** law amendment gun rights states laws ban weapons
9 Children Gender children child women sex law sexual parents rights
10 Religion* rights government religious human freedom god religion church
11 National Holiday* day national american house holiday white awareness world
12 Water Park Energy** water national energy park land oil areas gas
13 Police & BLM ** police law officers violence enforcement officer black death
14 Internet Companies* internet service information access companies small business government
15 Students School Education** students school education schools student public children college
16 Ukraine Russia* ukraine russian russia puerto sanctions japan ukrainian rico
17 Visa Immigration** visa immigration united states status family green home
18 Military Veterans** military service members veterans soldiers war army forces
19 White Anti Genocide** white anti genocide countries whites racist word code
20 Http & China* http www org chinese people human china world
21 Animal* animals animal dogs wild hong dog kong horses
22 Secession* states united government state america people powers nature
23 Vehicle & FAA** vehicles safety vehicle faa aircraft air cars flight
24 Medal Award* medal honor freedom award presidential game team american
25 Food Labeling** food fda products foods health safe labeling ban
26 Marijuana** marijuana drug cannabis medical schedule hemp states substances
27 Ebola & TPP* ebola trans media trump trade partnership people protect
28 FDA & Blood fda blood life india drug sri sikhs drugs
29 Mcllellan mcclellan act iran veterans toxic nuclear congress health
30 Charly Wingate charly robbery pardon vietnam max retrial wingate circumcision

Based on the topics and visualization results, human coders put labels following the guideline reported in 3.2 and also judged the quality of each topic. Table A1
shows the label, topic quality (indicated by number of asterisks), and eight topic words for each of the 30 topics extracted from the petition data. Asterisks in the
“Label” column in Table A1 indicate topic quality judged by a human annotator; ‘**’ indicates “good quality,” ‘*’ indicates “fair quality,” and no asterisk indicates
“poor quality” topics.

Appendix II: Interview Questions (IRB approved)

Six to eight sampled policy analysts from capital region of New York State will evaluate the practical usefulness of the text mining tools developed
by the researchers. We will come to the interviewees' work places and interview them individually. We will prepare three sets of electronic in-
struments: 1) the interactive software loaded with the visualization results,2) one electronic file containing 30 graphs reflecting topics and petition
signatures, and 3) another electronic file containing 12 images showing Google Trends and topics in two columns and six rows. The participants will
be instructed about data and data mining tools used to create the visualization and presented images, then will be requested to investigate all of
them. Any questions will be answered by the investigator(s). After the participants are finished with the investigation, they will be prompted to
answer to the questionnaire. The participants will be given as long as necessary to complete the investigation. The questionnaire includes the
following questions:

1. What kind of analysis do you do as everyday practice?
2. How things get into the conversation in regards with legislative or policy agenda?
3. What do you think about using social media and petition sites for possible agenda for legislatures or more in general to establish policy?
4. What is your general perception of the relevance of these topics for current legislative and policy agenda?
5. What is your interpretation of these results?
6. How user friendly are these tools from your point of view and experience?
7. How useful do you think this tool and images would be for your work and practice?
8. What would you do to improve the tool and make it more helpful for your practice?
9. How well do you think that these images and analyses represent the interests of the general public?

10. What do you think about skills needed to be able to produce/apply these tools?
11. Do you feel comfortable about applying similar technologies in your work? What kind of skills would be needed to be able to do this?
12. Is there any other relevant topics you would like to discuss or any other thing you want to mention that was not covered in our questions?

References

Aitamurto, T. (2012). Crowdsourcing for democracy: A new era in policy-making. Parliament
of Finland.

Attard, J., Orlandi, F., Scerri, S., & Auer, S. (2015). A systematic review of open gov-
ernment data initiatives. Government Information Quarterly, 32(4), 399–418. https://
doi.org/10.1016/j.giq.2015.07.006.

Bannister, F., & Connolly, R. (2014). ICT, public values and transformative government: A
framework and programme for research. Government Information Quarterly, 31(1),
119–128. https://doi.org/10.1016/j.giq.2013.06.002.

Batmanghelich, K., Saeedi, A., Narasimhan, K., & Gershman, S. (2016). Nonparametric

spherical topic modeling with word Embeddings. Proceedings of the Conference.
Association for Computational Linguistics. Meeting. 2016. Proceedings of the Conference.
Association for Computational Linguistics. Meeting (pp. 537–542). . https://doi.org/10.
18653/v1/P16-2087.

Bergh, C., & Benghiat, G. (2017). Analytics at Amazon speed: The new normal. Business
Intelligence Journal, 22(2), 46–54.

Bertot, J. C., Butler, B. S., & Travis, D. M. (2014). Local big data: The role of libraries in
building community data infrastructures. Proceedings of the 15th annual international
conference on digital government research (Dg.o 2014) (pp. 17–23). . https://doi.org/
10.1145/2612733.2612762.

Blei, D. M. (2012). Probabilistic topic models. Communications of the ACM, 55(4), 77–84.
Boyd-Graber, J., Mimno, D., & Newman, D. (2014). Care and feeding of topic models:

L. Hagen, et al. Government Information Quarterly xxx (xxxx) xxx–xxx

13

http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0010
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0010
https://doi.org/10.1016/j.giq.2015.07.006
https://doi.org/10.1016/j.giq.2015.07.006
https://doi.org/10.1016/j.giq.2013.06.002
https://doi.org/10.18653/v1/P16-2087
https://doi.org/10.18653/v1/P16-2087
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0035
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0035
https://doi.org/10.1145/2612733.2612762
https://doi.org/10.1145/2612733.2612762
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0045
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0050


Problems, diagnostics, and improvements. In E. M. Airoldi, D. Blei, E. A. Erosheva, &
S. E. Fienberg (Eds.). Handbook of mixed membership models and their applications (pp.
225–255). Boca Raton, FL: CRC Press.

Cassi, L., Lahatte, A., Rafols, I., Sautier, P., & de Turckheim, É. (2017). Improving fitness:
Mapping research priorities against societal needs on obesity. Journal of Informetrics,
11(4), 1095–1113. https://doi.org/10.1016/j.joi.2017.09.010.

Chatfield, A. T., & Reddick, C. G. (2017). A longitudinal cross-sector analysis of open data
portal service capability: The case of Australian local governments. Government
Information Quarterly, 34(2), 231–243. https://doi.org/10.1016/j.giq.2017.02.004.

Cisco. (2013). The internet of everything for cities. Retrieved from https://www.cisco.
com/c/dam/en_us/solutions/industries/docs/gov/everything-for-cities. pdf.

Dawes, S. S., Pardo, T. A., & Cresswell, A. M. (2004). Designing electronic government
information access programs: A holistic approach. Government Information Quarterly,
21(1), 3–23.

Feng, S., & Boyd-Graber, J. (2019). What can AI do for me?: Evaluating machine learning
interpretations in cooperative play. Proceedings of the 24th international conference on
intelligent user interfaces - IUI ‘19 (pp. 229–239). . https://doi.org/10.1145/3301275.
3302265.

Gascó-Hernández, M., Martin, E. G., Reggi, L., Pyo, S., & Luna-Reyes, L. F. (2018).
Promoting the use of open government data: Cases of training and engagement.
Government Information Quarterly, 35(2), 233–242. https://doi.org/10.1016/j.giq.
2018.01.003.

Graves, A., & Hendler, J. (2013). Visualization tools for open government data.
Proceedings of the 14th annual international conference on digital government research
(pp. 136–145). .

Hagen, L. (2016). Topic modeling for e-petition analysis: Interpreting petitioners' policy prio-
rities (Ph.D.)United States – New York: State University of New York at Albany.

Hagen, L. (2018). Content analysis of e-petitions with topic modeling: How to train and
evaluate LDA models? Information Processing and Management, 54(6), 1292–1307.
https://doi.org/10.1016/j.ipm.2018.05.006.

Hagen, L., Harrison, T. M., & Dumas, C. L. (2018). Data analytics for policy informatics:
The case of E-petitioning. Policy analytics, modelling, and informatics (pp. 205–224).
Cham: Springer. https://doi.org/10.1007/978-3-319-61762-6_9.

Hagen, L., Uzuner, O., Kotfila, C., Harrison, T. M., & Lamanna, D. (2015). Understanding
Citizens' direct policy suggestions to the Federal Government: A natural language
processing and topic modeling approach. 2015 48th Hawaii International Conference
on System Sciences (HICSS) (pp. 2134–2143). . https://doi.org/10.1109/HICSS.2015.
257.

Howell, E., & Lang, J. (2017). Researching UX: User research. VIC Australia: Sitepoint Pty
Ltd.

Janssen, M., Charalabidis, Y., & Zuiderwijk, A. (2012). Benefits, adoption barriers and
myths of open data and open government. Information Systems Management, 29(4),
258–268. https://doi.org/10.1080/10580530.2012.716740.

Janssen, M., & Helbig, N. (2019). In PressInnovating and changing the policy-cycle: Policy-
makers be prepared!. Government Information Quarterlyhttps://doi.org/10.1016/j.
giq.2015.11.009.

Krasnashchok, K., & Jouili, S. (2018). Improving topic quality by promoting named en-
tities in topic modeling. Proceedings of the 56th annual meeting of the Association for
Computational Linguistics. Vol. 2. Proceedings of the 56th annual meeting of the
Association for Computational Linguistics (pp. 247–253). . Short Papers. Retrieved from
https://www.aclweb.org/anthology/P18-2040.

Kumar, V., Smith-Renner, A., Findlater, L., Seppi, K., & Boyd-Graber, J. (2019). Why
Didn't you listen to me? Comparing user control of human-in-the-loop topic models.
ArXiv:1905.09864 [Cs]. Retrieved from http://arxiv.org/abs/1905.09864.

Lau, J. H., Baldwin, T., & Newman, D. (2013). On collocations and topic models. ACM
Transactions on Speech and Language Processing, 10(3), 10:1–10:14. https://doi.org/
10.1145/2483969.2483972.

Luna-Reyes, L. F. (2017). Opportunities and challenges for digital governance in a world
of digital participation. Information Polity, 22(2–3), 197–205. https://doi.org/10.
3233/IP-170408.

Luna-Reyes, L. F., & Gil-Garcia, J. R. (2014). Digital government transformation and in-
ternet portals: The co-evolution of technology, organizations, and institutions.
Government Information Quarterly, 31(4), 545–555. https://doi.org/10.1016/j.giq.
2014.08.001.

Magalhaes, G., & Roseira, C. (2017). Open government data and the private sector: An
empirical view on business models and value creation. Government Information
Quarterly. https://doi.org/10.1016/j.giq.2017.08.004.

Marr, B. (2018, June 20). Comparing data visualization software: Here are the 7 best tools
for 2018. Forbes. Retrieved from https://www.forbes.com/sites/bernardmarr/2018/
06/20/comparing-data-visualization-software-here-are-the-7-best-tools-for-2018/.

Mergel, I., Kleibrink, A., & Sörvik, J. (2018). Open data outcomes: U.S. cities between
product and process innovation. Government Information Quarterly, 35(4), 622–632.
https://doi.org/10.1016/j.giq.2018.09.004.

Mimno, D. (2013). Package “mallet”. Retrieved October 9, 2015, from https://cran.r-
project.org/web/packages/mallet/mallet.pdf.

Najafabadi, M. M., & Luna-Reyes, L. F. (2017). Open government data ecosystems: A
closed-loop perspective. Proceedings of the 50th Hawaii international conference on
system science (HICSS-50) (pp. 2711–2720). .

Nam, T. (2012). Citizens' attitudes toward Open Government and Government 2.0.
International Review of Administrative Sciences, 78(2), 346–368. https://doi.org/10.
1177/0020852312438783.

Nielsen, J. (2012). Usability 101: Introduction to usability. Nielsen Norman Group.
Retrieved from http://www. nngroup.com/articles/usability-101-introduction-to-
usability/on 2018-11-20.

Nielsen, J., & Molich, R. (1990). Heuristic evaluation of user interfaces. CHI ‘90 pro-
ceedings of the SIGCHI conference on human factors in computing systemshttps://doi.
org/10.1145/97243.97281.

Norris, D. F., & Reddick, C. G. (2013). Local E-Government in the United States:
Transformation or incremental change? Public Administration Review, 73(1), 165–175.
https://doi.org/10.1111/j.1540-6210.2012.02647.x.

Oracle (2017). MySQL. Retrieved November 13, 2017, from https://www.mysql.com/.
Poucke, S. V., Zhang, Z., Schmitz, M., Vukicevic, M., Laenen, M. V., Celi, L. A., & Deyne,

C. D. (2016). Scalable predictive analysis in critically ill patients using a visual open
data analysis platform. PLoS One, 11(1), e0145791. https://doi.org/10.1371/journal.
pone.0145791.

Puron-Cid, G., Gil-Garcia, J. R., & Luna-Reyes, L. F. (2016). Opportunities and challenges
of policy informatics: Tackling complex problems through the combination of open
data, technology and analytics. International Journal of Public Administration in the
Digital Age, 3(2), 66–85.

rapidminer (2017). Lightning fast unified data science platform. Retrieved November 16,
2017, from https://rapidminer.com/products/.

Reddick, C. G., Chatfield, A. T., & Ojo, A. (2017). A social media text analytics framework
for double-loop learning for citizen-centric public services: A case study of a local
government Facebook use. Government Information Quarterly, 34(1), 110–125.
https://doi.org/10.1016/j.giq.2016.11.001.

Rickford, R. (2016). Black lives matter: Toward a modern practice of mass struggle. New
Labor Forum, 25(1), 34–42. https://doi.org/10.1177/1095796015620171.

Rubin, J. (1994). Handbook of usability testing: How to plan, design, and conduct effective
tests (1st ed.). Wiley.

Sahuguet, A., Krauss, J., Palacios, L., & Sangokoya, D. (2014). Open civic data: Of the
people, for the people, by the people. IEEE Data Engineering Bulletin, 37(4), 15–26.

Siegel, E. (2016). Predictive analytics: The power to predict who will click, buy, lie, or die (2nd
ed.). Wiley.

Sievert, C., & Shirley, K. E. (2014). LDAvis: A method for visualizing and interpreting
topics. Proceedings of the workshop on interactive language learning, visualization, and
interfaces (pp. 63–70). .

Sivarajah, U., Weerakkody, V., Waller, P., Lee, H., Irani, Z., Choi, Y., & Glikman, Y.
(2016). The role of e-participation and open data in evidence-based policy decision
making in local government. Journal of Organizational Computing and Electronic
Commerce, 26(1–2), 64–79.

Stephens-Davidowitz, S. (2017). Everybody lies: Big data, new data, and what the internet
can tell us about who we really are. New York, NY: Dey Street Books.

The White House. (2011). September 20Opening remarks by President Obama on open
government partnership. Retrieved May 16, 2015, from https://www.whitehouse.
gov/node/78625

The White House (2015). The open government partnership: Third open government
National Action Plan for the United States of America. Retrieved from http://www.
whitehouse.gov/blog/2013/12/06/united-states-releases-its-second-open-
government-national-action-plan.

The White House (2017). For developers: We the people API. (Retrieved November 3, 2017,
from /developers).

Toots, M., McBride, K., Kalvet, T., & Krimmer, R. (2017). Open data as enabler of public
service co-creation: Exploring the drivers and barriers. 2017 Conference for E-
Democracy and Open Government (CeDEM) (pp. 102–112). . https://doi.org/10.1109/
CeDEM.2017.12.

Treacy, M., & O'Sullivan, J. (2010). e-Government and organisational transformation – A
perspective from the property registration Authority of Ireland. Presented at the 10th
European conference on e-government (ECEG 2010) (pp. 400–408). .

Ubaldi, B. (2013). Open government data: Towards empirical analysis of open government
data initiatives. Paris: OECD Working Papers on Public Governance (22) 0_1,1,4-60.

Walters, L. C., Aydelotte, J., & Miller, J. (2000). Putting more public in policy analysis.
Public Administration Review, 60(4), 349–359. https://doi.org/10.1111/0033-3352.
00097.

Yalçın, M. A., Elmqvist, N., & Bederson, B. B. (2016). Keshif: Out-of-the-box visual and
interactive data exploration environment - semantic scholar. In: Proceedings of the
IEEE VIS 2016 workshop on visualization in practice: Open source visualization and visual
analytics software (Retrieved from /paper/Keshif-Out-of-the-Box-Visual-and-
Interactive-Data-Yalçın-Elmqvist/4364b7bb4f731bef0c9f22067691fefa42d85c93).

Zuiderwijk, A., Helbig, N., Gil-Garcia, R. J., & Janssen, M. (2014). Special issue on in-
novation through open data - a review of the state-of-the-art and an emerging re-
search agenda: Guest Editors' introduction. Journal of Theoretical & Applied Electronic
Commerce Research, 9(2), I–XIII. https://doi.org/10.4067/S0718-
18762014000200001.

Zuiderwijk, A., Janssen, M., Choenni, S., Meijer, R., & Alibaks, R. S. (2012). Socio-tech-
nical impediments of open data. Electronic Journal of Electronic Government, 10(2),
156–172.

L. Hagen, et al. Government Information Quarterly xxx (xxxx) xxx–xxx

14

http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0050
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0050
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0050
https://doi.org/10.1016/j.joi.2017.09.010
https://doi.org/10.1016/j.giq.2017.02.004
https://www.cisco.com/c/dam/en_us/solutions/industries/docs/gov/everything-for-cities
https://www.cisco.com/c/dam/en_us/solutions/industries/docs/gov/everything-for-cities
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0065
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0065
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0065
https://doi.org/10.1145/3301275.3302265
https://doi.org/10.1145/3301275.3302265
https://doi.org/10.1016/j.giq.2018.01.003
https://doi.org/10.1016/j.giq.2018.01.003
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0080
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0080
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0080
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0085
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0085
https://doi.org/10.1016/j.ipm.2018.05.006
https://doi.org/10.1007/978-3-319-61762-6_9
https://doi.org/10.1109/HICSS.2015.257
https://doi.org/10.1109/HICSS.2015.257
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0105
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0105
https://doi.org/10.1080/10580530.2012.716740
https://doi.org/10.1016/j.giq.2015.11.009
https://doi.org/10.1016/j.giq.2015.11.009
https://www.aclweb.org/anthology/P18-2040
http://arxiv.org/abs/1905.09864
https://doi.org/10.1145/2483969.2483972
https://doi.org/10.1145/2483969.2483972
https://doi.org/10.3233/IP-170408
https://doi.org/10.3233/IP-170408
https://doi.org/10.1016/j.giq.2014.08.001
https://doi.org/10.1016/j.giq.2014.08.001
https://doi.org/10.1016/j.giq.2017.08.004
https://www.forbes.com/sites/bernardmarr/2018/06/20/comparing-data-visualization-software-here-are-the-7-best-tools-for-2018/
https://www.forbes.com/sites/bernardmarr/2018/06/20/comparing-data-visualization-software-here-are-the-7-best-tools-for-2018/
https://doi.org/10.1016/j.giq.2018.09.004
https://cran.r-project.org/web/packages/mallet/mallet.pdf
https://cran.r-project.org/web/packages/mallet/mallet.pdf
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0170
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0170
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0170
https://doi.org/10.1177/0020852312438783
https://doi.org/10.1177/0020852312438783
https://doi.org/10.1145/97243.97281
https://doi.org/10.1145/97243.97281
https://doi.org/10.1111/j.1540-6210.2012.02647.x
https://www.mysql.com/
https://doi.org/10.1371/journal.pone.0145791
https://doi.org/10.1371/journal.pone.0145791
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0200
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0200
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0200
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0200
https://rapidminer.com/products/
https://doi.org/10.1016/j.giq.2016.11.001
https://doi.org/10.1177/1095796015620171
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0220
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0220
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0225
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0225
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0235
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0235
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0240
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0240
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0240
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0245
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0245
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0245
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0245
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0255
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0255
https://www.whitehouse.gov/node/78625
https://www.whitehouse.gov/node/78625
http://www.whitehouse.gov/blog/2013/12/06/united-states-releases-its-second-open-government-national-action-plan
http://www.whitehouse.gov/blog/2013/12/06/united-states-releases-its-second-open-government-national-action-plan
http://www.whitehouse.gov/blog/2013/12/06/united-states-releases-its-second-open-government-national-action-plan
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0265
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0265
https://doi.org/10.1109/CeDEM.2017.12
https://doi.org/10.1109/CeDEM.2017.12
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0275
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0275
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0275
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0280
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0280
https://doi.org/10.1111/0033-3352.00097
https://doi.org/10.1111/0033-3352.00097
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0295
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0295
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0295
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0295
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0295
https://doi.org/10.4067/S0718-18762014000200001
https://doi.org/10.4067/S0718-18762014000200001
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0320
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0320
http://refhub.elsevier.com/S0740-624X(18)30368-X/rf0320

	University of South Florida
	From the SelectedWorks of Loni Hagen
	2019

	Open data visualizations and analytics as tools for policy-making
	Open data visualizations and analytics as tools for policy-making
	Introduction
	Background: We the people open data
	Literature review
	Analytics to create value through open data
	Visualization of topic modeling

	Methods
	Data
	Tools for assessing and visualizing data1
	Usability assessment

	Findings: Usability assessment
	Discussion
	Barriers, limitations, and suggestions
	Higher bars for adopting information acquired by data-driven analysis for policy making
	Implications on tool development

	Conclusion
	Acknowledgements
	Appendix I: 30 topics
	Appendix II: Interview Questions (IRB approved)
	References