06 April 2021

AperTO - Archivio Istituzionale Open Access dell'Università di Torino

Original Citation:

Leveraging social semantic components in executable environments for learning

Published version:

DOI:10.1111/exsy.12044

Terms of use:

Open Access

(Article begins on next page)

Anyone can freely access the full text of works made available as "Open Access". Works made available under a
Creative Commons license can be used according to the terms and conditions of said license. Use of all other works
requires consent of the right holder (author or publisher) if not exempted from copyright protection by the applicable law.

Availability:

This is the author's manuscript

This version is available http://hdl.handle.net/2318/143214 since 2016-06-29T12:17:04Z


Leveraging social semantic components in executable environments for learning

Rossana Damiano
Dipartimento di Informatica and CIRMA

Università di Torino - Italy
rossana@di.unito.it

Cristina Gena
Dipartimento di Informatica and CIRMA

Università di Torino - Italy
vincenzo@di.unito.it

Vincenzo Lombardo
Dipartimento di Informatica and CIRMA

Università di Torino - Italy
VRMMP - Torino - Italy

cgena@di.unito.it

Abstract

Learning can benefit from the modern web structure through the
convergence of top–down encyclopedic institutional knowledge
and bottom–up user–generated annotations. A promising approach
to such convergence consists in leveraging the social functionali-
ties in 3.0 executable environments through the recommendation of
tags with the mediation of lexical and semantic resources.

This paper addresses such issues through the design and evaluation
of a tag recommendation system in a Web 3.0 web portal, “150
Digit”. Designed for schools, “150 Digit” encourages students and
teachers to interact with a set of four exhibitions on the histori-
cal and social aspects of the Italian unification process in a virtual
environment. The web site displays the exhibits and their related
documents promoting the users’ active participation through tag-
ging, voting, and commenting the exhibits. Tags become a way for
students to create and explore new relations among the site con-
tents, orthogonal to the institutional viewpoint. In this paper, we
illustrate the recommendation strategy incorporated in “150 Digit”,
which relies on a semantic middleware to mediate between the input
expressed by the users through tags and the top-down institutional
classification provided by the curators of the exhibitions. Following
on, we describe the evaluation process conducted in a real experi-
mental setting, and discuss the evaluation results and their implica-
tions for learning environments.

Keywords: tag recommendation, Web 3.0, evaluation, learning
environments

1 Introduction

According to W.L. Hosch’s definition , Web 1.0 can be described
– using an analogy to file system permissions – as “read-only”,
Web 2.0 as “read-write” and Web 3.0 as “read-write-execute”.1 Fol-
lowing the Web 3.0 principle of executability, the “150 Digit” web
portal (http://www.150digit.it) has been designed with the goal of
creating a virtual environment where schools interact with the ex-
hibitions that celebrate the 150th anniversary of the Unification of
Italy. In 150 Digit, a 3D reconstruction allows users to visit the
exhibitions; and encourages them to be an active part of the site
community by tagging, voting, and commenting the exhibits, and
by uploading new contents. The site contents include both the ex-
hibits, with their related documents (such as the curators notes), and
user–generated contents, such as multimedia presentations created
by teachers and students on the Unification of Italy.

As a result of the user centered methodology on which the site was
designed, tagging emerged as a primary issue starting from the de-

1http://www.britannica.com/blogs/2007/07/web-30-the-dreamer-of-the-
vine/

sign phase, in the focus groups organized with the teachers involved
in the project. Since the pioneering work by [Bateman et al. 2007]
tags have been pointed out as an important resource in learning:
“The information provided by tags provides insight on learner’s
comprehension and activity” (p. 1). Being primarily targeted at
educational users, tags play a two-fold role in 150 Digit: on one
side the tagging activity, as stressed by the focus groups partici-
pants, is part of the educational processes and promotes the linguis-
tic reflection of the students over the site contents; on the other, tags
complement the institutional stance on the themes covered by the
exhibitions, mirroring the users own understanding and letting new
opinions emerge. Also, tagging fosters new correlations of the site
contents and can be exploited by the users to navigate in an alter-
native way to following the paths offered by the site’s information
architecture. Finally, user preferences and tags are used to generate
recommendations of contents and promote the exploration of the
site in a “bottom-up” perspective.

In 150 Digit, the support provided to the users (and educational
users in particular) to improve tagging is a recommender module,
which was designed and developed in order to meet the project’s
specific needs. Suggesting tags to users aims at overcoming the
well known trend according to which a site folksonomy stops or
slows its growth after some time because the users start to use the
same tags, and do not introduce new tags anymore [Trant 2009].
The novelty of the approach developed for 150 Digit is that the tag
recommender relies on the semantic description of the exhibitions
provided by the curators. Basically, the generation of new tags is
obtained through lexical resources, which allow the recommender
to expand the meaning of the existing tags, while non relevant tags
are filtered out by consulting the semantic description given by the
curators. The rationale for this approach is to leverage the exhibi-
tions’ institutional perspective to focalize the generation of new tags
and support the teachers work more effectively. At the same time,
this strategy aims to avoid the generation of a semantic gap between
the institutional categorization of the contents and the folksonomy,
keeping them aligned as the latter is expanded. Notice that this can
be seen as a variation of the well known “vocabulary problem” (first
identified by [Furnas et al. 1987]), i.e. the lack of convergence of
terms in user–generated vocabularies.

In this paper we describe and evaluate the tag recommendation sys-
tem of 150 Digit as a method to improve the contribution of the so-
cial semantic components in executable environments designed for
learning. The semantic layer of the site relies on a light ontology,
WordNet Domains [Bentivogli et al. 2004], a taxonomy of domains
originally developed to add semantic information to the meanings
of the terms in WordNet [Miller 1995]. In 150 Digit domains are
used to categorize the exhibits and the user-contributed contents,
providing a background description against which new tags are
sought by the recommender. The recommendation of tags exploits
the meaning relations encoded in the Italian version of WordNet,


MultiWordNet [Pianta et al. 2002], to expand on the existing tags
and propose new ones.

From March to June 2012, students and teachers visited the large
exhibition “Fare gli italiani” (“Making Italians”), where they at-
tended a post hoc laboratory in which they were asked to interact
with the “150 Digit” portal. These laboratories were the basis for
a thorough evaluation of the system. In this paper, we analyze the
results of the evaluation, assessing the impact of our approach on
the accuracy of the inserted tags, their quantity and typology. Given
the data recorded in log files (a sort of indirect observation), we also
analyze the users’ behavior in general and compare the folksonomy
generated through the system use with a baseline. The paper is
structured as follows: after surveying the related work (Section 2),
in Section 3 an overview of the “150 Digit” web portal is given, in
terms of its design, goals and functionalities, including a prelimi-
nary evaluation conducted on the prototype system. Section 4 gives
an evaluation of the portal through an experiment with real users.
Discussion and conclusions end the paper.

2 Related work

Since the advent of Web 2.0, tagging has attracted much interest in
scholars and has been studied under many perspectives. In partic-
ular we acknowledge three main areas in the corpus of tag-related
research. Tags have been studied with the goal of understanding
the behavior and interests of users, letting different tagging styles
emerge; more recently, they have been studied as a resource, in an
attempt to extract ontologies from user-generated folksonomies; fi-
nally, promising attempts have been made to exploit tags in order to
provide personalized recommendations and services to users, rang-
ing from tag recommendation to recommendation. In e-learning,
tags can be viewed as a means to gain insight on students’ learn-
ing [Bateman et al. 2007]; from them, information can be gathered
to build user profiles, aimed at making the learning environment
adaptive [Ferreira-Satler et al. 2011].

Several attempts have been made to interpret sets of user tags.
Users can tag with different purposes: to categorize or describe a
resource for future retrieval, or to give an opinion [O’ Donovan
2009]. Concerning the action of tagging in the artwork domain,
the results of the experiments in the Steve.museum project [Trant
2006] show that when users add tags on artworks, professional
users and non–expert users insert complementary information:
non-expert users insert information on the subject of the artwork
(such as, in case of a painting, the people and place depicted, the
ideas it suggests, the emotions, etc.), while experts provide only
“external” information regarding the authors, the historical period,
materials and so on. Moreover it emerged that users are generally
keen on leaving a trace of what they think and feel.
Other works envisage the complementarity of top-down classi-
fication and user-driven classification. [Szomszor et al. 2007]
suggest that the best solution for resources accessibility would
be to integrate the users’ subjective perspective with traditional
classification systems. This could exploit the benefits of both
approaches limiting their respective problems. This idea also
inspired the work of “150 Digit”, which integrates the curators’
knowledge encoded in the semantic component and the users’
perspective expressed by tags.

Regarding the user participation to both content creation and tag
insertion, Nonnecke et al. [Nonnecke et al. 2004] and Preece et al.
[Preece et al. 2004] identify the roles of “lurkers” and “posters”,
where lurkers are members of online communities who read, but do
not post, and posters are the few members who post content. These
results have been recently confirmed by Gena et al. [Gena et al.

Accepted for publication.]: the results show that the most partici-
pating users contribute in the form of small contributions (clicking
on a tag for insertion, clicking on like/dislike) and just a few of them
generate bottom-up contents. Analyzing the users’ tagging activity,
they reported that 84% of user tags were the ones proposed by the
system and just clicked on by the users, while the remaining 16%
were inserted by users as free text. This demonstrates that, when
available, users tend to select proposed tags instead of inserting new
ones, thus providing support to the use of tag recommenders. These
findings are also stated in [Ames and Naaman 2007]. The authors
reported that in the same domain (photos), users tag more in sys-
tems that recommends tags (ZoneTag) than in a system that does
not offer tag recommendations (Flickr).

Concerning the recommendation of tags, most approaches rely on
statistical techniques (PageRank, evolved into FolkRank [Hotho
et al. 2006]) to learn correlations among tags from their co-
occurrence in a folksonomy, and use this information to suggest
suitable tags for a resource. In our approach, the tag recommen-
dation mechanism relies on the meaning relations encoded in an
external resource, i.e., WordNet [Miller 1995] and WordNet Do-
mains [Bentivogli et al. 2004], following the approach proposed
by [Xu et al. 2006]. Similarly, [Cantador et al. 2011] proposed a
method that uses the YAGO ontology (containing information from
Wordnet and Wikipedia) for filtering and classifying tags into a set
of purpose-oriented categories (content-based, context-based, sub-
jective, and organizational). The results show that content- and
context- based tags are considered superior to subjective and or-
ganizational tags in helping a tag recommender component. They
found that the transformation of tags into ontology concepts con-
sents inferring semantic relations among concepts for recommen-
dation purposes.

In content-based recommenders, the use of WordNet to improve
recommendations is not new. [Degemmis et al. 2007] transform
the classic keyword based profiles into semantic user profiles uti-
lizing Wordnet and experienced that semantic user profiles produce
more accurate recommendations. [Laniado et al. 2007] propose in-
tegrating WordNet in the navigation interface of a folksonomy. In
particular using WordNet to build a hierarchy (top-down classifica-
tion) of related tags (the relatedness is calculated according to well
known similarity metrics in Wordnet) can help users navigate and
find related resources in del.icio.us. This approach is quite similar
to our tag recommendation strategy, in which related tags are sug-
gested on the basis of the hierarchical relation encoded in Wordnet.
Finally [Djuana et al. 2011] have found that a backbone ontology,
such as the 43 categories incorporated in WordNet, may improve
tag recommendation. They automatically learn the ontology from
user tags, and use this ontology to improve recommendations by
re-ranking the proposed tags on the basis of a collaborative filtering
algorithm. They have found that the re-ranking procedure improves
precision and recall. We currently do not consider WordNet cate-
gories in our recommendation process, though an automatic learn-
ing component could be added in order to perform this task.

A strong focus on semantics characterizes the content-based rec-
ommendation systems [Pazzani and Billsus 2007], as is the case
of 150 Digit. An essential component for content recommenders
is a system to describe the items that may be recommended, and
this description very often relies on an ontology. For instance, in e-
learning, Protus 2.0 tutoring system [Vesin et al. 2012] is a content-
based recommender that uses an ontology for knowledge represen-
tation and inference engines for reasoning. [Tang et al. 2012] start-
ing from a mining approach combined with fuzzy logic techniques
generate a Personal Web Usage Ontology (written in OWL), which
enables personalized web resources recommendation. On the side
of content-based recommender in the artwork domain we mention


the CHIP artwork recommender2. Similarly to 150 Digit, where
most content items are constituted by artworks, one of the main
goals of CHIP is to demonstrate how Semantic Web and recom-
mendation technologies can be deployed together to improve the
access to digital museum collections [Wang et al. 2009]. Results
from a user test demonstrate that users prefer content-based rec-
ommendations that leverage artwork features, and conclude that
domain-specific terms are generally more useful for content-based
techniques than generic ones. This finding is in line with our de-
cision to use domain knowledge, encoded in the semantic catego-
rization of the contents provided by the curators, to improve the
recommendation of tags.

3 System Overview

The goal of 150 Digit is to provide an open environment where stu-
dents can visit the exhibitions online and access a wide repository of
multimedia items related with the subject of the Unification of Italy.
The site contains both institutional contents, taken from the exhibi-
tions, and user–generated contents. Contents can be commented on
and tagged by users, thus generating new connections over them.
Tags are exploited to group contents on the fly, through a dedicated
tag-based search tool; tags and preferences are exploited to recom-
mend contents to the users. By doing so, the site integrates the
top-down perspective reflected in the institutional categorisation of
content with the bottom-up perspective induced by the users’ activ-
ity.

3.1 Functionalities and Design

The project encompasses three user profiles: the editor, who is in
charge of editing and publishing the institutional contents provided
by the exhibition curators, and validating the contents uploaded by
the students; the profile of the classes, student and teacher, who can
visit the exhibitions, add tags and comments to the exhibits, vote
them, and upload new items; and the registered user, who does not
belong to a class but can visit the exhibitions, vote and tag the ex-
hibits, and create her/his own playlist in a private area. Dedicated
tools like the “virtual classroom” (a separate space to comment site
contents shared by a group of students under the guidance of one
or more teachers) are aimed at improving the quality of the inter-
action with the site for educational users (for full description of the
system, see [Damiano et al. 2011]). Given these profiles, the portal
has three main functions: content management, content editing and
navigation.

• A content management system allows the site editors to cre-
ate the site main sections (in 150 Digit, they consist of exhi-
bitions) and categories within these sections, to add contents
to the categories and describe them through tags and semantic
labels. These labels constitute the semantic layer of the site
(as described in Section 3.2).

• A simple content editor lets the educational users edit and
publish contents in the existing exhibitions and categories, de-
scribing them through tags. During the tagging process, users
can ask the system to recommend tags. Suggesting tags to
users is a way to contrast the trend according to which a site
folksonomy slows its growth because the users stop introduc-
ing new tags [Trant 2009]. In 150 Digit, given the educa-
tional goals of the site, tag recommendation also serves the
purpose of supporting the teachers’ work on the linguistic de-
scription of the exhibits and documents contained in the site.
Web 2.0 functionalities, such as tagging and commenting, are
also available as part of the site navigation.

2http://www.chip-project.org/

• Site navigation, open to non–registered users, is the same for
the three profiles. In addition to the standard navigation en-
forced by the site’s information architecture, users can nav-
igate the contents by following content recommendations or
by using the tag-based search tool. In a didactic perspective,
exploring the site through the recommendations provided by
the site or through the tag-based search can be seen as a way
to support the theachers’ work in relating the items of the ex-
hibitions into alternative, coherent narratives.

The interaction design of 150 Digit relies on the ‘visit’ metaphor
to structure the information. The user can visit the four exhibitions
with a standard hypertext-based format, by following the connec-
tions over the items induced by tags through the tag-based search, or
in a 3D modality (see Figure 1). The portal features a plugin, tested
on major browsers, to navigate the exhibitions in a 3D environment,
with the aim of making the access to the exhibits more compelling.
This approach is borrowed from entertainment (videogames in par-
ticular) in order to offer students with an immersive, non textual
access modality they are familiar with.

• The standard, hypertext-based navigation follows a classical
top-down approach from general categories to detailed infor-
mation. The information architecture encompasses three lay-
ers, namely exhibitions, categories and items; at item level,
the user can move across items by following recommenda-
tions.

• The 3D navigation contains a set of navigation paths that mir-
ror those experienced by the visitors in the real exhibitions.
The use of the same structure in both the standard and the 3D
visit is aimed at providing guidance to users in the 3D space.
The 3D visit relies on the paradigm of constrained spatial nav-
igation [Burigat and Chittaro 2007], i.e., it is constrained to
some fixed positions, in sequential order, where the visitor is
“transported” through a stepwise flight simulation (briefly de-
scribed in [Damiano et al. 2012]).

• The tag-based navigation provides a bottom-up approach to
site contents. In this modality, users can take advantage of
the search functionalities (by keyword, artwork’s title, author,
tag, etc.) and sort the items by number of views, users’ pref-
erences, and so on.

Users can switch from one modality to another (for instance from
3D to hypertext, from hypertext to tag-based navigation) anytime
during navigation, and remain in the same (virtual) location (e.g.
the same category or item) after the switch. This approach is
intended to stress the parallelism between the various navigation
modalities, taking user’s need of orientation into account, and giv-
ing them the possibility to easily switch among multi-modal infor-
mation and different viewpoints of navigation.

150 Digit was developed by a multi–disciplinary team, involving
AI, computer graphics, interaction design and media experts, and
with the participation of the target users in all the phases of the
project, from design to prototyping, according to a user-centered,
iterative design methodology. The resulting portal integrates differ-
ent components (social, didactic, informative) in a seamless inter-
face that overcomes the challenges posed by the software integra-
tion issues and the content production process. The web 3.0 portal
interface design was inspired by usability heuristics and guidelines,
as well as by information architecture principles. Moreover the web
pages were created in respect of the Italian accessibility law (Stanca
Act). A usability expert supervised the interface design together
with the web designers, and reported heuristics and guidelines that
guided the design decision process.


Figure 1: The visit modalities in 150 Digit. Left, standard hypertext; center, 3D visit; right, tag–based visit.

Different types evaluations were carried out by the project team at
different stages of development. In the system design stage, and
in particular during the requirement elicitation, a focus group of 5
users, 4 males and 1 females, aged 40-62, was selected. The par-
ticipants were shown to a set of 15 scenario based static interfaces
and the main systems functionalities, labeling and layout with the
designers, for 3 hours. In general this group of teachers highlighted
the need for textual content to be associated to the exhibits, and
for dedicated tools for content creation. They appreciated the pro-
posed interfaces/functionalities, and considered them as valid tools
for classroom work and students’ involvement. The main findings
emerged from the focus group affected the project with changes
in both labeling (e.g., “favourites” instead of “playlist”) and func-
tionalities. In particular, some of the existing functionalities were
modified (for example, teachers suggested to show tag recommen-
dations only on request), and new ones were added: mainly, the
possibility of creating a virtual class where students can discuss the
exhibits and insert comments that are visible only within this class.

A preliminary evaluation was conducted on a static prototype,
which consisted of interface screenshots. This evaluation aimed at
verifying the navigation issues (such as breadcrumbs, home button,
etc.) in the graphical interface and the users’ reception of the so-
cial (tagging) and semantic (tag recommendations) functionalities.
5 users were tested. These were teachers, 3 males and 2 females,
aged 25-55. The test on the static interface consisted of showing the
users screenshots to and discussing the solutions with respect to the
aforementioned functionalities, while tag recommendation module
test consisted of the accomplishment of a set of tasks, such as tag-
ging or voting an item. The issues that emerged from this evaluation
concerned the understanding of the social aspects of the site, such
as the role of tags in the fruition of the contents. So, in the re-
design, tooltips to explain the meaning of social functions and the
possibility of increasing the size of the pictures which illustrate the
contents were added.

3.2 Semantic Framework and Tag Recommendation

The need to support prototyping, development and production
within a tight time schedule has determined the choice to rely on
‘light’ semantic tools to leverage the portal recommendation func-
tions. The system semantics rely on WordNet Domains [Bentivogli
et al. 2004], a hierarchy of domain labels (169 labels) integrated in
MultiWordNet. While most ontologies require expert knowledge to
understand their structure (consider for example, foundational on-
tologies like SUMO [Niles and Pease 2003] or DOLCE [Gangemi
et al. 2002]), WordNet Domains lends itself to the use by non expert
users, providing an off-the-shelf, portable middleware on the top of
which semantic tools can be built.

Semantic Categorization of Contents. In 150 Digit, the recom-
mendations provided by the system rely on a semantic categoriza-
tion of the contents, with the aim of integrating the social compo-
nent with the institutional perspective conveyed by the curators in
the conceptual organization of the exhibitions. For each exhibition
in 150 Digit, each category was associated by the curators to the do-
mains, which, according to them, better describe the category cov-
erage in semantic terms. For example, the “Timeline” category was
associated with the “Time Period” domain, the “Mass media” with
multiple domains, “Linguistics”, “Photography”, “Telecommuni-
cation”, “Cinema”, “Radio”, “Telephony” and “Tv”. The underly-
ing assumption is that semantic tools (and taxonomies in particular)
can provide an effective “external grounding” to the relations over
tags, as exemplified by the work of [Markines et al. 2009], that em-
ploys taxonomies (such as WordNet) to measure the reliability of
the emergent semantic relations among tags in folksonomies, thus
providing a sound foundation to the Social Semantic approach.

Tag Expansion and Disambiguation. Differently from standard
approaches, which exploit statistical techniques to recommend tags
(as in the case of PageRank, re–cast into FolkRank [Hotho et al.
2006]), the tag recommendation mechanism in 150 Digit consists
of a constrained expansion of the meaning of existing tags, based on
the semantic relations over the lexical items incorporated in Word-
Net [Miller 1995].

In WordNet, words are gathered into sets of synonyms (i.e. words
with same meaning), called synsets; synsets are linked according
to meaning relations, such as hyperonymy (more general meaning)
and hyponymy (more specific meaning). MultiWordNet includes
the Italian language and is aligned to WordNet 1.6. The basic ex-
pansion relies on the synonymy relations among lexical items en-
coded in synsets:

1. For each user tag, get the corresponding lexical entry from the
lemmatizer;

2. Given the lemma, get the synsets from MultiWordNet in
which it appears;

3. For each synset found, get all the lemmas contained;

4. Merge the obtained synsets by deleting the repeated entries.

Further expansion relies on querying MultiWordNet for related
synsets based on hyperonymy and hyponymy relations at step 3.

The simple expansion mechanism described above however does
not guarantee that the recommended tags are actually related to
the user tags, due to the polysemy of natural language. In other
words, a tag may correspond to more than one lexical entry. To


Figure 2: Screenshots of the tag recommendation interface. The slider allows the user to regulate the quantity of recommended tags (from
left, “Pochi–Few” tags, to right, “Molti–Many” tags). Recommended tags (here, given the input tag “folla”, i.e., “crowd”) are arranged in
a tag cloud. The terms referring to “crowd”, include “mass”, “army”, “bunch”, “swarm”, etc. .

overcome this difficulty, two disambiguation strategies are incor-
porated in the tag recommender. The disambiguation relies both
on ‘syntactic’ knowledge provided by the context of other tags and
on the ‘semantic’ knowledge contained in the semantic layer. The
‘syntactic’ disambiguation relies on the context of the other tags as-
sociated with the item: for each proposed tag, if it co-occurs in the
same synset with one of the context tags, it is included in the rec-
ommended tags; otherwise it is discarded. For example, consider
the situation in which the tags associated with an exhibit are “emi-
grants” (emigrant), “pescatore” (fisherman) and “giovane” (youth).
Following this strategy, a tag which is a synonym of one of the
three tags (for example, “ragazzo”, i.e., young man) will be recom-
mended, while a tag which is not a synonym of any of the three
tags (such as “garzone”, i.e. shop boy) will be not recommended.
The ‘semantic’ disambiguation relies on the domains attached to
the categories, inspired by [Magnini et al. 2002]. Each exhibit in-
herits the domain labels associated with the category it belongs to
(each exhibit belongs to only one category, parallel with the actual
arrangement of the exhibition) and with the exhibition itself. These
domains provide the semantic context against which the proposed
tags are filtered to eliminate the non relevant ones. For example,
consider the Italian word “quadro”. This word has two different
meaning, “painting” and “control panel”, the first one associated
with the “Art” domain in MultiWordNet, the second one associated
with the “Electronics” domain. If the disambiguation occurs in the
category “Painters and patriots” (associated, among others, with the
“Art” domain), only the first meaning of the word “quadro” is con-
sidered, while the second one (with its synonyms and other related
terms) is discarded because its domains don’t match the category
domains.

Interactive Tag Recommendation. In order to let the user control
the combination of the expansion and disambiguation techniques
described above, the recommendation of tags is accomplished in an
interactive fashion (see Fig. 2). If the user enters one or more tags in
the system, an auto–completion function shows the possible words
given the letters inserted so far; then, the user can then ask the sys-
tem to propose new, related tags. If no tags have been inserted by
the user, the recommendation takes the tags that are already asso-
ciated with the current item as input (if any, otherwise, the recom-
mendation cannot be made). The amount of recommended tags is
regulated by a slider: the user can move the slider from the “Few
tags” position (the starting position) to the “Many tags” position,
through intermediate positions. Each position corresponds to a dif-
ferent combination of tag expansion and filtering. Figure 2 shows
how the cloud of recommend tags grows as the user moves the slider
from left (“less tags”) to right (“more tags”), with two intermediate
positions between the initial recommendation to the highest expan-

sion of the user inserted tag.

Although the interface allows the user to control the tag expan-
sion mechanism, the presence of hyponyms and hyperonyms may
still disorientate them, since their introduction in the set of recom-
mended tags may not be obvious, especially the first time the sys-
tem is used. In order to overcome this problem, a tag cloud presents
the recommended tags, so as to alert the user of the possible pres-
ence of unexpected tags. As the user moves the slider, the tag cloud
grows or shrinks, and the user can accept one or more of the rec-
ommended tags by clicking on them. In the suggested tags cloud,
the font size of each tag is given by a combination of two factors:
tag frequency in the folksonomy and in the language use. The use
of word frequency in language use (taken from a frequency lexi-
con, “Corpus e Lessico di Frequenza dell’Italiano Scritto”, CoLFIS
[Laudanna et al. 1995]) has the function of making more unusual
terms less visible in the cloud.

Recommender Architecture. The architecture of the tag recom-
mendation system includes the following components:

• Lemmatizer: performs the morphological analysis of the user
tag, returning its non flexed form, needed to access the lex-
ical knowledge. For example: “persone” (people), the plu-
ral form of “persona” (person) is converted into the singular
form. Since most tags are nouns, we chose to consider only
the plural to singular conversion. The latter is achieved by
using a data base of forms, implemented in mySql.

• Expansion Module: written in PHP, implements the expansion
of the user tags along the semantic relations incorporated in
MultiWordNet, as described above. Again, MultiWordNet is
stored in a mySql data base and is accessed by a set of PHP
APIs.

• Disambiguation Module: implements the context–base and
the semantic–based disambiguation strategies. This module
interacts with the site CMS to get the set of tags that have al-
ready been added to the item and the domain labels that are
associated with it.

• Tag Cloud Generator: this module determines the size of the
tags in the generated cloud based on the frequency of tags in
the folksonomy and in the lexicon.

Item recommendation.

The content recommendation function relies on two complementary
approaches: a collaborative filtering approach [Schafer et al. 2007]
and a semantic approach. So, the user is presented with two sets of


Figure 3: The semantic architecture of the web 3.0 portal.

recommendations, one of which is based on the preferences given
by the other users, and the other is based on the tags added to the
items.

The semantic–based recommendation selects the items to recom-
mend based on the shared tags with the current item. Items are
ranked according to the number of tags they have in common with
the given item. Items with the same ranking are re-ranked according
to the category to which they belong: items from the same category
(and the same exhibitions) as the current item are preferred.

The social recommendation is based on the preferences expressed
by the community of the users, and is inspired by the technique of
collaborative filtering [Xu et al. 2006; Sarwar et al. 2001]:

1. Given the current content, select its highest vote;

2. Select all the users who have given the same vote to that item;

3. For each of these users, select the items to which the user has
given the same (or higher) vote ; If the set is empty, set the
vote to vote – 1;

4. Rank the selected items by their highest votes;

5. Select the first n contents;

The user is presented with the two sets of recommendations (tag–
based and preference–based); the difference between the two is
communicated by different labels, “150 Digit recommends” and
“Other schools recommend ” respectively. In case the same item
appears in the two sets, the duplication is eliminated.

3.3 Preliminary Evaluation

Given the logs of the first six months of publication of the web
site, a preliminary evaluation of the users’ acceptance of the site
functionalities was conducted, and of the recommender system in
particular. Only front-end users were considered for social func-
tionalities (content generation and tagging), while for the semantic
analysis of tags back-office users were also taken into considera-
tion, as they benefit from this kind of recommendation because they
are requested to tag exhibits as part of the publication process

During the first six months, 347 users logged onto the site, 149
(42.93%) active teachers, 199 (57.35%) regular visitors. Of the
regular visitors, 61 users (31%) were associated to classes. It is im-
portant to note that teachers and classes were explicitly contacted
by the committee in order to promote the portal. These teach-
ers/classes were randomly selected over a set of teachers/classes

who regularly participate in trials organized by the Ministry of Ed-
ucation. This evaluation, conducted for prototype refinement and
experiment design, was split into two parts: the first concerning the
generic Web 2.0, i.e. social, functionalities and the second relating
to the 3.0, i.e. executable, functionalities.

Web 2.0 functionalities.

The users’ behavior in relation to social functionalities can be ana-
lyzed in terms of:

• their participation in content creation,

• their tagging activity,

• the quantity and the typology of inserted tags.

With regards to the uploading of user-generated content, 11 virtual
classes of the 51 registered classes (21.57%) inserted new contents
(a total of 29 new contents, while the institutional contents are 271).
In detail, 3 classes with same teacher inserted more than half of con-
tents (51.72%), one class inserted 13.79% of contents and another
one inserted 10,34% of contents. Thus 5 classes out of 11 (45.45%)
inserted 76% of contents.

In total 404 tags (duplicates included), either freely inserted by
users or selected among the tags suggested by the system, have
been collected since the beginning of the experimentation. More
specifically 297 tags (73.51%) were proposed by the system and
just clicked on, while the remaining 107 ( 26.41%) were inserted
by users in free text. 28 users out of 347 (8.07%) inserted tags.
Of these 17 were teachers working with their virtual classes (61%)
and 11 were regular visitors (39%). In particular a teacher using
the site both as a regular visitor and with her 5 classes inserted al-
most half the tags (201 tags, 49.75%). Note that this teacher, and
her classes, were the same that inserted more than half the con-
tents. Another teacher both as visitor and with her class inserted
16.83% of tags (namely 68 tags), while another class created an
above average number of tags (31 tags, 7.67%). The remaining
classes (31.71%) inserted an average of 5.9 tags per class, while
regular visitors (32.14%) inserted an average of 5 tags per user. The
low user participation in both content creation and tag insertion con-
firms the results of [Nonnecke et al. 2004] and [Preece et al. 2004]
and replicates the dichotomy between “lurkers” and “posters” men-
tioned above.

Not all the tagged contents in exhibition received the same number
of tags, see Table 1. The frequency of tags with respect to exhibi-
tions/sections needs to be balanced with the number of contents
present in each exhibition/section. In general, the number of tags
is proportional to the number of content items in the exhibitions,
with two exceptions: “La bella Italia” and the “Extra contents”
sections. “La bella Italia” received the least user attention in
term of tags, despite its relevant number of contents; the “Extra
contents” section, i.e. the section containing schoolgenerated
contents that are not strictly related to the main exhibitions, was
particularly successful. This success is not surprising, as the
contents generated by classes receive more attention by the classes
themselves. Moreover, the insertion of a new content implies the
insertion of tags.

It is interesting to compare the number of tags received by each ex-
hibition with the number of visits of the real and virtual exhibitions.
In the real world, the exhibition “Fare gli italiani” had the highest
number of visitors, followed by “La bella Italia”, “Stazione futuro”,
and “Il futuro nelle mani”. The number of visits received by the ex-
hibitions on the web site “Fare gli italiani” is still the most visited
exhibition followed by “La bella Italia”, “Il futuro nelle mani”, and
“Stazione futuro” (the last two being almost on a par). While the


Figure 4: Content recommendation in 150 Digit. On the left, the tag–based recommendations (“The systems recommends you”) ; on the
right, the preference–based recommendations (“Schools recommend you”).

Exhibition/Section Number of tags Number of contents
“Fare gli italiani” 179 tags (44.31%) 135 (45%)
“Extra contents” 70 tags (17.33%) 28 (9.33%)

“Il futuro nelle mani” 37 tags (9.16%) 48 (16%)
“Stazione futuro” 33 tags (7.92%) 30 (10%)
“La bella Italia” 16 tags (3.96%) 38 (12.6%)

“The places (of current exhibitions)” 16 tags (3.96%) 9 (%3)

Table 1: The most tagged exhibitions/sections

trend regarding the former exhibitions has also been confirmed by
the 150 Digit taggers activity, the latter ones, in particular “La bella
Italia”, reveal a much lower number of tags with respect to their
virtual visits. An explanation could be that the sections/exhibitions
receiving more tags are those whose contents are more pertinent to
the topics covered by the study programme. Regarding the tagged
contents, 109 items have been tagged, with an average number of
3.71 tags per item. However the distribution of tag per item is not
homogeneous. 10 items (9.17%) received a number of tags more
than twice the average, as detailed in Table 2. The other items
(90.83%) received a number of tags ranging from 1 to 7. More
specifically 4 items (3.67%) received 7 tags, 5 items (4.59%) re-
ceived 6 tags, 10 items (9.17%) received 5 tags, 7 items (6.42%) re-
ceived 4 tags, 17 items (15.60%) received 3 tags, 32 items (29.36%)
received 2 tags, 24 items (32.02%) received 1 tag. Notice that most
of the items received a low number of tags. The most used tags
(“history”, “unification”, “Italy”, “risorgimento”, “tradition”, etc)
reflect the historical context of the web site contents (the celebra-
tion for the 150th anniversary of the unification of Italy), while the
others reflect the artwork content (“woman”, “women”, “food”),
namely subject related tags. The remaining 254 tags have been
used with these frequency values: 198 tags (49%) have been used
once, 47 tags (11.63%) have been used twice, 9 tags have been used
3 times. To sum up we can conclude that a few tags are used more
than once, while most of the tags are used once, or twice at most.
However these considerations must also take into account the lim-
ited sample of users involved in the trial.

From these data, we concluded that the users’ behavior with the
150 Digit system is coherent with the social functionalities largely
reported in the literature. Different groups (less active and more ac-
tive users) emerge for the quantity of uploaded contents and added
tags, with the distribution of tags featuring the most common tags,
in line with the themes of the exhibitions.

Web 3.0 Functionalities

The data set collected in the 6-month testing of the prototype sys-
tem was employed to conduct a preliminary evaluation of the ade-
quacy of our approach to the recommendation of tags. Our work-
ing hypothesis is that, if the approach is correct, the semantics of
the folksonomy should, to some degree, match the institutional cat-
egorization of contents, thanks to the use of the semantic layer in

the recommendation of tags. In order to obtain a semantic descrip-
tion of the folksonomy, comparable with the description of the cate-
gories in the semantic layer, we adopted the very same resource em-
ployed to encode the semantic layer itself, i.e., WordNet Domains
(see Section 3.2). To do so, tags were associated with the domains
of the corresponding terms in MultiWordNet, thus obtaining a pic-
ture (though a coarse one) of the semantics of the folksonomy. The
obtained representation can be straightforwardly compared to the
description of the categories encoded in the semantic layer, which
were entered with the relevant domains by the curators.

As a preliminary step of the analysis, duplicated tags were elim-
inated from the folksonomy; then each tag was associated to the
corresponding lemma in MultiWordNet and all the synsets in which
the lemma appears were collected. Following this, the relevant do-
mains for each tag were retrieved through the synset ids (in Multi-
WordNet, domains are associated with synsets).

With this mapping, the distribution of domain labels in the folkson-
omy were investigated and compared with the institutional labels,
both at site and category level. In order to compare the domains
associated with the folksonomy with the institutional domains as-
sociated with the website categories the folksonomy domains were
ranked according to their frequency. Table3.3 gives the ranking of
domains according to their frequency of association to the user tags
at site level. As expected, the top domain is Factotum (44.97%), the
domain assigned to lemmas having a generic meaning. Beside other
generic domains, such as “Quality” (5.35%) or “Person” (5.27%),
most of the domains in this rank, such as “Geography” (5.46%),
“Military” (4.24%), “Politics” (4.04%) and Art (3.66%) are highly
relevant to the themes dealt with by the exhibitions, which narrate
Italy’s complex evolution after its unification in terms of political,
social and cultural aspects. So, if “Buildings” (5.64%) can be re-
lated to the monuments and buildings that appear as pictures or
location in many exhibits, “Administration” (4.21%) refers to the
institutions and administrative regions mentioned all along the nar-
ration of Italys historical evolution.

In order to gather more insight, we extended the semantic compari-
son of tags and institutional domains to the highest level of detail of
the categories (each exhibition includes several categories). How-
ever, since most data in the tagset concerned the exhibition “Fare
gli Italiani”, we limited the categorylevel comparison to this exhi-


Table 2: Most tagged items

Item Exhibition/Section Number of tags
“Adunata” “Fare gli italiani” 34 tags (8.42%)

“Calendario 2011” “Fare gli italiani” 16 tags (3.96%)*
“Blog 150 anni insieme” “Fare gli italiani” 12 tags (2.97%)*

“Cibo per le feste” “Extra contents” 12 tags (2.97%)*
“L’Italia, Selargius, la Sardegna nei 150 anni” “extra contents” 12 tags (2.97%)*
“Carabiniere in alta uniforme e Corazziere” “Il futuro nelle mani” 10 tags (2.48%)

“Il pane, simbolo dell’unita” “Stazione futuro” 9 tags (2.23%)
“Foggia e l’Unita’ d’Italia” “Extra contents” 8 tags (1.98%)*
“La classe....di una volta” “Fare gli italiani” 8 tags (1.98%)*

“Rete-Mondo Moda” “The places” 8 tags (1.98%)*

Rank Domain Hit number
1 Factotum 1092 (44.97%)
2 Buildings 137 (5.64%)
3 Geography 133 (5.46%)
4 Quality 130 (5.35%)
5 Person 128 (5.27%)
6 Military 103 (4.24%)
7 Administration 102 (4.21%)
8 Politics 98 (4.04%)
9 Art 89 (3.66%)
10 Sociology 84 (3.45%)

Table 3: Ranking of domains according to the matching tags.

bition. “Fare gli Italiani” contains 20 categories, with an average
of about 3.68 labels for each category. By comparing the domains
associated with each category to those associated with tags added
to contents of the same category, we found a significant overlap.
The comparison showed that the average overlap between folkson-
omy domains and institutional domains is 61.02%, that is, 61.02%
of the domains extracted from the folksonomy of a certain category
matched the institutional domains in the corresponding category.
More interestingly, for each category the topmost ranked domain
in the folksonomy (the domain that was associated with most tags
in the category) always matches one of the institutional domains.
For example, the topmost ranked domain in the folksonomy for the
category “The Migrations” is “Geography”: this domain, together
with “Sociology”, was associated by the curator with that category.

The evaluation methodology described has two obvious limitations.
Firstly, the limited coverage of the folksonomy by MultiWordNet:
of the 1957 tags in the folksonomy, only 865 can be found in Multi-
WordNet (44.2%). Secondly, we did not perform any kind of dis-
ambiguation of tags, so the representation of the folksonomy in
terms of domains may reflect the ambiguity of tags. Notwithstand-
ing these limitations, we considered the overlap of the institutional
and the folksonomy domains satisfactory, so the tag recommenda-
tion strategy was maintained, extending it to work with no text input
by the user: in the current version of the website, the input to the
tag recommender can be given explicitly, by inserting one or more
tags, or implicitly, by requesting the systems to suggest tags based
on the tags associated with the current item.

4 Experimental Evaluation

During a few months, students and teachers visited the large and
permanent exhibition “Fare gli italiani” (“Making Italians”), and
then attended one of the three post hoc laboratories where they
were asked to interact with the 150 Digit portal. These laborato-
ries constituted the basis for a thorough evaluation of the system

Laboratory Control Experimental
On the trail of migrants 11 10

The suitcase of the historian 2 3
Making Italians E-book: 2 3

Table 4: Laboratory sessions attended, with numbers of the control
and experimental groups, respectively.

with the goal of assessing the acceptance of the tag recommender
and the effectiveness of the Web 3.0 (a Social Semantic Web ap-
proach, borrowing the words of [Markines et al. 2009]) over the
Web 2.0 approach in tag recommendation. During the lab sessions,
the students were requested to analyze the items related to the ex-
hibits and to create and upload new materials, also adding tags in
the meantime. In the experimental group, the students interacted
with the system, receiving tag and content recommendations, while
for the control group, the recommendation module was disabled in
order to gather the control data. Notice that the users in the control
group could not use the tag recommender, but were able to see the
other users tags (in the style of Web 2.0), when adding new tags to
the contents.

In the following, we analyze the results, evaluating the impact of
our approach on

• the tagging activity of users, i.e. the quantity and the typology
of the inserted tags;

• the semantics they convey.

The latter point implies an analysis of the tag sets generated by the
two versions of the system: one is completely user-generated, while
the other one mixes user-generated tags and semantic recommenda-
tions.

4.1 Tag Analysis: Methodology and Results

Hypothesis. We hypothesized that the tag set generated by the ex-
perimental group would be quantitatively and qualitatively different
from the tag set generated by the control group. In particular, the
folksonomy of the experimental group, having benefited from the
tag recommender, should be larger and more heterogeneous.

Design. The first group of students (control group) interacted with a
modified version of the system, without the semantic recommender
(independent variable). In this phase subjects received only social
recommendation, namely the recommended tags that are the most
used tags. The second group (experimental group) of students in-
teracted with the regular version of the system, with the semantic
recommender enabled.

Participants. 15 classes (9 secondary school classes, and 6 high


school classes) for the control phase with 33 registered users3 on
the web site vs. a total of 298 users participating to the laboratories.
16 classes (10 secondary school classes, and 6 high school classes)
for the experimental phase with 42 registered users vs. a total of
255 users participating to the laboratories.

Apparatus and Materials. The laboratory rooms were equipped
with a set of computers (Windows-based PCs) and one Interactive
whiteboard (mainly for the lab conductor). Users browsed the web
site using MS Explorer 8. The performances were traced by means
of an ad hoc logging system. Users were given written instructions.

Procedure. The classes used the system during one of the follow-
ing laboratories:

• On the trail of migrants: Through multimedia workstations
students faced the journey of Italian emigrants of the early
twentieth century: the baggage, the passenger list, landing
at Ellis Island and the interrogation. Every student group
took the role of a (emigrant) character. Then, with a leap
through time, they relived the journey of new migrants from
Guinea, Afghanistan, Kurdistan and other countries different
from Italy. At the end of the laboratory students had to tag the
characters on 150 Digit;

• The historian’s suitcase: Before visiting the exhibition, by
means of 150 Digit, students were introduced to the narra-
tive choices made by the historians that curate the exhibition.
They then chose an exhibition category and looked for its his-
torical sources. At the end of the lab they uploaded (on 150
Digit) and tagged a document containing their work;

• Making Italians E-book: In groups students were assigned
specific roles to make observations, collect data and produce
images along the way. After the visit, in 150 Digit, each
group, created a humorous and original page of text and ani-
mated images using Prezi editor, on the themes of the exhibi-
tion or on how it was received. The e-book was then uploaded
into the system and tagged.

Table 4 summarizes which laboratories were attended during the
control and the experimental phase. Each class was free to attend
their preferred laboratory. In the first month of experimentation
each class interacted with the modified version of the system, with-
out the semantic recommender (control group). During the sec-
ond month of experimentation each class interacted with the regu-
lar version of the system, with the semantic recommender enabled
(experimental group). In both conditions, students were required
to analyze the items related to the exhibits and to create and up-
load new materials, adding tags while doing both these operations.
Users in the experimental group were asked to insert at least one of
the suggested tags.

It is important to note that users interacted with the system in
groups. Every class was split into 2-4 groups. So the number of
users registered into the system was less than the total number of
real users attending the laboratories.

Control group results. The 33 users of the control group inserted
a total of 133 tags, with an average of 4.03 tags per registered user,
and 8.06 tags per class. 10 users (24.24%) inserted almost half
the tags (49.62%), in this way confirming the presence of active
users in a community that are more active than others in content
contributions [Nonnecke et al. 2004]. The same consideration ap-
plies to classes: 4 classes (26.67%) inserted more than half the tags
(55.64%). Table 5 shows the most used tags, namely all the tags

3Users interact with the system in group. Every class was split into 2-4
groups. Thus the number of registered users on the systems was less than
the total number of users attending the laboratories.

Tags Frequency Percentage
Cosa Nostra 6 4.51 %

mafia 6 4.51 %
assault 5 3.76 %

migration 5 3.76 %
war 5 3.76 %

emigration 4 3.01 %
Clash 3 2.26 %

collective act 3 2.26 %
education 3 2.26 %
Gold Rush 3 2.26 %

immigration 3 2.26 %
maid 3 2.26 %

civil war 2 1.50 %
conflict 2 1.50 %

fight 2 1.50 %
massacre 2 1.50 %
murder 2 1.50 %
Naples 2 1.50 %

presentation 2 1.50 %
Public Schools 2 1.50 %

school 2 1.50 %
Topic 2 1.50 %
trench 2 1.50 %

Table 5: Most used tags - Control Group

Categories Frequency Percentage
Migrations 46 34.59%

Mafias 21 15.79%
The First World War 18 13.53%

The school 15 11.28%
The power of the unity 10 7.52%
The Second World War 9 6.77%

The massmedia 5 3.76%
The Church 3 2.26%

The campaigns 3 2.26%
The consumption 1 0.75%

Italy cities 1 0.75%
Painters and Patriots (2011) 1 0.75%

Table 6: Most tagged categories - Control Group

used more than once. Notice that i) the two most used tags are syn-
onyms (i.e. “Cosa Nostra” and “mafia”); ii) the other top-ranked
tags, i.e. migration-emigration-immigration, and assault-war-clash,
share the same meaning; iii) these mentioned tags reflect the con-
tent of the most tagged categories, namely “Migrations”, “Mafia”,
and “The First World War”; iv) finally, a great number of tags (62,
which is 46,62% of the total number of tags) were used only once,
which shows great sparsity in the use of tags. The control group
classes uploaded 20 new contents in total (5 new contents per class).
Only 4 classes uploaded new contents since the “On the trail of mi-
grants” laboratory does not require users to upload material.

Table 6 shows the most tagged categories, which clearly reflect the
themes of the most attended laboratories.

Experimental group results. The 42 registered users of the
experimental group inserted in total 214 tags, with an average of
5.09 tags per user, and 13.37 tags per class. 8 users (19.05%)
inserted almost half the tags (49.07%), thus confirming also in
this case the presence of active users in a community [Nonnecke
et al. 2004]. The same consideration applies to classes: 5 classes
(31.25%) inserted more than half the tags (54.87%). Table 7 gives
the most used tags, namely all the tags used more than twice.


Tags Frequency Percentage
farmer 8 3.74%
worker 6 2.80%

merchant 5 2.34%
illiterate 4 1.87%
Migrant 4 1.87%
Cameo 3 1.40%
maid 3 1.40%
miner 3 1.40%

mother 3 1.40%
poor 3 1.40%

shopkeeper 3 1.40%
unemployed 3 1.40%

Veteran 3 1.40%
well-off 3 1.40%
woman 3 1.40%
young 3 1.40%

Table 7: Most used tags - Experimental Group

The tags used once are 108 (50.47%), while tags used twice are
10.75%. Thus most tags (more than 60%) are re-used very little
or not at all. Concerning the most used tags, i.e. farmer, worker,
merchant, illiterate, migrant, etc., we should notice these are
mainly the tags reflecting the subjects presented in the “Migration”
category.
The experimental group classes uploaded in total 21 new contents
(3.5 new contents per class). Notice that just 6 classes uploaded
new contents, since the “On the trail of migrants” laboratory does
not require to upload material. Table 8 shows the most tagged cate-
gories, which reflect the themes of the most frequented laboratories.

For what concerns the slider that regulates the number of recom-
mended tags, its use was analyzed in terms of the positions of the
cursor selected by users when they accepted the recommendations.
The hypothesis was that an even distribution of the cursor posi-
tions would confirm the users’ understanding and acceptance of the
slider:

• Tags from position 1 were selected 84 times (39.25%): in this
position, the expansion strategy considers the tighter seman-
tic relations encoded in MultiWordNet, i.e., synonymy, and
the candidate tags are disambiguated against the domains at-
tached to the current category;

• Tags from position 2 were selected 54 times (25.23%): in this
position, the expansion is extended to the hyponymy and hy-
peronymy relations, but the disambiguation is also extended
to take the context of the existing tags into account;

• Tags from position 3 were selected 40 times (18.69%): in this
position, the domain–based disambiguation is removed;

• Tags from position 4 were selected 36 times (16.82%): in this
position, both disambiguation methods are removed;

As shown by the usage of the slider tool, users seem to accept and
understand its usage correctly, since all the cursor’s positions on
the slider were employed, with an obvious prevalence of the initial
position.

4.2 Tag Recommender Evaluation

Using the same approach described in Section 3.3, the semantic rep-
resentation of the two tag sets (one generated by the control group
and the other by the experimental group), was compared and given

Categories Frequency Percentage
Migrations 150 70.09%

The Futurist 17 7.94%
The New Officine 11 5.14%

The mafia 6 2.80%
Campaigns 6 2.80%

Unification of Italy 4 1.87%
The First World War 4 1.87%
It began with their 4 1.87%

Consumption 3 1.40%
Gallery of shops 3 1.40%

Painters and Patriots (2011) 2 0.93%
Officine Grandi Repairs 2 0.93%
The Second World War 2 0.93%

Table 8: Most tagged categories- Experimental Group

in terms of the domains associated with the tags they contain. For
each tag set, we computed the overlap of its domains with the insti-
tutional domains, category by category (since the association with
domains is at category level, i.e., each category was associated by
the curators with a set of domains).

For each tag, we collected the domains to which it is associated (us-
ing the synset-domain mapping encoded in MultiWordNet ). For
each category of the exhibition, we obtained a set of domains, (the
folksonomy domains) ranked according to the number of tags to
which each domain is associated4. Each set of ranked domains con-
stitutes a rough semantic representation of the tag set in terms of the
taxonomy of domains encoded in MultiWordNet. Tables 10 and 9
report the top ten domain labels for each tag set. The control group
tag set refers to 44 different domains; the experimental group tag set
of the refers to 55 different domains. Each domain is accompanied
by the number of times it is associated with a tag in the folkson-
omy (“Hit number” in the tables). For each set, some tags could
not be employed for this evaluation because they are not present in
MultiWordNet (for example, because they are proper nouns, neolo-
gisms, etc.) or because they dont have a domain label associated in
MultiWordNet.

The two sets of ranked domains were compared. First of all, in
order to assess if the two sets of domains (obtained in the two ex-
perimental sessions) were significantly different from the statistical
point of view, the χ2 test was calculated. The statistic shows that the
difference in the two distribution are significant (χ2(67)=108.54,
p<0.001).

A more thorough comparison shows that the two sets have 29 do-
mains in common; also, it was observed that only 4 of the 29 shared
domains are in the ten topmost ranked domains of both sets (“Soci-
ology”, “Psychological features”, “Military”, “Person”).

By extending the comparison to the categories, the χ2 test shows
that only the different frequencies of distribution of domains be-
longing to “Migrations” and “The First World War” are significant,
obtaining respectively χ2(40)=91.39, p<0.001, and χ2(13)=52.60,
p<0.001.

Secondly, the overlap of each set of folksonomy domains with
the set of institutional domains was measured. Similarly to what
emerged from the dataset collected during the system tuning phase
(described in Section 3.3), the top ranked domains in the folkson-
omy overlap with the institutional domain labels. Moreover, by
observing the overlap at category level, we found that, in both tag

4Again, notice that the same tag can be associated to different domains
due to its intrinsic polysemy (i.e., a tag belonging to more than one synset)
or because a synset is associated to more domains.


Domain Hit number
person 113 (21.86%)

commerce 21 (4.06%)
military 21 (4.06%)

agriculture 17 (3.29%)
sociology 17 (3.29%)

psychological features 15 (2.90%)
quality 15 (2.90%)
biology 13 (2.51%)

medicine 12 (2.32%)
animals 10 (1.94%)

Table 9: Top ranked domain labels in experimental group.

sets, the topmost ranked domains match at least one of the institu-
tional domains assigned to that category, notwithstanding the low
number of domains associated with the categories by the curators
(an average 3.68 domains per category). For example, in the “Mi-
grations” category, the domain in common between the folksonomy
and the institutional domains is “Sociology” (17 hits in the control
group and 10 in the experimental group) where the institutional do-
mains of the category are “Geography” and “Sociology”); for the
“The Second World War” category, the most frequent domain (9
hits in the control group and 6 in the control group) is “Military”,
which matches the institutional domains associated with that cat-
egory (“Diplomacy”, “History”, “Military”). The only exceptions
are given by some categories for which less than 5 tags were col-
lected. To summarize, although the two distributions of domains are
statistically different, they are not significantly different in terms of
their overlap with the institutional domain labels. However, it is
necessary to specify that a full comparison is impossible. Due to
the uneven distribution of tags in the category and to the partial
coverage by MultiWordNet , the comparison was possible for only
6 categories out of 35; in all other cases, the tags of the users could
either not be found in MultiWord-Net or there were no tags at all in
that category (for one of the two tag sets).

The analysis of the semantics of the folksonomy conducted through
the domains confirms the prototype analysis findings (Section 3.3),
i.e., that the folksonomy domains significantly overlap with the in-
stitutional domains. In summary, given the domain based analysis,
we can conclude that the assumption on which the recommendation
strategy relies, that the institutional domains match the semantics
of the site categories as perceived by the users, is confirmed by the
data.

As for the effectiveness of the recommendation strategy with re-
spect to the baseline (no recommendation at all), no significant dif-
ferences in the overlap of folksonomy domains with institutional
domains were found between the control and the experimental
group; i.e., the overlap does not grow using the recommender. In
this way the tag recommendation can seem ineffective. However,
the limited size of the tag sets collected during the experimentation
makes the comparison difficult, and although a statistical difference
does exist, it cannot be traced back to the overlap with the institu-
tional domains. On the other hand, this result can be considered a
success in an educational context, because tag insertion in focused
trials driven by teachers leads students to select the tags in a very
accurate and focused way (with and without the recommenders),
and this may have contributed to reducing the difference between
the two situations.

5 Discussion

This section summarizes the results of the evaluation and analyzes
the tag sets of the experimental and control groups in further de-

Domain Hit number
sociology 52 (13.07%)
military 37 (9.3%)
person 34 (8.54%)
history 23 (5.78%)
school 17 (4.27%)

law 15 (3.77%)
pedagogy 14 (3.52%)

psychological features 10 (2.51%)
time period 8 (2.01%)
linguistics 7 (1.76%)

Table 10: Top ranked domain labels in control group.

tail, with the goal of explaining the difference between the two sets
emerged from the domain-based analysis.

As can be expected, users in the experimental group inserted more
tags than users in the control group (5.09% tags vs. 4.03% tags
per user), showing that the use of the recommender helps users
find more tags. Also, users in the experimental group tended to
insert more new tags compared to the control group: there are 85
(64%) distinct tags in the control group tag set, and 147 (69%)
distinct tags in the experimental group tag set. This finding is
positive for the evaluation of the tag recommender, because it
shows that its use can increase the variety of tags. However, this is
not clear-cut: since users were requested to insert tags at the end
of each laboratory session and to reason about which tag to insert
(e.g., they were explicitly asked to tag the discussed artworks and
the uploaded material), the number and the variety of inserted
tags may have been biased by these given instructions, blurring
the differences between the two groups. In addition to this, the
laboratory context and the presence of the teachers may have led
students to be more accurate in tagging.

Concerning the emergent semantics of the generated folksnomy, the
number of domains referred to by the tags in the experimental tag
set is higher than the number of domains referred to by the tags
in the control tag set (55 domains vs. 44 domains). Even if these
results do not show a clear benefit in the use of the tag recommender
(again, also due to the structure of the experimental context), the
number and distribution of tags in the experimental group seem to
have been positively affected by the use of the tag recommender.

To better understand the meaning of the tags inserted by the users,
further semantics may be required. In order to gain more insight,
two more investigations were conducted: first, the tags were an-
alyzed according to their lexical types, trying to register the dif-
ference in the two tag sets; second, the distribution of tags in the
WordNet Domains in the two tag sets were compared, trying to re-
late it to the use of the tag recommender.

To study the distribution of tags based on their lexical types, the
collected tags were clustered into WordNet categories. A semi-
automatic procedure5 performed a WordNet dictionary lookup to
obtain the top-level categories that could be deduced from the cor-
respondence with the lexicographer file organization6. Only tags
contained in the WordNet dictionary were mapped to WordNet cat-
egories. The obtained classification for control and experimental
group is detailed in Table 11.

5Disambiguation was performed manually. Each times a tag belongs to
more than one category the right meaning was manually checked by con-
fronting the tag and tagged item.

6http://wordnet.princeton.edu/man/lexnames.5WN.html


Lexnames Experimental group Control group
adj.all 31 2

adj.pert 1 0
compound 7 6
invented 12 0
noun.act 19 45

noun.animal 4 0
noun.artifact 14 7
noun.attribute 2 1
noun.cognition 2 10

noun.communication 3 5
noun.event 3 8

noun.feeling 7 0
noun.food 1 1

noun.group 6 18
noun.location 7 10
noun.object 0 1
noun.person 77 15
noun.plant 0 1

noun.relation 1 0
noun.state 12 0

noun.substance 2 1
noun.time 0 1

NULL 1 1
verb.stative 2 0

Table 11: Comparison between Wordnet lexnames

The χ2 test was calculated in order to assess if the frequency of
distribution of tags obtained in the two experimental session were
significant. The statistics show that the difference in the two dis-
tributions is significant (χ2(23)=125.47, p<0.001). What seems
to emerge from this classification and from the comparison of the
results of the two experimental conditions is that the presence of
the semantic recommender led users to tag adjectives frequently
(14.5%) and nouns denoting people (noun.person: 36%), such as
farmer, merchant, worker, etc very frequently. The use of these
words may be partly explained by the topics covered in “On the
trail of migrants” lab, which focuses on people. However, this was
the most frequented laboratory, as well as the “Migrations” cate-
gory, in both groups (not only in the experimental one), and the
most used tags in the control group do not reflect this trend. In-
deed , students in the control group often used nouns denoting acts
or actions (noun.act: 33.8%), and nouns denoting groups of people
or objects (noun.groups: 13.51%) more than nouns denoting peo-
ple (noun.person: 11.28%). From this observation we can conclude
that the semantic recommendations of tags promote a more pre-
cise description of the subject represented in the artwork. This is
demonstrated by the wide use of tags belonging to the Wordnet per-
son category. This finding also confirms the results obtained in the
Steve.museum project [Trant 2006], according to which users pre-
fer tags related to the subject of the artwork and to the description
of what is represented in it.

As illustrated in the previous section, the comparison of the two tag
sets conducted by analyzing their implicit semantics through Multi-
WordNet Domains confirmed the statistical difference between the
two sets, implying that the use of the tag recommender actually
affects the tagging behavior of users, although it does not prove
that the overlap of the folksonomy is increased with the use of the
recommender. In particular, we tried to relate the distribution of
domains in the tag set of the experimental group with the function-
ing of the tag recommender. Remember that, by using a slider, the
users can affect the behavior of the recommender, reducing or in-
creasing the number of recommended tags (see Section 3.2). In
the starting position, the recommendation strategy is more restric-

tive, as it applies different disambiguation techniques, which are
progressively removed as the slider is moved to the following posi-
tions. In parallel, the search for recommended tags is extended from
tighter semantic relations (synonymy) to looser ones (hyponymy
and hyperonymy). As a result, in the the last slider position, no dis-
ambiguation is performed and all available semantic relations are
considered.

By analyzing the distributions of domain labels that emerged from
the two sets of tags (control and experimental group), we note that
the difference resides mainly in two facts:

• In the experimental group, when the recommender was in use,
there were 26 more domains (such as “Exchange”, “Adminis-
tration”, etc.) than in the control group (there are 16 shared
domains in the two distributions).These domains tend to be
more specific than the shared ones;

• In the experimental group, the most abstract domains in the hi-
erarchy have more tags related to them than the control group.
For example, “Person” moves from 34 related tags in the con-
trol group tag set to 113 related tags in the tag set of the ex-
perimental group.

These two complementary findings show that the growth observed
in the experimental tag set is related both with the diversity of do-
mains it contained (with more specific domains added) and with the
distribution of the tags in the domains, with more of the tags associ-
ated with the generic domains. The users generated more tags with
the help of the recommender, and this increase affected both the
generic and the specific domains, which increased in number due
to the focalization brought by the use of the recommender.

These findings are consistent with the use of the tag recommen-
dation slider. The increased number of specific domains can be
attributed to the 39.25% of times the users accepted the recom-
mended tag with the position “A” in the slider: in this position, tags
are constrained to the institutional domains, which are generally
quite specific, such as “Telecommunications” or “Transport”. The
higher number of tags related with generic and abstract domains
(such as “Psychological features”, or “Person”) are consistent with
the fact that the remaining positions of the slider expand the existing
tags along the hyperonymy relations encoded in MultiWordNet, in
which 16.82% cases where no disambiguation was conducted (with
the slider in D position, where the highest number of recommended
tags is generated by the system).

Last, but not least, a difference between the two tag sets lies in
the domain/tag relationship. For example, if we consider the most
tagged category, “Le migrazioni” (Migrations), we observe that in
the tag set generated by the experimental group, tags and categories
tend to increase proportionally. Notwithstanding this growth, the
set of domains emerging from this tag set remain consistent with
actual semantics of the category: for example, consider the domain
“metrology”, brought in by the tag related with the distances cov-
ered by the migrants, or “religion”, which is related to the descrip-
tion of the ethnic groups involved in the migrations.

Given the analysis conducted, we can conclude that a mixed ap-
proach, which relies on leveraging the curatorial knowledge (en-
coded in semantic format) to support social functions such as tag-
ging, is suitable to the development of active learning environments.
The use of the tag recommender contributes to keeping the tagging
activity of the educational users aligned with the categorization of
contents provided by the curators, at the same time sustaining the
growth of the obtained folksonomy and its focalization in terms of
variety. By promoting the linguistic reflection on the use of tags,
the recommender helps the teacher focalize the students’ work with
the beneficial input of the curatorial knowledge.


6 Conclusions

In this paper we described “150 Digit”, a web portal on the 150th
anniversary of the Unification of Italy, which has been designed and
implemented with a Social Semantic Web approach in mind. The
150 Digit portal displays the contents of the exhibitions organized
in 2011 for the anniversary, and encourages the active participation
of the educational users through social functionalities such as tag-
ging, voting, and commenting, and through the creation and upload
of new contents. A 3D visit contributes to making the environment
more compelling for students.

In particular, we described the semantic-based tag recommenda-
tions incorporated in “150 Digit”. The users’ tagging activity intro-
duces new connections over the site contents, expanding and com-
plementing the categorization of the exhibitions provided by the cu-
rators. The tag recommender exploits the semantic description of
the curatorial categorization of contents to sustain the growth of the
folksonomy, keeping it aligned with the institutional perspective,
with benefits for the focalization of the activities in the schoolwork.

After describing the design of “150 Digit”, mainly based on an it-
erative, user–centered methodology, we illustrated the evaluation
conducted of the tag recommendation function, discussing its re-
sults and possible benefits for learning environments.

Given the users’ activity logs collected for 150 Digit, which in-
clude tagging, commenting, uploading and saving content, current
methodologies (for example, see [Ferreira-Satler et al. 2011]) con-
sent the use of this information to generate more adaptive recom-
mendations. As future work, we envisage the improvement of the
tag recommender by extracting user profiles from the users’ behav-
ior.

Acknowledgments

This work was partially supported by the project “150digit. L’Italia
delle scuole” funded by Comitato Italia 150. The authors would like
to thank: Comitato 150 and Esperienza Italia 150 (especially (Clau-
dia Cugnasco and Marina Bertiglia), CSL – Università di Firenze;
Indire – MIUR, Firenze; CIRMA – Università di Torino. This work
was carried out with Virtual Reality & Multi Media Park S.p.A.
(Fabrizio Nunnari, Marco Squeo, Shanti May, Davide Di Giannan-
tonio). We also thank the users who took part in the focus group and
in the tests and the anonymous reviewers for their help and advise.

7 Authors

Rossana Damiano, PhD, is an Assistant Professor at the Depart-
ment of Computer Science of the University of Torino, Italy. Her
interdisciplinary research activity is centred on intelligent applica-
tions for cultural heritage, ranging from interactive multimedia sys-
tems to applications of virtual agents. In particular, she is interested
in the role of semantic models in new media production and appli-
cations.

Vincenzo Lombardo, PhD, is an Associate Professor of Informat-
ics at the University of Torino, Italy. He works on models and ap-
plications of knowledge-based system to multimedia communica-
tion. He carries on a production activity in multimedia production,
hosted by events at the international level.

Cristina Gena is an Assistant Professor at the Department of Com-
puter Science of the University of Torino, working in the area of
intelligent user interfaces. She completed her Ph.D. in Communi-
cation Science (University of Torino) in 2003, with a thesis on the
evaluation of user-adaptive systems. Her current research activities
address user modeling, adaptive web systems and their evaluation,

context-aware systems, semantic web, web 2.0, usability and in-
teraction design. Her contribution is based on experiences gained
within these fields.

References

AMES, M., AND NAAMAN, M. 2007. Why we tag: motivations
for annotation in mobile and online media. In CHI, ACM, M. B.
Rosson and D. J. Gilmore, Eds., 971–980.

BATEMAN, S., BROOKS, C., MCCALLA, G., AND
BRUSILOVSKY, P. 2007. Applying collaborative tagging
to e-learning. In Proceedings of the Workshop on Tagging and
Metadata for Social Information Organization (WWW’07).

BENTIVOGLI, L., FORNER, P., MAGNINI, B., AND PIANTA, E.
2004. Revising the wordnet domains hierarchy: semantics, cov-
erage and balancing. In Proceedings of the Workshop on Mul-
tilingual Linguistic Ressources, Association for Computational
Linguistics, 101–108.

BURIGAT, S., AND CHITTARO, L. 2007. Navigation in 3d virtual
environments: Effects of user experience and location-pointing
navigation aids. International Journal of Human-Computer
Studies 65, 11, 945–958.

CANTADOR, I., KONSTAS, I., AND JOSE, J. M. 2011. Categoris-
ing social tags to improve folksonomy-based recommendations.
J. Web Sem. 9, 1, 1–15.

DAMIANO, R., GENA, C., LOMBARDO, V., NUNNARI, F., SUP-
PINI, A., AND CREVOLA, A. 2011. 150 digit. integrating 3d
visit and social functions into a web 3.0 learning-oriented ap-
proach. In Broadband and Wireless Computing, Communication
and Applications (BWCCA), 2011 International Conference on,
IEEE, 136–143.

DAMIANO, R., LOMBARDO, V., GENA, C., AND NUNNARI, F.
2012. Guidance for web 3d in cultural heritage dissemination.
In Proceedings of the 17th International Conference on 3D Web
Technology, ACM, 186–186.

DEGEMMIS, M., LOPS, P., AND SEMERARO, G. 2007. A
content-collaborative recommender that exploits wordnet-based
user profiles for neighborhood formation. User Model. User-
Adapt. Interact. 17, 3, 217–255.

DJUANA, E., XU, Y., LI, Y., AND JOSANG, A. 2011. Ontol-
ogy learning from user tagging for tag recommendation making.
In Proceedings of the 2011 IEEE/WIC/ACM International Con-
ferences on Web Intelligence and Intelligent Agent Technology
- Volume 03, IEEE Computer Society, Washington, DC, USA,
WI-IAT ’11, 310–313.

FERREIRA-SATLER, M., ROMERO, F., MENENDEZ-
DOMINGUEZ, V., ZAPATA, A., AND PRIETO, M. 2011.
Fuzzy ontologies-based user profiles applied to enhance e-
learning activities. Soft Computing-A Fusion of Foundations,
Methodologies and Applications, 1–13.

FURNAS, G., LANDAUER, T., GOMEZ, L., AND DUMAIS, S.
1987. The vocabulary problem in human-system communica-
tion. Communications of the ACM 30, 11, 964–971.

GANGEMI, A., GUARINO, N., MASOLO, C., OLTRAMARI, A.,
AND SCHNEIDER, L. 2002. Sweetening ontologies with dolce.
Knowledge engineering and knowledge management: Ontolo-
gies and the semantic Web, 223–233.


GENA, C., CENA, F., VERNERO, F., AND GRILLO, P. Accepted
for publication. The evaluation of a social adaptive web site for
cultural events. User Model. User-Adapt. Interact..

HOTHO, A., JÄSCHKE, R., SCHMITZ, C., AND STUMME, G.
2006. Folkrank : A ranking algorithm for folksonomies. In
LWA, University of Hildesheim, Institute of Computer Science,
K.-D. Althoff and M. Schaaf, Eds., vol. 1/2006 of Hildesheimer
Informatik-Berichte, 111–114.

LANIADO, D., EYNARD, D., AND COLOMBETTI, M. 2007. A
semantic tool to support navigation in a folksonomy. In Proceed-
ings of the eighteenth conference on Hypertext and hypermedia,
ACM, New York, NY, USA, HT ’07, 153–154.

LAUDANNA, A., THORNTON, A., BROWN, G., BURANI, C., AND
MARCONI, L. 1995. Un corpus dell’italiano scritto contempo-
raneo dalla parte del ricevente. III giornate internazionali di
analisi statistica dei dati testuali 1, 103–109.

MAGNINI, B., STRAPPARAVA, C., PEZZULO, G., AND GLIOZZO,
A. 2002. The role of domain information in word sense disam-
biguation. Natural Language Engineering 8, 04, 359–373.

MARKINES, B., CATTUTO, C., MENCZER, F., BENZ, D.,
HOTHO, A., AND STUMME, G. 2009. Evaluating similarity
measures for emergent semantics of social tagging. In Proceed-
ings of the 18th international conference on World wide web,
April, Citeseer, 20–24.

MILLER, G. 1995. Wordnet: a lexical database for english. Com-
munications of the ACM 38, 11, 39–41.

NILES, I., AND PEASE, A. 2003. Mapping WordNet to the SUMO
ontology. In Proceedings of the IEEE International Knowledge
Engineering conference, 23–26.

NONNECKE, B., PREECE, J., AND ANDREWS, D. 2004. What
lurkers and posters think of each other. Hawaii International
Conference on System Sciences 7, 70195a.

O’ DONOVAN, J. 2009. Capturing trust in social web applica-
tions. In Computing with Social Trust, J. Golbeck, Ed., Human-
Computer Interaction Series. Springer London, 213–257.

PAZZANI, M. J., AND BILLSUS, D. 2007. Content-based recom-
mendation systems. In The Adaptive Web, 325–341.

PIANTA, E., BENTIVOGLI, L., AND GIRARDI, C. 2002. Develop-
ing an aligned multilingual database. In Proc. 1st Intl Conference
on Global WordNet.

PREECE, J., NONNECKE, B., AND ANDREWS, D. 2004. The top
five reasons for lurking: improving community experiences for
everyone. Computers in Human Behavior 20, 2 (March), 201–
223.

SARWAR, B., KARYPIS, G., KONSTAN, J., AND REIDL, J. 2001.
Item-based collaborative filtering recommendation algorithms.
In Proceedings of the 10th international conference on World
Wide Web, ACM, 285–295.

SCHAFER, J., FRANKOWSKI, D., HERLOCKER, J., AND SEN, S.
2007. Collaborative filtering recommender systems. The adap-
tive web, 291–324.

SZOMSZOR, M., CATTUTO, C., ALANI, H., O’HARA, K., BAL-
DASSARRI, A., LORETO, V., AND SERVEDIO, V. D. 2007.
Folksonomies, the semantic web, and movie recommendation.
In Proceedings of the Fourth European Semantic Web Confer-
ence, 71–85.

TANG, J., HUI, S., ZHOU, B., FONG, A. C. M., AND HONG, G.
2012. Generation of personalized ontology based on consumer
emotion and behavior analysis. IEEE Transactions on Affective
Computing 3, 2, 152–164.

TRANT, J. 2006. Exploring the potential for social tagging and
folksonomy in art museums: Proof of concept. New Review of
Hypermedia and Multimedia 12, 1, 83–105.

TRANT, J. 2009. Studying social tagging and folksonomy: A re-
view and framework. Journal of Digital Information 10, 1.

VESIN, B., IVANOVI?, M., KLA?NJA-MILI?EVI?, A., AND
BUDIMAC, Z. 2012. Protus 2.0: Ontology-based semantic rec-
ommendation in programming tutoring system. Expert Systems
with Applications 39, 15, 12229 – 12246.

WANG, Y., STASH, N., AROYO, L., HOLLINK, L., AND
SCHREIBER, G. 2009. Semantic relations for content-based
recommendations. In K-CAP, 209–210.

XU, Z., FU, Y., MAO, J., AND SU, D. 2006. Towards the seman-
tic web: Collaborative tag suggestions. In Collaborative web
tagging workshop at WWW2006, Edinburgh, Scotland, Citeseer.