Microsoft Word - fertig_fuer_reps.doc


How to assess the impact of an electronic document? And what does impact mean 
anyway? Reliable usage statistics in heterogeneous repository communities 
 
This document is a preprint of the formal publication with the same title, written by the same 
authors in: 
OCLC Systems & Services 26 (2), p. 133-145 
www.emeraldinsight.com/10.1108/10650751011048506 
DOI: 10.1108/10650751011048506 
Publisher: Emerald Group Publishing Limited 
 
 
Authors: 
Ulrich Herb - Saarland University and State Library, Saarbrücken, Germany (corresponding 
author) 
Eva Kranz - Saarland University and State Library, Saarbrücken, Germany 
Tobias Leidinger - Saarland University and State Library, Saarbrücken, Germany 
Björn Mittelsdorf  - Saarland University and State Library, Saarbrücken, Germany 
 
 
Purpose 
Usually impact of research and researchers is tried to be quantified by using citation data: 
Either by journal-centred citation data as in the case of the journal impact factor JIF or by 
author-centred citation data as in the case of the Hirsch- or h-index. The paper discusses a 
range of impact measures, especially usage based metrics. Furthermore the authors report the 
results of two surveys. The surveys focused on innovative features for open access 
repositories – with an emphasis on functionalities based on usage information. 
  
Design/methodology/approach 
The first part of the article analyzes both citation-based and usage-based metrics. The second 
part is based on the findings of the surveys: One in form of a brainstorming session with 
information professionals and scientists at the OAI6 conference in Geneva, the second in form 
of expert interviews mainly with scientists. 

Findings 
The results of the surveys indicate an interest in the social aspects of science like 
visualizations of social graphs both for persons and their publications. Furthermore usage data 
is considered an appropriate measure to describe quality and coverage of scientific 
documents, admittedly the consistence of usage information among repository has to be kept 
in mind. The scientist that took part in the survey also asked for community services, 
assuming these might help to identify relevant scientific information more easily. Some of the 
other topics of interest were personalization or easy submission procedures. 

Originality/value 
This paper delineates current discussions about citation-based and usage-based metrics. Based 
on the results of the surveys it depicts which functionalities could enhance repositories, what 
features are required by scientists and information professionals and whether usage-based 
services are considered valuable. These results also outline some elements of future repository 
research. 

 
Acknowledgments 
The authors would like to thank Philipp Mayr, Sven Litzcke, Cornelia Gerhardt, the experts 
who prefer to remain anonymous, and all participants of Breakout Group 6 at the OAI6 
conference. 
 
Introduction 
As Harnad (2008) explains, the meaning of an impact measure can only be determined by 
correlating said measure with either another measure (construct validity) or an external 
criterion (external validity).  
But which data should be employed to check impact measures like the Journal Impact Factor 
or the Hirsch-Index? 
The range and divergence of potential validating data sets, respectively their object selection, 
object granularity, and complexity of calculation instructions, reveal that the scientific value 
of a document has multiple dimensions (Moed 2005b). The actual choice depends on the 
perspective from which the impact (usefulness) question is asked.  
 
Galyani Moghaddam & Moballeghi (2008) give an extensive overview over possible 
methods. But a matter seldom addressed is the concrete motivation for impact measurement, a 
question that can help defining what impact should mean in a specific context. 
 
Statistical predictions and especially quality assessments can become self-fulfilling 
prophecies, especially if the numbers are already in use. If we use the height of academics as 
quality criterion in calling new staff members, academic teams naturally will become taller. A 
later study of height and institutional quality will find a high correlation of quality and height 
not because of the inevitable working of things but because this relation was man-made and 
the variables were confounded to begin with. Nicholas addresses this issue commenting on 
the Journal Impact Factor in an interview conducted by Shepherd (2007). 
 
Scientometric perspective: Eugene Garfield devised the Journal Impact Factor (JIF) as a filter 
criterion to determine whether a journal should be included in the Science Citation Index 
(SCI) sample (Garfield 2006). At that time each journal in the sample meant a serious amount 
of work. The restriction on a finite number of journals was not only a matter of quality but 
also of practicability. The assumption is that a higher rate of citations indicates a higher 
importance/quality/impact of an article but more importantly the journal. JIF is –in the context 
of journal assessment- presumably superior to simple publication counts as quantity does not 
depend on quality, but it can be argued that JIF rather describes a journal’s prestige, 
concentrates in established topics, and depends on a certain amount of honesty while it can be 
easily misunderstood by the naive or corrupted by the dishonest (Brumback 2009). 
 
Evaluation perspective: “The use of journal impacts in evaluating individuals has its inherent 
dangers. In an ideal world, evaluators would read each article and make personal 
judgements.” (Garfield 2006)  
There are two main problems in evaluation:  

Scientific quality and researcher value are assumed to be multidimensional, including 
characteristics outside the publication activity (Moed 2005b). 
Subjective or personal statements, i.e. evaluation by peers, are not reproducible, and 

have in modern days the air of bias and distortion about them. 
Jensen et al. (2009) investigate career predictors for French scientists in the national research 
organisation CNRS. Promotions are decided by a peer committee. The correlation of  
promotion on the one hand and publication and citation measures on the other is highest for 
the Hirsch-Index. Nevertheless the amount of promotions correctly predicted by h is only 


48%. This gap might result from human failure h cannot predict or measurement bias the 
experts do not succumb to, but presumably a mixture of both. 
Therefore the questions are:  

Which variables should be collected in addition to citation metrics?  
How should the variables be weighted?  
How to maximize fairness and openness, or can objective measures and human-made 

recommendations be synchronised?  
 
The need for multidimensional evaluation is shown by Shepherd (2007) who reports that over 
40% of their web survey sample perceive the JIF as a valid measure but over 50% regard the 
JIF as over-used. 
 
Journal perspective: The motivations of journal editors can be assumed to be purely 
economic, as only economically sound journals can compete with economically managed 
journals in a spiral of competition. Mabel et al. (2007) investigated the attitudes of medical 
editors towards the JIF and their handling of independent variables which are likely to 
increase their journal’s JIF rating. Editors relied chiefly on quality increase of their staff to 
boost author recruiting, article selection, and author support. 
JIF was accepted as status quo but editors expressed their concern that JIF is not useful to 
impress their practising readership. Thus, they could not solely rely on optimizing their JIF 
scores. They hope that complementary metrics that represent clinical impact or public 
advancements will be implemented. Empirical analysis of the interrelation of journal 
characteristics and journal performance (McWilliams et al. 2005) seem to contradict some of 
the medical editors’ statements. It is rather likely that different circumstances in management 
science and medicine account for these discrepancies. Further assessment of the properties of 
different disciplines will improve the transfer of insights. 
 
Library perspective: Electronic documents fundamentally change the mechanisms in libraries. 
Whereas in former times the library was the keeper of objects and controlled and monitored –
sometimes even created- the processes of search, localisation, and usage, they have become 
an intermediate agent nowadays. These changes might be as trivial as people being able to 
receive a text without physically visiting the library leading to bulletin boards not being read 
anymore. Librarians have to adapt by offering telematic versions of their services (Putz 2002). 
On the other hand the easily adaptable electronic reception desk offers opportunities of 
personalisation, customisation and portalisation. (Ghaphery & Ream 2000) and (Ketchell 
2000) warn that personalisation appeals rather to the professional or heavy user whereas 
customised views, centred on a specific topic or even specific academic classes, aid the 
average student user, who shuns high investments (e.g. login procedure, training period) and 
has to change focus of interest quickly. Additionally metrics can aid subject librarians in the 
compilation of resources. 
Another issue is billing. As libraries no longer have control over objects, they have to rely on 
external server statistics provided by the publishers and hosts to make licensing and 
acquisition decisions.  
The COUNTER standard (Project COUNTER 2008) is widely used to generate usage reports.  
Its granularity is rather unfit to help with acquisition decisions. Usage is reported per journal, 
not per volume, making it impossible to identify irrelevant vintages that would better be 
bought by article on demand and not included in a flat rate licensing bundle.  
COUNTER tends to distort the actual usage, for example Davis & Price (2006) report how 
interface characteristics of the publisher portal can unjustly increase usage frequency. 
 

Educational perspective: Educational science research in general focuses on the classroom 
aspects of digitisation.  
Collaborative work environments, online instructions, and testing environments are designed 
and evaluated to enhance the lecturers' efficiency for example with homework management 
and on the other hand boost student to student and student to tutor communication (Appelt 
2001). Electronic resources are produced by students or prepared by the lecturer to be stored 
and possibly versioned in the system. Course reserve collections are often created 
independently from library activities as many coursework software systems are designed as 
closed applications, which cannot be easily connected with other services. The aim of 
education is on the one hand to teach the curricula but on the other hand emphasis is placed 
on teaching information and communication technology (ICT) competence (Sikkel et al. 
2002). As education relies heavily on text books, contemporary citation measures are not 
applicable. 
 
The usability perspective is a specialised point of view that can complement the paradigms 
described above. It is most obvious with education as most education research explicitly 
includes investigations of ease of use and practicability. On the other hand institutions and 
organisations in a competitive environment (libraries, universities, and publishers) can 
improve their strategic position by increasing user efficiency. These can be purely technical 
aspects (e.g. a user achieving his goal with fewer steps requires less server computation time) 
but in general it has to be discussed whether the service does fulfil the request at all and 
whether it meets the user’s needs. 
Much discussed aspects of usability in information dissemination are recommender services, 
though their main application is in the commercial area (Montaner et al. 2003). The vast 
amount of works already available and the increasing growth rate can be assumed to overload 
the faculties of non-elite information hunters and gatherers (i.e. most students, practitioners, 
interested private persons, and persons concerned). Even a professional academic researcher 
can overlook an article and be informed in peer review about his non-optimal library search. 
But recommenders can not only help to clarify a topic. Content providers are very interested 
in recommenders that show the visitor alternative objects of interest hoping that he spends 
more time with the provider’s products. This can be a benefit by itself as straying users 
increase the amount of views and visits, which is reflected in COUNTER statistics as well as 
in revenues for paid advertisements.  
Other aspects of usability include among others visualisation of data, personalisation 
including user notes and comments being saved for later visits, see Expert Interviews and 
Brainstorming Session later in this article for further examples. 
 
But valid usage statistics are valuable to all of the perspectives: To scientometry it is an 
additional database that enables research in construct validity and sociological aspects of 
citation habits though it has to be emphasised that there is no mono-variant relation between 
usage and citation (Moed 2005a). Possibly citations and usage are independent dimensions of 
multi-dimensional impact.  
Access of an electronic resource can be measured in real-time and to a certain extent in-house. 
This should appeal to evaluation committees as well as developers (and usability testers) of 
educational methods and academic services. Methodologically speaking access and to a lesser 
degree usage are observable whereas questionnaires and even references/citations are 
susceptible to bias and distortions based on human language, beliefs and self-consciousness 
(Nicholas et al. 2005). 
 

Libraries and publishers have always counted their users' activity. It is a simple result of 
billing. And of course these numbers were used to advertise the journal following the logic of 
the tyranny of the majority: The journal read by many should also be read by you.  
 
There are problems that have to be addressed:  
The observable event in a repository of digitised objects reached via HTTP is the client 
computer request for said object to the web server.  
Neither a human intention nor successful delivery is strictly necessary. There are visits that 
result from search engines updating their search indices, errantry, and prefetching.  
But attributing requests to different individuals is further hampered by technologies like thin 
clients and proxy servers but also by public search terminals. Thin clients allow their users to 
interact with software located and executed on a central infrastructure. In case of web 
browsers this implies that all browser instances serving one thin client cluster are routed via 
one IP address. Intransparent proxies -nowadays mainly an important aspect of network 
security- pose the same problem. The obvious solution is to determine one unique user not 
only via the request IP address but also by utilising session identifiers transmitted as cookies 
or dynamically created URL arguments. However there is no reliable way to tell to visitors 
apart who use the same physical machine and account. This is common in educational 
facilities with search terminals located for example in libraries. It would be necessary to clean 
the browser’s cache each time before a successor begins his work or to identify the account as 
belonging to multiple persons for example by an appendix to the user-agent header field. 
Furthermore, aggregated statistics (e.g. author statistic) suffer from multiple instances of one 
document, but also from print-outs, private sharing of articles, making it very hard for the 
statistics provider to produce an ecologically valid parameter (Nicholas 1999), see 
Stassopoulou & Dikaiakos (2007) for a dynamic approach to robot detection. 
 
The heterogeneity of perspectives strongly indicates that a single measure, even a single 
method, is hardly a reliable decision base. Furthermore this diversity implies that even if one 
perspective were to reject usage analysis based on scientifically valid reasons this cannot 
automatically extend to other motivations. 
 
Open Access Statistics (OA-S) and similar projects 
Interoperable Repository Statistics (IRS) is a British project tailored to the British repository 
context. Utilising PERL scripts and the software tool AWStats EPrints and DSpace repository 
access logs can be analysed. The strength lies in the well prepared presentation possibilities 
offering various kinds of graphs and granularities (IRS 2007). 
 
MESUR is a research project which has established a large set of usage data by logging 
activities on publisher and link resolver servers. They aim towards creation and validation of 
usage-based metrics as well as validity testing of other impact measures (Bollen et al. 2008). 
 
The PEER project investigates the relation between open access archiving, research, and 
journal viability (Shepherd 2008). To this end PEER measures usage of a closed set of 
journals made open access by publishers for the project’s duration. 
 
PIRUS is developing a standard for article level usage reports that adheres to COUNTER 
Codes of Practice. A prototype and abstract description were created to enable document 
hosts to report the raw access data to an aggregating server or process it themselves  
(Shepherd et al. 2009). 
 

Open Access Statistics (OA-S, http://www.dini.de/projekte/oa-statistik/english/) is a project 
funded by the German Research Foundation (Deutsche Forschungsgemeinschaft DFG) and 
conducted by the Project Partners State- and University Library Göttingen (Georg-August 
Universität Goettingen), the Computer- and Mediaservice at Humboldt-University Berlin, 
Saarland University and State Library, and the University Library Stuttgart. OA-S aims to (1) 
establish protocols and algorithms to standardise calculations of usage frequency on web-
technology based on open access repositories (2) create an infrastructure to collect the raw 
access data, process it accordingly and  (3) supply the participating repositories with the usage 
metrics. In contrast to IRS, statistical parameters are not calculated locally - so in addition to 
article level measurements, parameters beyond document granularity can be implemented, 
like author-centred statistics or usage aggregation over different versions or multiple instances 
of the same publication like preprint vs. postprint or self-deposit vs. repository copy. This 
flexibility in scope should ease the combination and comparison of usage statistics and 
bibliometric indices. Methods and experiences are similar to those of the PIRUS project, but 
OA-S concentrates on the centralised aggregation strategy and faces an even more diverse 
repository software ecosystem.  
In addition to the bibliometric perspective, the project will specify functionalities to enhance 
the usability of repositories, e.g. quality valuations, document recommendations, based 
among others on usage data. 
In order to focus on the services most yearned for a questionnaire survey will be conducted to 
determine actual user priorities. It should be noted that neither the interviews nor the 
brainstorming were limited to a special perspective or methodology. All ideas were accepted 
equally.  
 

Expert Interviews 
The expert interviews were conducted according to the guidelines given by Bogner et al. 
(2005). Most experts were identified through their publications. To add publisher and user 
perspective persons who are involved in journal production and are situated at Saarland 
University were contacted, too. 5 out of 10 candidates agreed to participate in a loosely 
structured interview. Interview length ranged from 12 to 57 minutes. No person-centred 
presentation of results is given to ensure privacy. Most interviews were conducted via phone. 
All were recorded with explicit consent from the participants and afterwards transcribed to 
text. 
 
The following list consists of the experts’ ideas and inspirations from the interviews: 

1.Recommender (high-usage-quota-based) 
2.Freshness-Recommender (recent-publication-based) 
3.Minority-Recommender (low-usage-quota-based) 
4.Profile-Recommender (based on profile similarities) 
5.Subject-Recommender (thematic-proximity-based) 
6.Usage-Similarity-Recommender (clickstream-similarity-based) 
7.Citation-Recommender (citation-intersection-based) 
8.Favourites-Recommender (Users'-favourites-lists-based) 
9.Recommendation of central authors 
10.School-of-thought Recommender (Scientific social network graph) 
11.Author-centred usage statistics 
12.Repository-centred usage statistics 
13.Subject-centred usage statistics 
14.User-centred usage statistics 
15.Reordering links (usage-quota-based) 
16.Collapsing links in large result sets (usage-quota-based) 
17.Re-Rendering Result List layout 
18.Dead link identification 
19.Users' Quality Statements, i.e. comments (free text) 
20.Users' Quality Statements (rating) 
21.Quality Statements (usage-based) 
22.Ensuring document accessibility (bridging the gaps between different storages) 
23.Automated Retro-Digitalisation requesting 
24.Automated translation requesting 
25.Feed Notifications 
26.Notifying friends manually (e.g. via e-mail) 
27.Search Phrase Recommender 
28.Search Result commenting 

 
Brainstorming Session 
A brainstorming was conducted as part of Breakout Group 6 at the OAI6 conference in 
Geneva (Mittelsdorf & Herb 2009). In contrast to the expert interviews, many proposals were 
concerned with interface design and data visualisation and presentation. This possibly results 
from the fact that many participants are situated in libraries. The ideas are grouped and 
preluded by buzz word labels (Arabic numbers) to heighten readability. 
 

1.Authority/Standardisation 
a.Central Unique Author Identification 

i.Author 
ii.Identification/Profile 


iii.Picture 
iv.Projects 
v.Competence 

b.Network of Authors 
i.Social 

ii.Professional 
iii.Expertise 
iv.Field of interest 

2.Visualisations/Indexing Dimensions 
a.Paper's Context 
b.Visual Social Graph 
c.Show development of ideas (network graph displaying publication times for 
a document set) 
d.Visualisation of publication's geo-location 
e.Position publication in the „landscape of science“ 
f.Project’s social map 
g.Visualise Data and Connections 
h.Semantic classification 
i.Numerical Semantics (Speech Independence) 

3.Barrier Reduction 
a.Connect to the world (link between science and application) 
b.Publication-News-Binding 
c.Solicit further research 

i.Need stack 
ii.Wish list 

iii.Request notification 
d.Practicable access to repositories not only via modern PC capabilities and 
resolution (e.g. mobile phones, hand helds, OLPC, etc.) 

4.Reception Tracking 
a.Consistent access statistics 
b.Re-use tracking 
c.Enhanced (complex) metrics (for better evaluation) 

5.Assistance (author and user): 
a.Automatic update/linking pre-print + new version 
b.Thumbnail/snapshot creation (first page display) 
c.Integrate everything (integrate information and processes and results) 
seamlessly working 
d.Modification of the document catalogues' structures 

6.Assistance (author): 
a.Real-Time assistance in form fill-in 
b.Automatic Metadata Creation/Lookup 
c.Reduce Redundant Work (intelligent submission) 
d.Dynamic publication (version management of production and reception; 
collaborative production) 
e.Easy submission process 
f.Dynamic publication list (exportable) 
g.Bonus Point System 
h.Easy feedback from authors to repository 
i.Repository as workspace 
j.Repository as research/production environment 
k.Educational assistance/encouragement for new authors (How-tos) 


l.Automatic/easy classification/positioning of new publication 
m.Automatic citation generation 

7.Assistance (user): 
a.Track/pursue other searchers’ way through the repository 
b.User recommendations as part of repository 
c.Graph/Image extraction from papers 
d.Dataset extraction 
e.Assign personalised searchable attributes 

i.Personal comments 
ii.Pictures as bookmarks 

iii.Memory aids 
iv.Relevance statement 

f.Transparent result display relevance criteria 
 
Comparison:  
Simple numbers in this section indicate items from the expert interview set, while number-
letter-combinations belong to the brainstorming set. 
Both samples expressed a strong awareness of and interest into the social aspects and laws 
shaping modern science, calling for social graphs of publications and authors (10. and 2.a.-f.).  
 
Statistics were perceived as an information source to judge quality and coverage (11.-14.) 
whereas the brainstorming group emphasised inter-repository consistency of statistics as a 
precondition (4.a.) and possible benefits to evaluation (4.c.).  
 
Many ideas revolved around community sharing, assuming a positive shift in the amount of 
work required to identify an interesting paper (7.a.; 7.b. and 27.). These trends are probably 
inspired by the widely perceived impact of user-generated content and community content 
management. The same is probably true of 7.b., 7.e.i-iv, 28., 19., 20., and 8.,  but in addition 
this strong demand for personalisation mechanisms implies that users have the impression of 
many redundant steps in repository handling (e.g. search for a specific piece of text in a 
previously read article).  
 
Overall the experts accepted the repository interface as it is in contrast to the brainstorming 
group. Most technical and bureaucratic proposals came from the latter. Possibly because a 
majority is employed in the library/knowledge management sector. The experts interviewed 
on the other hand emphasised that not only the amount of services is important but also the 
service’s success rate. All of them would tolerate recommender systems with an accuracy of 
90% or more but would rather not be bothered by the noise produced by an inaccurate service. 
 
There seems to be a demand for complex measures and the unfiltered presentation of complex 
interrelations instead of simplifications. The persons interviewed no longer believed in the 
promise of simple numbers describing the world without loss of information. 
 
Future Research 
To investigate the desirability order of the collected ideas quantitative methods will be used.  
The questionnaire will have three logical parts:  

demographic questions for identifying user subcultures and later data re-use 
general “attitude towards different kinds of service“ questions.  These are filter 

questions that cause blocks of questions for specific services to be asked. 
specific questions. A participant will have to answer a number of thematic blocks 

based on his general attitudes. 


The questionnaire will be a set of HTML forms. Adaptive testing is easily implemented using 
dynamically generated HTML pages. Adaptive testing reduces the number of items presented, 
which helps to prevent participants from giving random answers to questions they are not 
interested in.  
In electronic testing there is also no need to manually transcribe answers from hard-copy 
forms into the computer, thus eliminating the risk of transcribing errors. 
Execution via HTML forms is today the cheapest and most efficient way to conduct a survey 
targeting a large and international sample. There will be at least a German and an English 
version.  
 
Conclusion: 
The ideas presented in this paper provide especially those persons concerned with usability 
improvement and the creation of new services with valuable hints to the library or interface 
perspective. The informative value will greatly increase as the results of the questionnaire 
survey can be quantitatively interpreted.  
The benefit to the other perspectives should not be underrated. Aside from designing 
specialised tools for evaluators, the data needed to implement added-value services and the 
data generated by visitors utilising these services can be integrated with established data 
sources' increasing validity and the amount of variance explained. 
Usage data can be used to analyse the validity of bibliometric constructs. New modes of 
synchronous and asynchronous communication can help libraries and universities –even 
publishers- to tailor their stock to their clients’ demands and even to rectify content or 
reference structures for example. A stronger awareness of the social aspects of the publishing 
process can renew peer communication and make peer review more transparent if not 
completely open. Educational as well as non-academic personnel is not only a beneficiary but 
as shown in the brainstorming can be a source of major transformations assuming that it is 
supported by students, academics, and bureaucrats. 
Additionally the use of open protocols and standards for object description and data transfer 
are strictly necessary: Different solutions can aid the innovation process but this should not be 
an excuse for implementing the same algorithm on a different set of objects without retaining 
interoperatability with other providers. The OAI standards as well as standards as IFABC, 
ORE, and OpenURL context objects need to be employed and further refined. 
 
 
References 
Appelt, W. (2001), “What groupware functionality do users really use? Analysis of the usage 
of the BSCW system”, in Parallel and Distributed Processing, 2001, Mantova, 2001, IEEE, 
pp. 337-341  

Bogner, A. Littig, B. and Menz, W. (2005), Das Experteninterview: Theorie, Methode, 
Anwendung, VS Verlag für Sozialwissenschaften, Wiesbaden. 

  
Bollen, J., Van de Sompel, H. and Rodriguez, M.A. (2008), "Towards usage-based impact 
metrics: first results from the MESUR project", in Proceedings of the 8th ACM/IEEE-CS joint 
conference on Digital libraries, Pittsburgh, 2008, ACM, New York, pp. 231-240. 

  
Brumback, R.A. (2009), “Impact Factor Wars: Episode V – The Empire Strikes Back”, 
Journal of Child Neurology, Vol. 24 No. 3, pp. 260-262. 

  
Davis, P.M. and Price, J.S. (2006), “eJournal interface can influence usage statistics: 
Implications for libraries, publishers, and Project COUNTER”, Journal of the American 
Society for Information Science and Technology, Vol. 57 No. 9, pp. 1243-1248. 

  
Galyani Moghaddam, G. and Moballeghi, M. (2008), “How Do We Measure Use of Scientific 
Journals? A Note on Research Methodologies.”, Scientometrics, Vol. 76 No. 1, pp. 125-133. 

  
Garfield, E. (2006), “The History and Meaning of the Journal Impact Factor”, Journal of the 
American Medical Association, Vol. 295 No. 1, pp. 90-93. 

  
Ghaphery, J. and Ream, D. (2000), “VCU’s My Library: Librarians Love It. ...Users? Well, 
Maybe”, Information Technology and Libraries, Vol. 19 No. 4, pp. 186-190. 

  
Harnad, S. (2008), “Validating Research Performance Metrics Against Peer Rankings”, Inter-
Research Ethics in Science and Environmental Politics, Vol. 8, pp. 103-107. 

  
IRS (2007), “IRS: Interoperable Repository Statistics”, available at: http://irs.eprints.org/ 
(accessed 17 July 2009).  

  
Jensen, P., Rouquier, J.-B. and Croissant, Y. (2009), “Testing bibliometric indicators by their 
prediction of scientists promotions”, Scientometrics, Vol. 78 No. 3, pp. 467-479. 

  
Ketchell, D.S. (2000), “Too Many Channels: Making Sense out of Portals and 
Personalization”,  Information Technology and Libraries, Vol. 19 No. 4, pp. 175-179. 

  
Mabel, C., Villanueva, E.V. and Van Der Weyden, M.B. (2007), “Life and times of the 
impact factor: retrospective analysis of trends for seven medical journals (1994-2005) and 
their Editors' views”, Journal of the Royal Society of Medicine, Vol. 100 No. 3, pp. 142-150. 

  
McWilliams, A., Siegel, D. and Van Fleet, D.D. (2005), “Scholarly Journals as Producers of 
Knowledge: Theory and Empirical Evidence Based on Data Envelopment Analysis”, 
Organizational Research Methods, Vol. 8 No. 2, pp. 185-201. 

  
Mittelsdorf, B. and Herb, U. (2009), “Breakout group 6. Access Data Mining: A new 
foundation for Added-value services in full text repositories”, available at: 
http://indico.cern.ch/contributionDisplay.py?contribId=72&confId=48321 (accessed 17 
August 2009). 

  
Moed, H.F. (2005a), “Statistical Relationships Between Downloads and Citations at the Level 
of Individual Documents Within a Single Journal”, Journal of  the American Society for 
Information Science and Technology, Vol. 56 No. 10, pp. 1088-1097. 


Moed, H.F. (2005b), Citation Analysis In Research Evaluation, Springer Netherlands, 
Dordrecht.  

  
MONTANER, M., LÓPEZ, B. and DE LA ROSA, J.L. (2003), “A Taxonomy of 
Recommender Agents on the Internet”, Artificial Intelligence Review, Vol. 19 No. 4, pp. 285-
330. 

  
Nicholas, D., Huntington, P., Lievesley, N. and Withey, R. (1999), “Cracking The Code: Web 
Log Analysis”, Online & CD-ROM Review, Vol. 23 No. 5, pp. 263-269. 

  
Nicholas, D., Huntington, P. and Watkinson, A. (2005), “Scholarly journal usage: The results 
of deep log analysis”, Journal of Documentation, Vol. 61 No. 2, pp. 248-280. 

  
Project COUNTER (2008), “COUNTER Codes of Practice”, available at: 
http://www.projectcounter.org/code_practice.html (accessed 17 July 2009). 

  
Putz, M. (2002), Wandel der Informationsvermittlung in wissenschaftlichen Bibliotheken, 
University of Applied Sciences for Library and Information Management, Eisenstadt, 17 July 
2009. 

  
Sikkel, K., Gommer, L. and Van der Veen, J. (2002), “Using Shared Workspaces in Higher 
Education“, Innovations in Education and Teaching International, Vol. 39 No. 1, pp. 26-45. 

  
Shepherd, P. (2007), “Final Report on the Investigation into the Feasibility of Developing and 
Implementing Journal Usage Factors”, available at: 
http://www.uksg.org/sites/uksg.org/files/FinalReportUsageFactorProject.pdf (accessed 17 
July 2009). 

  
Shepherd, P. and Wallace, J.M. (2009a), “PEER: a European project to monitor the effects of 
widespread open access archiving of journal articles”, Serials, Vol. 22 No. 1, pp. 19-23. 

  
Shepherd, P. and Needham, P.A.S. (2009b), “PIRUS Final Report”, available at: 
http://www.jisc.ac.uk/media/documents/programmes/pals3/pirus_finalreport.pdf (accessed 17 
August 2009). 

  
Stassopoulou, A. and Dikaiakos, M.D. (2007), “A Probabilistic Reasoning Approach for 
Discovering Web Crawler Sessions”, Lecture Notes in Computer Science, Vol. 4505 No. 1, 
pp. 265-272. 

 
Biographical Notes: 
 
Ulrich Herb studied Sociology at Saarland University, Germany. He is member of the 
electronic publishing group of Saarland University and State Library. 
Affiliation: Saarland University and State Library, Saarbrücken, Germany 
 
Eva Kranz is studying Bioinformatics at Saarland University, Germany, where she also 
has been working as a student assistant for Open-Access-Statistics, since 2008. Ms. Kranz is 
actively involved in the open source project Collabtive where she is responsible for 
development, documentation and community management. 
Affiliation: Saarland University and State Library, Saarbrücken, Germany 
 
Tobias Leidinger is studying Computer Science at Saarland University, Germany. He is 
working for several electronic publishing projects at Saarland University and State Library 
(e.g. OPUS 4 and Open Access Statistics), since 2006. 
Affiliation: Saarland University and State Library, Saarbrücken, Germany 
 
Björn Mittelsdorf has been a member of Open-Access-Statistics since 2008. Previously he 
spent two years at the Institute for Psychology Information, Trier, Germany, where he was 
involved in digital preservation of primary research data. 
Affiliation: Saarland University and State Library, Saarbrücken, Germany