key: cord-0574772-oxd63g3c
authors: Derzsy, Noemi; Majumdar, Subhabrata; Malik, Rajat
title: An Interpretable Graph-based Mapping of Trustworthy Machine Learning Research
date: 2021-05-13
journal: nan
DOI: nan
sha: 125c2478b12e9a5fbfd1bdc03b9f5c68a0bd5691
doc_id: 574772
cord_uid: oxd63g3c

There is an increasing interest in ensuring machine learning (ML) frameworks behave in a socially responsible manner and are deemed trustworthy. Although considerable progress has been made in the field of Trustworthy ML (TwML) in the recent past, much of the current characterization of this progress is qualitative. Consequently, decisions about how to address issues of trustworthiness and future research goals are often left to the interested researcher. In this paper, we present the first quantitative approach to characterize the comprehension of TwML research. We build a co-occurrence network of words using a web-scraped corpus of more than 7,000 peer-reviewed recent ML papers -- consisting of papers both related and unrelated to TwML. We use community detection to obtain semantic clusters of words in this network that can infer relative positions of TwML topics. We propose an innovative fingerprinting algorithm to obtain probabilistic similarity scores for individual words, then combine them to give a paper-level relevance score. The outcomes of our analysis inform a number of interesting insights on advancing the field of TwML research.

With the unprecedented increase in the deployment of machine learning (ML) systems in the real world, there is an increasing need to ensure that such systems behave in a socially responsible manner. Responding to this challenge, in the recent past there has been a plethora of interest from ML researchers and practitioners to develop algorithms and models that embody qualities such as fairness, explainability, privacy, and robustness. This sub-field of ML is often referred by umbrella terms such as Responsible ML or Trustworthy ML [3, 30, 32] .

Scientific literature on trustworthy ML (TwML) has grown rapidly in the past few years. While this presents tremendous opportunities for future technical work, there is a lack of codification and characterization of this knowledge base. Considering the interdisciplinary nature of the area and its major venues of * Alphabetical authors. arXiv:2105.06591v1 [cs.SI] 13 May 2021 publication (e.g. FAccT 1 and AIES 2 ), with participation from fields like social sciences and public policy, such mapping is important to not only summarize existing work but also to inform new application areas on their relevance to TwML. To this end, there are a number of quality review and summary articles [3, 5, 16] , as well as books [11] . However, by nature these are qualitative and static, and leave the judgement of a topic of interest being relevant to TwML to the reader.

In this paper, we take a quantitative approach to characterize the comprehension of trustworthy ML research and its position with respect to contemporary scholarly work in the broader field of ML. Our text analytics approach is based on a weighted co-occurrence network of words which occur in the text of more than 7,000 peer-reviewed ML papers published in the last 5 years-both related and unrelated to TwML. We use this network for two purposes. First, we use community detection to obtain semantic clusters of words and infer the relative position of TwML topics. Second, we propose a novel relevance score to quantify the 'closeness' of individual words to TwML concepts, then combine these wordlevel scores in an interpretable manner to obtain paper-level relevance scores.

Related Work. Easy availability of bibliographic data has enabled a number of recent studies that aim to map scientific research spaces based on past scholarly work. Such studies span both general science [7, 34] , and specific domains like Physics [4, 19] , Bioinformatics [13] , as well as rapidly emerging interest areas like COVID-19 [33] . The use of complex network models is extremely popular in these studies. In previous work, network models have been built on data from citations [21] , author information and collaborations [4, 6] , pre-categorized research topics, such as Physics and Astronomy Classification Scheme (PACS) codes [4, 19] , or words in the text of papers [13] .

Popular methods for constructing scientific knowledge fall in two broad categories: embedding-based methods, and co-occurrence networks/knowledge graphs. Embedding-based methods, such as [4] , typically map entities like words, sentences and documents to a high-dimensional numerical space (using techniques such as Word2vec), then form edges between two entities (such as words, authors, citations) based on their similarity as measured by some similarity metric. On the other hand, the second category of methods build a network directly based on the connectivity patterns of entities. Edges or edge weights in the graph may correspond to specific relationships between entities for knowledge graphs [2] , or measures such as co-occurrence indicators/weights [13, 22] .

To the best of our knowledge, the literature lacks a study which maps the research landscape and characterizes existing knowledge of TwML. In this paper we aim to fill this gap.

Goals and contributions. Contrasting with the mostly exploratory and inferential nature of studies on other fields of research, we aim for both inference and prediction. Specifically, our goal is to answer the following research questions in a data-driven manner: Q1: Within the co-occurrence network of words in recent ML literature, can we characterize the relative position of words and concepts as they relate to TwML? Q2: Can we predict which scientific papers that are not on TwML topics may be relevant to this sub-field? Q3: For words or terms not directly pertaining to TwML, can we infer their context similarity with words/terms that do?

To address these questions, our contributions are:

1. We map the space of recent ML research using a word-level network generated from papers that cover general ML, as well as work specifically on TwML. 2. We propose a probabilistic fingerprinting method that quantifies the relevance of a word/paper to TwML. Word-level relevance scores are aggregated in an interpretable manner to obtain the paper-level scores. 3. Taking a broader point of view, we explore the network of words to identify a words that, even though not overtly related to TwML (i.e. unlike terms such as fairness, transparency), are conceptually related to TwML.

As the nascent area of Trustworthy ML matures, through this paper we aim to initiate the quantitative study of this body of literature to channel future work in interesting directions, as well as create connections with new application areas.

To begin with, we scraped 7107 papers from the Proceedings of Machine Learning Research 3 (PMLR) website, that have been presented at peer-reviewed conferences and workshops, and contain ML research covering a breadth of topics. This corpus as part of our analysis ensures we have a diverse dictionary of words and their co-occurrences. Due to the wide scope of the PMLR corpus, papers that specifically focus on TwML get clubbed together with ones that do not. To create a distinctive set of papers that focus on TwML, we used a two-pronged approach. First, we obtained 221 papers from the ACM Digital Library 4 that were published in past FAccT conferences (2018-2020), and labeled all of them as TwML-focused. Second, we curated a set of 74 words and terms related to TwML, starting from a list of obvious 'seed' words (e.g. 'bias','fairness') and manually iterating to include their variants relevant for TwML (e.g. 'algorithmic bias' but not 'biased coin') present in the FAccT corpus. We then label a paper in the larger PMLR corpus as TwML-focused if it contains at least one occurrence of any of these words, resulting in 263 more such papers. We label all the other PMLR papers as non-TwML-focused.

Pre-processing. In a scientific paper, raw data, tables, plots, proofs, and references can all contribute to a noisy dictionary. Authors spend considerable time deciding titles and writing abstracts to make them stand out more. Further, abstracts often present a high-level summary of the research problem and methodology [31] . Therefore, for our analysis, we restrict our corpus to only include titles, keywords (when present), and abstracts. We start with standard text pre-processing steps: convert text to lowercase, remove special and numeric characters, tokenize, remove stop words and single character words, and then finally stem words using the Snowball stemmer 5 . After pre-processing, we use simple random sampling to split the corpus, assigning 90% of all papers (i.e., PMLR+FAccT) as training set and the remainder as test set.

Our methodological work has 3 components: (1) building a network of words and detecting communities of similar words, (2) fingerprinting papers as relevant to TwML, and (3) discovery of non-TwML words potentially relevant to the area of TwML. Figure 1 illustrates these processes, and we detail them below.

Network of words. Using the pre-processed text from our training corpus, we build a word co-occurrence network by connecting each pair of stemmed words that appear in the same abstract. Connections between words (i.e., nodes) are represented with a weighted edge. The weight reflects co-occurrence-the number of times the pair of words appeared together in an abstract. This construction scheme generates an undirected, weighted network of words. Next, we detect communities in this network and identify which communities our predefined list of TwML words occur in. In order to effectively perform community detection on the network, we use additional cleaning steps to denoise the graph by removing very high-frequency words. We apply a differential edge cutoff: we remove the top 10% highest connectivity non-TwML words and the top 25% of highest connectivity TwML words that originate from splitting compound words (e.g., 'algorithmic bias' → 'algorithm' and 'bias'). Note that this splitting also converts the 74 TwML-specific words into 41 individual stemmed words. Finally, we use the Louvain community detection algorithm [1] to identify densely connected communities within the above network.

Bi-level fingerprinting. We use a novel fingerprinting algorithm (Algorithm 1) to obtain probabilistic similarity scores for individual words or papers. To begin with, we score each word individually based on its weighted shortest path distance-computed using Dijkstra's algorithm-from TwML words. We calculate the relevance score for a full paper as the weighted average of the word-level scores of all words in that paper, assigning larger weights to words that belong to the same community as TwML words:

(1)

Here, s i > 0 is the relevance score of the i th word in the paper. Given weights w 1 > w 2 ≥ 0, the contribution of a word to the paper-level score is w 1 s i if it belongs to any of the two communities rich in TwML words (Table 1; indicated by TwML community, or TC), and w 2 s i if it belongs to any other community (indicated by non-TwML community, or NTC). To score a paper, we consider the N words in its abstract that yield non-zero scores through Algorithm 1. Finally, the denominator normalizes a paper-level score by the maximum possible value, and the score is set to 0 if all word-level scores are 0 in a paper. If the relevance score of a paper is ≥ 0.5, then we flag the paper as potentially TwML-related. We use grid search to find optimal values of the weights: w 1 = 3, w 2 = 0.5. Due to the nature of how the above paper-level relevance scores (Eq. 1) are calculated, our probabilistic fingerprinting method is inherently interpretable. From analyzing the breakdown of a paper-level score into its constituent wordlevel scores, the user can obtain potential reasonings of why a paper may be (or not) highly relevant to TwML. We discuss this in Sections 3 and 4. Table 1 : TwML words in communities. The first two rows contain the bulk of TwML-related words. The second row can be interpreted as a community relating to differential privacy.

Contextual similarity of non-TwML words. Our last goal is to expand the existing list of TwML words with additional words that are conceptually related to TwML. The reason for doing this is two-fold. Firstly, in the current work we rely solely on the TwML words as an initial seed list of mostly technical words that are used for multiple purposes. However, expanding this existing list with additional contextually similar words would result in a more inclusive set that can improve the fingerprinting process. Secondly, we wish to identify broad areas of interest for future research using these conceptually similar words.

To this end, we utilize the connectivity information of non-TwML words with TwML words. We extract all the direct connections of TwML words, along with their corresponding edge weights, which indicate the strength of their connection. In addition, we score each direct neighbor using Algorithm 1, which informs us on the overall connectivity of that word with TwML words as a whole. Finally, we use upper threshold cutoffs on edge weights and word relevance scores to identify words above the threshold as potentially of interest.

Network of words. Only about 7% (484 out of 7328) of all papers are TwMLrelated. Previous studies have empirically observed that complex methods such as knowledge graphs or high-dimensional numeric embeddings are less reliable for characterizing rare concepts or terms [15, 29] . Because of this rarity issue of TwML papers, we use a word co-occurrence network in place more sophisticated methods. The resulting network contains 10,698 nodes and 254,347 edges.

The community detection algorithm generated 25 communities, with a modularity score of 0.33. As given in Table 1 , TwML-related words are concentrated in two communities. Among them, seven words that are mostly related to Differential Privacy (DP) separate from the rest into one community (second row in Table 1 words. For convenience we shall refer to these communities as DP and non-DP community, respectively. The remaining 8 TwML words-which are mostly ambiguous such as 'metric' or 'procedur' or general such as 'trustworthi'-get distributed across 6 communities. Figure 2 visualizes the overall network, focusing on the two TwML-specific communities. We categorize the TwML words into four subject-based categories:

-Privacy: 'privaci', 'differenti', 'privat', 'guarantee', 'concern','preserv', -Interpretability: 'transpar','interpret','account', -General: 'trustworthi', 'mechan','algorithm','data', -Fairness: all others.

From the relative position of words in each category in Figure 2 , it is evident that a number of privacy-specific and fairness-specific words cluster together, and these two clusters are well-separated from each other.

Fingerprinting of papers. Because of the probabilistic nature of our fingerprinting process, it can be used to classify whether or not a paper is related to TwML. Our method exhibits good recall values across the two corpuses. The precision in the PMLR corpus-hence the overall precision, as it forms a large proportion of the overall set of paper-is low. This is an indication that there are probably a number of papers that do not contain our pre-specified TwML words, but may be related to this subject based on their contents. Note that since all papers in the FAccT corpus are labeled as TwML-related, area under curve (AUC) does not exist for this category, and it exhibits a perfect precision.

We present non-TwML papers with highest relevance scores in Table 3 , and the word-level relevance scores for selected papers in Figure 3 . A number of papers in Table 3 are on topics that have received less attention in TwML literature [9, 16] , such as reinforcement learning, active learning, bandit algorithms, and outlier detection. The word-level scores (Figure 3) give interpretability to the paper-level fingerprinting. As an example, paper 7 in Table 3 [18] gets a high score because of the word 'movi' from to the non-DP community, and 'fisher' which belongs to neither of the two TwML-word rich communities. Contextualizing these words, a fisher information-based approach similar to [18] may be relevant for obtaining fairly calibrated movie ratings and recommendations [28] .

Contextual similarity. To expand the existing list of TwML words with additional conceptually related words, we use the edge weights and relevance scores of words that are direct neighbors of a TwML word in our co-occurrence net- work to identify the appropriate threshold cutoff. Filtering for words that share at least one edge of weight ≥ 100 with a TwML word, and have a relevance score ≥ 0.5 resulted in a subset of 290 words. Words in this list can be further assessed for their significance. Table 4 highlights 10 such words.

A number of interesting insights come out from the above analysis.

Network of words. The differential distribution of TwML words within communities, as observed in Table 1 , indicates that TwML papers tend to focus more on certain lines of research, methods or applications than others. In the context of ML bias and fairness, this is echoed by the review article of [16] . They observed that addressing group fairness in classification problems has received disproportionately high interest compared to other fairness categories (e.g. individual fairness, subgroup fairness) and types of methods (e.g. clustering, graph embedding); see Table 7 therein. Within the TwML words, Differential Privacy (DP)-specific words and those related to fairness and transparency group separately into two different communities. A potential reason for this may be that DP is a comparatively older research area, and has seen more theoretical developments than relatively new topics like fairness or transparency.

Paper-level fingerprinting. All papers in Table 3 with high relevance scores are on comparatively complex algorithms. A number of these areas have been heavily researched of late, such as reinforcement learning (RL; papers 1, 14, 19) , bandit problems (2, 4, 18) , anomaly detection (2, 9, 10, 11) , representation learning (11, 13, 15) , multitask problems (2, 8, 10, 13) , dirichlet process (3, 22) , and nonconvex optimization (24, 25) . The word-level breakdown of relevance scores ( Figure 3) gives further insights into how the concepts in these papers may be related to TwML. Top scores for papers 7 and 19 come from TwML-words that belong to the non-DP community. Looking into their subject matters, paper 7 [18] studies statistical ranking for dependent network data. Interestingly, a very recent paper that is not in our analyzed corpus studied the problem of applying fairness constraints on node ranks in a graph [12] . Paper 19 is on safe policy improvement in RL [27] . Safe policies in RL refer to policies that maximize expected return in problems where ensuring certain safety constraints is important alongside satisfactory performance [8] . In the context of ML fairness, safe policies can potentially be policies that satisfy equitable performance guarantees for sensitive demographic subgroups.

In Table 5 , we summarize the words with highest scores among words that occur in any of the 25 papers in Table 3 , and belong to either the DP or non-DP community. Among words belonging to the DP community, 'multiclass' is interesting. After a small number of papers in the early 2010's [20, 24] , multiclass problems in DP have started to receive more attention recently [25] . Words in the non-DP cluster, on the other hand, refer to methods or algorithms-'mdps' is Markov Decision Processes, 'vb' is variational bayes, and 'triplet' is triplet loss. Each of these categories are contextual to ML fairness or explainability. For example, [10, 14] incorporate causality and fairness notions in ML models using variational inference. Russell and Santos [23] explains reward functions in MDPs by building a classification model with rewards as outputs. A recent preprint [26] applies the triplet loss in the context of fairness.

Contextual similarity. A large number of 'similar' words that are heavily connected with TwML words neither (a) pertain to algorithms or methods, nor (b) belong to the DP community. Table 4 presents ten such words. In contrast to words highly important to fingerprinting of papers (Table 5) , these similar words mostly refer to application aspects of fairness ('race', 'stereotyp', 'facial', 'skin'), privacy and security ('tamper','membership','secur'), as well as other practical issues ('drug','physiolog','censor'). This potentially suggest two things. Firstly, application-oriented keywords are closely associated with TwML terms, and should be used to characterize the research landscape of this interdisciplinary field. Secondly, such application areas may foster new connections with TwML topics, especially the ones each such word relates to.

In this paper, we present the first quantitative study of the trustworthy ML research space. Using network analysis methods we identify the similarity and clustering patterns of TwML vs. non-TwML words, propose a novel fingerprinting method to predict which papers may be related to TwML, and provide word-level contextual similarity insights. As indicated by Table 3 , Figure 3 , and Table 5 , potential areas of future exploration include multiclass problems in differential privacy, and work that focus on fairness and transparency aspects of newer research areas in broader ML. Contextually similar non-TwML words in Table 4 suggest the need for more practice-oriented work in this field, which recent studies have acknowledged [3, 17] .

Through this work, we hope to motivate further quantitative characterization of TwML literature. As examples, a higher proportion of content instead of only title, abstract, and keywords may be used. The document corpus being analyzed can be specifically tailored to the end goals of the analysis (e.g. inference vs. prediction, explore new connections between theoretical vs. applied topics). Such explorations will facilitate and guide future ML research by identifying methodological gaps, as well as create novel opportunities for applying existing analytical techniques in new practical problems.

Fast unfolding of communities in large networks

Mining scholarly data for fine-grained knowledge graph construction

Socially responsible ai algorithms: Issues, purposes, and challenges

Mapping the physics research space: a machine learning approach

A Snapshot of the Frontiers of Fairness in Machine Learning

Investigating the interplay between fundamentals of national research systems: Performance, investments and international collaborations

Science of science

A Comprehensive Survey on Safe Reinforcement Learning

A Survey on Differentially Private Machine Learning

Improving fair predictions using variational inference in causal models

The Ethical Algorithm: The Science of Socially Aware Algorithm Design

Applying Fairness Constraints on Graph Node Ranks Under Personalization Bias

Co-Occurrence Network of High-Frequency Words in the Bioinformatics Literature: Structural Characteristics and Evolution

Fairness through causal awareness: Learning causal latentvariable models for biased data

Foundations of Statistical Natural Language Processing

A Survey on Bias and Fairness in Machine Learning

Six Steps to Bridge the Responsible AI Gap

Enhanced statistical rankings via targeted data collection. ICML-2013 pp

Where is your field going? A machine learning approach to study the relative motion of the domains of physics

Large Margin Multiclass Gaussian Classification with Differential Privacy

Leveraging citation networks to visualize scholarly influence over time

Novel keyword co-occurrence network-based methods to foster systematic reviews of scientific literature

Explaining reward functions in markov decision processes

Combining Binary Classifiers for a Multiclass Problem with Differential Privacy

Differentially private image classification using support vector machine and differential privacy

SensitiveLoss: Improving Accuracy and Fairness of Face Representations with Discrimination-Aware Deep Learning

Safe Policy Improvement with Baseline Bootstrapping in Factored Environments. AAAI-2019 pp

Calibrated recommendations

Novel keyword co-occurrence network-based methods to foster systematic reviews of scientific literature

The Relationship between Trust in AI and Trustworthy Machine Learning Technologies. In: FAT-2020

Writing the title and abstract for a research paper: Being concise, precise, and meticulous is the key

Towards a Robust and Trustworthy Machine Learning System Development

Navigating the landscape of COVID-19 research through literature analysis: A bird's eye view

The science of science: From the perspective of complex systems