key: cord-0459746-6o7rkiqr authors: Miceli, Milagros; Posada, Julian title: The Data-Production Dispositif date: 2022-05-24 journal: nan DOI: nan sha: 3ec67799c62033373d8a91145374d6275336de71 doc_id: 459746 cord_uid: 6o7rkiqr Machine learning (ML) depends on data to train and verify models. Very often, organizations outsource processes related to data work (i.e., generating and annotating data and evaluating outputs) through business process outsourcing (BPO) companies and crowdsourcing platforms. This paper investigates outsourced ML data work in Latin America by studying three platforms in Venezuela and a BPO in Argentina. We lean on the Foucauldian notion of dispositif to define the data-production dispositif as an ensemble of discourses, actions, and objects strategically disposed to (re)produce power/knowledge relations in data and labor. Our dispositif analysis comprises the examination of 210 data work instruction documents, 55 interviews with data workers, managers, and requesters, and participant observation. Our findings show that discourses encoded in instructions reproduce and normalize the worldviews of requesters. Precarious working conditions and economic dependency alienate workers, making them obedient to instructions. Furthermore, discourses and social contexts materialize in artifacts, such as interfaces and performance metrics, limiting workers' agency and normalizing specific ways of interpreting data. We conclude by stressing the importance of counteracting the data-production dispositif by fighting alienation and precarization, and empowering data workers to become assets in the quest for high-quality data. Many machine learning (ML) models are built from training data previously collected, cleaned, and annotated by human workers. Companies and research institutions outsource several of these tasks through online labor platforms [64] and business process outsourcing (BPO) companies [53] . In these instances, outsourcing organization and their clients regard workers as independent contractors, considering them factors of production, and their labor a commodity or a product subject to market regulations [87] . They are paid as little as a few cents of a dollar per task, usually lack social protection traditionally tied with employment relations, and are subject to systems of control and surveillance [30, 37, 81] . Their assignments broadly comprise the interpretation and classification of data, and their work practices involve subjective social and technical choices that influence data production and have ethical and political implications. Workers interpreting and classifying data do not do so in a vacuum: their labor is embedded in large industrial structures and deeply intertwined with naturalized profit-oriented interests [44] . This paper presents an investigation of data production for ML as carried out by Latin American data workers mediated by three platforms operating in Venezuela and a business process outsourcing (BPO) company located in Argentina. To study data work for machine learning, which we define as the labor involved in the collection, curation, classification, labeling, and verification of data, we lean on Foucault's notion of dispositif and apply the method of dispositif analysis [39] . A dispositif is an ensemble of objects, subjects, discourses, and practices as well as the relations that can be established between them [25] . Examples of dispositifs include prisons, police, and academia. These heterogeneous ensembles of discursive and non-discursive elements constitute what is perceived as reality and, as such, what is taken for granted. The decision to lean on Foucault's notion of dispositif is methodological, rather than theoretical. This notion and the method of dispositif analysis enables the study of data production as embedded in social interactions and hierarchies that condition how data is constructed and how specific discourses are reproduced. As we will describe in Section 3.1, this method also allowed us to integrate diverse qualitative data and focus on the relationships between them. We define the data-production dispositif as the network of discourses, work practices, hierarchies, subjects, and artifacts comprised in ML data work (see Figure 3 ) and the power/knowledge relationships that are established and naturalized among them. The data-production dispositif determines the realities that ML datasets can reflect and the ones that remain erased from them. It has a crucial effect on the outputs that ML models will consider to be true. Dispositif analysis interrogates means of reality making, with a special focus set on the meanings that become dominant and those that are marginalized -"the said as much as the unsaid" [25] . Our dispositif analysis explores the sites where the production of ML data is outsourced. It comprises the investigation of (1) linguisticallyperformed elements (what is said/written), (2) non-linguistically performed practices (what is done), and (3) materializations (how linguistically and non-linguistically performed practices translate into objects) [39] . These elements and research questions relate specifically to the outsourcing of ML data-production tasks and can be structured as follows: • Linguistically performed elements: What discourses are present in task instructions provided to outsourced data workers? (RQ1) → We analyzed a corpus of 210 instruction texts for data-related tasks requested by ML practitioners and outsourced to data workers. • Non-linguistically performed practices: How do outsourced data workers, managers, and requesters interact with each other and instruction documents to produce data? (RQ2) → To explore how linguistically performed elements translate into practice, we conducted 41 interviews with data workers and inquired how they interpret the instructed tasks. In addition, we conducted interviews with six managers and eight ML practitioners (in their role as data-work requesters). • Materializations: What artifacts support the observance of instructions, and what kind of work they perform? (RQ3) → Through participant observation, we account for some of the material elements in which the data-production dispositif manifests, such as platforms and interfaces, tools to surveil workers, and documents that record the decisions made between service providers and service requesters. To summarize, this paper, its contribution, and the extensive analysis it comprises can be described as follows: We start by exploring Foucault's notion of dispositif and defining key related concepts "dispositif. " In colloquial French, "disposition" is used in the sense of being "at someone's disposal. " This way, Link highlights the power element comprised in dispositif as the separation between those who are "at the disposal" (those who are instrumental) and those who have influence to determine the strategy used to meet a need [9] . Those who are "at the disposal" and those who "dispose" are part of the dispositif's strategy. In sum, the concept of dispositif comprises the knowledge that is built into linguistically performed practices (what is said, written, and thought), non-linguistically performed practices (what is done), and materializations (the objects) [39, 41] . A dispositif can therefore be defined as a constantly changing network of objects, subjects, discourses, and practices that shape each other, producing new knowledge and new power. Previous HCI and CSCW research has engaged with Foucauldian theory. For instance, Harmon and Mazmanian [34] follow Foucault's understanding of discourse to explore how US residents talk about smartphones and smartphone users. Kou et al. [46] use the Foucauldian concepts of power, knowledge, and self to explicate human-technology relationships. And Bardzell et al. [4] draw on Foucault's Theory of Identity to investigate social practices within the virtual world Second Life. In terms of methodology, Kannabiran et al. [42] use Foucauldian Discourse Analysis (FDA) to study the rules and mechanisms involved in HCI discourses on sexuality, while Spiel [78] combines Actor-Network Theory with FDA into a "critical experience" framework to evaluate how children in the autistic spectrum interact with technologies. Despite the wide application of Foucauldian theory in HCI and CSCW, the dispositif notion and analysis has not been applied to the study of data production and data work. To that end, we believe that our contribution could be seen as methodological, in the sense that we intend to show a novel and comprehensive mode of analysis to approach data work. The data-production dispositif analyzed in this paper comprises the "infrastructure" that enables the (re-)production and circulation of specific discourses in and through ML data work. As Foucault argues, the emergence of each dispositif responds to an "urgent need. " The data-production dispositif responds to the growing demand for data and labor in the AI industry. We define data work as the human labor necessary for data production, in this case, for machine learning. Data work involves the collection, curation, classification, labeling, and verification of data. Users, developers, and outsourced workers carry out these tasks at any point in the development and deployment of AI systems. For example, medical professionals in the case of AI for healthcare [6, 55, 79] , education professionals [50] , or internet users when answering ReCAPTCHA tests [40] . This paper will employ the term "data work" to refer exclusively to the labor outsourced through crowdsourcing platforms and specialized business process outsourcing (BPO) companies, instead of the broader data work carried out by other professionals and users, while acknowledging the role of the former within the dispositif as requesters. Platforms are one of the two significant ways of outsourcing data work. The rise of alternative forms of work different from traditional employment [43] and the expansion of the "gig economy, " or casual employment mediated through platforms [89] , gave rise to "crowdsourcing, " "crowdwork, " or "digital piecework" platforms where geographically dispersed workers are allocated many fragmented tasks, which are carried out online from their homes. Platforms are hybrid organizations that combine traits of firms and multi-sided markets [11] . They serve as infrastructures that "facilitate and shape personalised interactions among end-users and complementors, organised through the systematic collection, algorithmic processing, monetisation, and circulation of data" [63] . Platforms thrive in digital environments because they respond to deficiencies in markets and enterprises that fail to extract and appropriate data and allocate resources efficiently [11] . The second primary form of outsourced data work for ML is provided by business process outsourcing (BPO) companies. Conversely to crowdsourcing platforms, where hierarchies are primarily managed by algorithms, BPOs show rather traditional management structures. BPO is a form of outsourcing that involves contracting a third-party service provider to carry out specific parts of a company's operations, in the case of our investigation, data-related tasks. These service providers often specialize in one type of ML data service (e.g., semantic segmentation) or application domain (e.g., computer vision), contrary to platforms specializing in one or a few application domains but with more diverse data services. While prices per piece are significantly higher than those offered by platforms, many machine learning companies prefer to outsource their data-related projects with BPOs because of the perceived higher quality of data [54] . This is due to the companies' domain specialization and traditional managerial structures that allow more direct and personal communication. The intervention of humans in processes of data production has been addressed by a large body of CSCW and HCI research [19, 26, 53, 57, 58, 61, 62, 77, 80] . Some investigations have explored the role of worker subjectivity on datasets [7, 12, 29, 84] and have proposed ways to recognize and address worker bias [3, 28, 36, 84] . In contrast, other researchers have documented the "practices, politics, and values of workers in the data pipeline" [73, 74] and the sociotechnical organization of data work that privileges speed, scale, and scalability over worker wellbeing [37] , low wages [18, 33] , dependency [71] , and the power asymmetries vis-à-vis requesters [38, 52, 53, 72] . As we argue, the "urgent need" addressed by the data-production dispositif is the exponential need for cheaper and more profitable data, which is also the exploitation of surveillance [90] , natural resources [15] , and other types of labor [13] . Previous research has highlighted the role of these elements to guarantee a façade where AI is seen as neutral, unbiased, and efficient due to the lack of human intervention -and error -while keeping workers and factors of production hidden from the public lens [10, 30, 37] . These elements show the wide extension of the "heterogeneous ensemble" that constitutes the discursive, non-discursive, and material elements of data production. Because the data-production dispositif is too vast to explore in one academic paper, we circumscribe its analysis around outsourced data work for ML as one of its crucial components. We lean on dispositif analysis to investigate the discourses implicit in annotation instructions, the non-discursive practices involved in the production of ML datasets, and how both materialize in artifacts. Often described as an extension of discourse analysis [9] , dispositif analysis expands the field of inquiry beyond texts to include actions, relationships, and objects. Dispositif analysis rests on the notion of knowledge (and power) as the connecting force between discursive and non-discursive components. It accounts for hierarchies and power structures in societal fields and organizations that shape the construction of meaning in discourse [68] . Thus, our dispositif analysis crucially focuses on the relationship between discourse, practice, and objects in data production, and the power created through their interaction. Foucault never outlined an explicit methodology of dispositif analysis. Several authors, most prominently Sigfried Jäger [39, 41] , have explored ways of operationalizing the complex Foucauldian notion of dispositif into a method of inquiry. Dispositif analysis has thus been in constant evolution since the mid-1980s. Caborn [9] mentions four steps comprised in this methodology: (1) identifying the elements that constitute the dispositif, (2) determining which discourses they embody and their entanglement with other discourses, (3) interrogating power by "considering who or what is at the disposal of whom", and (4) analyzing non-discursive practices associated with the dispositif's discourses. However, to our knowledge, a comprehensive guide or method of how to conduct a dispositif analysis has not yet been developed. As Jäger and Maier describe, dispositif analysis remains "a flexible approach and systematic incitement for researchers to develop their analytic strategies, depending on the research question and type of materials at hand" [39] . The study presented in this paper follows the experimental spirit of Jäger and Maier's invitation to develop our analytical strategy, in this case, to study data production. Here, we combine methodological elements discussed by several authors in terms of the operationalization of power, knowledge, and discourse [8, 9, 47, 48, 60] , apply them to our fieldwork on data production through platforms and at a BPO, and follow the examples provided by previous research that has successfully applied variations of dispositif analysis [9, 31, 51, 85, 86] . We followed the four steps outlined by Caborn mentioned above and based our analysis on the three-dimensional framework described by Jäger and Maier [39] as follows: • The analysis of linguistically performed elements: which aims at reconstructing the knowledge built into what is said and written through discourse analysis. In terms of our investigation, this phase comprised an examination of the discourses encoded in the instruction documents received by data workers. • The analysis of non-linguistically performed practices: which aims at reconstructing the knowledge that underlies linguistically performed practices and how they translate into action. In this phase, we investigated how workers make sense of the task instructions and their work in general, the interactions between workers, managers, and clients, and the labor conditions that structure these practices. We studied these elements through interviews conducted with data annotators who perform tasks guided by such instructions, machine learning practitioners who compose annotation instructions, and managers who oversee the process. • The materializations: This phase of analysis consisted of identifying the knowledge that is built into physical and digital artifacts, i.e., discursive materialization, whose existence is coherent with the discourses they encode. Through this lens, we set the focus on the platforms and interfaces used to perform data work, documents (as artifacts and not as texts) that record decision-making processes, and tools used to surveil workers and quantify their performance. Our analysis of these materalizations is based on participant observations and the above-mentioned interviews. Making researchers' positionality explicit is key to situate the standpoint from which an investigation has been conducted. Positionality statements are relevant to all types of studies, specially qualitative and exploratory investigations such as this one. Moreover, given the flexible character of dispositif analyses, it seems appropriate to disclose some elements of the authors' backgrounds that might have informed the analysis presented in this paper. Both authors are multiracial researchers born in different countries of Latin America. Both are first-generation academics working in institutions located in the Global North, where they live under immigrant status. Both have a background in Sociology and Communication. Their first language is Spanish. One of the authors identifies as female and the other as male. Both are cisgender. Despite being born and raised within worker-class families and in the same regions as the data workers interviewed, the authors acknowledge that their class-related experiences differ from those of the interview partners and that their position as researchers living and working in the Global North provides the authors with privilege that the study participants do not hold. Throughout data collection, analysis, and while considering the implications of this investigation, the authors have put much effort in remaining reflexive and acknowledging their position regarding the study participants and field of inquiry. This investigation comprises several weeks of participant observation, a total of 55 interviews, and the analysis of 210 instruction documents. These data were collected during several months of fieldwork from 2019 to 2021, online and in person, at two sites (see Table 1 ): • virtually studying three crowdsourcing platforms operating in Venezuela and the experiences of platform workers, and, • in a hybrid format, at a business process outsourcing company (BPO) located in Buenos Aires, Argentina, where data workers perform tasks related to the collection and labeling of data for machine learning. At both fieldwork sites, we conducted participant observations and semi-structured interviews. To complement these data, we conducted a series of expert interviews with managers at other BPOs and with ML practitioners in their role of data-work requesters (see Section 3.3.2 and Table 3 for a detailed account of the interview participants). Fieldwork in Venezuela was carried out virtually between July 2020 and June 2021 due to restrictions related to the coronavirus pandemic. For the first phase of this research, we signed up and completed tasks for the platforms to understand the tasks available, working conditions, and interfaces. While some of these platforms presented similar tasks, they differed considerably in their general availability, the interfaces, labor process, and task applications. Initially, we contacted the platform workers using convenience sampling since this population is invisible, meaning that it is difficult to approach them without the support of the platforms, which we did not have for the study. We sought permission from the moderators of the most popular worker groups on Facebook and Discord to post a call for study participants. Thanks to this initial approach, we were able to use snowball sampling to contact further participants. We conducted in-depth interviews and asked workers about their experience working for the platforms. Additionally, we asked workers if they could share information about the instructions they received. Some workers also shared guides created by colleagues to understand and answer the tasks efficiently. We include those in our analysis as well. Moreover, we also searched the internet to find additional annotation instructions online. Our approach, notably the use of convenience and snowball sampling, present several limitations in terms of reproducibility and bias towards participants belonging to similar social circles. We have mitigated these issues by comparing our results with similar studies (see Section 2.2) and with the workers at the BPO company. Fieldwork at Alamo, the Argentine business process outsourcing (BPO) company, was carried out in-person between May and June 2019 in Buenos Aires and continued online between August 2020 and February 2021. At the time of this investigation, this company is a medium-sized organization with branches in several Latin American countries. Besides data work, Alamo conducts content moderation and software testing projects. The company is an impact sourcing type of BPO, which refers to a branch of the outsourcing industry that purposely employs workers from poor and marginalized populations to offer them a chance in the labor market and to provide informationbased services at lower prices. We contacted the company via e-mail to request field access. After several months of inquiry, a meeting with the company's management took place in which the researcher on site signed a non-disclosure agreement that specified several elements that we are not allowed to disclose in this or other papers. Most of these elements concern the identity of clients and specific details about their ML models. After this meeting, fieldwork was allowed to commence and we were able to observe several projects related to the collection and annotation of data for ML. Apart from shadowing workers, we were granted access to team meetings, meetings with clients, workers' briefings, and QA analysis related to three projects carried out by the company in 2019, involving the collection and labeling of image data. We complemented the observations with in-depth interviews with data workers, managers, and QA analysts. In total, we collected 210 annotation instruction documents from the platforms and the BPO. The analysis of the instructions was carried out by both authors. We used critical discourse analysis [39] to explore the instruction texts. The analysis comprised three stages: (1) the structural analysis of the corpus, (2) a detailed analysis of discourse fragments, and (3) a synoptic analysis. These steps (especially the synoptic analysis) included several iterations that allowed us to discover connections between different levels of analysis, collect evidence to support our interpretations, and develop arguments. Table 2 offers an overview of the codes used for the discourse analysis of the instruction documents, their evolution throughout the three phases of analysis, and explanatory memos that reflect our understanding of each code. The goal of the structural analysis is to code the material to identify significant patterns and recurring themes and sub-themes comprised in the instructions. By the end of the structural analysis phase, we were able to identify elements of the text structure, regular tasks, and stylistic devices that appeared in the instruction documents. These elements helped us identify "typical" texts and representative discourse fragments for the following analysis step. The detailed analysis comprised an examination of selected text fragments. We focused on identifying typical representations and their variations and interrogated the elements highlighted in the instruction documents and the contextual knowledge that is taken for granted and, thus, neglected in them. We also paid special attention to binary reductionisms, presupposition and attribution, examples, and visualizations. A critical aspect of the analysis focused on the taxonomies that structure the labels instructed by requesters. Finally, the synoptic analysis included the overall interrogation of the observations that emerged from the structural and detailed analyses. This phase included an intensive exchange between both authors to reflect upon our shared understanding of the identified discourse strands. We contrasted the selected fragments and the identified elements with the interview and observation material. This approach helped us understand the role of work and managerial practices in legitimizing specific discourses. To reconstruct the knowledge that underlies the practices that constitute the data-production dispositif, we turned to the experiences of those actors who interact with the instructions regularly. With this aim, we conducted a total of 55 interviews with dataworkers located in Venezuela and Argentina, BPOs managers and founders, and ML practitioners who regularly outsource data-related tasks. Table 3 shows a detailed overview of the interview partners, including their role within their organizations, location, type of interview, and language. Due to the restrictions related to the COVID-19 pandemic, the interviews with platform workers were conducted online through video calls, while those with BPO workers were conducted in person before the pandemic. The interviews with the data workers were conducted in Spanish, which is the native language of the interviewers and the participants. We conducted the expert interviews in English. While most platform workers had tried several platforms, they usually focused on one, except for two workers who worked simultaneously for Tasksource and Workerhub. All interview partners were asked to choose a code name or were anonymized post-hoc to preserve their identity and that of related informants. The goal of the interviews was to reveal practices and perceptions and obtain additional information about the organizational relations and structures that inform how data-related tasks come about, how instructions are communicated, and how workers execute them. For instance, hierarchical structures can have an essential effect on meaning-making practices as enacted through the annotation instructions without being referred to explicitly or implicitly in the instruction documents. The in-depth interviews with data workers include accounts of specific work situations involving the interpretation of data. Moreover, they cover task descriptions, widespread routines, practices, working conditions, lived experiences, and general views on their work and the local labor market. It is essential to mention that the differentiation between in-depth and expert interviews refers to the interview method chosen for each situation and informant and was not based on informants' occupational status or position. Our priority was engaging in in-depth conversations with data workers to discuss and learn from their experiences in and beyond data work. Conversely, we used the expert-interview method to conduct focused exchanges with actors that possessed a broad overview of the machine learning pipeline. The expert interviews covered the topics of data work and the relationship between BPO/platform and requesters. Dispositif analysis allowed us enough flexibility to obtain valuable insights from the interviews by combining inductive and deductive coding. Some of the topics that we identified through discourse analysis in the instruction documents helped us build categories to code the interviews. This form of deductive coding was oriented towards finding additional evidence for phenomena identified in the instruction texts and understanding the contexts in which instructions are formulated and carried out. In addition, there was room for inductive category formation so that several codes could emerge directly from the interviews during coding. This approach helped us identify valuable observations that otherwise have been lost. Through this form of analysis, we aimed at identifying patterns. Those patterns were later confronted with the elements identified in the instruction texts and complemented with participant observations. The development of coding schemes for the analysis and the coding process itself was carried out in iterations involving cross-coding between both authors. The interview transcripts were analyzed in their original language (Spanish or English). The excerpts included in Section 4 were translated by us when writing this paper and only after the analysis phase. Our emergent understanding evolved throughout numerous discussions and several iterations until reaching the set of findings that we present in Section 4. Through fieldwork at the BOP and the platforms, we were able to observe interactions among data workers and between them and clients using in-person observation in the Argentinian case and digital ethnography in the Venezuelan one [35] . Furthermore, we observed workers' interactions with crowdsourcing platforms and the software interfaces used to complete annotation tasks. Special attention was paid to the interaction of workers with task instructions. It is important to mention that the instruction documents underwent a twofold form of analysis: On the one hand we analyzed instruction documents as texts through discourse analysis as described in Section 3.3.2. On the other hand, we used the observations conducted to analyze these documents as artifacts or materializations of the data-production dispositif. For the latter form of analysis, the focus was set on documents' function, provenance, and the interactions they allow or constrain. The level of involvement regarding observations varied from shadowing to active participant observations. In some cases, we had the opportunity to observe and try the interfaces and perform data annotation tasks for several hours. All observations were recorded as jottings taken in real-time. Those jottings comprised descriptions of briefings, meetings, tasks, documents, communication channels, and interfaces, as well as their advantages and limitations. In parallel, reflections on the researchers' impressions and perceptions, including explicitly subjective interpretations, were noted. Simple sketches and, when permitted, photos helped to complete the observations registered. The information gathered in keywords or bullet points was later transformed into complete texts and integrated into more consolidated field notes. In the analysis phase, we combined these field notes with the interview transcripts and coded them following the steps described in the previous subsection. The presentation of our findings comprises a descriptive subsection (4.1) and three analytical parts (4.2, 4.3, and 4.4). In 4.1, we present several examples of the different tasks carried out by data workers at each one of our four fieldwork sites. We include details of how task instructions are formulated and Table 4 . Types of tasks in outsourced data work based on Tubaro et al. [81] Task Type Description Examples (based on our fieldwork) Data Generation The collection of data from the worker's environment "You can earn $2.5 by completing the task 'Do you wear glasses?' Upload a picture of a document with your prescription values now. " The classification of data according to a predefined set of labels "Based on the text in each task, select one of these three options: Sexually Explicit, Suggestive, Non-Sexual. " Algorithmic Verification The evaluation of algorithmic outputs "You'll be shown two lists of up to eight search suggestions each. Your task is to indicate which list suggestion is better. " The impersonation of an artificial agent As the assistant, the "user will initiate the conversation. . . you need to use the facts to answer the user's question. " communicated to workers and how workers follow or interrogate instructions. Through these descriptions, we seek to locate our analysis in specific settings with specific ways of doing things. Next, we move into dissecting the described tasks, instructions, and practices while outlining specific characteristics of the data-production dispositif. In 4.2, 4.3, and 4.4, we center the findings of the dispositif analysis around our three research questions. Following RQ1, we describe discursive practices such as those involved in the taxonomies used to collect and classify data and the warnings and threats included in them. Following RQ2, we describe non-discursive practices and social contexts such as the obedience to instructions, the dependence of Latin American workers to precarious work, and moments of interrogation and solidarity among workers. Finally, and following RQ3, we describe some of the dispositif's materializations, such as documents, work interfaces, and tools to measure workers' performance and surveil them. Before moving towards answering our three research questions, we will describe in this section the different tasks available in data work and explored in this study. We use the framework proposed by Tubaro et al. [81] to differentiate the tasks (see Table 4 ). While this framework was initially conceived to analyze digital platform labor, we think it can also be applied to BPOs in the broader field of data work due to the similar types of tasks available. Tubaro et al. define three moments in outsourced AI production: "artificial intelligence preparation," "artificial intelligence verification," and "artificial intelligence impersonation." The authors divide AI preparation into the collection of data and its annotation. AI verification involves the evaluation of algorithmic outputs. Finally, AI impersonation, often seen in the corporate and AI-asa-service sector [59] , refers to the non-disclosed "'human-in-the-loop' principle that makes workers hardly distinguishable from algorithms" [81] . Platform workers are directed to collect data from websites or to produce media content (e.g., text, images, audio, and video) from their devices. For example, a task on Clickrating instructed workers to find information online from companies in the United States, including their address and telephone number. Workerhub required workers to take photos of themselves in certain poses or pictures of family members (including children) and enter attributes of the subjects in these images, including their age and gender. While tasks involving data collection from the web were paid a few cents per assignment, those that involved capturing photos, video, and audio were compensated with a few dollars per file generated. Interviewed workers found the latter type of task attractive because they were the best remunerated. In this case, financial need overcame any privacy concern. At the BPO, one of the data generation projects produced an image dataset to train a computer vision algorithm capable of identifying fake ID documents. For this purpose, workers were instructed to use their IDs and those of their family members. They took several pictures of the documents and used some of those pictures to create different variations of the ID document (changing the name, the address, the headshot). This way, they produced imagery of authentic as well as fake IDs. The requester of this task was a large e-commerce corporation and Alamo's most important client. The task is just one of many data-related projects Alamo has conducted for this corporation in the last four years. Because of their ongoing service relationship, the client invested considerable time and money in training Alamo's workers for each project. One particular characteristic of this relationship is that instruction documents are not created by the client unilaterally but co-drafted with Alamo's project managers and team leaders. Managers and leaders then serve as support to answer data workers' questions, should they arise after reading the instructions or while completing the tasks. Because of the sensitivity of the data involved in the ID project, a special interface with several security measures was created to work on the images and store them. Furthermore, all workers had to sign a consent form, allowing the use of their ID document. The request to use their ID and those of family members caused some unease and raised several questions among workers, as one of Alamo's managers reported. Then, it was the managers' job to "convince them that their IDs were not going to be used for anything bad, no crime or something. And to do that without revealing too much of the client's idea because we had signed an NDA. " The most common task in data work is data annotation. Projects of this kind were available in all of the studied platforms and the BPO. Tasksource and Workerhub provided many image segmentation and classification tasks as platforms specializing in computer vision projects. Some of these tasks included the classification and labeling of images of people according to gender, race, or age categories. Since segmentation tasks required around one hour depending on the size of the image and the number of labels, these tasks are paid more than classifying entire pictures according to a set of categories. Most of the segmentation tasks were designed for training self-driving cars and devices that are part of the internet of things. At the same time, image classification has multiple uses, including content moderation (notably for hate speech and sexual content), healthcare, facial recognition, retail, and marketing. For instance, the same BPO workers that collected pictures of ID documents were subsequently asked to classify and label them as "authentic" or "fake." In addition, workers had to segment the "fake" ID documents and mark the part of the image they had modified. Finally, they had to annotate the type of modification the image had undergone (e.g., "address has been modified" or "headshot is fake"). Outside of computer vision, workers of Workerhub were asked to identify hate speech and sexual content in text, notably for social media. For example, in an assignment titled "Identify Racism," workers were asked to read social media posts and identify whether or not the content included racism or if this judgment was not possible. In another task titled "$exxybabe69, " workers had to judge if usernames included examples of child exploitation, general sexual content, or none of the previous categories. Video and audio annotation were also present in this platform. For example, in the task "How dirty is this," workers had to identify if there was sexual or "racy" content in different media types, including audio files and videos. In section 4.2, we will revisit some of these tasks to describe how workers navigated the different taxonomies they encountered to perform data annotation tasks. Algorithmic verification involves the assessment of algorithmic outputs by workers. This type of task was observed primarily on the annotation platform Tasker and accessed through Clickrating. Tasker is the internal platform of a major technology company that develops a search engine. They use Clickrating to recruit their workers who, depending on the project, have to sign special contracts and non-disclosure agreements. Algorithmic verification tasks, for example, include assessing how the search engine has responded to a user query, the objects that accompany a search result (e.g., images, maps, addresses), or whether the search result contains adult content or not. In many cases, these assessments include comparing search results with a competitor search engine and assessing which one is more accurate and substantial. In another example of algorithmic verification, one of the tasks conducted by the BPO Alamo for its largest client (the same e-commerce corporation behind the "ID project") consisted in verifying the outputs of a model used by the client to moderate user-generated content in their marketplaces. In this case, the task consisted of reviewing content flagged as inappropriate by an algorithm and confirming or correcting the output. For this purpose, the client had provided handbooks that contained each marketplace's terms and conditions and examples of the specific forms a violation could take. For workers, this task often involved being exposed to disturbing images and violent language, which several interview partners described as "tough. " Impersonation is the rarest type of task, and it was only observed once in the platform Clickrating. Tubaro et al. [81] describe it as a task that occurs "whenever an algorithm cannot autonomously bring an activity to completion, it hands control over to a human operator." The task that we encountered, developed by a major social media company, asked workers to dialogue with users and respond to their queries according to a set of predefined "facts, history, and characteristics. " If the worker couldn't answer the user's query, they were asked to say, "Sorry, I don't know about that, but can I tell you about... " and then they had to "insert fact related that may be of interest to user. " The platform instructed workers to complete dialogues in the least amount of time, and they have to be logged into the platform in specific 3-hour sessions. RQ1: What discourses are present in task instructions provided to outsourced data workers? To explore our first research question, we analyze the task instructions as text. Our analysis focuses on the categories and classes used for collecting, sorting, and labeling data as contained in the task instructions. We describe three recurrent elements: (1) the normalization of conventions from the Global North and oriented towards profit maximization, (2) the use of binary classifications and the inclusion of residual categories such as "other" or "ambiguous," and (3) the discursive elements that aim at constraining data workers' agency in the performance of data-related tasks. Taxonomies are the main component of task instructions for data work. They consist of classification systems comprising categories and classes used to collect, sort, and label data. Definitions, examples, and counterexamples usually accompany taxonomies. The number of classes depends on the task and varies according to each platform or company. For instance, assignments on the platform Workerhub usually present a smaller set of labels, ranging from two to a dozen maximum. At the same time, jobs in Tasksource usually feature dozens of classes classified in several categories (e.g., for the semantic segmentation of a road, the category "cars" included labels like "police car" or "ambulance"). The taxonomies that we observed in instructions carry self-evident meanings to the clients but are not necessarily relevant to the annotators or communities affected by the ML system. For instance, the label "foreign language" refers to languages other than English, and the category "mail truck" only comprises examples of USPS vehicles. One of the projects conducted at the BPO Alamo consisted in analyzing video footage of a road. A particularity of this project is that the requester did not predefine the labels, but workers were asked to come up with mutually exclusive classes to label vehicles in Spanish. One of the main difficulties of the project, however, lay in the nuances of the Spanish language: Probably oriented towards targeting a broader market, the requester wanted the labels to be formulated in español neutro ("neutral Spanish"), i.e., without the idioms that characterize how Argentines and most Alamo workers speak. This contrast led to many instances of discussion among workers and managers about which vehicles' designations the client would consider "neutral. " The mismatch between the classifications that requesters and outsourced data workers consider self-evident becomes critical in cases of social classification. For instance, in the task shown in Example 1, workers were asked to label individuals' faces for facial recognition according to predefined racial groups that included "White, African American, Latinx or Hispanic, Asian, Indian, Ambiguous, " where the last category should be selected "ONLY if you cannot identify the RACE of the person in the image. " Example 1 In this task you will be determining the race of the persons in the images. You should select only one of the following categories: Beyond the already problematic situation of being asked to "guess someone's race," for many interviewed Latin American workers, this type of classification did not make sense, as it was conceived by a US company with a US-centric conception of racial classification, i.e., with little regard for the cultural and racial complexities in Latin America. Similarly, Gonzalo, a data worker for Workerhub, declared having trouble with a task that asked him to label "hateful speech" in social media posts: They give you many examples of what they consider "hateful" and not. But, once you're doing the task, you don't encounter basic examples, and it's up to you as a worker to interpret the context and decide what counts as "hate. " Clearly, [the requesters] have their parameters and, if they don't consider something hateful, they will mark your work as wrong. . . . For example, a sentence like "kick the latinos out" would not be regarded as "hateful, " and you will not interpret it the same way as a Latino. These examples of social classification and conceptualization are not just about cultural differences between requesters and data workers, but they reflect the prevalence of worldviews dictated by requesters and considered self-evident to them. Furthermore, the taxonomies respond to the commercial application of the product that will be trained on the data that outsourced workers produce. For example, the instructions to categorize individuals for facial recognition technologies in Example 1 are based on a US-centric definition of "protected group." Such definitions suggest requesters' efforts to prioritize the mitigation of legal risks, neglecting the safety of users from social minorities and groups facing discrimination in other contexts, such as ethnic groups with defined caste systems or social categories not protected by US law, such as economic class. In another example that shows the inscription of requesters' profit orientation, workers of a major social media app working through Clickrating, were instructed to evaluate if a post was "building awareness. " Here, "awareness" referred to posts with commercial content and discarded anything personal or political in nature (see Example 2). Building awareness means that the purpose of the [post] is to give information about a brand, product, or other [sic] object. For example, "@Restaurant is awesome for karaoke and the food is delicious!" is building awareness. A story is not building awareness if it's primarily about the author's life. A story that mentions the name of a business or service without providing much additional context, or a story that refers to a product in passing while its author is sharing one of their experiences, is not building awareness. For example, if the author says, "I'm eating in @restaurant", the story is not building awareness for the restaurant. Requesters' profit orientation is implicit in task instructions and gets inscribed in the ways data workers approach their tasks. For instance, when asked what the procedure would be if they were unsure about annotation instructions, most of the workers at the BPO answered that they would abide by the requesters' opinion because "their interpretation is usually the one that makes more sense as they know exactly what kind of system they are developing and how they plan to commercialize it. " While complex taxonomies are common in the BPO, the most common classification tasks in platforms are binary. For example, for tasks described as "Does this link title use sensationalist phrasing or tactics?, " "Who is the author of this story?, " and "You will be looking at a bounding box and choosing whether it is around a primary face, " the possible labels were "yes or no, " "human or non-human, " and "yes or no" respectively. As mentioned above, many tasks, especially in Workerhub, are based on categories protected by the United States legislation when defining what counts as hate speech, racism, and other forms of discrimination (See Example 3). However, the classification of whether a text contains hate speech is often reduced to a binary decision without considering the context. Example 3 In this task, you will be identifying messages that contain hate speech. Based on the text, you must select: • Hate Speech: if the username contains hateful content • None: if there is NO hateful or abusive language in the given [sic] Definition: Select Hate Speech if the text contains any of the following: • Discrimination, disparagement, negativity, or violence against a person or group based on a protected attribute • References to hate groups, or groups that attack people based on a protected attribute This form of binary classification often ignores ambiguity or uncertainty, such as when workers are confronted with contexts that are ambiguous or separate from their cultural setting. Moreover, given the impossibility of platform workers to send feedback to requesters, many omissions (involuntary or not) remain unquestioned. For instance, a task on Workerhub asked workers to classify images as "racist" or not; we observed an image representing several copies of the "crying Wojak" meme wearing kippahs inside a heating oven while the meme "Pepe the Frog" is watching outside. The text above reads: "Changed the wooden doors today frens [sic] , this is actually working as intended now!" While this image was, for us, a clear example of antisemitism, a form of racism condemned worldwide, including by the United Nations [83] , requesters instructed workers not to consider this type of hateful content as "racist. " In several of the cases where binary categories were prescribed, we encountered a third label, usually called "other" or "ambiguous," not to designate an additional class that would break the binary classification but to merge errors or instances where the worker cannot apply one of the two labels (see Example 4) . Workers are also encouraged to ignore ambiguity altogether. Some Clickrating tasks acknowledge the limits of this binary classification and urge workers to ignore other possible attributes when categorizing data. For example, in an assignment where workers tagged queries for a search engine, the instructions referred to "Queries with Multiple Meanings" for "queries [that] have more than one meaning. For example, the query [apple], in the United States might refer to the computer brand, the fruit, or the music company." In this case, the task instructed workers to "keep in mind only the dominant or common interpretations unless there is a compelling reason to consider a minor interpretation. " The "dominant" or "common" interpretation of a term in the US may be different to the one in Latin America. Still, we repeatedly encountered similar instructions to deal with instances of ambiguity in many tasks. Overview Select: • Male: if the boxed face is a male • Female: if the boxed face is a female • Other: if there is no face in the box Very often, instruction documents were dated, and requesters provided several updates. For instance, in the task "How dirty is this image/video" on the platform Workerhub, workers were initially instructed to label adult content in images. Later on, the requesters provided videos without updating the documents, confusing some workers because they could not apply the original instructions to the video footage. As a result, the requesters had to update the instructions to include details about video annotation. In another example, a document that asked workers to classify elements in a road revised its definition of "Pedestrians sitting on the ground" to include also those "laying [sic] on benches and laying [sic] or sitting on the ground" (see Example 5) . This example also shows that the "other" category that we described above does not represent "everything else" included in the classification but that there are elements deliberately or accidentally left out. In this latter example, the requesters failed to see that people in the streets are not necessarily always "walking" or "sitting, " but that a segment of the population lies or sleeps on them. Pedestrians sitting on the ground Use the "pedestrian" label if a pedestrian is sitting on the ground, bench, ledge, then use the "pedestrian" label. UPDATE!! Use the "PEDESTRIAN LYING DOWN" label for pedestrians laying [sic] on benches and laying [sic] or sitting on the ground. As we will argue in the following sections, the influence and preferences of powerful actors in data work are stabilized through narrow task instructions, specially tailored work interfaces, managers and quality assurance analysts in BPOs, and algorithms in crowdsourcing platforms. Some of these processes are part of the tacit knowledge workers have about their position (i.e., it goes without saying that workers must carry out tasks according to the preferences of requesters). However, task instructions often make explicit reference to the power differentials between workers and requesters as they include threatening warnings such as "low quality responses will be banned and not paid" or "accurate responses are required. Otherwise you will be banned" (see . Platform workers risk being banned and even expelled from the platform if they fail to follow task instructions. At the BPO Alamo, the communication of instructions is mediated by project managers and team leaders. For this reason, such warnings are not explicit in written documents but are present in reviewing instances and evaluating workers' performance. Here, any concerns expressed at the workers' end are filtered through hierarchical managerial structures and hardly ever reach requesters. Example 6 This is a high paying job, a special job, but to gain access to it and to keep access to it after passing the qualification test, we require patience and VERY careful [sic] thought out and accurate responses. Otherwise, you will, unfortunately be banned from the job :( The picture above shows SHIFTING DATA which means that the LiDAR points for stationary objects move or slide around throughout the scene. Any [Project] Tasks with shifting data are not usable by the customer & have to be cancelled. You will NOT get paid for working on task [sic] with shifting data!!!!!!!!!! Every time you get a [Project] Task (before you start working) always turn on dense APC and look around the entire scene to check for shifting data. You will be able to tell that the shift is big enough to be cancelled if it makes any object 0.3m plus larger than it's [sic] normal size (or if it makes a flat wall 0.3m plus thick) and effects [sic] multiple cuboids. We value your individual opinion and review each result, so please provide us with your best work possible. We understand that this can be a tiring task, so if you are in any way unable to perform your best work, please stop and come back once you are refreshed. You may also see multiple queries with the same kind of visual treatment. Please keep your judgments consistent UNLESS you feel that there is some difference in the two that would result in a change of overall score. Judges providing low quality responses will be banned and not paid. These messages encode a precise definition of "accurate responses": Accuracy is classified according to what the client believes to be an accurate truth value, while divergence from that value is considered inaccurate. Here, too, the classifications that make sense to requesters have prevalence. This is why workers at the Argentine BPO are permanently encouraged by management to think in terms of "what the client might want and what would bring more value to them. " Given the social and economic contexts in which the outsourcing of data work occurs, warnings and threats of being banned or fired reinforce the hierarchical structure of the annotation process and compel workers to follow the norms as instructed or risk losing their livelihood. In the next section, we will present evidence of how the social contexts of workers and the fear of losing their job shapes how assignments are carried out. This section explores our second research question. Here, we focus on analyzing interviews to describe the contexts in which data-related tasks are carried out and the interactions they enable. We describe (1) the social contexts of Latin American workers that lead to their dependence on data work regardless of the labor conditions, (2) the elements that contribute to the unquestioning obedience to instructions, and (3) moments of subversion of rules as well as workers' organization and solidarity. Being an impact sourcing company, Alamo employs around 400 young workers who live in slums in and around Buenos Aires. As stated on its website, Alamo specifically recruits workers from poor areas as part of its mission. As Natalia, one of the BPO's project managers, describes, this is a population that does not receive many opportunities in the Argentine labor market: They are very young, and a bit, you know. . . Alamo works with people another company wouldn't hire, so people who live in areas. . . slums with difficulties, with a very low socioeconomic level. That's something the company pays attention to when it comes to recruiting, and if during the interview we detect that the candidate could have an opportunity somewhere else, we prefer not to hire that person and hire someone else. One particularity of Alamo is that it provides workers with a regular part-or full-time salary. This form of employment contracts with the widespread piece-wage model in platforms. The salary Alamo's workers received in 2019 was the equivalent of 1.70 US dollars per hour which was the minimum legal wage in Argentina. Despite the low wages and exhausting tasks such as semantic segmentation or content moderation, all interviewees were satisfied as the company offers better conditions than previous experiences. According to a report published in November 2020 by the national Ministry of Production [76] , the unemployment rate in Argentina is 10%, and 35% of the employed labor force is not registered. Argentina has a long tradition of undeclared labor. This way, employers avoid paying taxes while workers remain without protection or benefits. Behind the numbers are people like Nati, who did different types of precarious work before working for Alamo. She started at the BPO as a data annotator and quickly became a reviewer until being offered a position as an analyst in the company's QA department. Like other Alamo workers, she acknowledges the difficulty of securing a desk job somewhere else. Moreover, many of our research participants mentioned being proud of the work they do at Alamo because a desk job has "a different status." For several of them, working at Alamo means finally having a steady income and breaking with generations of informal gigs, for example, in the cleaning or construction sectors. As Nati explains, what Alamo offers is better than the alternatives: That was the situation at home; we were going through a rough time. My mother was out of work because her former boss had found someone else to clean, and I had lost my job too. So I needed a job and when I found this one I was surprised to work at a friendly place for a change! Now I have a desk, a future, and I feel appreciated. This is new to me. In the case of the platforms, they have thrived in the Venezuelan economy, which is characterized by the highest levels of inflation in the world, with an average of 3,000% in 2020 [1] . All participants that we interviewed from Venezuela stated that the "situación país" [country's situation] was the main reason they resorted to online work. Workers reported difficulty finding employment in the local labor market, especially for income that is not dependent on the national currency, the bolivar, which devalues quickly. For example, Rodrigo, a Clickrating worker, quit his job as an information technology consultant because online platforms were the only way he could earn US dollars. He explains the monetary situation of his country as follows: There are two types of currency exchange rates: the official rate dictated by the government and the one used on the black market, which everyone uses. Everyone knows this black-market exchange rate. It's an Instagram profile that posts the average exchange rates of several independent currency exchange websites. They make this average and post the fluctuation several times per day, which is the exchange rate that we use today. Platforms' low entry barriers make outsourced data work an attractive -and sometimes the only -source of employment during social, economic, and political crises. Data workers earn 15 to 60 US dollars per week, the average being around 20 dollars, which is substantially higher than the minimum wage in Venezuela reported by workers to be around 1 US dollar per month in March 2021. Dependency on the platforms is exacerbated by the high unemployment levels and reduced government support during the COVID-19 pandemic. In this situation, workers have limited access to subsidies and pay for services such as healthcare from their income [65] . For example, Olivia, one of the Tasksource workers, was diagnosed with diabetes and has to self-fund the costs of insulin. This dependency affects the labor process as well. Workers usually do not choose which tasks to perform, even if they disagree ethically with their assignments. When asked what criteria Carolina, a Clickrating worker, uses to choose a task over another, she answered: My priority is to get the tasks that pay the best. But I don't even have that choice. The platform restricts which jobs are available here in Venezuela, so I have to make the most of it to earn the minimum and get paid as soon as I get one task. By "minimum, " Carolina refers to the minimum income workers can transfer out of the platforms, which is another form of creating dependency. Platforms establish a minimum of 5 to 12 US dollars before they make payments and, if a worker cannot achieve this threshold, they have to wait for a week before withdrawing their salary. This payment process is a form of institutionalized wage theft implemented by the platforms. Workers lose the money they have worked for if they get banned before reaching the threshold for payment. In its outsourcing capacities, Alamo focuses on data-related services ordered mainly by machine learning companies. Even if they display some similarities, each of those projects is different from the previous ones, and workers need to be briefed regularly. Depending on the difficulty and extension of the task, briefings can be more or less sophisticated and involve more or fewer actors, meetings, and processes. Sometimes, the instructions for new projects are sent by the requester via email in a PDF document. One of the area managers receives that information and transmits it to a project manager, who would then put together a team and work closely with their leader. Depending on the degree of difficulty, one or more meetings with the team will be held to explain the project, answer questions, and supervise the first steps. When handling large projects from multinational organizations, Alamo invests a considerable amount of resources in the briefings. No matter how big or small the requester, briefings at Alamo consist of getting the workers acquainted with the expectations of the requesters and are a way of making sure workers are on the same page and thinking similarly: The information from the client usually reaches the team leader or the project manager first, and, at that moment, what we do is to have a meeting for criteria alignment. . . that is generally what we do. The team meets to touch base and see that we all think in the same way. (Quality assurance analyst with Alamo) These briefings give workers a framework for new projects and are instrumentalized by the company as the first instance of control, aiming at reducing room for subjectivity. Further control instances, aiming to ensure that data work is done uniformly and according to requesters' expectations, take place in numerous iterations where reviewers and team leaders review and revise data and go back to the instruction documents or contact the requester to clarify inconsistencies. In companies like Alamo, data quality means producing data precisely according to the requester's expectations. According to Eva, a BPO manager in Bulgaria, this view on data quality is commonplace in data services companies. In the following excerpt, she summarizes the importance and main function of instructions and further instances of control, i.e., making sure that the workers interpret the data homogeneously: Normally, issues in data labeling do not come so much from being lazy or not doing your work that well. They come from a lack of understanding of the specific requirements for the task or maybe different interpretations because a lot of the things, two people can interpret differently, so it's very important to share consistency and, like, having everyone understand the images or the data in the same way. The interviews we conducted with requesters show that the priority behind the formulation of task instructions is producing data that fits the requester's machine learning product and the business plan envisioned for that product. What does not match the requester's instructions is considered low-quality data or "noise. " Dean is a machine learning engineer working for a computer vision company in Germany. He reported on this widespread view as follows: Dean: Noise is what doesn't fit your guidelines. Interviewer: And where do those guidelines come from? Dean: We say, "actually we want to do this, we want to do that," and then, of course, since the client is the king, we translate that business requirement into something like. . . into a requirement in terms of labels, what kind of data we need. As described in Section 4.2.3, compliance with requesters' views is made explicit in instruction documents in the form of warnings for workers. Those documents are usually the only source of information and training platform workers have to complete their tasks. However, the case of the platform Tasksource is slightly different, as it employs Latin American coaches to brief and explain to workers how to interpret instructions and annotate tasks. This approach is similar to the one used by the BPO Alamo and described above. However, at Tasksource, briefings take place in week-long unpaid digital courses called "boot camps" and later evaluation periods called "in-house. " The use of the military and correctional term "boot camp" could be interpreted as reflecting this training's purpose: conditioning workers to obey tasks without question. Ironically, even though the platform employed workers to help train artificial agents, they were supposed to behave like "robots, " according to a Tasksource worker named Cecilia: When you start, they tell you: "To be successful in this job, you have to think like a machine and not like a human." After that, they explain to you why it has to be like that. For example, you are teaching a [self-driving] car how it has to behave. When you segment an image, there is a police car, and you label it like a regular car, the [self-driving] car will think it's a regular car and, if it crashes against it, something terrible can happen. The mistake was not of the car that crashed into the police vehicle, but it's yours as a tasker, as a worker, who taught the car to behave like that. Platform workers serve a similar role as BPO employees in reinforcing the primacy of instructions and requester intent to complete tasks effectively, producing data that fits model and revenue plan while shifting the responsibility for failures on workers. In this context, obedience to instructions is critical for data workers to keep their job and make a living. The fear of being fired, banned from the job, or not being paid for the task reinforces the disposition of workers to being compliant, even when instructions look arbitrary. This is what Rodolfo, Tasksource worker, reported: That is why I don't like that platform very much. Because they give us the instructions and we have to follow. And there are many cases where, if you don't complete the task really to perfection, according to what they want or what they think is right, they just expel you. Just like that, even if you followed the instructions thoroughly. Not everything is imposition and obedience in data work. There are also several expressions of workers organizing to improve working conditions and help each other deal with tasks and make the most out of them. For instance, Alamo's employment model that includes data workers as part of the company's permanent staff instead of having them as contractors results from workers organizing to demand receiving a fixed salary and benefits. As reported by one of Alamo's reviewers, Elisabeth, in 2019, further workers' demands were being negotiated with the company: We asked for a couple of things like the possibility of home office and a better healthcare plan. We are organizing many things. It's being negotiated. In 2020, probably also motivated by the Covid-19 pandemic, Alamo's data workers were finally allowed to work remotely. It is worth mentioning that before 2020, every other company's department and management were allowed to work at least some days of the week remotely while the data workers could not. In the case of the platforms, data workers organize in virtual groups and fora. The existence of virtual and local groups of workers that provide solidarity and support has been reported in other examples of platform labor [16, 67, 88] . In the case of Venezuelan data workers, we observed similar situations. Because we used convenience and snowball sampling and worker groups on social media as a starting point, all the interviewed participants were directly associated with them. Participants use these independent and worker-led spaces to exchange information about which tasks pay more and are less challenging to complete and warn each other about non-reliable requesters. One of the aspects that workers paid significant attention to was the presence of bugs in the tasks. When asked about their existence, Yolima, a worker with the platform Tasksource, said to us: Errors occur all the time. But, since we are in groups on Facebook and Whatsapp, we alert each other and say, "Hey, don't do this task because it has a bug. It will flag you as mistaken even if you have done everything ok. " Some smaller groups, with high entry barriers to ensure privacy and trustworthiness among members, recommend specific tasks over others. For example, when describing tasks with sexual or violent content, Estefanía, one of Clickrating's workers, stated: I don't like those tasks with pornographic content. I do them only when my friends from the groups say, "look, this is a good task, here's the link. " I don't have to look for good tasks, and that's great. I just have to log into my account and do the annotation without worrying about which tasks to do. Some users of these smaller groups also craft guides to explain the instructions to their peers. Most interviewed workers stated that their knowledge of the English language was limited. Since Tasksource and Clickrating only presented instructions in that language, and Workerhub provided automated translations with errors, these guides in Spanish are a fundamental tool for workers. They are written by workers for their peers and contain Spanish translations of taxonomies, definitions, and examples. They also provide further explanations about the contexts in which workers can apply the taxonomies, avoid being banned by the algorithm, and maintain high accuracy scores. For example, in the introduction of a guide for a task to annotate hate speech in text for Workerhub, a user wrote: Example 9 IMPORTANT INFORMATION What I'm sharing in this guide is based on my experience with the task. I'll try to explain as best as I can the tips that I consider are the most important to avoid being banned and the essential information to understand the task. BE CAREFUL The task "No Hatred" is not available on all accounts. You must have been paid AT LEAST ONCE. IT'S IMPORTANT THAT YOU CONSIDER THIS GUIDE FOR WHAT IT IS: A "GUIDE" made for you to understand the task better. You must earn real experience by doing the task with perseverance and dedication. Work practices in the data-production dispositif are not informed exclusively by the relationships between requesters, intermediaries (platform or BPO), and individual workers. They are also dependent on the networks formed by the latter group. This can be observed in BPOs where data workers share the same office space and constantly consult and advise each other on conducting projects more quickly and easily. Among platform workers, online groups help to choose what tasks to carry out, and such decisions are influenced by recommendations and guides from peers who evaluate instructions from requesters. RQ3: What artifacts support the observance of instructions, and what kind of work they perform? In this section, we focus on the third research question. Based on the observations conducted at the crowdsourcing platforms and the BPO company, we present three of the many possible materializations of the data-production dispositif: (1) the function of diverse types of documents that embody the dispositif's discourses, (2) the platforms and interfaces that guide and constrain data work, and (3) the tools used by managers and platforms to surveil workers and quantify their performance. In Section 4.2, we focused on the content of instruction documents to describe the discursive elements comprised in them. Here, instead, we look into a variety of documents -instructions included -to analyze them as artifacts, focusing on their form, function, and type of work they perform. One common document related to data work at BPOs is that containing metadata and project documentation. Alamo, for instance, records the details of each project in several documents that vary in form and purpose according to the task and the requester. Often, that documentation aims to preserve the evolution of task instructions, registering changes requested by clients. Keeping this type of documentation functions as a form of "insurance" for Alamo and can help resolve discrepancies if requesters are not satisfied with the service provided. In those cases, project documentation serves as proof that data was produced as instructed. Documents containing project details can also serve the purpose of preserving situated and contingent knowledge that would otherwise get lost and could help improve future work practices [53] . Sometimes, these documents become artifacts that cross Alamo's boundaries and reach the requesters. For them, the documents might have a factual function (in terms of the information they want or need) or a symbolic one (to reassure clients that Alamo is at their disposal). Alamo's QA analyst Nati describes this as follows: We send a monthly report to the clients, including what was done and problems we encountered; we set objectives for the following month and send an overview of the metrics. Some clients don't even look at the report but insist on receiving it every month. Others value it and use the information to report to leadership or investors within their organization. The documents produced by the BPO are tailored to be valuable for requesters. Conversely, the documents formulated by requesters often remain unintelligible for data workers, even if they are the primary addressees, as in the case of instruction documents. In many cases, language is the main issue hindering the intelligibility of documents: Most of the workers we interviewed have limited knowledge of the English language and reported using translation services, notably Google Translate, to understand the instructions provided by requesters. As mentioned in the previous section, one of the main reasons platform workers resort to guides written by peers is that they are written in Spanish. But beyond language differences, elements of the taxonomies used in documents can also be confusing, as explained by Tasksource's worker Yolima and described in Section 4.2.1: For the [categories], they are made in the United States, I think. I don't know what they would call a laundry sink 1 , a shower, or parts of the bathroom. Most of the time, my mistakes were with parts of the bathroom, especially around the shower, the tap, and those things. That was confusing because that was a shower for me, but it was something else for [the platform]. The confusion produced by the different languages is not merely a matter of cultural bias. Looking for cheap labor, platforms and requesters target the Venezuelan market but ignore the language barriers and formulate instructions in English. Moreover, further documents that workers encounter in their work, such as privacy policies, contracts, and non-disclosure agreements, are also prepared in English and remain, partially or totally, unintelligible for them. Workers usually sign these documents without understanding the full scope of their contractual relationship with their employers. Along with instructions, these documents embody the data-production dispositif. They are a materialization of normalized discourses and practices that shape data workers to be dependent and, therefore, obedient, while their subjectivities as Spanish-speaking Latin American workers are ignored and erased. In BPOs like Alamo, choices regarding which platform will host the data and will be used as a tool are made by clients. In many cases, the requester has developed an annotation software specifically tailored to the needs of their business and the dataset to be produced. In other projects, the company uses a commercial platform designed by a third party. In this case, the client would suggest the tool that best fits their needs among several choices available on the market. The choice of a specific tool comes with limitations that, in one way or another, constrain data workers' agency to interpret and sort data. The most notorious one is that the taxonomies contained in instruction documents are also embedded in the software interfaces that workers use to collect, organize, segment, and label data. Workers usually interact with a drop-down menu containing all the classes or attributes they are allowed to apply to data. Most interfaces do not allow workers to add further options to the list of pre-defined labels that they receive from requesters. This is most prominent in software interfaces specially designed by requesters and tailored to specific projects. In those cases, the software interfaces that mediate between workers and data are designed to ensure that tasks are completed according to particular parameters pre-defined by requesters and made explicit in the instruction documents. In the case of regular data work tools for commercial use, the impossibility of changing the predefined categories or adding more classes is perceived as a limitation that makes data work harder at BPOs and requires communication throughout hierarchical structures until the requester is reached. Jeff, one of the managers leading a BPO in Iraq, reports on this issue: There was a limitation on the annotation tool that they were using. They were relying on an open-source platform that doesn't have that feature that lets you add or create predefined attributes, which makes the work many times easier. Some of these generic tools give the project owner -generally the requester or a BPO's project manager -the faculty to allow workers to add further options to the classification system (see Figure 1 ). However, this does not seem to be a widespread practice at Alamo. Among the many projects that we had the opportunity to observe during fieldwork, only once were data workers allowed to co-create the taxonomy around which data was organized and annotated. In the crowdsourcing platforms that we studied, only Clickrating presented external interfaces, meaning that workers had to log into internal annotation platforms of clients, notably in the case of tasks requested by major technology companies. For Tasksource and Workerhub, workers interacted with data annotation interfaces developed by these companies. In both cases, the screen displayed a top bar with an accuracy score or the percentage of tasks submitted by the worker that the platform judged accurate. For Workerhub, the top bar also showed the number of annotations completed for the assignment, the earnings, the time spent per task, and a button to display the instructions (see Figure 2 ). On both platforms, the labels were available in the right sidebar alongside tools to zoom in and out and configure the visibility of the data. In all three platforms, workers could not change the predefined labels or suggest changes. The interfaces present in the BPO and the platforms feature gamification elements (scores and timing) to speed up the labor process and keep workers focused on the tasks at hand. The overreliance on speed privileges action over reflection and increases the alienation between workers and the production process. Even tasks that ask for workers' judgment, such as those present in Clickrating, are timed and reward fast thinking. That said, unlike the other platforms, they offer room for comments to evaluate algorithmic outputs that can be substantial, creating more engagement for workers beyond narrow annotation tasks. To differentiate itself in the very competitive market of outsourced data services, the BPO Alamo makes a selling point out of its performance metrics and quality assurance mechanisms. The company puts much effort into developing more and better ways of measuring performance and quality, and transforming those into numbers and charts the client may perceive as valuable. As a response to market demands, quality controls intensify, which results in pressure and surveillance for workers. Moreover, the need for quantifiable data to translate "quality" into a percentage exacerbates the standardization of work processes, which, once more, results in less room for workers' subjectivity. Alamo has highly standardized processes that include a team leader and several reviewers per team and a quality assurance (QA) department using several metrics to ensure that projects are conducted in accordance with the requesters' expectations. In addition, team leaders and the QA department use metrics to quantify workers' labor. As a token of transparency, sometimes workers' scores are shared with clients. Noah, one of the BPO's team leaders, describe the function of metrics within Alamo and concerning its clients: We have metrics for everything. They can be individual, for personal output, or they can be general in the project. We have some to measure correct and incorrect output, there we see where we fail, where we can give more support to the team so that those errors are corrected, how we can solve those problems. In QA, what they do is metrics. Metrics, and ensure that the quality provided to the client is high. In platforms, workers are also constantly evaluated with accuracy and speed metrics. Instead of being managed by company employees like in the case of Alamo, platform workers are assessed and controlled by algorithms. All platform workers we interviewed reported being often banned from tasks because the algorithms negatively evaluated their performance. Of course, this represents a serious obstacle to maintaining a stable income, especially when it is permanent. This is what Juan, a worker of Workerhub, reported: Juan: The platform pays every Tuesday. Once they ban you, you lose all your credits, in the sense that, without an account under your name and email, you can't open a new account and access the money you've earned. Interviewer: Did they tell you why? Juan: No. I could have asked in the [platform managed] Discord channel, but if you ask anything, you get banned. They are the ones who command. . . they are the ones who decide. I was banned without cause because my accuracy was high. I never knew why they expelled me. Interviewer: How did you realize you were banned? Juan: One day, I couldn't access my account. . . . I created another account with the same email, worked for a week, and they banned me again. They didn't pay me. Some of my colleagues from the same neighborhood and cousins who work for the platform told me: "Don't create an account with the same email because they won't let you. They will let you open it, but then they won't pay you. " The algorithms that assess worker performance in the three platforms that we have studied follow exactly the same three-step process: First, workers have to work with the same data again after some time. For example, if the task involves categorizing photographs of flowers according to their colors, if a worker marks the same image differently, the algorithm will consider it "spam. " The second mechanism is to verify workers' answers with previously labeled data. If there is a mismatch, the algorithm will assume that the worker is not performing their activities "accurately." Finally, from interviews with workers and previous observations in Amazon Mechanical Turk [56] , a platform that is not well established in Latin America outside of Brazil and, therefore, not the focus of this study, the third method used by algorithms is to compare workers' answers with those of peers and assume that the most common answer is the correct one. Many of the workers' groups that we encounter provide guides, so workers do not diverge from the responses of the majority and, thus, keep high levels of accuracy from the perspective of the algorithms. In concurrence with previous work [27, 53, 54, 75] , we have observed that workers collecting, interpreting, sorting, and labeling data do not do so guided solely by their judgment: their work and subjectivities are embedded in large industrial structures and subject to control. Artificial intelligence politics are inextricably connected to the power relations behind data collection and transformation and the working conditions that allow preconceived hegemonic forms of knowledge to be encoded in machine learning algorithms via training datasets. Labor conditions and economic power in the production of ML datasets manifest in decisions related to what is considered data and how each data point is interpreted. While task instructions help data workers complete their tasks, they also constitute a fundamental tool to assure the imposition of requesters' worldviews on datasets. Sometimes, the meanings and classifications comprised in data work instructions appear self-evident to workers, and a shared status quo is reproduced on the dataset. Often, however, the logic encoded in the instructions does not resonate with them. This could be due to cultural differences between requesters and data workers, lack of contextual information about the dataset's application area, perceived errors that cannot be reported, or simply because the tasks appear ethically questionable to workers. In such cases, another form of normalized discourse persists: that of a hierarchical order where service providers are conditioned to follow orders because "the client is always right" and workers should "be like a machine. " According to Foucault, discourse organizes knowledge that structures the constitution of social relations through the collective understanding of the discursive logic and the acceptance of the discourse as a social fact. A normalized discourse is, therefore, what goes without saying. This way, the prevalence of requesters' views and preferences does not need to be explicitly announced to workers. Instead, such implicit knowledge influences how the outsourced data workers that we observed and interviewed perform their tasks: carefully following instructions, even when they do not make sense to them or when they do not agree with the contents and taxonomies in the documents. The context of poverty and lack of opportunities in the regions where data production is outsourced is also fundamental as it makes workers dependent on requesters and, thus, obedient to instructions. Finally, artifacts such as narrow work interfaces with embedded predefined labels, platforms that do not allow workers' feedback, and metrics to assess workers' "accuracy" (understood as Accurate responses or be banned from the job. The client is always right… Fig. 3 . The three components of the data-production dispositif based on the framework and figure proposed by Jäger and Maier [39] compliance to requesters' views) constitute discursive materializations that, at the same time, ensure the perpetuation and normalization of specific discourses. All these elements combined -the predefined truth values encoded in instructions, the work practices and social positions of workers, and materializations such as interfaces -constitute the data-production dispositif. Without any of these elements, the dispositif would not be able to function as such. As Foucault puts it, dispositifs respond to an "urgent need" [25] that is historically and geographically contingent. The data-production dispositif responds to the voracious demand for more, cheaper, and increasingly differentiated data to feed the growing AI industry [5, 13] . Its goal is to produce subjects that are compliant to that need. The Foucauldian notion of subject has a twofold meaning, with subjects, on the one hand, being producers of discourse and, on the other hand, being created by and subjected to dispositifs. All subjects are entangled in dispositifs and have, therefore, tacit knowledge of how to do things within specific contexts. This tacit knowledge includes "knowing one's place" and what is expected from each subject depending on their position. Thus, data workers know that subjects in their social and professional position are implicitly expected to comply with client's requests. This way, dispositifs normalize and homogenize the subjectivities of those they dominate, producing power/knowledge relationships that shape the subjects within the dispositif according to certain beliefs, actions, and behaviors that correspond to the dispositif's purpose [21, 23] . Following Foucault's perspective, we argue that the goal of the data-production dispositif is creating a specific type of worker, namely, outsourced data workers who are kept apart from the rest of the machine learning production chain and, therefore, alienated. Data workers are recruited in impoverished areas of the world, often under the premise of "bringing jobs to marginalized populations," but are not offered opportunities to rise socially or professionally in terms of salary and education. Data workers who are surveilled, pushed to obey requesters and not question tasks, and who are constantly reminded of the dangers of non-compliance. Data production cannot be a dignifying type of work if it does not provide workers with a sustainable future. The implications of this data-production dispositif designed to constrain workers' subjectivities and perpetuate their alienation, precarization, and control, will be unpacked in the following subsection. As the generous corpus of research literature dedicated to mitigating bias in crowdsourcing suggests, controlling workers' subjectivities is considered essential to avoid individual prejudices being incorporated in datasets and, subsequently, in machine learning models. However, as we have shown with our findings, unilateral views are already present at the requesters' end in the form of instructions that perpetuate particular worldviews and forms of discrimination that includes racism, sexism, classism, and xenophobia. Given its characteristics, the data-production dispositif is detrimental to data workers and the communities affected by machine learning systems trained on data produced under such conditions. To close this paper, we would like to make a call to dismantle the dispositif. However, before going into the implications of our call, it is crucial to consider that we never cease to act within dispositifs and, by dismantling the data-production dispositif, we would inevitably give rise to another one. Therefore, we discuss here ways of dismantling the data-production dispositif as we know it today, that is, by changing the material conditions in data work and making its normalized discourses explicit. Substantial efforts in research and industry have been directed towards investigating and mitigating worker bias in crowdsourcing. Many of these initiatives portray data workers as bias-carrying hazards whose subjectivities need to be constrained to prevent them "contaminating" data. This widespread discourse within the data-production dispositif gives place to narrow instructions and work interfaces and the impossibility of questioning tasks. Workers are required to "think like a machine" to be successful in the job. Moreover, data workers are often kept in the dark about requesters' plans and the machine learning models that they help train. Such conditions lead to workers' alienation as they are kept apart from the rest of the ML production chain. Researchers have often referred to data workers as data labelers and content moderators, practicing ghost work [30] that remains "invisible" [70] . However, as Raval [69] accurately argues, it is worth asking invisible for whom and, most importantly, "what does this seeing/knowing-hence generating empathetic affect among Global North users-provide in terms of meaningful paths to action for Global South subjects (workers and others)?" Breaking with the alienation of data workers means much more than rendering them visible. It rather requires making the rest of the machine learning supply chain visible to them. It means providing information and education on technical and language matter that could help workers understand how their valuable labor fuels a multi-billion dollar industry. This also concerns questions of labor organization and unionizing: For instance, the recently-created Alphabet Workers Union has taken steps in this direction by including contractors -many of them outsourced data workers. To help counter their alienation, researchers and industry practitioners need to regard data workers as tech workers as much as we do when we think of engineers. Why would requesters want to educate data workers and disclose technical or commercial information to them? As mentioned above, the design of the tasks that we encountered failed to acknowledge and rely on the unique ethical and societal understanding of workers to improve the annotations and, with them, models. We found that the BPO model generates a stronger employment relationship with workers compared to platforms, notably Workerhub and Tasksource, which translates into higher engagement with the tasks at hand. Furthermore, BPO workers interviewed by us said they wished they knew more about the requesters' organizations and products because this would help them understand their work and perform better. In this sense, expanding instructions to include contextual information about the task, its field of application, and examples that show its relevance for systems and users could improve data workers' motivation and satisfaction, and help them understand the value of their labor within ML supply chains. One of the most pressing ethical and humanitarian concerns surrounding outsourced data work is the workers' quality of life. The data-production dispositif is designed to access a large and cheap labor pool and profit from workers' precarious working conditions. It is not a coincidence that, in Latin America, the platforms we encountered were established primarily in Venezuela, a country mired in a deep socio-political crisis exacerbated by the COVID-19 pandemic, and that the BPO company in Argentina recruited its workers from low-income neighborhoods. While the arrival of these platforms and BPOs has allowed many workers to circumvent the limits of their local labor markets, the system of economic dependency and exploitation that they reproduce hinders efforts for sustainable development that include access to decent work, and economic growth [82] . Labor is an often overlooked aspect in discussions of ethical and sustainable artificial intelligence [66] . We argue that we cannot truly create fair and equitable machine learning systems if they depend on exploitative labor conditions in data work. Why would requesters want to improve labor conditions in outsourced facilities? All ML practitioners interviewed for this study had experience outsourcing data-related tasks with both crowdsourcing platforms and BPOs. They all agreed that platforms are cheaper than BPOs, but the latter offer higher quality. As argued by our interview partners, BPO teams remain more or less unchanged throughout the production project, which results in better quality. Moreover, direct communication with project managers allows for iterations and the incorporation of feedback. Several ML practitioners also report preferring not to outsource data-related tasks, especially in cases where a unique "feel for the data" [61] , that can only be achieved with time and experience, was required. The evidence pointing to a negative correlation between cheap labor and the quality of data [49] described by the ML practitioners that we interviewed could be a strong argument for requesters to take measures and fight precarious work in outsourced facilities. Improving labor conditions might result in less expensive (and, perhaps, more effective) approach than investing in "debiasing" datasets after production. Our findings show that the widespread use of "protected categories" for human classification is bound to the cultural contexts and local jurisdictions that define what counts as a protected group. Moreover, even tasks that do not involve classifying humans, such as identifying objects in a road, can potentially have fatal consequences for individuals or groups, as in the case of the Tasksource requester who did not include labels for humans sleeping or lying on the streets. Making the rationale behind task instructions explicit can be difficult if categories are implicitly considered commonplace for requesters, as they might not even notice the normativity behind instructed taxonomies. Moreover, data workers that are subject to surveillance and control and who risk being banned from tasks are less likely to question instructions. De-centering the development of taxonomies from an "a priori" (i.e., classifying exclusively based on personal experience) and databased (i.e., classifying solely based on quantitative data) classification to one that derives from the context and experiences of those who may be affected by it could be a fruitful approach to this issue [17] . Data workers often perceive errors in task instructions or interfaces that remain unnoticed by the requesters. Even if this feedback could be valuable for requesters, the data-production dispositif is designed to silence workers' voices. We argue that the approach observed in Clickrating, where feedback from workers was encouraged, could be constructive here. However, expanding and implementing such an approach would require a general shift of perspective: from considering workers' subjectivities a danger to data towards considering workers as assets in the quest for producing high-quality datasets. Fostering workers' agency instead of surveillance and opening up channels for feedback could allow workers to become co-producers of datasets instead of mere reproduction tools. Why would requesters want to be questioned in their logic? While taxonomies respond to the commercial necessities of requesters, they also need to be built with equity and inclusivity in mind. This is not only an ethical issue, but it can quickly become a commercial one. Public scrutiny can have fatal consequences for a machine learning product that is perceived to be discriminatory or harmful [2, 32, 45] . Furthermore, instruction documents are living documents. We have observed how requesters update them by withdrawing the tasks, reinstating the instructions, and seeking data work again, a time-consuming and costly process. Thus, requesters could benefit from considering instructions as the product of exchanges with the different stakeholders contributing to data production and deployment. Data workers could play a key role in interrogating and improving tasks and, therefore, datasets and ML systems. Our findings are bound to the platforms, companies, individuals, and geographical contexts covered by our study and our positionality as researchers, which has undoubtedly oriented but probably also limited our observations and interpretations. Because of the qualitative nature of our investigation, we have striven for inter-subject comprehensibility [20] instead of objectivity, which means making sure that our interpretations are plausible for both authors and the contexts observed. Furthermore, the use of multiple data sources allowed us to procure supporting evidence for observed phenomena. In addition, the use of expert interviews allowed us to confirm and discuss several of our initial interpretations. This paper only covers some aspects of the data-production dispositif. This is because no dispositif works in isolation but is always entangled with other discourse, action, and materialization networks. To explicate the totality of the data-production dispositif would mean to analyze its relationship with, among many others, the scientific dispositif, the economic dispositif, and more specifically, the academic and the tech-industry dispositifs. Critical aspects of these relationships have been reported in these pages, but covering them all in one paper would be unfeasible. The fact that our analysis is bound to remain "incomplete" could be seen as a limitation. However, we consider it an opportunity for future research to expand our findings and interrogate ways of working with data that today seem commonplace. We think that a profound exploration into the tech-industry dispositif and its relationship with the data-production dispositif could be especially fruitful. To explore how data for machine learning is produced through labor outsourced to Venezuela and Argentina, we have turned to the Foucauldian notion of dispositif and applied an adapted version of the dispositif analysis method outlined, among others, by Sigfried Jäger [39, 41] . Our investigation comprised the analysis of task instructions, interviews with data workers, managers, and requesters, as well as observations at crowdsourcing platforms and a business process outsourcing company. What we have called the data-production dispositif comprises discourses, work practices, and materializations that are (re)produced in and through ML data work. Our findings have shown that requesters use task instructions to impose predefined forms of interpreting data. The context of poverty and dependence in Latin America leaves workers with no other option but to obey. In view of these findings, we propose three ways of counteracting the data-production dispositif and its effects: making worldviews encoded in task instructions explicit, thinking of workers as assets, and empowering them to produce better data. While the potentially harmful effects of algorithmic biases continue to be widely discussed, it is also essential to address how power imbalances and imposed classification principles in data creation contribute to the (re)production of inequalities by machine learning. The empowerment of workers and the decommodification of their labor away from market dependency, as well as the detailed documentation of outsourced processes of data creation, remain essential steps to allow spaces of reflection, deliberation, and audit that could potentially contribute to addressing some of the social questions surrounding machine learning technologies. Venezuela reports 2020 inflation of 3,000 percent Machine Bias. ProPublica Bias decreases in proportion to the number of annotators The lonely raccoon at the ball: designing for intimacy, sociability, and selfhood On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? Conference on Fairness, Accountability, and Transparency (FAccT '21 Data work in healthcare: An Introduction Identifying Mislabeled Training Data Mehr als nur diskursive Praxis? -Konzeptionelle Grundlagen und methodische Aspekte der Dispositivanalyse On the Methodology of Dispositive Analysis Digital labor studies go global: Toward a digital decolonial turn The Platformisation of Labor and Society How annotation styles influence content and preferences Atlas of AI. Power, Politics, and the Planetary Costs of Artificial Intelligence Bourdieu and Foucault on power and modernity AI in the Wild. Sustainability in the Age of Artificial Intelligence Log Out! The Platform Economy and Worker Resistance 2020. Data feminism CrowdCO-OP: Sharing Risks and Rewards in Crowdsourcing A Design Perspective on Data Uwe Flick. 2007. Qualitative Sozialforschung: Eine Einführung Orders of discourse The Archaeology of Knowledge: And the Discourse on Language The Subject and Power What Is Critique? Power/knowledge: selected interviews and other writings Clarity is a Worthwhile Quality: On the Role of Task Clarity in Microtask Crowdsourcing Garbage in, Garbage out? Do Machine Learning Application Papers in Social Computing Report Where Human-Labeled Training Data Comes From Are We Modeling the Task or the Annotator? An Investigation of Annotator Bias in Natural Language Understanding Datasets Measuring Social Biases of Crowd Workers using Counterfactual Queries Ghost Work: How to Stop Silicon Valley from Building a New Global Underclass The Academic Dispositif: Towards a Context-Centred Discourse Analysis let's stop AI ethics-washing and actually do something A Data-Driven Analysis of Workers' Earnings on Amazon Mechanical Turk Stories of the Smartphone in everyday discourse: conflict, tension & instability Digital Anthropology. Berg. 196-213 pages Understanding and Mitigating Worker Biases in the Crowdsourced Collection of Subjective Judgments The cultural work of microwork Turkopticon: interrupting worker invisibility in amazon mechanical turk Analysing discourses and dispositives: A Foucauldian approach to theory and methodology Little history of CAPTCHA Deutungskämpfe: Theorie und Praxis Kritischer Diskursanalyse How HCI talks about sexuality: discursive strategies, blind spots, and opportunities for future research The Role of Unemployment in the Rise in Alternative Work Arrangements Biased Priorities, Biased Outcomes: Three Recommendations for Ethicsoriented Data Annotation Practices Microsoft Funds Facial Recognition Technology Secretly Tested on Palestinians Turn to the Self in Human-Computer Interaction: Care of the Self in Negotiating the Human-Technology Relationship The Dispositif: A Concept for Information and Communication Sciences The relationship between motivation, monetary compensation, and data quality among US-and India-based workers on Mechanical Turk Data Work in Education: Enacting and Negotiating Care and Control in Teachers' Use of Data-Driven Classroom Surveillance Technology The Movement Problem, the Car and Future Mobility Regimes: Automobility as Dispositif and Mode of Regulation Being a turker Between Subjectivity and Imposition: Power Dynamics in Data Annotation for Computer Vision Documenting Computer Vision Datasets: An Invitation to Reflexive Data Practices Who Does the Work of Data? Interactions The Brazilian Workers in Amazon Mechanical Turk: Dreams and realities of ghost workers How Data Science Workers Work with Data: Discovery, Capture, Curation, Design, Creation Designing Ground Truth and the Social Life of Labels Lifting the curtain: Strategic visibility of human labour in AI-as-a-Service Post-Foucauldian Discourse and Dispositif Analysis in the Post-Socialist Field of Research: Methodological Remarks Data Vision: Learning to See Through Algorithmic Abstraction Trust in Data Science: Collaboration, Translation, and Accountability in Corporate Data Science Projects The Future of Work Is Here: Toward a Comprehensive Approach to Artificial Intelligence and Labour Embedded Reproduction in Platform Data Work. Information We Haven't Gone Paperless Yet: Why the Printing Press Can Help Us Understand Data and AI Algorithmized but not Atomized? How Digital Platforms Engender New Forms of Worker Solidarity in Jakarta Foucault's dispositive: The perspicacity of dispositive analytics in organizational research Interrupting invisibility in a global world Behind the Screen: Content Moderation in the Shadows of Social Media Who are the crowdworkers?: shifting demographics in mechanical turk We Are Dynamo: Overcoming Stalling and Friction in Collective Action for Crowd Workers All Equation, No Human: The Myopia of AI Models Everyone wants to do the model work, not the data work": Data Cascades in High-Stakes AI Do Datasets Have Politics? Disciplinary Values in Computer Vision Dataset Development Primas salariales sectoriales en Argentina. Ministerio de Desarrollo Productivo de la Nación. Centro de Estudios para la Producción XXI Data Work in a Knowledge-Broker Organisation: How Cross-Organisational Data Maintenance Shapes Human Data Interactions Critical Experience: Evaluating (with) Autistic Children and Technologies When is Machine Learning Data Good ?: Valuing in Public Health Datafication Towards an AI-powered Future that Works for Vocational Workers The trainer, the verifier, the imitator: Three ways in which human platform workers support artificial intelligence United Nations Measures to combat contemporary forms of racism, racial discrimination, xenophobia and related intolerance Bayesian Bias Mitigation for Crowdsourcing Born Political: A Dispositive Analysis of Google and Copyright Security as Dispositif: Michel Foucault in the Field of Security. Foucault Studies Networked but Commodified: The (Dis)Embeddedness of Digital Labour in the Gig Economy Workers of the Internet unite? Online freelancer organisation among remote gig economy workers in six Asian and African countries The Gig Economy: A Critical Introduction The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Funded by the German Federal Ministry of Education and Research (BMBF) -Nr 16DII113, the International Development Research Centre of Canada, and the Schwartz Reisman Institute for Technology and Society. We thank Tianling Yang, Marc Pohl, Alex Taylor, Alessandro Delfanti, Paula Núñez de Villavicencio, Paola Tubaro, Antonio Casilli, and the anonymous reviewers. Special thanks to the data workers who shared their experiences with us. This work would not have been possible without them.