Advocates of open data promise big benefits. Some documented successes have come in the form of increased savings to government and the creation of useful software applications that help residents navigate their cities and government services better, especially in the West. But open data advocates expect much more. The push towards open data is also premised on its ability to make governments more transparent and accountable to their people, on its ability to empower citizens vis-à-vis governments. According to Gavin Newsom, the current Mayor of San Francisco who has embraced open data, "It will change the way citizens and governments interact but perhaps most importantly, it is going to change the way elected officials and civil servants deliver programs, services, and promises... I can't wait until it challenges and infuriates the bureaucracy" [1].
Can such benefits be expected in the Indian context? I have collected a great deal of data in India, thanks to a project that I started and run called Transparent Chennai, which collects and creates maps and data about important civic problems, especially those affecting the poor [2]. My experience with collecting government data at the city level suggests that accessing these benefits may require that open data policies take into account some of the peculiarities of the Indian context, and the ways in which the Indian government stores and uses data.
What we found in Chennai was that existing policies to make data more available to the public have failed to overcome systemic local obstacles to openness. Despite the presence of the Right to Information Act (RTI), seekers still had to navigate a complex bureaucratic system to access the information they required, one which continued to actively resist giving out information to the public. Government knowledge in Chennai was also much more fragmentary than I expected. Our research found that information about city services tended to be incomplete, for ultimately rational reasons that were not always immediately apparent. As a result, advocates of open data policies run the risk of glorifying government data that is frequently inaccurate or incomplete, often in ways that are biased against the poor. Such fragmentary data also often does not provide the kind of knowledge that is required for holding government accountable for better performance, undercutting the goals of open data policies.
Two narratives about collecting data in India will help to illustrate some of these peculiarities. These accounts are taken from my own experiences, as well as a data collection diary maintained by an intern for Transparent Chennai, Meryl Sebastian, who was tasked with gathering data for the project. These accounts are taken from our efforts to collect data on two issues -- public toilets and bus routes [3].
The city has a vast network of public transportation. It has a Metropolitan Transport Corporation (MTC) that manages approximately 700 bus routes, two existing train lines managed by the Southern Railways, a suburban train line and a Mass Rapid Transit System, and another rail line under construction that is managed by the Metro Rail Corporation. However, at the time that Transparent Chennai started looking at the issue in late 2009, there was no publicly available map of bus routes in the city, or of the two existing train systems, or of the new planned Metro rail [4].
We decided to focus first on creating an easily searchable map of bus routes in the city, but getting detailed information on routes for this map from the MTC proved to be nearly impossible. Meryl started by calling the office of the Managing Director of the MTC repeatedly, but the phone rang endlessly with no answer. She finally managed to speak to his assistant who told her to speak to Mr. Sampath, a senior officer in the Operations department. Mr. Sampath picked up his phone only on the third try, and despite having been referred to him by the Managing Director's office, he suggested that Meryl first send a letter directly to the Managing Director asking for the information and getting approval for him to release it.
Since letters sent by mail had not yielded results in the past, Meryl decided to visit the office with a letter in hand introducing herself and our project, and asking for the information we required. The guard outside the MTC office examined her letter, and sent her to the Public Relations Officer who told her that that the MTC did not have any maps or shapefiles [5] for bus routes, and that all the information we needed could be found on the MTC website. This was not actually true. The website only identified the major bus depots that buses passed through on each route. The bus depots were usually at least a couple of kilometers apart, with at least ten stops between them, about which no information was available on the website. The MTC's website was also designed so poorly that unless you knew the exact name of the bus depots nearest to your starting point and destination, you could not find the routes. Moreover, none of the information was available on a map. In other words, the site was almost entirely useless for anyone who was not already very familiar with the bus system and with Chennai. Meryl pressed the Public Relations Officer further for detailed route information. He shook his head again -- the information was too extensive to be given out, he argued, but he nevertheless referred her to Mr. Sampath, the senior officer in Operations whom she had already spoken to earlier on the phone.
Meryl reached Mr. Sampath's office just as he was leaving. He looked over her letter, and asked his assistant to give her access to the information. The assistant then took Meryl to the office of a senior official in the Traffic Planning and Development department, a man named R. Kumar. He told Meryl that there were more than 650 routes in the city, and that it was not possible to give her detailed information on all of them. He also told her that the MTC itself did not have up-to-date route maps or digital lists of stops for all the routes. When she pressed him further for any detailed information about routes, he conceded that he could give her detailed route information for five to ten routes, provided she first got approval in writing from Mr. Sampath for releasing this information. Satisfied, Meryl returned to our office to prepare a letter for Mr. Sampath to sign. A few days later, she called Mr. Sampath's office again to schedule a meeting to give him our request letter. He told her that they would all be busy with Board Meetings until the Monday of the following week, and asked her to come to the office on Tuesday with her letter. On Tuesday, when she called again to schedule an appointment, he said that there was a function happening that the entire office would be attending, and asked her to come the next afternoon instead, and to meet Mr. Kumar, an officer in the Traffic Planning department, instead of coming to his office.
The next day, Meryl went to meet Mr. Kumar, who told her it would not be possible to give out route details for even the 10 routes he had promised. He claimed the MTC itself did not have this information for any of the routes, which contradicted what he said in his last meeting. According to him, the last time that the MTC had updated official records of detailed route information was in 1972. The reason they had only put the depot information on the website, was because bus depots remained the same even though the routes that the buses took to get from depot to depot changed quite frequently.
When Meryl insisted that he give her at least some of the information, Mr. Kumar directed a member of his staff, Mr. Murugavel, to give Meryl a list of routes, and detailed information on stop names for a few of them. Mr. Murugavel, while he was putting together the list of routes, told Meryl that information about bus routes and specific stop locations were written up in registers at the MTC, and occasionally updated on computers when people who had the skills to do that were available. However, bus routes changed frequently, and they were easier to update in the registers than on the computers, so the information on the computers lagged far behind. According to Mr. Murugavel, bus routes and bus stop locations were highly contentious issues, and there were frequent debates about where bus stops were needed. Publicly committing to routes would make responding to pressures from residents and powerful people like local politicians for changes in routes much more difficult, and so the MTC was reluctant to make information about all stops available to the public.
He gave her for the list of routes, and listed the names of all the stops for 22 major routes, and asked her to come back if she needed anything more. Although the MTC's own database of routes and stops was not well organized, Mr. Murugavel was highly conversant with the details of routes and stops. Indeed, he had compiled the list of stop names for these 22 routes entirely from memory. He also told us the routes that were the most used, and would give us full geographical coverage of the city, all without consulting a map, a register, or a computer.
Although some surveys like the National Family Health Survey showed that there were much higher levels of access to sanitation in Chennai than in other cities, low-income women workers who we spoke with said they needed more well-maintained public toilets in the city, not just at their houses, but also at their workplaces like market areas and informal industry clusters. The Transparent Chennai team wanted to find out more about toilets -- how many there were, where they were located, how they were planned, and how they were funded -- so that the public toilet infrastructure could be better directed towards meeting workers' needs at home and at the workplace. However, answering these questions proved to be far more difficult than we originally imagined.
We first decided to get an accurate count of public toilets in the city. One afternoon, I called the Chennai Corporation and asked for the department that took care of public toilets. After many long holds, phones being hung up, and failed attempts to transfer my call to the correct department, someone finally connected me to the Buildings Department that managed all Corporation owned structures. The man on the other end of the phone chuckled when he heard that I was interested in public toilets, and then told me that although the Buildings department was responsible for the construction and maintenance of public toilet structures in the city, they maintained no central register of toilets at the Chennai Corporation's main office. To get information about the number and locations of toilets, he told me that I would have to approach each of the Zonal offices individually.
At the time, there were ten Zonal offices in Chennai [6], and I asked Meryl to visit each office to get the total number and locations of all the public toilets. The process we followed was the same for each Zone, but the offices responded with varying levels of cooperation. For one Zone, Meryl left our office armed with a letter of introduction specifying the information she required and a vague address taken from the Corporation website, and searched for the zonal office with an increasingly irritable auto-rickshaw driver. When she finally arrived at the office, neither the Assistant Commissioner nor the Executive Engineer was available, so the personal assistant to the Assistant Commissioner sent her to the Letters department. There, she was asked to make a photocopy of her request letter. The original was kept with them, and the copy was given to her, both stamped with the date of her visit, and she was asked to come back after two days. Two days later, the Executive Engineer was there, and like many of the other officers we interacted with on this issue, he seemed both confused and amused by her interest in toilets. He chided her for coming in the afternoon, because the work would have been completed more quickly in the morning, but immediately put two engineers to the task of preparing a list for her. After another hour of waiting, she had a hand-written list of toilets and toilet addresses in her hand, and she returned to the office triumphantly to type it up.
Other zones were not so easy. In one Zone in northern Chennai, Meryl met with the Assistant Commissioner. He told her that the list of public toilets was with the Executive Engineer who was away on a trip to Delhi, and asked Meryl to come back the next week. When she returned to the Zonal office, both the Assistant Commissioner and Executive Engineer were in meetings. She waited for two hours until the Executive Engineer found time to meet with her. Although the Assistant Commissioner had personally directed the request letter to the Engineer, the Engineer was not sure whether the list of toilets could be given out to a member of the public without explicit approval from the Corporation Commissioner, the senior-most bureaucrat in the city. Meryl told him that she had collected the same data from seven other zones without any problem. The Engineer nodded, and asked her whether she had already obtained the data from Zone 9. She said yes, and he then called the Assistant Commissioner of Zone 9 to ask whether it would be prudent to give out the information. Finally satisfied that a list of public toilets was safe to give out to a member of the public, he instructed the Engineering Department to prepare the list for her. After waiting another 45 minutes, she left the office with a list of the 31 toilets in the Zone and their addresses.
In this way, zone by zone, with multiple visits, many letters of introduction, and much careful coaxing, Meryl slowly put together a list of toilets and their addresses in the city. Only one zone provided her with a map of local infrastructure [7], the rest gave her lists of toilets and addresses. Her list showed that there were 572 toilets in the city, what seemed to us like a vast under-provisioning of toilets. After all, the 2001 Census counted only 670,000 toilets in the city for 827,000 houses, so there was clearly a need for public sanitation of some sort. However, before we went public with the data, we decided to file a petition under the Right to Information (RT) Act for the same information so that if we were challenged on the data, we would have a paper trail showing from where we had obtained it. The RTI also required follow up with each individual zone and took much longer than the stipulated 30 days waiting period. When the results finally came in, we were shocked. The total number of toilets jumped from 572 to 714. Every single zone reported a different number of toilets under the RTI than they had reported voluntarily to Meryl. We were particularly befuddled because some reported more toilets than they had listed for us originally, but some actually reported less, meaning that engineers had seemingly given us addresses for toilets that did not exist.
However, we found even more inconsistencies in the database. Using the addresses given to us and armed with a GPS unit, we physically mapped all the toilets in one zone - Zone 4 - in the northern part of the city, which had a high percentage of slums and informal workplaces. It emerged through the physical mapping that many of the toilets had incorrect addresses, and our team spent hours searching for them in the warren of streets in that zone. Some toilets included in the database had already been demolished by the city. Others were clearly in disuse, and had been for years, and really should not have been included in the listing of current toilets. From our local interviews, it appeared that zonal level bureaucrats had good reasons for keeping the number of toilets unclear. Contracts for toilet maintenance were a source of income for many ward councilors, and lower-level bureaucrats were paid off to ensure that the contracts went to the right people. Although we do not have proof that this is what happened, people we interviewed in the field told us that non-existent toilets were being used for creating fictional maintenance contracts so that councilors could benefit from them. This also could explain why toilets that had been in disuse for years were still counted in the official register -- they made money for local councilors.
These two narratives, one about public toilets and one about bus routes, provide insights into the process of data collection from government sources, and provide some important lessons for advocates of open data.
Firstly, simply passing a law on open data may not be enough to make government data widely and easily available to the public, the way in which such data has become available in some other nations. Even though Indian citizens currently have a right to access government information, members of the Transparent Chennai team still found it extremely difficult to get access to basic data on civic issues. Much of this difficulty came out of a culture of fear among lower level bureaucrats to give data to the public without explicit approval from their superiors. Unless systemic barriers to openness are addressed, such as this type of culture within the bureaucracy, legislating for more openness may not be as effective in India as elsewhere.
Secondly, opening up existing government data alone will not be enough to increase government accountability and to evaluate government performance, especially on issues concerning the poor. When we were collecting information, we found large gaps and egregious errors in the government's written data. There was no centralized list of toilets, and the list of toilets from one zone misidentified toilet locations and counted non-existent and un-used toilets. Bus routes data missed out on all the stops between bus depots. It was also true in both of these cases that officials were actually well-informed about toilets and bus routes, but that this knowledge was held informally, as personal knowledge rather than as a shared database for their department. In these cases, inaccuracies or gaps in written data were useful to government officials in different ways. In the case of toilets, it appeared that elected representatives profited from them. For buses, the lack of publicly available lists of bus routes enabled the MTC to respond more quickly to complaints from residents. As a result, focusing on existing government data was not enough to understand the state of civic services.
These gaps and inaccuracies in data are particularly true for issues concerning the poor. Despite the amount of time and energy it took to create high quality data on the numbers and locations of toilets in the city, we did not have much more information to really evaluate government performance on providing access to sanitation than we did before, besides the total number of toilets in the city and the percentage of usable toilets in Zone 4. This is because other pieces of data against which the map or our survey could have been usefully compared were simply unavailable. Maps of places where toilets would have been needed -- slum areas [8], large market areas where street vendors operated, bus stops and bus depots, clusters of informal sector industry, waiting areas for daily laborers -- were not available either from the government or other sources. And because many of these spaces and activities, like street vending or informal industry, are considered illegal by the government, it is unlikely that the government will ever collect information on them. Demographic information by ward was available from the Census, but this was not comparable with the toilet maps because the census used completely different ward boundaries than the Corporation. Statistics like access to sanitation were only available for the city as a whole, and not disaggregated to the zone or the ward level. After all of our efforts, we had a lot more "data" about sanitation, but we did not necessarily have much more information to evaluate government performance on providing sanitation in the city. Our experience suggests that getting access to government data alone is not enough to hold government accountable for service provision in the Indian context, particularly for the poor. Advocates of open data who are concerned with government accountability will have to find innovative means of supplementing existing government data.
[1] Claire Cain Miller, "Local governments offer data to software tinkerers," New York Times, December 6 2009, http://www.nytimes.com/2009/12/07/technology/internet/07cities.html
[2] Transparent Chennai's mission is to collect, process, and map data on neglected issues in the city, particularly those affecting the poor. On issues for which government data is lacking or incorrect, we work with residents to actually create their own data. Our work can be found at www.transparentchennai.com.
[3] The choice of these two issues was not accidental. When I was still working as an urban policy researcher at the think-tank in which Transparent Chennai is housed, I organized a public consultation for informal sector workers. Both public sanitation and public transport were issues that came up repeatedly at the consultation, and these were the issues that we decided to focus on at first in our work.
[4] The new Chennai Metro Rail website has a schematic map of the three rail routes
[5] Shapefiles are a format commonly used for geospatial data.
[6] Since then, the Corporation has expanded to 15 Zones.
[7] This was a printed map, which ended up being nearly useless for geo-coding the infrastructure because the map was so unclear.
[8] We did have access to a map of unrecognized slums in the city produced by a consultant, but this map was not part of the city's official record.