Statistics holds a paramount position in developing planning as a scientific tool to translate relevant data into information for effective policymaking. The development plans and policies of any country are formulated on the basis of data collected from the ground level. They provide information, both for policy formulation and its evaluation, so that needful interventions could be made whenever necessary. Surveys are conducted to collect different kinds of data from various categories of stakeholders. India too has a long history of surveys and censuses. Even in the British era, the government got a survey/census conducted at specific intervals. Much importance was attached to the survey/ census by the government after independence. In the early decades of planning in our country, P C Mahalanobis developed and used statistical tools and large-scale sample survey data in various fi elds of planning and policy. The quality of statistics plays a very important role in the formulation of development and investment policies. It is also crucial for improving transparency and accountability in policy planning and implementation with better governance and management. It also helps in having greater control on delivery of public services. However, it is observed that the data collected from various stakeholders is not utilised to the extent it is envisaged for. There are a number of issues to be considered from the point of view of conceptualisation of a scheme/project for which the data is collected, the modality which is adopted for collecting the data and finally the way it is utilised. For the policymakers, it is desirable that the quantum of data to be collected is optimised and the procedure is simplified with a view to make it less cumbersome, time saving, more useful and result-oriented. Some of the key issues faced during data collection for different national level surveys could be listed as follows: (1) Illiteracy/lack of awareness on the part of respondent; (2) no visible benefits for the respondent; (3) too many surveys or successive surveys coming in a row; (4) respondent’s fear about sharing the information; (5) absence of appropriate statutory laws for orienting the respondent for sharing the information; (6) unavailability of trained and experienced manpower; (7) lengthy questionnaire; (8) complexity of questions, often being beyond the understanding of respondent; (9) approaching the right person at the right time; (10) wrong information intentionally being provided by the respondent to hide her/his real status (for example, a welloff person claiming herself/himself under the category of below the poverty line (BPL) and vice versa). National Sample Surveys Before we take up the remedies for the above-mentioned issues, it would perhaps be relevant to briefly touch upon recent experiences from the National Sample Survey Office (NSSO) surveys. The intricacies of a national-level survey could be better elaborated through the schedules used. For instance, one of the schedules used for the Employment–Unemployment Survey has a question pertaining to response code. In the response code, there are five options, namely: (i) informant: cooperative and capable; (ii) cooperative but not capable; (iii) busy; (iv) reluctant; and (v) others. Every respondent covered under the said survey will compulsorily fall under any of the above five options. The level and quality of data/information obtained from a respondent depends upon her/his behaviour (cooperativeness) and capability. Out of these options, only option (i) that is, informant: cooperative and capable, fully meets the objective of the survey. The respondents from all the other options are not able to provide the information of desired level and quality
Further, it also dilutes the level and quality of overall information gathered from option (i) Thus, the inference drawn from such data would also lead to distortion. The data obtained through the 61st (2004–05), 66th (2009–10) and 68th (2011–12) rounds for the NSS Employment–Unemployment Survey leads to a number of important observations. It has been observed that in the abovestated three rounds, an average of 80% respondents fall in category I. They are cooperative and capable, about 17% respondents fall in category II. They are cooperative but not capable. About 2% respondents fall in category III. They are busy and cannot respond to the survey. About 1% respondents are in categories IV and V. They are either reluctant or are in the category, “others.” In other words, the responses of about 20% of the respondents—about a fifth—do not lead to clarity. The data generated through the information from such unwilling respondents often yields intriguing estimates. This also has its long-term repercussions on data collected from successive rounds. The other impact of such data coming below the level of expectation deprives the scholars, policymakers and other stakeholders to raise questions about the reliability and veracity of data. Substitution of households is usual in large-scale surveys. Households are selected scientifically as per the standardised procedures. An emphasis is also laid on canvassing the schedules from originally selected households. However, if respondents are not providing the data due to various reasons, the data collector has to substitute such originally selected household as per laid-down procedures to cover the requisite number of households. As observed from Table 2, in the said three surveys, an average of 3% substituted households were surveyed and the reason for substituting 20% households (that is, every fifth household) of these households was, that e ither the informant was busy or noncooperative. The problem of response beyond the expected framework does not end here, even after substituting non-responsive households. Thus, results of large-scale national surveys being analysed with such drawbacks would certainly lead to inappropriate policies at the national level. Hence, it is imperative that sincere efforts are made to find out its reasons and solutions thereof.
The solutions of problems involved in this exercise require an in-depth analysis of the underlying problems. For objective assessment of the situation, a sector-wise analysis of respondents was made. Moreover, we get a view of the trend prevalent among different categories of respondents which is illustrated in Table 1. For instance, there is a clear diver gence of trend between rural and urban respondents. While the rural respondent is more cooperative, but less capable to respond, the urban respondent is capable, but more busy and reluctant. Such divergence has been observed in the previous NSS rounds, more or less with the similar trend. There could be unanimity on this issue that getting an appropriate reply from both rural/urban respondents is a challenging task. However, there are certainly ways and means to obtain appropriate replies from respondents of both the sectors. Within the sector (whether rural or urban), a clear dichotomy is observed, which is illustrated in Table 3. From the table, it is observed that high-income group respondents in both the sectors are capable but more busy, reluctant and non-cooperative as compared to middle- and low-income group respondents. The issues discussed above are commonly prevalent in case of large-scale national surveys in different countries.