Saturday, May 21, 2016

Civil Services Examination - Game of Thrones or Game of Chance?

The Civil Services Examination of India is one of the most selective examinations in the world with probably the lowest success rate for any competitive examination. In 2015, a total of 465,882 candidates appeared for the first stage of the Exam – Prelims, of which 15,008 (3.2%) were selected to appear for the next stage of written examinations – Mains. Around 20 % of these candidates, specifically, 2797 were called for third and final stage – the Personalty Test or Interview of which a total of 1078 candidates were finally selected and recommended for appointment to one of the many Services of the Government of India, notably, the IAS, IFS, IPS, IA&AS, IRS, ICAS, IDAS, IRAS, IP&TAFS etc. The success rate is thus a measly 0.23%. 


I took the Civil Services Examination in 1994  and again in 1995, cleared on both occasions, and  based on my 1995 exam rank of 136, joined the Indian Audit & Accounts Service (IA&AS) in 1996, and have been happily working with the Audit Department since then.
In those days, the marks secured by the successful candidates were not available in the public domain. Recently, when  I came to know that this data is available on the UPSC website, I decided to analyse the same. Here is my analysis, based on the presentation that I made to the Civil Services Officer Trainees (Probationers) of the various Accounts and Finance Services  undergoing training at NIFM (where I am presently on deputation as Professor).


The dataset in MS Excel and pdf format can be downloaded here. (MS Excel / PDF)


The above visualisation shows the stratification of final Rank (bin size of 85 ranks) with relative share of the four different categories in different rank bin. This is fairly stable till about Rank 600, which has been taken as the cut-off rank for subsequent analysis.

The above visualisation (Histogram) shows the very narrow band in which the written scores of most candidates lie - between 40 to 45 % in both Civil Services Exam 2015 and 2014. (Each bin of width 1 %)
The Personality Test (Interview) however shows a very different behaviour - the spread here is far wider. A very interesting feature of the interview scores distribution is the "lumpiness" of the data, with prominent spikes at round number scores of 55 %, 60 %, 65 % and 70%. Based on this, one can safely infer that a) The UPSC Interview board is assessing the candidates on a percentage scale and then converting it into marks (out of 275) and b) Some of the Interview Boards are not attempting to be very precise in their assessment, and are willing to grade candidates in round number scores (say 65% instead of 64 % or 66 %).
What could be seen visually in the histogram is now clear in the above tabulation of the Standard Deviations of the Written and Personality Test (PT - Interview) Scores.
The importance of each Mark in determining the final rank of the candidates can be seen in the visualisation above, which shows the number of candidates with the same Marks - right from the highest score of 1063 with Rank 1 at the left to a score of around 877 with Rank of 600 at the right . While there is sufficient gap between the candidate's marks in the top 10 or so ranks, we have 4-5 candidates per mark around the 50th Rank and 9-10 candidates per Mark around the 100th Rank, going as high as 40 candidates with the same score at around Rank 600 !!
Each mark matters - and with Interview Marks showing a round number bias and using a % scale (with each % equal to 2.75 marks) added to the low reliability, an element of randomness/chance is introduced in the selection process of the candidates.

The final rank of the selected candidates is based on the total of the Written and Personality Test (Interview) scores. If UPSC was looking for a certain attribute, lets call it "IAS-ness" for want of any better name, which was present in both Written Exam and Personality Test, then these two scores would have shown some relation (correlation) with each other. For example,  the oft-quoted spurious correlation seen between ice cream sales and shark attacks occurs because both are related to the common variable - Temperature. We however do not see any correlation between the Written and Interview Scores - actually , a negative correlation is seen. This happens because the successful candidate with a poor Interview score necessarily has to have a higher Written Score and vice-versa; else she would have not made the cut-off.
The above definitions and examples of Reliability and Validity are taken from the excellent book by Richard Nisbett - Mindware-Tools for Smart Thinking. (This book, along with Daniel Kahneman's Thinking Fast & Slow are mandatory reading for all).
Which brings us to a very important Question:

A serendipitous natural experiment dataset was available within the Civil Services results for the year 2013,2014,2015. As a fairly large number of candidates clear the exam in successive years, their Interview scores in the two years could be examined to see the Reliability of UPSC Interview.
The above visualisation- scatter plot, which shows the correlation between the Interview scores of the same set of candidates in two successive years was the most surprising (and disturbing) finding. One can very clearly see that there is little if any relation between how a candidate may fare in the Personality Test in two successive years. With an R-Squared of around 0.1, the Interview process can be said to have limited Reliability(and as a consequence also of limited Validity.).
Whatever the Personality Test tries to assess (the "IAS-ness") lack of reliability leads to a doubt on the validity of the measurement.
Daniel Kahneman speaks about the "Illusion of Validity" as he recounts his experience of evaluating candidates for officer training as part of a group of evaluators. In his words "Our impression of each candidate's character was as direct and compelling as the color of the sky.... A single score usually came to mind and we rarely experienced doubts or formed conflicted impressions." However, the assessment of the candidate was in variance with the actual performance of the candidates in the officer-training school. As Kahneman goes on to say-"our ability to predict performance at the school was negligible. Our forecasts were better than blind guesses, but not by much"
Richard Nisbett calls it the "Interview Illusion", and says that "predictions based on the half-hour interview have been shown to correlate less than 0.10 with performance ratings of undergraduate and graduate students, as well as with performance ratings for army officers, businesspeople, medical students, Peace Corps volunteers, and every other category of people that has ever been examined".

While the Personality Test (Interview) is seen to show limited Reliability, the Written scores of the same candidates across successive years shows a slightly higher consistency (Greater R-squared value), but not as high as a typical measure of pure ability.

The final rank of the candidate is based on the total of her Written and Interview Scores. The higher the score, the better the rank. Since it is not the absolute but the relative score that determines your rank, a regression model was built to predict the rank in percentile terms based on the candidate's percentile on her Written score and the percentile on her  Interview score in relation to the other candidates. The model was developed for the top 600 ranks for Civil Services Exam 2014 and 2015, and is shown below:

The model shows a good fit with the data as measured by the high R-Squared value, and based on the relative weights to the two parameters, it is seen that the Personality test carries a 40% weight in determining the final Rank!
Interview or the Personality Test has a high 40% impact weight, but at the same time it shows low reliability (year to year score consistency). The Written scores too can vary across years.
So what should the Civil Services aspirant do?
While the above is a strange (and funny) trend with lower roll number more likely to lead to success in the exam (Which way does the Causal arrow point?), the candidates need to rethink their Civil Services strategy.

So, any Civil Services aspirant should be prepared for the long haul.
The median number of attempts of the candidates who finally join the service is likely to continue increasing.
And what a waste of precious years of the talented  young Indian boys and girls slogging away to clear a highly unpredictable exam.
I hope that the expert committee constituted by the Government under the chairmanship of Sh BS Baswan to examine the various issues regarding the Civil Services Examination is able to fix some of the problems in the current system, and more importantly, prevent the colossal waste of time and effort of the youth of India.

23 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. Very nice analysis. Specifically unpredictability of personality test with higher levels of subjectivity. Needs to be shared most. Thank you sir for taking these efforts.

    ReplyDelete
  3. Was making similar visualisations of the UPSC data in tableau. Glad to see that someone already did it for me, that too with a vastly superior analysis. Kudos to you, sir

    ReplyDelete
  4. Wonderful article. Nice analysis. But how to determine the real merit or mettle of a candidate. All of us can brainstorm and suggest something. The Govt seems responsive to new ideas...

    ReplyDelete
  5. Wonderful analysis... Could you please try to explain the relationship between the candidate's roll number and probability of success in the exam? Isn't it very strange?! Thanks

    ReplyDelete
    Replies
    1. This pattern of candidates with lower roll number being over represented in the successful candidates list has been around for at least 30 years. The Director of NIFM, Sh Harsh Kumar, who had compiled the result manually in 1978 had seen this pattern then, and had asked me to see if it is still present- and there it was.
      This is a case of self-selection, the serious candidates apply early, and hence by definition are the ones more likely to succeed. Remember, only around 50% of the candidates who fill the form actually end up taking the prelims.

      Delete
    2. Thanks for the reply sir! I was wondering whether even with self selection, isn't the jump from 1st 50000 to the next too steep? Could some other factor like the order in which answer sheets are evaluated and a bias based on that be involved?! Thanks

      Delete
    3. This pattern of candidates with lower roll number being over represented in the successful candidates list is because Serious Candidate in Delhi fill the form on very first day so as to get UPSC as their mains centre...

      Delete
    4. That makes sense, thanks!

      Delete
    5. That makes sense, thanks!

      Delete
  6. this is a very good analysis sir.. apt i would say to my situation of variance in interview marks.. i would also like to point out the fluctuation in marks of optional subject of same candidate in successive years (the deviation at times is close to 60-70marks, marks that a full lenght gs paper can give)

    ReplyDelete
  7. Wonderful analysis... Could you please try to explain the relationship between the candidate's roll number and probability of success in the exam? Isn't it very strange?! Thanks

    ReplyDelete
  8. Sir you must share this analysis with UPSC chairman ! Great work ! :)

    ReplyDelete
  9. Brilliant analysis. Any way this can be intimated to UPSC members?

    ReplyDelete
    Replies
    1. Hi, I understand how serious this is. Matter of fact, I got the chance to speak about the core findings in the presence of a Secretary DOPT a few days ago during his visit to NIFM, , and commend the initiative of PM in discontinuing Interviews for Group B, C, and D posts. Unfortunately , , the illusion of validity is too strong. It will be tough (impossible ?) for UPSC to accept that they cannot consistently assess the IAS -ness of a candidate. However, I will be sending this report to UPSC and DOPT, and hope for the best..

      Delete
  10. Lack of reliability of UPSC exam, which is bound to be so at least in near future, makes a strong case for putting the number of attempts that a candidate can take, on the upper side.

    ReplyDelete
  11. CSE 2015 - My written score was 714 (almost 40 marks above cut off) but score of 132 in interview ruined my chances of selection.
    Interview was on unexpected lines but I was relaxed and composed. Some of the question were so vague that to find the precise answer of the same is extremely difficult. I still can't figure out the relevance of many question they have asked.
    People at my mains score have scored rank in 190s, 200s and 300s; and I again have to repeat the year having nothing in my hand.
    One suggestion is that why UPSC cannot absorb candidates who hadn't made into the final list. Knowing the fact that govt. Departments are precariously shortage of manpower.

    ReplyDelete
    Replies
    1. I agree with you brother.
      People who clear mains might be fit to other central jobs too if not civil services (as the #of posts are limited).
      This is a very valid point from ur side

      Delete
  12. CSE 2015 - My written score was 714 (almost 40 marks above cut off) but score of 132 in interview ruined my chances of selection.
    Interview was on unexpected lines but I was relaxed and composed. Some of the question were so vague that to find the precise answer of the same is extremely difficult. I still can't figure out the relevance of many question they have asked.
    People at my mains score have scored rank in 190s, 200s and 300s; and I again have to repeat the year having nothing in my hand.
    One suggestion is that why UPSC cannot absorb candidates who hadn't made into the final list. Knowing the fact that govt. Departments are precariously shortage of manpower.

    ReplyDelete
  13. As much as I hate to say it, I experienced a deja vu moment after reading this post. Your conclusions are bang on target. Any aspirant dreaming of cracking this exam should ready himself for the long haul and try to maximize his chances by giving as many examinations as possible (State Services etc). One more factor which might be worth considering is the way candidates get shortlisted at the prelims stage. That can stir the "UPSC pot" even further

    ReplyDelete
  14. Another important statistic worth considering is the geographical location from which the candidates appeared for the tests and their distance from Delhi. That might throw up some more interesting insights. Similarly it would also be good to analyze the scores of candidates in prelims/mains examinations of candidates who completed their secondary education through CBSE/ICSE and their medium of education

    ReplyDelete