Friday, June 20, 2008

模型:万恶的考试制度

To determine one’s life with a single exam is gambling. Now let’s look at it from school’s point of view. Next time we'll see how it will effect your life from a student's position.
----------------
Model 1: assume students can be divided into 2 groups, good(G) and bad(B), with each number n1 and n2. Denote student in G as g_i, and b_j for student in B. Divide results into two levels, good (G1) and bad(B1). If g_i or b_j in G1, then let it be 1, otherwise 0.
Assume iid , P(g_i =1)=p1, P(b_j =1)=p2. Let i=1,…, N and j=1,…,M.
Then P(sum(g_i,b_j)=n, for all i and j)=………

In real world, at least in china, there is static standard G1. What really matters is the order statistics, which seems not easy.

So the other point of view is the top school will choose randomly from the good students pool G1. Then the interesting question is just the probability that among those chosen students, given an acceptable percentage of good students.
------------------
Model 2: assume students can be divided into 3 groups, G1, G2 ang G3. While results have 3 levels R1, R2 and R3. And for a student g^i_k in Gi(k=1,..,N_i), the probability that ends up in Rj is p_ij. Assume the statistics of the number of students in Ri to be X_i, which are the ones we are interested in.
----------------------
Model 3:extend the subscriptions. G1,G2,…G_s. R1,R2,…R_t. P(g^i_k in R_j)=p_ij, k=1,..,N_i. X_j=#{ g^i_k in R_j}.
Problem: P(X_j = n_j)=? , and its inverse problem.

An simplification is assume s=t, and p_ij=0 for any abs(i-j)>1.
----------------------
Model 4: extend model 3 to continuous situation, Assume all students consists of a complete order set, that is, all of them can be ordered in a line as to tell, any two of them can be compared with a definite result. We can assume that everyone has a definite mean for this exam, so we can compare their means.

Define the statistics as the number of students fall in the certain region. Assume everyone’s score has one’s own distribution, which is a too subtle job, so we can assume certain distribution, which is determined by the mean of the student’s score. Then the question is again the behavior of this statistics.

Some candidates for the distribution:
Uniform, of course just a naive guess.
Linear, whose pdf has the shape of capital lambda.
Normal, maybe…
Double exponential, just heard about it……
Anyway, it is also a possible property that distribution is not symmetric, especially near the boundary. Or just truncate it.

In terms of details computation, please help yourself.(and let me know your opinion or results, have fun~)