∫ A more practical hiring secretary solution

Kernel:

An algorithm to hire a secretary that is more practical than the classical one.

The hiring secretary problem (along with many other alternative nameshttps://en.wikipedia.org/wiki/Secretary_problem) is a classic problem in probability theory and decision theory. The problem is as follows:

There are $ N $ candidates for a job, and the quality of each candidate is known only after they have been interviewed. The candidates are interviewed one by one in random order. After each interview, the decision maker must decide whether to hire the candidate or not. The decision is irrevocable and must be made immediately. The goal is to maximize the probability of hiring the best candidate.

This problem is one of the most famous problems in probability due to its simplicity and the elegant form of the optimal solution.

∂ The classical solution

The classical solution to the hiring secretary problem is to interview the first $ N/e $ candidates without hiring any of them, and then hire the first candidate who is better than all the candidates interviewed so far. This algorithm has an expected success rate of $1/e$, where $e$ is the base of the natural logarithm.

Lest repeating what's already out there in the Internet, I will not go through its derivation in this blog post. (You can find it herehttps://en.wikipedia.org/wiki/Secretary_problem.) Instead, I will discuss a more practical variant of the hiring secretary problem, and my proposed solutions to it.

There is a major flaw with the classical hiring setting. There're more than $ 1/3 $ chance that no candidate is hired; that is when the best candidate falls in the first $1/e$. What can we do to improve this?

One simple improvement is to always hire the last candidate if no better candidate is found. I call this variant the desperate hiring secretary problem. This simple improvement can be implemented easily, but it is no way the end of our conundrum. Do we really need only the best candidate? What if we are allowed to hire sub optimal candidates to improve the overall chance of meeting a good secretary?

∂ Improvement: score payoff and a stopping problem

The hiring secretary problem with score payoff (or cardinal payoff) is another variant of the hiring secretary problem that relaxes the constraint of hiring only the best candidate. In this variant, each candidate has a score, and the goal is to maximize the expected score of the hired candidate. This variant is related to a stopping problem, where the decision maker must decide when to stop the search to maximize the expected payoff.

The hiring secretary problem with cardinal payoff

There are $ N $ candidates for a job, each with a score. The goal is to maximize the expected score of the hired candidate.

The scores of the candidates appear to be randomed from an unknown distribution.
Each individual score is revealed only after the candidate has been interviewed.
The candidates are interviewed one by one in random order.
The decision is irrevocable and must be made immediately after each interview.
If the decision is made, the candidate is scored and the search stops.

To solve this problem, we need to know what's the criteria for stopping the search to get the highest expected score. It is the odd of not finding a better candidate in the rest of the trials. And we can only use all previous candidate scores, but not the future candidate scores.

∂ Calculating the odd of no better candidate

Suppose that $i$ is the number of candidates already interviewed. We can use the quantile score of the i-th candidate relative all the previous candidates to estimate the probability of finding a no better candidate in the rest of the trials as

$$ \begin{align*} \Pr(\text{no better candidate after} \, i) &= q_i^{N-i} \tag{1} \\ \end{align*} $$

where $q_i \in \left[0, 1\right] $ is the quantile score of i-th candidate.

At step $i$, we compute $ \Pr(\text{no better candidate after} \, i) $, and when it's greater than 0.5, we hire the candidate. I call this algorithm the quantile rank model.

The time complexity to perform this algorithm in a run is $ O(N \log N) $ (the logarithm part comes from quantile sort), as opposed to the $ O(N) $ time complexity of the classical solution.

This estimate using the data we collect from all the previous trials to rank the current candidate. Because of this, it may be off. We can never be assured of the true quantile until the end of the trials. To find a better estimate, we compute the total probability of finding a no better candidate given different scenarios.

$$ \begin{align*} \hat{\Pr}(\text{no better candidate after} \, i) &= \sum_{q'} \Pr(\text{no better candidate after} \, i | q') \Pr(q') \\ &= \sum_{q'} (q')^{N-i} \Pr(q') \\ \end{align*} $$

where $q'$ is the varied probability of finding a no better candidate after the hidesight when one of the candidate combination unrolls.

If suppose that after $ N $ steps we found $ h $ more candidates not better than the candiate at $i$, we can then alter the probability of finding a no better candidate as follows:

$$ \begin{align*} q'_{i, h} = \frac{qi + h}{N} \end{align*} $$

And for the chance that particular $ q'_{i, h} $ will occur, we can use the binomial distribution:

$$ \begin{align*} \Pr(q'_{i, h}) \approx \binom{N-i}{h} (q)^h (1 - q)^{N-i-h} \end{align*} $$

Now we put everything together.

$$ \begin{align*} \hat{\Pr}(\text{no better candidate after} \, i) \approx \sum_{h=0}^{N-i} \left( \frac{qi + h}{N} \right)^{N-i} \binom{N-i}{h} (q)^h (1 - q)^{N-i-h} \\ \end{align*} $$

This time, at step $i$, we compute $ \hat{\Pr}(\text{no better candidate after} \, i) $, and when it's greater than 0.5, we hire the candidate. I call this algorithm the unroll model.

Note that because we will never know the true quantile until the end of the trials, this is merely an estimation. We will need to perform experiment to justify the estimation.

To compute this estimate, we require $ O(N^2) $ time complexity, which is significantly more than the $ O(N) $ time complexity of the classical solution.

There is one nice quality of both the quantile rank model and the unroll model; i.e. both are guaranteed to be not worse than the classical solution. Because they will behave like the classical solution when the best candidate is not found during the rejection period. And in the case when the best candidate is found in such period, the algorithms will attempt to retrieve the best available candidate.

There remains one issue with these algorithms. How can we truly confide that the best candidate of the entire trial run is found when we only observe the rank of all the previous candidates? The point is we cannot. This is related to another important problem in economics.

∂ Predicting black swans

Suppose during the run, we often find better than the rest candidates so far, can we use this information to estimate the chance of finding a better candidate in the rest of the trials? This is like predicting the next black swan event in stock markets, which is impossible to do in generalNicholas, N. (2008). The black swan: the impact of the highly improbable. Journal of the Management Training Institut, 36(3), 56.. But perhaps we can do this in the limited given time like until the end of a trial run?

For one thing, we can really do this if we know the distribution of the candidate scores, which violates the conditions of the problem we set. Still I'm interested to use this insight to reshape the odds of finding a better candidate in the rest of the trials. Here allow me to propose two models:

Primitive black swan prediction: Keep the track of how many times the best so far candidate are found in the past, and use it as the chance to find a better candidate in the rest of the trials. This one does not contradict the conditions of the problem.
Oracle black swan prediction: Using oracle insight into the value range of the candidate scores, we can predict the chance to meet a better candidate in the rest of the trials.

Using black swan prediction also allow us to remove the rejection period all together. This is because at the beginning of the trials, the predicting chance of the black swan will be high. (Since the chance that we will find the best candidate at the begining is low.) We can include black swan prediction to provide a margin for the odd of finding a better candidate. This is like we try to tell the algorithm that, "Hey the best may have yet to come, and you should not be too confident about the current candidate, even if it is the best so far." The extreme statistics from the sparsity of data at the begining of the trial run will automatically be subdued by the high expectation of black swan events.

Needless to say, we need experiments to justify this idea. I'll add the two mentioned black swan models into the experiments at the end of this blog post.

∂ Literature review

The problem setting I've discussed is the same problem that a paper by Bearden, J. N. (2006)Bearden, J. N. (2006). A new secretary problem with rank-based selection and cardinal payoffs. Journal of Mathematical Psychology, 50(1), 58-59.. Their results provide the optimality under the same scheme as the classical secretary solution, only with $\sqrt{N}$ rejection period and desperate strategy. But while the classical algorithm and Bearden's one do not, my proposed methods also handle the case when the best candidate is found early within the rejection period.

Another work by Palley and KremerPalley, A. B., & Kremer, M. (2014). Sequential search and learning from rank feedback: Theory and experimental evidence. Management Science, 60(10), 2525-2542. also tried to solve the cardinal payoff problem. Their partial-information model operates under the same condition but with exception that the decision maker knows the distribution of the candidate scores, the one criteria that I only assume in my oracle black swan prediction model.

For this, I will compare the results of my proposed methods with the classical solution and Bearden's solution in the next section, but not with Palley and Kremer's model, which will be left for future work when I explore more about the effect of different distributions of the candidate scores.

∂ Result comparison

Here I will compare the results of the following algorithms:

Classical solution hiring secretary solution
Classical solution with desperation
Bearden's solution: classical with $\sqrt{N}$ rejection period and desperation
Using current quantile rank alone (1) with $ 1/e $ rejection period
My unroll method with $ 1/e $ rejection period
Quantile rank method with $ \sqrt{N} $ rejection period
My unroll with $ \sqrt{N} $ rejection period
Quantile rank method with primitive black swan prediction
My unroll with primitive black swan prediction
Quantile rank method with oracle black swan prediction
My unroll with oracle black swan prediction

For each algorithm, I ran the experiments for 100,000 trial runs with $ N $ varied from 100 to 200 against the uniformly and normally distributed score data. The reported expected score falls within range between 0 for minimum and 1 for possible maximum (the best score for each experiment may be less than 1). (Check my github for the experiment codehttps://github.com/PVirie/soft-hiring-problem/blob/main/main.py.) The result is as follows:

Algorithm	Score: Uniform data	Score: Normal dist data
Classical solution	0.608	0.582
Classical solution with desperation	0.799	0.763
Bearden solution	0.928	0.811
Quantile rank with $ \sqrt{N} $ rejection period	0.963	0.835
Unroll with $ \sqrt{N} $ rejection period	0.963	0.834
Quantile rank with $ 1/e $ rejection period	0.965	0.867
Unroll with $ 1/e $ rejection period	0.969	0.869
Quantile rank with primitive black swan prediction	0.936	0.811
Unroll with primitive black swan prediction	0.946	0.817
Quantile rank with oracle black swan prediction	0.974	0.852
Unroll with oracle black swan prediction	0.981	0.857

From the experiment, if we disregard the oracle black swan prediction for now, the unroll algorithm with $ 1/e $ rejection period is the best algorithm for both uniform and normal distribution data. In fact, the unroll algorithms are most of the time better than the pure quantile rank algorithms. Though, using the rank alone already provides a major boost in the expected score to the classical solution.

Now let's see how the black swan prediction models affect the results. As appears in the table, the primitive black swan prediction worsen the results, while using oracle black swan prediction improve the results. This shows that black swan predictions could be another important variable to consider. They may improve or worsen the results, depending on the quality of the prediction and information available.

To the best of my knowledge, the proposed algorithms in this blog yield the best results for the hiring secretary problem with cardinal payoff under all the set conditions up to the time of writing this blog post.