0274. H-Index¶

Given an array of integers citations where citations[i] is the number of citations a researcher received for their ith paper, return compute the researcher's h-index.

According to the definition of h-index on Wikipedia: A scientist has an index h if h of their n papers have at least h citations each, and the other n − h papers have no more than h citations each.

If there are several possible values for h, the maximum one is taken as the h-index.

Example 1:

Input: citations = [3,0,6,1,5]
Output: 3
Explanation: [3,0,6,1,5] means the researcher has 5 papers in total and each of them had received 3, 0, 6, 1, 5 citations respectively.
Since the researcher has 3 papers with at least 3 citations each and the remaining two with no more than 3 citations each, their h-index is 3.

Example 2:

Input: citations = [1,3,1]
Output: 1

Constraints:

n == citations.length
1 <= n <= 5000
0 <= citations[i] <= 1000

Analysis: using sort + binary search¶

Given the definition of "H-index", we can find out that h is just a measurement of the maximum amount of paper that has the citation number greater or equal to h. So in a sorted array, we just need to find the lower bound of any h that satisfies this criterion, because finding the lower bound in a sorted array is always $O(\log(n))$ .

Note

In Cpp, the upper_bound and lower_bound differ in the way of locating the first and last element of the same value with the target value. For upper_bound, it finds the first equal value, and lower_bound finds the last one.

Time: $O(n \times \log(n))$ sorting + finding
Space: $O(1)$ if we use quicksort without an auxiliary array

Code 1¶

class Solution {
public:
    int hIndex(vector<int>& a) {
        //[3,0,6,1,5] h=3: 3, 6, 5
        int n = a.size();
        sort(a.begin(), a.end());
        // try all the possible H from large to small
        for (int i = n; ~i; --i) {
            if (n - (lower_bound(a.begin(), a.end(), i) - a.begin()) >= i)
                return i;
        }
        return 0;
    }
};

Analysis: using Bucket sort¶

Since our data isn't large (the range is also small), we can use bucket sort to solve this particular problem with $O(n)$ complexity.

Since we are only interested in the h in the range from 0 to n, we can create an array with a size of n. Each element in the array represents the frequency of that amount of citation.

Note

For all the citations that are greater than n, we put them into an n-th bucket, since that is the same effect for placing the paper into its citation's bucket.

Then we start from n and keep a counter to record the current number of papers that fulfill the requirement, which is counter >= h. If this requirement is fulfilled, then we can return the value from here.

Time: $O(n)$
Space: $O(n)$

Code 2¶

class Solution {
public:
    int hIndex(vector<int>& a) {
        //[3,0,6,1,5] h=3: 3, 6, 5
        int n = a.size();
        vector<int> b(n + 1);
        for (int v : a) {
            if (v >= n) b[n] ++;
            else b[v] ++;
        }
        int counter = 0;
        for (int i = n; ~i; --i) {
            counter += b[i];
            if (counter >= i) return i;
        }
        return 0;      
    }
};

Last update: April 1, 2022