top of page

CCC2020 S3: Searching for Strings

Let S be a substring of H with length |N| To check whether S is a permutation of H, keep a count of each of the 26 letters of the alphabet. If the counts of S and N all match, then they are permutations of each other. To do this quickly, loop through the S s from left to right. This way, at most two characters are added or removed each time we shift right. This lets us know which substrings can satisfy the "permutation" requirement.

Now we will find the number of distinct substrings S that work. Let T be the set of all such S. To eliminate duplicates, we need a quick way of comparing the elements of T. This can be done quickly by using a rolling hash (used a double-hash with 31 as the base ). We insert valid hash(S) s into set T .Finally, the answer is the cardinality of T.

Time complexity: O(|H|)


#include <iostream>
#include <string>
#include <unordered_set>
#include <vector>
#include <utility>
using namespace std;
typedef long long ll;
typedef unsigned long long ull;
ll hsh[200010], pw[200010], base = 31;
int fs[26], ft[26];
unordered_set<ll> st;
bool isubstring(){
    for (int i = 0; i < 26; i++){
        if (fs[i] != ft[i]) return false;
    }
    return true;
}
ll mySubHash(int l, int r){
    return hsh[r] - hsh[l-1]*pw[r-l+1];
}
int main() {
    string s, t;
    cin >> s >> t;
    int n = s.length(), m = t.length();
    for (int i = 1; i <= n; i++){
        fs[s[i-1]-'a']++;
    }
    pw[0] = 1;
    for (int i = 1; i <= m; i++){
        hsh[i] = hsh[i-1]*base + t[i-1];
        pw[i] = pw[i-1]*base;
    }
 
    for (int i = 1; i <= m; i++){
        // Sliding window
        ft[t[i-1]-'a']++;
        if (i < n)
        	continue;
        if (i > n)
        	ft[t[i-n-1]-'a']--;
        // Rolling hash
        if (isubstring()){
        	st.insert(mySubHash(i-n+1, i));
        }
    }
    cout << st.size() << endl;
}

Recent Posts

See All

Comments


bottom of page