Scribbling

Rabin-Karp Algorithm 본문

Computer Science/Algorithms & Data Structures

Rabin-Karp Algorithm

focalpoint 2022. 11. 9. 03:41

Rabin-Karp Algorithm is an efficient algorithm to search a string pattern in a given sentence.

The core of the idea is to use hash function to compare strings.

Time complexity is O(n), where n is length of the string. However, this is true only when there is no hash collision.

Time complexity is O(n * m) if hash function is poorly selected (producing many hash collisions), where m is length of the pattern.

 

Below is the python implementation of Rabin-Karp Algorithm.

import string

def rabinKarp(sentence: string, pattern:string, p:int, mod=10**9+7) -> int:
    '''
    sentence: string
    pattern: pattern
    p: power value for hash function
    mod: mod value (default: 10**9 + 7)

    returns the first index of the matching
    '''
    n, l = len(sentence), len(pattern)
    if n < l:
        return -1

    if sentence[:l] == pattern:
        return 0

    arr = [ord(c) - ord('a') for c in sentence]

    patternHash, h = 0, 0
    for i in range(l):
        patternHash = (patternHash * p + ord(pattern[i]) - ord('a')) % mod
        h = (h * p + arr[i]) % mod

    for i in range(1, n - l + 1):
        h = (h * p - arr[i-1] * pow(p, l, mod) + arr[i+l-1]) % mod
        if h == patternHash and sentence[i:i+l] == pattern:
            return i
    return -1


'''
Example
'''
string = 'dkjfapviovjqvjaaavl'
pattern = 'iovjqvj'
h = len(set(string))
print(rabinKarp(string, pattern, h))