Implement strStr()

Question (LC.28)

Returns the index of the first occurrence of needle in haystack, or -1 if needle is not part of haystack.

Example

Input: haystack = "stack", needle = "queue"
Return: -1
Input: haystack = "abcdefgef", needle = "ef"
Return: 4

Analysis

Essentially, this question is asking us to implement haystack.indexOf(needle). We always start from brute force approach. For each character in the haystack we search for a match for needle. Always do examples on paper/whiteboard first. This will help you understand the problem and think of a a clear algorithm. Also these examples can often lead to good test cases.

Test cases:
case 0: "", "" return 0
case 1: "dsgsfg","" return 0
case 2: "", "asdfsdf" return -1
case 3: "abcd", "fg" return -1
case 4: "abcdegef", "ef" return 6
case 5: "abcdeef", "ef" return 5 you have to reset the pointer every time
case 6: "abcdege", "ef" return -1

Two Pointers

for i from 0 to length(hay)-1-(length(need)-1)
    for j from 0 to length(need)-1
        if (hey[i] == need[j]) {
            i++;
        } else {
            j = 0;
            break;
        }
    end for
    if (j == length(need)-1)
        return i;
end for

return -1; // we didn't find any occurrence

Can you see the problem with this approach? You have i++ but never reset it back before the next iteration in the outer loop. This is not very clear. Instead of using two explicit pointers, we can "combine" them. Also j++ in the to break the inner for loop. So you have to check j == length(needle).

for i from 0 to length(hay)-1-(length(need)-1)
    for j from 0 to length(need)-1
        if (hey[i+j] != need[j]) {
            break;
        }
    end for
    if (j == length(need)-1)
        return i;
end for
return -1; // we didn't find any occurrence

Code

public int strStr(String haystack, String needle) {
    if (haystack == null || needle == null)
        return -1;
    for (int i = 0; i <= haystack.length()-1-(needle.length()-1); i++) {
        int j;
        for (j = 0; j <= needle.length()-1; j++) {
            if (haystack.charAt(i+j) != needle.charAt(j)) {
                break;
            }
        }
        if (j == needle.length()) { //note: not length-1 cause j++
            return i;
        }
    }
    return -1;
}

Time Complexity

Time O(m*n) Space O(1)


Follow Up Question

Can you do this in linear time? O(m+n)

Example

Input: "abcdeef", "ef"
Return: 5

You can't get all possible substrings then hash. Enumerate all possible substrings are O(n^2) already and you need O(n^2) space as well. So how can we speed things up?

Robin-Karp (hash function approach)

Rabin–Karp algorithm a string searching algorithm that uses hashing to find any one of a set of pattern strings in a text. For text of length n and p patterns of combined length m, its average and best case running time is O(n+m) in space O(p), but its worst-case time is O(nm).

Robin-Karp essentially speeds up the inner loop O(n) from the brute force approach to O(1) with a hash function. The trick is using a rolling hash, removing the first element then add the next one.

References

results matching ""

    No results matching ""