Search This Blog

Thursday, January 18, 2018

Find All Anagrams in a String

Given a string s and a non-empty string p, find all the start indices of p's anagrams in s.

Strings consists of lowercase English letters only and the length of both strings s and p will not be larger than 20,100.

The order of output does not matter.

Example 1:
Input:
s: "cbaebabacd" p: "abc"

Output:
[0, 6]

Explanation:
The substring with start index = 0 is "cba", which is an anagram of "abc".
The substring with start index = 6 is "bac", which is an anagram of "abc".
Example 2:
Input:
s: "abcdabc" p: "abc"

Output:
[0, 4]

Explanation:
The substring with start index = 0 is "abc", which is an anagram of "abc".
The substring with start index = 4 is "abc", which is an anagram of "abc".

Algorithm:

The idea is to convert the string to the counter array. Take Example 2 for example:
                   a , b, c, ....., z
p: "abc" => [1, 1, 1, ....., 0]   as pCount     // means 'a' appears once, 'b' appears once

To initialize, we take the first slice of window from s: "abcdabc" which is "abc"

window: "abc" => [1, 1, 1, ...., 0] as windowCount

if pcCount == windowCount then we know at index 0 we found a match result = [0]

We apply the same idea to the rest of the sliding windows: bcd, cda, dab, abc

Solution:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
class Solution {
    
    int countEqual(int[] window, int[] target){
        // Make sure window and target have the same length
        if( window.length != target.length )
            return -1;
        int count = 0;
        for(int i=0; i < window.length; i++){
            int diff = Math.abs(window[i] - target[i]);
            count += diff;
        }
        // If return zero, means window and target are identical
        return count;
    }
    
    int[] countChars(String str){
        int[] counter = new int[26];
        int offset = (int) 'a';
        for(int i=0; i < str.length(); i++){
            char cc = str.charAt(i);
            counter[cc-offset]++;
        }
        return counter;
    }
    
    public List<Integer> findAnagrams(String s, String p) {
        List<Integer> result = new LinkedList<Integer>();
        if( p.length() > s.length() )
            return result;
        
        int[] pCount = countChars(p);
        // char count view of the first window
        int[] windowCount = countChars(s.substring(0, p.length()));

        
        if( countEqual(pCount, windowCount) == 0)
            result.add(0);
        
        for(int i = 1; i < s.length()-p.length()+1; i++) {
            String window = s.substring(i, i+p.length());
            windowCount = countChars(window);
            if( countEqual(pCount, windowCount) == 0)
                result.add(i);
        }
        return result;
        
    }
}

Performance
Has two input with size M, N, the time complexity is O(MN)
space complexity is O(1)

Although the execution speed is quite slow, it is a good algorithm to elaborate during the interview.

No comments:

Post a Comment