Twitter | Onsite | Bad expressions search
Anonymous User
1149

This is roghly the question. You are given a list of "bad expressions" ahead of time.
You can preprocess them, or do whatever you need.
You are to implement a method that receives a tweet, and tries to find any of the bad expressions in the tweet, and to return a list of all the bad expression you found.

Exampe:
Bad exressions:
foo
food
frank
this has
is has
return
now now

Example tweet:
"this has been #TK421Tweets. We now return you to your regularly scheduled Twitter."

Result:
this has
return
is has

This was my solution. I didn't get the offer (not sure is it because of this).
I created a Trie from all the bad expressions and then did a char by char search.

Would love to hear is there a better way to do this.
Thanks.

class TrieNode {
    Map<Character, TrieNode> children = new HashMap<>();
    boolean endOfTheWord = false;
}

class T {
    TrieNode root = new TrieNode();

    public void insert(String word) {
        TrieNode current = root;
        for (char c : word.toCharArray()) {
            if (current.children.containsKey(c)) {

            } else {
                current.children.put(c, new TrieNode());
            }
            current = current.children.get(c);
        }
        current.endOfTheWord = true;
    }

    public TrieNode startsWith(String prefix) {
        TrieNode current = root;
        for (char c : prefix.toCharArray()) {
            if (!current.children.containsKey(c)) return null;
            current = current.children.get(c);
        }
        return current;
    }

    public TrieNode containsChild(TrieNode node, char c) {
        return node.children.getOrDefault(c, null);
    }
}

public class TwitterCoding {

    private List<String> nWords;
    private T trie = new T();

    public TwitterCoding(List<String> words) {
        this.nWords = nWords;
        for (String word : words) {
            trie.insert(word);
        }
    }

    public List<String> nWordsFound(String tweet) {
        List<String> results = new ArrayList<>();
        for (int i = 0; i < tweet.length(); i++) {
            char c = tweet.charAt(i);
            TrieNode trieNode = trie.startsWith("" + c);
            if (trieNode == null) continue;
            StringBuilder current = new StringBuilder();
            current.append(c);
            for (int j = i + 1; j < tweet.length(); j++) {
                char nextChar = tweet.charAt(j);
                trieNode = trie.containsChild(trieNode, nextChar);
                if (trieNode == null) break;
                current.append(nextChar);
                if (trieNode.endOfTheWord) {
                    results.add(current.toString());
                }
            }
        }
        return results;
    }


    public static void main(String[] args) {
        TwitterCoding tc = new TwitterCoding(Arrays.asList("foo",
                "food",
                "frank",
                "this has",
                "is has",
                "return",
                "now now"));
        System.out.println(tc.nWordsFound("this has been #TK421Tweets. We now return you to your regularly scheduled Twitter."));
    }
	
	This solution is O(mN) complexity, where m is the length of the longest bad expression, and N the length of the tweet.
Comments (6)