This is roghly the question. You are given a list of "bad expressions" ahead of time.
You can preprocess them, or do whatever you need.
You are to implement a method that receives a tweet, and tries to find any of the bad expressions in the tweet, and to return a list of all the bad expression you found.
Exampe:
Bad exressions:
foo
food
frank
this has
is has
return
now now
Example tweet:
"this has been #TK421Tweets. We now return you to your regularly scheduled Twitter."
Result:
this has
return
is has
This was my solution. I didn't get the offer (not sure is it because of this).
I created a Trie from all the bad expressions and then did a char by char search.
Would love to hear is there a better way to do this.
Thanks.
class TrieNode {
Map<Character, TrieNode> children = new HashMap<>();
boolean endOfTheWord = false;
}
class T {
TrieNode root = new TrieNode();
public void insert(String word) {
TrieNode current = root;
for (char c : word.toCharArray()) {
if (current.children.containsKey(c)) {
} else {
current.children.put(c, new TrieNode());
}
current = current.children.get(c);
}
current.endOfTheWord = true;
}
public TrieNode startsWith(String prefix) {
TrieNode current = root;
for (char c : prefix.toCharArray()) {
if (!current.children.containsKey(c)) return null;
current = current.children.get(c);
}
return current;
}
public TrieNode containsChild(TrieNode node, char c) {
return node.children.getOrDefault(c, null);
}
}
public class TwitterCoding {
private List<String> nWords;
private T trie = new T();
public TwitterCoding(List<String> words) {
this.nWords = nWords;
for (String word : words) {
trie.insert(word);
}
}
public List<String> nWordsFound(String tweet) {
List<String> results = new ArrayList<>();
for (int i = 0; i < tweet.length(); i++) {
char c = tweet.charAt(i);
TrieNode trieNode = trie.startsWith("" + c);
if (trieNode == null) continue;
StringBuilder current = new StringBuilder();
current.append(c);
for (int j = i + 1; j < tweet.length(); j++) {
char nextChar = tweet.charAt(j);
trieNode = trie.containsChild(trieNode, nextChar);
if (trieNode == null) break;
current.append(nextChar);
if (trieNode.endOfTheWord) {
results.add(current.toString());
}
}
}
return results;
}
public static void main(String[] args) {
TwitterCoding tc = new TwitterCoding(Arrays.asList("foo",
"food",
"frank",
"this has",
"is has",
"return",
"now now"));
System.out.println(tc.nWordsFound("this has been #TK421Tweets. We now return you to your regularly scheduled Twitter."));
}
This solution is O(mN) complexity, where m is the length of the longest bad expression, and N the length of the tweet.