Verifying an Alien Dictionary

Algorithm Guide

•

Leetcode Problem

•

Python

•

String

Easy

4/13/2021

•Leetcode link

"..."

The Problem

In an alien language, surprisingly they also use English lowercase letters, but possibly in a different order. The order of the alphabet is some permutation of lowercase letters.

Given a sequence of words written in the alien language, and the order of the alphabet, return true if and only if the given words are sorted lexicographically in this alien language.

Example 1:

Input: words = ["hello","leetcode"], order = "hlabcdefgijkmnopqrstuvwxyz"

Output: true

Explanation: As 'h' comes before 'l' in this language, then the sequence is sorted

Example 2:

Input: words = ["word","world","row"], order = "worldabcefghijkmnpqstuvxyz"

Output: false

Explanation: As 'd' comes after 'l' in this language, then words[0] > words[1], hence the sequence is unsorted.

Example 3:

Input: words = ["apple","app"], order = "abcdefghijklmnopqrstuvwxyz"

Output: false

Explanation: The first three characters "app" match, and the second string is shorter (in size.) According to lexicographical rules "apple" > "app", because 'l' > '∅', where '∅' is defined as the blank character which is less than any other character

Working Through The Problem

When working through problems, I like to break it down into parts. First understand in basic terms exactly what you are given. Then understand exactly what you need to return. After you have established this, brainstorm approaches to get from A to B. Here, I will walk through my process of doing exactly that.

Given

You are given two arguments: an array of strings and a string. The array of strings, called 'words', represents words in order in the dictionary. It can be any length and simply contains strings made up of any combination of the 26 letters in the English language. The second argument, called 'order', will always have a length of 26. It will contain every single letter in the English language exactly once.

Return

A Boolean value: either 'True' or 'False'. Whether you return 'True' or 'False' will depend on if the words in the 'words' array are ordered alphabetically treating the 'order' string as the alphabet.

The Approach

One approach would be to compare each word in 'words' to the word that follows it. As long as each word is checked and confirmed to be a word that should come before the word it comes before, the whole 'words' dictionary will be confirmed to be correctly ordered. So, we need an algorithm that just compares two words and finds out out which should come first. Then, we can run this algorithm at every word and compare it to the word after it. At each iteration, we will identify our current word as 'curr' and the following word as 'next'. If at any point, the algorithm tells us that 'next' should come before 'curr', we return 'False' since that would mean the 'words' array is not properly ordered. If we make it through the whole 'words' array without this happening, we can return 'True'. So, the idea is to iterate through 'words' and grab each 'curr' and 'next' to perform the algorithm that checks for correct ordering. Let's see what this set up will look like.

for i in range(len(words) - 1):
  curr = words[i]
  next = words[i + 1]

The Algorithm

So, how do we compare two words to see which should come first? First, you must find the first difference between the two words. For example, if the two words are 'tree', and 'branch', you can compare the first letter of the words to find out which comes first. However, if your two words are 'tree' and 'trunk', you need to look at the third letter to find out which word comes first. In coding terms, you need to iterate through the words to find the first differing letter between the words. This can be done with something like this:

# here you are iterating through the letters of the current word
# at each letter, check if it is the same letter as in the next word
# once we find our first instance of differing letters, then we can take action

for j in range((len(curr)):
  if curr[j] != next[j]:
    # do something

So, what action do we take once we find differing letters? Now reference our 'order' to see which letter should come first. This can be done easily if we put all the letters of 'order' in a dictionary. We can store each letter as a key and have it correspond to a number that representing its index in 'order'. The code below uses 'enumerate' to do exactly that.

dictionary = {}
for i, letter in enumerate(alphabet):
  dictionary[letter] = i

It will give you a dictionary like this:

# example input
order = "ngxlkthsjuoqcpavbfdermiywz"

# output after running the enumerate loop
dictionary = {
  'n': 0, 'g': 1, 
  'x': 2,'l': 3, 
  'k': 4, 't': 5, 
  'h': 6, 's': 7, 
  'j': 8, 'u': 9, 
  'o': 10, 'q': 11,
  'c': 12, 'p': 13,
  'a': 14, 'v': 15,
  'b': 16, 'f': 17,
  'd': 18, 'e': 19,
  'r': 20, 'm': 21,
  'i': 22, 'y': 23,
  'w': 24, 'z': 25
}

Using this dictionary, we can check our two letters to see which corresponds to a lower number in the dictionary. Whichever one does correspond to a lower number should belong to the word that goes first in 'words'. If this is the case, we can continue on to the next iteration with a 'break' statement. If not, we return 'False'. Let's add this in to our algorithm.

for j in range((len(curr)):
  if curr[j] != next[j]:
    if dictionary[curr[j]] > dictionary[next[j]]: return False

Now, we pretty much have all the parts. There is just one more potential case to consider: what if one full word is part of another word? (See example 3: 'app' and 'apple') In this case, 'app' should come first. We can add in an if statement to check when this case come up that looks like this:

for j in range(len(curr)):
  if j >= len(words[i + 1]): return False # here is the check added in to the code we already have
  if curr[j] != next[j]:
      if dictionary[curr[j]] > dictionary[next[j]]: return False
      break

Once we put this all together, we have an algorithm that properly solves the problem.

Solution

def alien(words, alphabet):
  dictionary = {}
  for i, letter in enumerate(order):
      dictionary[letter] = i
  for i in range(len(words) - 1):
      curr = words[i]
      next = words[i + 1]
      for j in range(len(curr)):
          if j >= len(words[i + 1]): return False
          if curr[j] != next[j]:
              if dictionary[curr[j]] > dictionary[next[j]]: return False
              break
  return True

Time Complexity

The enumerate loop is O(1) time since it will always have 26 iterations. The number of iterations in the loop through 'words' depends on the number of words. Therefore, this loop takes O(N) time where N is the number of words. So, the time complexity is O(N).

Space Complexity

The only space being used is in the dictionary created with the enumerate loop. This takes O(1) space since it will always have the same length of 26. So, the space complexity is O(1).

Back to posts