Python Beginner/Intermediate Guide to Functions, DS, & Techniques

20166

When you look through solutions written in python I'm sure you've seen many interesting functions, data structures, and techniques you have never seen before. Often these can make understanding otherwise beautiful solutions really difficult, and can be quite discouraging – I know that it was for me when I got started. My goal in this discussion post is to explain many of these functions, data types, and concepts to help make it easier to understand some of the unique and nifty ways people solve LeetCode problems. I'll be structuring this post by category. Additionally, you can click on any method or datatype to go to the documentation page!

Techniques

1. List/dictionary/set… comprehension
List comprehension (and all other kinds) can be an extremely useful but confusing technique used by many programmers on this site. It is incredibly useful for making concise solutions and avoiding loops which take up lots of code. List comprehension involved constructing a list in line, often with a for-loop. For example, if we wanted to construct a list which contains every element in some list nums = [1,2,3,4,5] which has value 3 or greater, we may default to

newNums = []
for num in nums:
	if num >= 3:
		newNums.append(num)

>> newNums = [3,4,5]

However, we can do this using list comprehension in one line like so:

newNums = [num for num in nums if num >= 3]

>> newNums = [3,4,5]

As you can see, this basically combines all the parts of our previous method (the for loop, the if condition, and adding num to the array) in one line! Many python programmers would consider this the more "pythonic" way of creating the list, but don't feel like you should always use list comprehension –– often the regular method is more readable!

You can also use this techniques with other data structures such as sets and dictionaries. For example, let's say we have the set example_set = {1,2,3,4,5,6} and we want to make a dictionary which will have the values from our set as keys which point to the value times 2 if the value is even, and they will point to the value times 3 if the value is odd. The typical way to do this would be:

dictionary = {}
for value in example_set:
	if value % 2 == 0: # if value is even
		dictionary[value] = value * 2
	else: # odd value
		dictionary[value] = value * 3
		
>> dictionary = {1: 3, 2: 4, 3: 9, 4: 8, 5: 15, 6: 12}

We can do this in one line with dictionary comprehension:

dictionary = {value: value*2 if value % 2 == 0 else value * 3 for value in example_set}

>> dictionary = {1: 3, 2: 4, 3: 9, 4: 8, 5: 15, 6: 12}

You can even have different if conditionals for the key and value in the dictionary! Ultimately, this is a super powerful tool but you can almost always just use the regular way of building out lists, sets, dictionaries, etc.

So the basic structure is

values = [(expression_one) if (boolean) else (expression_two) for (item) in (iterable)]

which is the same as

values = []
for item in iterable:
    if boolean:
		values.append(expression_one)
	else:
		values.append(expression_two)

The if and else are both optional, and you could just have the if without the else.

2. Using booleans as values
Since python does not have strict types, there are many tricks we can use to make code more concise. One such way is taking advantage of the fact that booleans, (True or False), can be used like integers (True --> 1, False --> 0). For example, let's say we wanted to see how many letters two strings of the same length have in common. Let s1 = "abcdef" and s2 = "abc123". One way we could do this is:

num_in_common = 0
for i in range(len(s1)):
    if s1[i] == s2[i]:
        num_in_common += 1

>> num_in_common = 3

We could also do this by using the boolean s1[i] == s2[i] directly!

num_in_common = 0
for i in range(len(s1)):
    num_in_common += s1[i] == s2[i]
	
>> num_in_common = 3

This method saves some space but it can be pretty confusing to read, especially in more complex exampels. Nevertheless, it can be useful. By also using list comprehension, we can actually do what we wanted in just one line!

num_in_common = sum([s1[i]==s2[i] for i in range(len(s1))])

>> num_in_common = 3

3. Lambda Functions
Sometimes, you want to quickly make a function for a small, contained task. Often these are when used with other functions, such as sorting. For example, if we wanted to create a function that multiplied a number by two, we could make

def double(num):
	return num * 2

>> double(5) = 10

We could do the same thing using a lambda function:

double = lambda x : x*2

>> double(5) = 10

Basically, a lambda function is of the form lambda arguments : expression, taking in any number of arguments and returning a singular expression. These are good for short functions you only want to use in limited scope. They can often be less readable than functions made using def, so typically you see these used in a few specific scenarios. Let's say that we have a list of pairs of numbers, pairs = [(1,8), (3,2), (4,7), (4,4), (1,7)] , and we want to sort them by the product of the two numbers in the pair. We can do this by taking advantage of the key parameter of python's built in sort function. key takes in a function, which sort will use to compare elements. So one way to do this would be

def pair_product(x, y):
	return x*y
pairs.sort(key = pair_product)

>> pairs = [(3, 2), (1, 7), (1, 8), (4, 4), (4, 7)]

However, assuming we have no other reason to keep this function, we can use a lambda function to make the function within the sort method!

pairs.sort(key = lambda pair: pair[0]*pair[1])

>> pairs = [(3, 2), (1, 7), (1, 8), (4, 4), (4, 7)]

This can be super useful for things like sorting ListNodes by their values (since trying to sort ListNodes themselves results in an error). Other examples of functions which could take in a key argument include max, min, many [heapq](https://docs.python.org/3/library/heapq.html) methods, [bisect](https://docs.python.org/3/library/bisect.html) methods, etc.

Functions

1. enumerate
This is one of the most common methods you'll see people using in solutions which you might not understand yet. enumerate makes iterating through an iterable (such as a list, dictionary, or string) by giving access to both the index and value of each element.

Let's say we want to iterate through some list nums = [5,4,3,2]. There are two typical ways of doing this:

for i in range(len(nums)):
	print(i)
	
--> 0, 1, 2, 3

for num in nums:
	print(num)
	
--> 5, 4, 3, 2

The first method returns us the indices of each element in nums, which allows us to access the elements by nums[i]. The second method just gets us all of the numbers in a sequence. If you just care about the indices of each element, then the first for loop is perfect, and if you just care about the numbers, then the second for loop is all you need. However, what if you want both? Well, you could do this by using the first for loop and then accessing nums[i] whenever you need the number (or you could just set num = nums[i] right under the for loop). There's nothing wrong with this, but enumerate makes it much more concise by returning a tuple containing the index of an element and its value –– it looks like (index, value).

list(enumerate(nums)) = [(0, 5), (1, 4), (2, 3), (3, 2)]

Commonly you see it referenced like this:

for i, num in enumerate(nums):
	print((i, num))
	
--> (0, 5), (1, 4), (2, 3), (3, 2)

This allows you to access the index i and the number num really easily! You don't have to keep calling nums[i] every time you want to use the number! This provides a marginal speed advantage but is mainly useful for keeping the code simple.

2. zip
zip is another function you see often for iterating through multiple iterables. Let's say we want to iterate over two lists, nums1 = [1,2,3] and nums2 = [2,2,3], and see which of the elements in the same indices are are equal. One way we could do this is

for i in range(len(nums1)):
	print(nums1[i] == nums2[i])
	
--> False, True, True

There is nothing wrong with this but we can avoid needing to access nums1[i] and nums2[i] is a little clunky and would be much worse if we wanted to compare tons of different lists together. Instead, we can use zip, which takes in multiple iterables and returns tuples with an item from each iterable.

list(zip(nums1, nums2)) = [(1, 2), (2, 2), (3, 3)]

So we could implement zip by doing

for (num1, num2) in zip(nums1, nums2):
	print(num1 == num2)

--> False, True, True

3. all
all is useful when we need to make sure that a bunch of conditions are met. For example, if we want to check that all the numbers in an array, nums, are less than a certain value, we could go through step by step with a function like this:

for num in nums:
	if num >= value:
		return False
return True

we can use all to create the same functionality in one line!
return any(num >= value for num in nums)

4. any
As it's name suggests, the any function is useful if we just want to check if any condition is met out of an iterable. Using a similar example as before, let's say we only want to check if at least one number in nums is greater than or equal to our value. Then instead of using

for num in nums:
	if num >= value:
		return True
return False

we can use

return any(num >= value for num in nums)

5. bin
If you want to convert an integer to binary, you can do it with the built-in bin method. In the words of the documentation, this method will "Convert an integer number to a binary string prefixed with '0b'," so be careful to take those first two characters into account. You can convert back by using int(binaryString, 2), which will convert binaryString to an integer assuming it was in base 2.

6. map
The map(function, iterable) will apply a function to an iterable and yeild all the results.

For example:

def addTen(num):
    return num + 10
	
list(map(addTen, [1,2,3,4])) = [11, 12, 13, 14]

If you have a function which takes in multiple parameters then you can use multiple iterators which will be passed as arguments in to the function in the order they are passed to map:

def raiseSmallToBig(x,y):
    return min(x,y)**max(x,y)
list(map(raiseSmallToBig, [2,3,4], [3,1,5])) = [8, 1, 1024] # from [2**3, 1**3, 4**5]

7. reduce
reduce(function, iterable) will apply a function to the elements in an iterable cumulatively left to right, ultimately arriving at a final value. For example:

def add(x,y):
	return x+y
reduce(add, [1, 2, 3, 4, 5])

calculates ((((1+2)+3)+4)+5)

8. filter
filter(function, iterable) will return all the values of an iterable such that function is True. This is the same as (item for item in iterable if function(item)) if function is not None (if it is None then it is the same as (item for item in iterable if item)). For example:

def lessThanTen(x):
    return x < 10
list(filter(lessThanTen, [1,10,100,2,50,5])) = [1, 2, 5]

9. reversed
As its name suggests, the reversed(seq) method returns an iterator which goes through a sequence in reverse.

Data Structures

1. dictionary
A dictionary is a datastructure which is indexed by keys rather than indexes. This blurb from the documentation helps explain them:

"It is best to think of a dictionary as a set of key: value pairs, with the requirement that the keys are unique (within one dictionary). A pair of braces creates an empty dictionary: {}. Placing a comma-separated list of key:value pairs within the braces adds initial key:value pairs to the dictionary; this is also the way dictionaries are written on output."

Dictionaries are great for a few things and are super important to understand. Firstly, they use a hashing function for all the keys, so you can figure out whether or not a key is in a dictionary in O(1) time (while if you wanted to figure out whether or not something was in a list you would need to search through every element, taking O(N) time). Secondly, they help us to map things efficiently in key: value pairs, which is extremely useful in tons of problems. I strongly encourage you to take some time to read a little bit of the documentation and to search up some more concise explanations of dictionaries, such as this article.

One example of what a dictionary looks like would be:

thisdict = {
  "brand": "Ford",
  "model": "Mustang",
  "year": 1964
}

2. counter
Sometimes we care about the number of times an element appears in an iterable. When we do, the Counter is one of the best methods of figuring this out. It goes through the iterable and figures out how many times each element appears, storing the element as a key and the number of times it appears as a value in a dictionary. For example,

Counter([4,4,5,5,5,6]) = {5: 3, 4: 2, 6: 1}

There are tons of useful things we can do with Counter such as getting the most common elements, getting the total number of counts, and using mathematical operations with other counters to add or subtract elements. I encourage you to read some of the documentation! Since it is a dictionary subclass we can also use all the typical dictionary methods and attributes as well.

3. set
A set is an unordered data structure in python which does not allow for duplicates and also uses a hashing function for quick lookups (figuring out whether or not something is present in the set in just O(1) time). These are incredibly useful for quick lookup (ex. very useful in Two Sum) and for ignoring or figuring out duplicates. You add to a set using .add(element) and you remove using .remove(element). I encourage you to read the documentation to see the many other useful things you can do with sets!

4. deque
If you want quick pop and append operations at two ends of a list (queue functionality), then using the built in collections.deque datastructure can help! It turns an iterable into a doubly-ended queue, meaning you can easily access the front or the end. Lists offer this functionality with pop(0) or insert(0,value) to access the front, but it takes O(N) time since it needs to update information for the other elements. Therefore, this is the best way to work with elements at the front and the end of a list if you don't care about what's in between. Good for implementing a FIFO structure.

Packages

1. bisect
If you ever want to quickly use binary search, the bisect package is super helpful. Here is how it is described:

"This module provides support for maintaining a list in sorted order without having to sort the list after each insertion. For long lists of items with expensive comparison operations, this can be an improvement over the more common approach. The module is called bisect because it uses a basic bisection algorithm to do its work."

It has methods for both searching and inserting, and you can look through the documentation for the specific method names.

2. heapq
heapq provides support for heap queue / priority queue in python. This is great for constantly accessing the smallest item in an iterable using heappop and for maintaining the heap structure with heappush. These are useful when you want to deal with small elements before other elements, hence the name priority queue (some elements get prioritized).

Two super useful methods in this package are heapq.nlargest(n, iterable, key=None) and heapq.nsmallest(n, iterable, key=None), which will return the n largest or smallest elements in an iterable based on some key function (like the sorted method). This is quick for small values of n and can be useful in lots of problems. As with anything else, you'll get the best understanding of this package by reading the documentation and/or any tutorials or posts about it.

Miscellaneous

1. If (not) ___
Sometimes you’ll see if not ___ or if ___ in some confusing contexts such as checking whether or not a list is empty. One way of checking if some list nums was empty would be to check if len(nums) == 0. However, we can also get the same result by checking if not nums, which will evaluate to True because we are basically asking if the list exists. In this sense, nums is evaluating to False. Other values with which we can do the same are numbers (if 0 will evaluate to False while if 10 will evaluate to True), ListNodes/TreeNodes (if None evaluates to False while any valid node will evaluate to True), dictionaries/sets, and probably many other applications.

2. Multiplying strings/lists
This is a quick tip, but you can multiply strings and lists by integers! You'll actually see this pretty often, especially with lists. For example, if you wanted to make a list with ten zeroes, you could write

zeroes = []
for _ in range(10):
	zeroes.append(0)

>> zeroes = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

But we could also do this by writing

zeroes = [0] * 10

>> zeroes = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

Similarly, a string with 10 zeroes could be constructed with

zeroes = ""
for _ in range(10):
	zeroes += "0"

>> zeroes = "0000000000"

Similarly, we could do this by

zeroes = "0"*10

>> zeroes = "0000000000"

3. In-line If/Else statements
Sometimes you'll see weird return statements that look like

return thing1 if expression else thing2

These are pretty messy and honestly not that readable, but it's good to know that you can commonly use if else statements within a single line in python. I wouldn't recommend using this too much since it makes the code tougher to follow, and you can almost always just use

4. @cache, @lru_cache
If you ever are solving a problem recursively, these are extremely useful memoization techniques (memoization = storing past results). They use a dictionary lookup to store previous results of a function so that you don't make duplicate recursive calls. The example they give in the documentation is super helpful for understanding how this is useful since it can save tons of times for repeated calls to a function which uses recursion. This is especially useful for functions such as calculating the fibonnaci sequence, which would normally make 2^N recursive calls but can be reduced to at most N recursive calls by using a cache.

Often in an interview setting I would recommend you actually implement the memoization yourself by using a dictionary or list to store results to a function to show your understanding.

5. float('inf'), math.inf
If you ever want to use infinity (often as a maximum or a minimum value to later be replaced), you can use float('inf) or math.inf for positive infinity and float('-inf') or -math.inf for negative infinity.

Thanks for reading!

If you have any questions, comments, or concerns, please comment below and I will try to get to you as soon as I can! If you want me to include any more content, let me know and I'll try my best to add it.
This is my first general discussion post, so any feedback is greatly appreciated! If this was helpful I hope you'll consider upvoting :)

Comments (5)