8 Python Tricks Used by Experienced Programmers
Here are eight graceful Python tricks that I’m sure you haven’t seen yet. Apply these tricks in your Python code to make it more concise and productive!
1. Sorting objects by multiple keys
Suppose we want to sort the following list of dictionaries:
people = [
{ 'name': 'John', "age": 64 },
{ 'name': 'Janet', "age": 34 },
{ 'name': 'Ed', "age": 24 },
{ 'name': 'Sara', "age": 64 },
{ 'name': 'John', "age": 32 },
{ 'name': 'Jane', "age": 34 },
{ 'name': 'John', "age": 99 },
]
But we do not just want to sort them by name or age, we want to sort them by both fields. In SQL, it will be a query like this:
SELECT * FROM people ORDER by name, age
There is actually a very simple solution to this problem, thanks to Python’s guarantee that the sort functions provide sorting stability. This means that items that are compared retain their original order.
To achieve sorting by name and age, we can do this:
import operator
people.sort(key=operator.itemgetter('age'))
people.sort(key=operator.itemgetter('name'))
Notice how I changed the order. First, sort by age, and then by name. Through operator.itemgetter()
we get the age and name fields from each dictionary in the list.
This gives us the result we wanted:
[
{'name': 'Ed', 'age': 24},
{'name': 'Jane', 'age': 34},
{'name': 'Janet','age': 34},
{'name': 'John', 'age': 32},
{'name': 'John', 'age': 64},
{'name': 'John', 'age': 99},
{'name': 'Sara', 'age': 64}
]
Names are sorted first, age is sorted if the name matches. Thus, all Jones are grouped by age.
Source of inspiration – question with stackoverflow.
2. List inclusion (List Generator)
List inclusions can replace the ugly loops used to populate the list. Basic syntax for list inclusions:
[ expression for item in list if conditional ]
A very simple example to populate a list with a sequence of numbers:
mylist = [i for i in range(10)]
print(mylist)
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
And since you can use an expression, you can also do some math:
squares = [x**2 for x in range(10)]
print(squares)
# [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
Or even call an external function:
def some_function(a):
return (a + 5) / 2
my_formula = [some_function(i) for i in range(10)]
print(my_formula)
# [2.5, 3.0, 3.5, 4.0, 4.5, 5.0, 5.5, 6.0, 6.5, 7.0]
And finally, you can use the “if” to filter the list. In this case, we save only those values that are divided by 2:
filtered = [i for i in range(20) if i%2==0]
print(filtered)
# [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
3. Check the memory usage of your objects
Using sys.getsizeof () you can check the memory usage of an object:
import sys
mylist = range(0, 10000)
print(sys.getsizeof(mylist))
# 48
Wow … wait … why is this huge list weighing just 48 bytes?
This is because the range function returns a class that only behaves like a list. A range is much less memory intensive than an actual list of numbers.
You can see for yourself using list inclusions to create an actual list of numbers from the same range:
import sys
myreallist = [x for x in range(0, 10000)]
print(sys.getsizeof(myreallist))
# 87632
So, playing with sys.getsizeof()
, you can learn more about Python and your memory usage.
4. Data classes
Starting with version 3.7, Python offers data classes. There are several advantages over regular classes or other alternatives, such as returning multiple values or dictionaries:
- data class requires minimal code
- you can compare data classes because there is
__eq__
- you can easily derive a data class for debugging because there is
__repr__
- data classes require tape hints, which reduces the chance of errors
Here is an example of a data class in operation:
from dataclasses import dataclass
@dataclass
class Card:
rank: str
suit: str
card = Card("Q", "hearts")
print(card == card)
# True
print(card.rank)
# 'Q'
print(card)
Card(rank='Q', suit="hearts")
Detailed guide can be found here.
5. Package attrs
Instead of data classes you can use attrs. There are two reasons to choose. attrs
:
- You are using a version of Python older than 3.7
- You want more features
Package attrs
Supports all major versions of Python, including CPython 2.7 and PyPy. Some of the additional attributes offered attrs
Compared to regular data classes, they are validators and converters. Let’s look at an example code:
@attrs
class Person(object):
name = attrib(default="John")
surname = attrib(default="Doe")
age = attrib(init=False)
p = Person()
print(p)
p = Person('Bill', 'Gates')
p.age = 60
print(p)
# Output:
# Person(name="John", surname="Doe", age=NOTHING)
# Person(name="Bill", surname="Gates", age=60)
Authors attrs
actually worked in PEP, which introduced data classes. Data classes are intentionally kept simpler (easier to understand), while attrs offers a complete set of functions that you might need!
Additional examples can be found. on the attrs example page.
6. Combining dictionaries (Python 3.5+)
Starting with Python 3.5, it is easier to combine dictionaries:
dict1 = { 'a': 1, 'b': 2 }
dict2 = { 'b': 3, 'c': 4 }
merged = { **dict1, **dict2 }
print (merged)
# {'a': 1, 'b': 3, 'c': 4}
If there are intersecting keys, the keys from the first dictionary will be overwritten.
In Python 3.9, combining dictionaries becomes even cleaner. The above merge in Python 3.9 can be rewritten as:
merged = dict1 | dict2
7. Search for the most common value
To find the most common value in a list or line:
test = [1, 2, 3, 4, 2, 2, 3, 1, 4, 4, 4]
print(max(set(test), key = test.count))
# 4
Do you understand why this works? Try to figure this out yourself before reading on.
You even tried, right? I will tell you anyway:
max()
will return the highest value in the list. Argumentkey
takes a single argument function to set the sort order, in this case test.count. The function is applied to each element of the iterable.test.count
– built-in list function. It takes an argument and will count the number of occurrences for that argument. Thus,test.count(1)
will return 2, andtest.count(4)
will return 4.set(test)
returns all unique values from test, so {1, 2, 3, 4}
So, in this single line of code, we take all the unique values of the test, which is equal to {1, 2, 3, 4}
. Further max
apply function to them list.count
and will return the maximum value.
And no – I did not invent this one-liner.
Update: a number of commentators have rightly noted that there is a much more efficient way to do this:
from collections import Counter
Counter(test).most_common(1)
# [4: 4]
8. Return multiple values
Functions in Python can return more than one variable without a dictionary, list, or class. It works like this:
def get_user(id):
# fetch user from database
# ....
return name, birthdate
name, birthdate = get_user(4)
This is normal for a limited number of return values. But everything that exceeds 3 values should be placed in the (data) class.
Learn the details of how to get a sought-after profession from scratch or Level Up in skills and salary, taking SkillFactory paid online courses:
- Machine Learning Course (12 weeks)
- Learning Data Science from scratch (12 months)
- Analyst profession with any starting level (9 months)
- Python for Web Development Course (9 months)
Read more
- Trends in Data Scenсe 2020
- Data Science is dead. Long live Business Science
- The coolest Data Scientist does not waste time on statistics
- How to Become a Data Scientist Without Online Courses
- Sorting cheat sheet for Data Science
- Data Science for the Humanities: What is Data
- Steroid Data Scenario: Introducing Decision Intelligence