Python Snippets
Check if a pathlib Path is absolute or relative⚑
file_path.is_absolute()
Remove a file using pathlib Path⚑
To remove a file using Python’s pathlib
library, you can use the unlink()
method on a Path
object. Here’s how it works:
from pathlib import Path
file_path = Path("path/to/your/file.txt")
# Check if the file exists before trying to delete it
if file_path.exists():
file_path.unlink()
This unlink()
method will delete the file at file_path
. If you want to handle any errors that may occur (like if the file doesn’t exist), you can wrap it in a try-except
block:
try:
file_path.unlink()
except FileNotFoundError:
print("The file does not exist.")
except PermissionError:
print("Permission denied.")
Send a linux desktop notification⚑
Using notify2
⚑
To show a Linux desktop notification from a Python script, you can use the notify2
library (although it's last commit was done on 2017. This library provides an easy way to send desktop notifications on Linux.
Alternatively, you can use the subprocess
module to call the notify-send
command-line utility directly. This is a more straightforward method but requires notify-send
to be installed.
import subprocess
def send_notification(title: str, message: str = "", urgency: str = "normal") -> None:
"""Send a desktop notification using notify-send.
Args:
title (str): The title of the notification.
message (str): The message body of the notification. Defaults to an empty string.
urgency (str): The urgency level of the notification. Can be 'low', 'normal', or 'critical'. Defaults to 'normal'.
"""
subprocess.run(["notify-send", "-u", urgency, title, message])
Compare file and directories⚑
The filecmp module defines functions to compare files and directories, with various optional time/correctness trade-offs. For comparing files, see also the difflib module.
from filecmp import dircmp
def print_diff_files(dcmp):
for name in dcmp.diff_files:
print("diff_file %s found in %s and %s" % (name, dcmp.left,
dcmp.right))
for sub_dcmp in dcmp.subdirs.values():
print_diff_files(sub_dcmp)
dcmp = dircmp('dir1', 'dir2')
print_diff_files(dcmp)
Use Path of pathlib write_text in append mode⚑
It's not supported you need to open
it:
with my_path.open("a") as f:
f.write("...")
Suppress ANN401 for dynamically typed *args and **kwargs⚑
Use object
instead:
def function(*args: object, **kwargs: object) -> None:
One liner conditional⚑
To write an if-then-else statement in Python so that it fits on one line you can use:
fruit = 'Apple'
isApple = True if fruit == 'Apple' else False
Get package data relative path⚑
If you want to reference files from the foo/package1/resources folder you would want to use the file variable of the module. Inside foo/package1/init.py:
from os import path
resources_dir = path.join(path.dirname(__file__), 'resources')
Kill a process by it's PID⚑
import os
import signal
os.kill(pid, signal.SIGTERM) #or signal.SIGKILL
Convert the parameter of an API get request to a valid field⚑
For example if the argument has /
:
from urllib.parse import quote
quote("value/with/slashes")
Will return value%2Fwith%2Fslashes
Get the type hints of an object⚑
from typing import get_type_hints
Student(NamedTuple):
name: Annotated[str, 'some marker']
get_type_hints(Student) == {'name': str}
get_type_hints(Student, include_extras=False) == {'name': str}
get_type_hints(Student, include_extras=True) == {
'name': Annotated[str, 'some marker']
}
````
# [Type hints of a python module](https://stackoverflow.com/questions/53780735/what-is-the-type-hint-for-a-any-python-module)
```python
from types import ModuleType
import os
assert isinstance(os, ModuleType)
Get all the classes of a python module⚑
def _load_classes_from_directory(self, directory):
classes = []
for file_name in os.listdir(directory):
if file_name.endswith(".py") and file_name != "__init__.py":
module_name = os.path.splitext(file_name)[0]
module_path = os.path.join(directory, file_name)
# Import the module dynamically
spec = spec_from_file_location(module_name, module_path)
if spec is None or spec.loader is None:
raise ValueError(
f"Error loading the spec of {module_name} from {module_path}"
)
module = module_from_spec(spec)
spec.loader.exec_module(module)
# Retrieve all classes from the module
module_classes = inspect.getmembers(module, inspect.isclass)
classes.extend(module_classes)
Import files from other directories⚑
Add the directory where you have your function to sys.path
import sys
sys.path.append("**Put here the directory where you have the file with your function**")
from file import function
Investigate a class attributes⚑
Expire the cache of the lru_cache⚑
The lru_cache
decorator caches forever, a way to prevent it is by adding one more parameter to your expensive function: ttl_hash=None
. This new parameter is so-called "time sensitive hash", its the only purpose is to affect lru_cache. For example:
from functools import lru_cache
import time
@lru_cache()
def my_expensive_function(a, b, ttl_hash=None):
del ttl_hash # to emphasize we don't use it and to shut pylint up
return a + b # horrible CPU load...
def get_ttl_hash(seconds=3600):
"""Return the same value withing `seconds` time period"""
return round(time.time() / seconds)
# somewhere in your code...
res = my_expensive_function(2, 2, ttl_hash=get_ttl_hash())
# cache will be updated once in an hour
Fix variable is unbound pyright error⚑
You may receive these warnings if you set variables inside if or try/except blocks such as the next one:
def x():
y = True
if y:
a = 1
print(a) # "a" is possibly unbound
The easy fix is to set a = None
outside those blocks
def x():
a = None
y = True
if y:
a = 1
print(a) # "a" is possibly unbound
Get unique items between two lists⚑
If you want all items from the second list that do not appear in the first list you can write:
x = [1,2,3,4]
f = [1,11,22,33,44,3,4]
result = set(f) - set(x)
Pad number with zeros⚑
number = 1
print(f"{number:02d}")
Get the modified time of a file with Pathlib⚑
file_ = Path('/to/some/file')
file_.stat().st_mtime
You can also access:
- Created time: with
st_ctime
- Accessed time: with
st_atime
They are timestamps, so if you want to compare it with a datetime object use the timestamp
method:
assert datetime.now().timestamp - file_.stat().st_mtime < 60
Show the date in the logging module traces⚑
To display the date and time of an event, you would place %(asctime)s
in your format string:
import logging
logging.basicConfig(format='%(asctime)s %(message)s')
logging.warning('is when this event was logged.')
Remove html url characters⚑
To transform an URL string into normal string, for example replacing %20
with space use:
>>> from urllib.parse import unquote
>>> print(unquote("%CE%B1%CE%BB%20"))
αλ
Read file with Pathlib⚑
file_ = Path('/to/some/file')
file_.read_text()
Get changed time of a file⚑
import os
os.path.getmtime(path)
Sort the returned paths of glob⚑
glob
order is arbitrary, but you can sort them yourself.
If you want sorted by name:
sorted(glob.glob('*.png'))
sorted by modification time:
import os
sorted(glob.glob('*.png'), key=os.path.getmtime)
sorted by size:
import os
sorted(glob.glob('*.png'), key=os.path.getsize)
Copy files from a python package⚑
pkgdir = sys.modules['<mypkg>'].__path__[0]
fullpath = os.path.join(pkgdir, <myfile>)
shutil.copy(fullpath, os.getcwd())
Substract two paths⚑
It can also framed to how to get the relative path between two absolute paths:
>>> from pathlib import Path
>>> p = Path('/home/lyz/')
>>> h = Path('/home/')
>>> p.relative_to(h)
PosixPath('lyz')
Move a file⚑
Use one of the following
import os
import shutil
os.rename("path/to/current/file.foo", "path/to/new/destination/for/file.foo")
os.replace("path/to/current/file.foo", "path/to/new/destination/for/file.foo")
shutil.move("path/to/current/file.foo", "path/to/new/destination/for/file.foo")
Print an exception⚑
Using the logging module⚑
Logging an exception can be done with the module-level function logging.exception()
like so:
import logging
try:
1 / 0
except BaseException:
logging.exception("An exception was thrown!")
ERROR:root:An exception was thrown!
Traceback (most recent call last):
File ".../Desktop/test.py", line 4, in <module>
1/0
ZeroDivisionError: division by zero
Notes
-
The function
logging.exception()
should only be called from an exception handler. -
The logging module should not be used inside a logging handler to avoid a
RecursionError
.
It's also possible to log the exception with another log level but still show the exception details by using the keyword argument exc_info=True
, like so:
logging.critical("An exception was thrown!", exc_info=True)
logging.error("An exception was thrown!", exc_info=True)
logging.warning("An exception was thrown!", exc_info=True)
logging.info("An exception was thrown!", exc_info=True)
logging.debug("An exception was thrown!", exc_info=True)
# or the general form
logging.log(level, "An exception was thrown!", exc_info=True)
With the traceback module⚑
Get the error string⚑
import traceback
def cause_error():
return 1 / 0 # This will raise a ZeroDivisionError
try:
cause_error()
except Exception as error:
# Capture the exception traceback as a string
error_message = "".join(traceback.format_exception(None, error, error.__traceback__))
print("An error occurred:\n", error_message)
Print directly the error⚑
The traceback
module provides methods for formatting and printing exceptions and their tracebacks, e.g. this would print exception like the default handler does:
import traceback
try:
1 / 0
except Exception:
traceback.print_exc()
Traceback (most recent call last):
File "C:\scripts\divide_by_zero.py", line 4, in <module>
1/0
ZeroDivisionError: division by zero
Get common elements of two lists⚑
>>> a = ['a', 'b']
>>> b = ['c', 'd', 'b']
>>> set(a) & set(b)
{'b'}
Get the difference of two lists⚑
If we want to substract the elements of one list from the other you can use:
for x in b:
if x in a:
a.remove(x)
Recursively find files⚑
Using pathlib.Path.rglob
⚑
from pathlib import Path
for path in Path("src").rglob("*.c"):
print(path.name)
If you don't want to use pathlib
, use can use glob.glob('**/*.c')
, but don't forget to pass in the recursive keyword parameter and it will use inordinate amount of time on large directories.
os.walk⚑
For older Python versions, use os.walk
to recursively walk a directory and fnmatch.filter
to match against a simple expression:
import fnmatch
import os
matches = []
for root, dirnames, filenames in os.walk("src"):
for filename in fnmatch.filter(filenames, "*.c"):
matches.append(os.path.join(root, filename))
Pad a string with spaces⚑
>>> name = 'John'
>>> name.ljust(15)
'John '
Get hostname of the machine⚑
Any of the next three options:
import os
os.uname()[1]
import platform
platform.node()
import socket
socket.gethostname()
Pathlib make parent directories if they don't exist⚑
pathlib.Path("/tmp/sub1/sub2").mkdir(parents=True, exist_ok=True)
From the docs:
-
If
parents
istrue
, any missing parents of this path are created as needed; they are created with the default permissions without taking mode into account (mimicking the POSIXmkdir -p
command). -
If
parents
isfalse
(the default), a missing parent raisesFileNotFoundError
. -
If
exist_ok
isfalse
(the default),FileExistsError
is raised if the target directory already exists. -
If
exist_ok
istrue
,FileExistsError
exceptions will be ignored (same behavior as the POSIXmkdir -p
command), but only if the last path component is not an existing non-directory file.
Pathlib touch a file⚑
Create a file at this given path.
pathlib.Path("/tmp/file.txt").touch(exist_ok=True)
If the file already exists, the function succeeds if exist_ok
is true
(and its modification time is updated to the current time), otherwise FileExistsError
is raised.
If the parent directory doesn't exist you need to create it first.
global_conf_path = xdg_home / "autoimport" / "config.toml"
global_conf_path.parent.mkdir(parents=True)
global_conf_path.touch(exist_ok=True)
Pad integer with zeros⚑
>>> length = 1
>>> print(f'length = {length:03}')
length = 001
Print datetime with a defined format⚑
now = datetime.now()
today.strftime("We are the %d, %b %Y")
Where the datetime format is a string built from these directives.
Print string with asciiart⚑
pip install pyfiglet
from pyfiglet import figlet_format
print(figlet_format("09 : 30"))
If you want to change the default width of 80 caracteres use:
from pyfiglet import Figlet
f = Figlet(font="standard", width=100)
print(f.renderText("aaaaaaaaaaaaaaaaa"))
Print specific time format⚑
datetime.datetime.now().strftime("%Y-%m-%dT%H:%M:%S")
Code | Meaning Example |
---|---|
%a | Weekday as locale’s abbreviated name. Mon |
%A | Weekday as locale’s full name. Monday |
%w | Weekday as a decimal number, where 0 is Sunday and 6 is Saturday. 1 |
%d | Day of the month as a zero-padded decimal number. 30 |
%-d | Day of the month as a decimal number. (Platform specific) 30 |
%b | Month as locale’s abbreviated name. Sep |
%B | Month as locale’s full name. September |
%m | Month as a zero-padded decimal number. 09 |
%-m | Month as a decimal number. (Platform specific) 9 |
%y | Year without century as a zero-padded decimal number. 13 |
%Y | Year with century as a decimal number. 2013 |
%H | Hour (24-hour clock) as a zero-padded decimal number. 07 |
%-H | Hour (24-hour clock) as a decimal number. (Platform specific) 7 |
%I | Hour (12-hour clock) as a zero-padded decimal number. 07 |
%-I | Hour (12-hour clock) as a decimal number. (Platform specific) 7 |
%p | Locale’s equivalent of either AM or PM. AM |
%M | Minute as a zero-padded decimal number. 06 |
%-M | Minute as a decimal number. (Platform specific) 6 |
%S | Second as a zero-padded decimal number. 05 |
%-S | Second as a decimal number. (Platform specific) 5 |
%f | Microsecond as a decimal number, zero-padded on the left. 000000 |
%z | UTC offset in the form +HHMM or -HHMM (empty string if the object is naive). |
%Z | Time zone name (empty string if the object is naive). |
%j | Day of the year as a zero-padded decimal number. 273 |
%-j | Day of the year as a decimal number. (Platform specific) 273 |
%U | Week number of the year (Sunday as the first day of the week) as a zero padded decimal number. All days in a new year preceding the first Sunday are considered to be in week 0. 39 |
%W | Week number of the year (Monday as the first day of the week) as a decimal number. All days in a new year preceding the first Monday are considered to be in week 0. |
%c | Locale’s appropriate date and time representation. Mon Sep 30 07:06:05 2013 |
%x | Locale’s appropriate date representation. 09/30/13 |
%X | Locale’s appropriate time representation. 07:06:05 |
%% | A literal '%' character. % |
Get an instance of an Enum by value⚑
If you want to initialize a pydantic model with an Enum
but all you have is the value of the Enum
then you need to create a method to get the correct Enum. Otherwise mypy
will complain that the type of the assignation is str
and not Enum
.
So if the model is the next one:
class ServiceStatus(BaseModel):
"""Model the docker status of a service."""
name: str
environment: Environment
You can't do ServiceStatus(name='test', environment='production')
. you need to add the get_by_value
method to the Enum
class:
class Environment(str, Enum):
"""Set the possible environments."""
STAGING = "staging"
PRODUCTION = "production"
@classmethod
def get_by_value(cls, value: str) -> Enum:
"""Return the Enum element that meets a value"""
return [member for member in cls if member.value == value][0]
Now you can do:
ServiceStatus(name="test", environment=Environment.get_by_value("production"))
Fix R1728: Consider using a generator⚑
Removing []
inside calls that can use containers or generators should be considered for performance reasons since a generator will have an upfront cost to pay. The performance will be better if you are working with long lists or sets.
Problematic code:
list([0 for y in list(range(10))]) # [consider-using-generator]
tuple([0 for y in list(range(10))]) # [consider-using-generator]
sum([y**2 for y in list(range(10))]) # [consider-using-generator]
max([y**2 for y in list(range(10))]) # [consider-using-generator]
min([y**2 for y in list(range(10))]) # [consider-using-generator]
Correct code:
list(0 for y in list(range(10)))
tuple(0 for y in list(range(10)))
sum(y**2 for y in list(range(10)))
max(y**2 for y in list(range(10)))
min(y**2 for y in list(range(10)))
Fix W1510: Using subprocess.run without explicitly set check is not recommended⚑
The run
call in the example will succeed whether the command is successful or not. This is a problem because we silently ignore errors.
import subprocess
def example():
proc = subprocess.run("ls")
return proc.stdout
When we pass check=True
, the behavior changes towards raising an exception when the return code of the command is non-zero.
Convert bytes to string⚑
byte_var.decode("utf-8")
Use pipes with subprocess⚑
To use pipes with subprocess you need to use the flag shell=True
which is a bad idea. Instead you should use two processes and link them together in python:
ps = subprocess.Popen(("ps", "-A"), stdout=subprocess.PIPE)
output = subprocess.check_output(("grep", "process_name"), stdin=ps.stdout)
ps.wait()
Pass input to the stdin of a subprocess⚑
import subprocess
p = subprocess.run(["myapp"], input="data_to_write", text=True)
Copy and paste from clipboard⚑
You can use many libraries to do it, but if you don't want to add any other dependencies you can use subprocess run
.
To copy from the selection
clipboard, assuming you've got xclip
installed, you could do:
subprocess.run(
["xclip", "-selection", "clipboard", "-i"],
input="text to be copied",
text=True,
check=True,
)
To paste it:
subprocess.check_output(["xclip", "-o", "-selection", "clipboard"]).decode("utf-8")
Create random number⚑
import random
a = random.randint(1, 10)
Check if local port is available or in use⚑
Create a temporary socket and then try to bind to the port to see if it's available. Close the socket after validating that the port is available.
def port_in_use(port):
"""Test if a local port is used."""
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
with suppress(OSError):
sock.bind(("0.0.0.0", port))
return True
sock.close()
return False
Initialize a dataclass with kwargs⚑
If you care about accessing attributes by name, or if you can't distinguish between known and unknown arguments during initialisation, then your last resort without rewriting __init__
(which pretty much defeats the purpose of using dataclasses in the first place) is writing a @classmethod
:
from dataclasses import dataclass
from inspect import signature
@dataclass
class Container:
user_id: int
body: str
@classmethod
def from_kwargs(cls, **kwargs):
# fetch the constructor's signature
cls_fields = {field for field in signature(cls).parameters}
# split the kwargs into native ones and new ones
native_args, new_args = {}, {}
for key, value in kwargs.items():
if key in cls_fields:
native_args[key] = value
else:
new_args[key] = value
# use the native ones to create the class ...
ret = cls(**native_args)
# ... and add the new ones by hand
for new_key, new_value in new_args.items():
setattr(ret, new_key, new_value)
return ret
Usage:
params = {"user_id": 1, "body": "foo", "bar": "baz", "amount": 10}
Container(**params) # still doesn't work, raises a TypeError
c = Container.from_kwargs(**params)
print(c.bar) # prints: 'baz'
Replace a substring of a string⚑
txt = "I like bananas"
x = txt.replace("bananas", "apples")
Parse an RFC2822 date⚑
Interesting to test the accepted format of RSS dates.
>>> from email.utils import parsedate_to_datetime
>>> datestr = 'Sun, 09 Mar 1997 13:45:00 -0500'
>>> parsedate_to_datetime(datestr)
datetime.datetime(1997, 3, 9, 13, 45, tzinfo=datetime.timezone(datetime.timedelta(-1, 68400)))
Convert a datetime to RFC2822⚑
Interesting as it's the accepted format of RSS dates.
>>> import datetime
>>> from email import utils
>>> nowdt = datetime.datetime.now()
>>> utils.format_datetime(nowdt)
'Tue, 10 Feb 2020 10:06:53 -0000'
Encode url⚑
import urllib.parse
from pydantic import AnyHttpUrl
def _normalize_url(url: str) -> AnyHttpUrl:
"""Encode url to make it compatible with AnyHttpUrl."""
return typing.cast(
AnyHttpUrl,
urllib.parse.quote(url, ":/"),
)
The :/
is needed when you try to parse urls that have the protocol, otherwise https://www.
gets transformed into https%3A//www.
.
Fix SIM113 Use enumerate⚑
Use enumerate
to get a running number over an iterable.
# Bad
idx = 0
for el in iterable:
...
idx += 1
# Good
for idx, el in enumerate(iterable):
...
Define a property of a class⚑
If you're using Python 3.9 or above you can directly use the decorators:
class G:
@classmethod
@property
def __doc__(cls):
return f"A doc for {cls.__name__!r}"
If you're not, you can define the decorator classproperty
:
# N801: class name 'classproperty' should use CapWords convention, but it's a decorator.
# C0103: Class name "classproperty" doesn't conform to PascalCase naming style but it's
# a decorator.
class classproperty: # noqa: N801, C0103
"""Define a class property.
From Python 3.9 you can directly use the decorators directly.
class G:
@classmethod
@property
def __doc__(cls):
return f'A doc for {cls.__name__!r}'
"""
def __init__(self, function: Callable[..., Any]) -> None:
"""Initialize the decorator."""
self.function = function
# ANN401: Any not allowed in typings, but I don't know how to narrow the hints in
# this case.
def __get__(self, owner_self: Any, owner_cls: Any) -> Any: # noqa: ANN401
"""Return the desired value."""
return self.function(owner_self)
But you'll run into the W0143: Comparing against a callable, did you omit the parenthesis? (comparison-with-callable)
mypy error when using it to compare the result of the property with anything, as it doesn't detect it's a property instead of a method.
How to close a subprocess process⚑
subprocess.terminate()
How to extend a dictionary⚑
a.update(b)
How to Find Duplicates in a List in Python⚑
numbers = [1, 2, 3, 2, 5, 3, 3, 5, 6, 3, 4, 5, 7]
duplicates = [number for number in numbers if numbers.count(number) > 1]
unique_duplicates = list(set(duplicates))
# Returns: [2, 3, 5]
If you want to count the number of occurrences of each duplicate, you can use:
from collections import Counter
numbers = [1, 2, 3, 2, 5, 3, 3, 5, 6, 3, 4, 5, 7]
counts = dict(Counter(numbers))
duplicates = {key: value for key, value in counts.items() if value > 1}
# Returns: {2: 2, 3: 4, 5: 3}
To remove the duplicates use a combination of list
and set
:
unique = list(set(numbers))
# Returns: [1, 2, 3, 4, 5, 6, 7]
How to decompress a gz file⚑
import gzip
import shutil
with gzip.open("file.txt.gz", "rb") as f_in:
with open("file.txt", "wb") as f_out:
shutil.copyfileobj(f_in, f_out)
How to compress/decompress a tar file⚑
def compress(tar_file, members):
"""
Adds files (`members`) to a tar_file and compress it
"""
tar = tarfile.open(tar_file, mode="w:gz")
for member in members:
tar.add(member)
tar.close()
def decompress(tar_file, path, members=None):
"""
Extracts `tar_file` and puts the `members` to `path`.
If members is None, all members on `tar_file` will be extracted.
"""
tar = tarfile.open(tar_file, mode="r:gz")
if members is None:
members = tar.getmembers()
for member in members:
tar.extract(member, path=path)
tar.close()
Parse XML file with beautifulsoup⚑
You need both beautifulsoup4
and lxml
:
bs = BeautifulSoup(requests.get(url), "lxml")
Get a traceback from an exception⚑
import traceback
# `e` is an exception object that you get from somewhere
traceback_str = "".join(traceback.format_tb(e.__traceback__))
Change the logging level of a library⚑
For example to change the logging level of the library sh
use:
sh_logger = logging.getLogger("sh")
sh_logger.setLevel(logging.WARN)
Get all subdirectories of a directory⚑
[x[0] for x in os.walk(directory)]
Move a file⚑
import os
os.rename("path/to/current/file.foo", "path/to/new/destination/for/file.foo")
IPv4 regular expression⚑
regex = re.compile(
r"(?<![-\.\d])(?:0{0,2}?[0-9]\.|1\d?\d?\.|2[0-5]?[0-5]?\.){3}"
r'(?:0{0,2}?[0-9]|1\d?\d?|2[0-5]?[0-5]?)(?![\.\d])"^[0-9]{1,3}*$'
)
Remove the elements of a list from another⚑
>>> set([1,2,6,8]) - set([2,3,5,8])
set([1, 6])
Note, however, that sets do not preserve the order of elements, and cause any duplicated elements to be removed. The elements also need to be hashable. If these restrictions are tolerable, this may often be the simplest and highest performance option.
Copy a directory⚑
import shutil
shutil.copytree("bar", "foo")
Copy a file⚑
import shutil
shutil.copyfile(src_file, dest_file)
Capture the stdout of a function⚑
import io
from contextlib import redirect_stdout
f = io.StringIO()
with redirect_stdout(f):
do_something(my_object)
out = f.getvalue()
Make temporal directory⚑
import tempfile
dirpath = tempfile.mkdtemp()
Change the working directory of a test⚑
The following function-level fixture will change to the test case directory, run the test (yield
), then change back to the calling directory to avoid side-effects.
@pytest.fixture(name="change_test_dir")
def change_test_dir_(request: SubRequest) -> Any:
os.chdir(request.fspath.dirname)
yield
os.chdir(request.config.invocation_dir)
request
is a built-in pytest fixturefspath
is theLocalPath
to the test module being executeddirname
is the directory of the test modulerequest.config.invocationdir
is the folder from which pytest was executedrequest.config.rootdir
is the pytest root, doesn't change based on where you run pytest. Not used here, but could be useful.
Any processes that are kicked off by the test will use the test case folder as their working directory and copy their logs, outputs, etc. there, regardless of where the test suite was executed.
Remove a substring from the end of a string⚑
On Python 3.9 and newer you can use the removeprefix
and removesuffix
methods to remove an entire substring from either side of the string:
url = "abcdc.com"
url.removesuffix(".com") # Returns 'abcdc'
url.removeprefix("abcdc.") # Returns 'com'
On Python 3.8 and older you can use endswith
and slicing:
url = "abcdc.com"
if url.endswith(".com"):
url = url[:-4]
Or a regular expression:
import re
url = "abcdc.com"
url = re.sub("\.com$", "", url)
Make a flat list of lists with a list comprehension⚑
There is no nice way to do it :(. The best I've found is:
t = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
flat_list = [item for sublist in t for item in sublist]
Replace all characters of a string with another character⚑
mystring = "_" * len(mystring)
Locate element in list⚑
a = ["a", "b"]
index = a.index("b")
Transpose a list of lists⚑
>>> l=[[1,2,3],[4,5,6],[7,8,9]]
>>> [list(i) for i in zip(*l)]
... [[1, 4, 7], [2, 5, 8], [3, 6, 9]]
Check the type of a list of strings⚑
def _is_list_of_lists(data: Any) -> bool:
"""Check if data is a list of strings."""
if data and isinstance(data, list):
return all(isinstance(elem, list) for elem in data)
else:
return False
Install default directories and files for a command line program⚑
I've been trying for a long time to configure setup.py
to run the required steps to configure the required directories and files when doing pip install
without success.
Finally, I decided that the program itself should create the data once the FileNotFoundError
exception is found. That way, you don't penalize the load time because if the file or directory exists, that code is not run.
Check if a dictionary is a subset of another⚑
If you have two dictionaries big = {'a': 1, 'b': 2, 'c':3}
and small = {'c': 3, 'a': 1}
, and want to check whether small
is a subset of big
, use the next snippet:
>>> small.items() <= big.items()
True
As the code is not very common or intuitive, I'd add a comment to explain what you're doing.
When to use isinstance
and when to use type
⚑
isinstance
takes into account inheritance, while type
doesn't. So if we have the next code:
class Shape:
pass
class Rectangle(Shape):
def __init__(self, length, width):
self.length = length
self.width = width
self.area = length * width
def get_area(self):
return self.length * self.width
class Square(Rectangle):
def __init__(self, length):
Rectangle.__init__(self, length, length)
And we want to check if an object a = Square(5)
is of type Rectangle
, we could not use isinstance
because it'll return True
as it's a subclass of Rectangle
:
>>> isinstance(a, Rectangle)
True
Instead, use a comparison with type
:
>>> type(a) == Rectangle
False
Find a static file of a python module⚑
Useful when you want to initialize a configuration file of a cli program when it's not present.
Imagine you have a setup.py
with the next contents:
setup(
name="pynbox",
packages=find_packages("src"),
package_dir={"": "src"},
package_data={"pynbox": ["py.typed", "assets/config.yaml"]},
Then you could import the data with:
import pkg_resources
file_path = (pkg_resources.resource_filename("pynbox", "assets/config.yaml"),)
Delete a file⚑
import os
os.remove("demofile.txt")
Measure elapsed time between lines of code⚑
import time
start = time.time()
print("hello")
end = time.time()
print(end - start)
Create combination of elements in groups of two⚑
Using the combinations function in Python's itertools module:
>>> list(itertools.combinations('ABC', 2))
[('A', 'B'), ('A', 'C'), ('B', 'C')]
If you want the permutations use itertools.permutations
.
Convert html to readable plaintext⚑
pip install html2text
import html2text
html = open("foobar.html").read()
print(html2text.html2text(html))
Parse a datetime ⚑
Parse a datetime from an epoch⚑
>>> import datetime
>>> datetime.datetime.fromtimestamp(1347517370).strftime('%c')
'2012-09-13 02:22:50'
Parse a datetime from a string⚑
from dateutil import parser
parser.parse("Aug 28 1999 12:00AM") # datetime.datetime(1999, 8, 28, 0, 0)
If you're using ambiguous date such as 03/05/24
and want the 3
to be the day and not the month use parser.parse("03/05/24", dayfirst=True)
.
If you don't want to use dateutil
use datetime
datetime.datetime.strptime("2013-W26", "%Y-W%W-%w")
Where the datetime format is a string built from the next directives:
Directive | Meaning | Example |
---|---|---|
%a | Abbreviated weekday name. | Sun, Mon, ... |
%A | Full weekday name. | Sunday, Monday, ... |
%w | Weekday as a decimal number. | 0, 1, ..., 6 |
%d | Day of the month as a zero-padded decimal. | 01, 02, ..., 31 |
%-d | Day of the month as a decimal number. | 1, 2, ..., 30 |
%b | Abbreviated month name. | Jan, Feb, ..., Dec |
%B | Full month name. | January, February, ... |
%m | Month as a zero-padded decimal number. | 01, 02, ..., 12 |
%-m | Month as a decimal number. | 1, 2, ..., 12 |
%y | Year without century as a zero-padded decimal number. | 00, 01, ..., 99 |
%-y | Year without century as a decimal number. | 0, 1, ..., 99 |
%Y | Year with century as a decimal number. | 2013, 2019 etc. |
%H | Hour (24-hour clock) as a zero-padded decimal number. | 00, 01, ..., 23 |
%-H | Hour (24-hour clock) as a decimal number. | 0, 1, ..., 23 |
%I | Hour (12-hour clock) as a zero-padded decimal number. | 01, 02, ..., 12 |
%-I | Hour (12-hour clock) as a decimal number. | 1, 2, ... 12 |
%p | Locale’s AM or PM. | AM, PM |
%M | Minute as a zero-padded decimal number. | 00, 01, ..., 59 |
%-M | Minute as a decimal number. | 0, 1, ..., 59 |
%S | Second as a zero-padded decimal number. | 00, 01, ..., 59 |
%-S | Second as a decimal number. | 0, 1, ..., 59 |
%f | Microsecond as a decimal number, zero-padded on the left. | 000000 - 999999 |
%z | UTC offset in the form +HHMM or -HHMM. | |
%Z | Time zone name. | |
%j | Day of the year as a zero-padded decimal number. | 001, 002, ..., 366 |
%-j | Day of the year as a decimal number. | 1, 2, ..., 366 |
%U | Week number of the year (Sunday as the first day of the week). | 00, 01, ..., 53 |
%W | Week number of the year (Monday as the first day of the week). | 00, 01, ..., 53 |
%c | Locale’s appropriate date and time representation. | Mon Sep 30 07:06:05 2013 |
%x | Locale’s appropriate date representation. | 09/30/13 |
%X | Locale’s appropriate time representation. | 07:06:05 |
%% | A literal '%' character. | % |
Install a python dependency from a git repository⚑
With pip you can:
pip install git+git://github.com/path/to/repository@master
If you want to hard code it in your setup.py
, you need to:
install_requires = [
"some-pkg @ git+ssh://git@github.com/someorgname/pkg-repo-name@v1.1#egg=some-pkg",
]
But Pypi won't allow you to upload the package, as it will give you an error:
HTTPError: 400 Bad Request from https://test.pypi.org/legacy/
Invalid value for requires_dist. Error: Can't have direct dependency: 'deepdiff @ git+git://github.com/lyz-code/deepdiff@master'
It looks like this is a conscious decision on the PyPI side. Basically, they don't want pip to reach out to URLs outside their site when installing from PyPI.
An ugly patch is to install the dependencies in a PostInstall
custom script in the setup.py
of your program:
from setuptools.command.install import install
from subprocess import getoutput
# ignore: cannot subclass install, has type Any. And what would you do?
class PostInstall(install): # type: ignore
"""Install direct dependency.
Pypi doesn't allow uploading packages with direct dependencies, so we need to
install them manually.
"""
def run(self) -> None:
"""Install dependencies."""
install.run(self)
print(getoutput("pip install git+git://github.com/lyz-code/deepdiff@master"))
setup(cmdclass={"install": PostInstall})
Warning: It may not work! Last time I used this solution, when I added the library on a setup.py
the direct dependencies weren't installed :S
Check or test directories and files⚑
def test_dir(directory):
from os.path import exists
from os import makedirs
if not exists(directory):
makedirs(directory)
def test_file(filepath, mode):
"""Check if a file exist and is accessible."""
def check_mode(os_mode, mode):
if os.path.isfile(filepath) and os.access(filepath, os_mode):
return
else:
raise IOError("Can't access the file with mode " + mode)
if mode is "r":
check_mode(os.R_OK, mode)
elif mode is "w":
check_mode(os.W_OK, mode)
elif mode is "a":
check_mode(os.R_OK, mode)
check_mode(os.W_OK, mode)
Remove the extension of a file⚑
os.path.splitext("/path/to/some/file.txt")[0]
Iterate over the files of a directory⚑
import os
directory = "/path/to/directory"
for entry in os.scandir(directory):
if (entry.path.endswith(".jpg") or entry.path.endswith(".png")) and entry.is_file():
print(entry.path)
Create directory⚑
With pathlib
:
if not dir_path.is_dir():
dir_path.mkdir(parents=True, exist_ok=True)
With os
:
if not os.path.exists(directory):
os.makedirs(directory)
Touch a file⚑
from pathlib import Path
Path("path/to/file.txt").touch()
Get the first day of next month⚑
current = datetime.datetime(mydate.year, mydate.month, 1)
next_month = datetime.datetime(
mydate.year + int(mydate.month / 12), ((mydate.month % 12) + 1), 1
)
Get the week number of a datetime⚑
datetime.datetime
has a isocalendar()
method, which returns a tuple containing the calendar week:
>>> import datetime
>>> datetime.datetime(2010, 6, 16).isocalendar()[1]
24
datetime.date.isocalendar()
is an instance-method returning a tuple containing year, weeknumber and weekday in respective order for the given date instance.
Get the monday of a week number⚑
A week number is not enough to generate a date; you need a day of the week as well. Add a default:
import datetime
d = "2013-W26"
r = datetime.datetime.strptime(d + "-1", "%Y-W%W-%w")
The -1
and -%w
pattern tells the parser to pick the Monday in that week.
Get the month name from a number⚑
import calendar
>> calendar.month_name[3]
'March'
Get ordinal from number⚑
def int_to_ordinal(number: int) -> str:
"""Convert an integer into its ordinal representation.
make_ordinal(0) => '0th'
make_ordinal(3) => '3rd'
make_ordinal(122) => '122nd'
make_ordinal(213) => '213th'
Args:
number: Number to convert
Returns:
ordinal representation of the number
"""
suffix = ["th", "st", "nd", "rd", "th"][min(number % 10, 4)]
if 11 <= (number % 100) <= 13:
suffix = "th"
return f"{number}{suffix}"
Group or sort a list of dictionaries or objects by a specific key⚑
Python lists have a built-in list.sort()
method that modifies the list in-place. There is also a sorted()
built-in function that builds a new sorted list from an iterable.
Sorting basics⚑
A simple ascending sort is very easy: just call the sorted()
function. It returns a new sorted list:
>>> sorted([5, 2, 3, 1, 4])
[1, 2, 3, 4, 5]
Key functions⚑
Both list.sort()
and sorted()
have a key
parameter to specify a function (or other callable) to be called on each list element prior to making comparisons.
For example, here’s a case-insensitive string comparison:
>>> sorted("This is a test string from Andrew".split(), key=str.lower)
['a', 'Andrew', 'from', 'is', 'string', 'test', 'This']
The value of the key
parameter should be a function (or other callable) that takes a single argument and returns a key to use for sorting purposes. This technique is fast because the key function is called exactly once for each input record.
A common pattern is to sort complex objects using some of the object’s indices as keys. For example:
>>> from operator import itemgetter
>>> student_tuples = [
('john', 'A', 15),
('jane', 'B', 12),
('dave', 'B', 10),
]
>>> sorted(student_tuples, key=itemgetter(2)) # sort by age
[('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)]
The same technique works for objects with named attributes. For example:
>>> from operator import attrgetter
>>> class Student:
def __init__(self, name, grade, age):
self.name = name
self.grade = grade
self.age = age
def __repr__(self):
return repr((self.name, self.grade, self.age))
>>> student_objects = [
Student('john', 'A', 15),
Student('jane', 'B', 12),
Student('dave', 'B', 10),
]
>>> sorted(student_objects, key=attrgetter('age')) # sort by age
[('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)]
The operator module functions allow multiple levels of sorting. For example, to sort by grade then by age:
>>> sorted(student_tuples, key=itemgetter(1,2))
[('john', 'A', 15), ('dave', 'B', 10), ('jane', 'B', 12)]
>>> sorted(student_objects, key=attrgetter('grade', 'age'))
[('john', 'A', 15), ('dave', 'B', 10), ('jane', 'B', 12)]
Sorts stability and complex sorts⚑
Sorts are guaranteed to be stable. That means that when multiple records have the same key, their original order is preserved.
>>> data = [('red', 1), ('blue', 1), ('red', 2), ('blue', 2)]
>>> sorted(data, key=itemgetter(0))
[('blue', 1), ('blue', 2), ('red', 1), ('red', 2)]
Notice how the two records for blue retain their original order so that ('blue', 1)
is guaranteed to precede ('blue', 2)
.
This wonderful property lets you build complex sorts in a series of sorting steps. For example, to sort the student data by descending grade and then ascending age, do the age sort first and then sort again using grade:
>>> s = sorted(student_objects, key=attrgetter('age')) # sort on secondary key
>>> sorted(s, key=attrgetter('grade'), reverse=True) # now sort on primary key, descending
[('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)]
This can be abstracted out into a wrapper function that can take a list and tuples of field and order to sort them on multiple passes.
>>> def multisort(xs, specs):
for key, reverse in reversed(specs):
xs.sort(key=attrgetter(key), reverse=reverse)
return xs
>>> multisort(list(student_objects), (('grade', True), ('age', False)))
[('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)]
Get the attribute of an attribute⚑
To sort the list in place:
ut.sort(key=lambda x: x.count, reverse=True)
To return a new list, use the sorted()
built-in function:
newlist = sorted(ut, key=lambda x: x.body.id_, reverse=True)
Iterate over an instance object's data attributes in Python⚑
@dataclass(frozen=True)
class Search:
center: str
distance: str
se = Search("a", "b")
for key, value in se.__dict__.items():
print(key, value)
Generate ssh key⚑
pip install cryptography
from os import chmod
from cryptography.hazmat.primitives import serialization
from cryptography.hazmat.primitives.asymmetric import rsa
from cryptography.hazmat.backends import default_backend as crypto_default_backend
private_key = rsa.generate_private_key(
backend=crypto_default_backend(), public_exponent=65537, key_size=4096
)
pem = private_key.private_bytes(
encoding=serialization.Encoding.PEM,
format=serialization.PrivateFormat.TraditionalOpenSSL,
encryption_algorithm=serialization.NoEncryption(),
)
with open("/tmp/private.key", "wb") as content_file:
chmod("/tmp/private.key", 0600)
content_file.write(pem)
public_key = (
private_key.public_key().public_bytes(
encoding=serialization.Encoding.OpenSSH,
format=serialization.PublicFormat.OpenSSH,
)
+ b" user@email.org"
)
with open("/tmp/public.key", "wb") as content_file:
content_file.write(public_key)
Make multiline code look clean⚑
If you need variables that contain multiline strings inside functions or methods you need to remove the indentation
def test():
# end first line with \ to avoid the empty line!
s = """\
hello
world
"""
Which is inconvenient as it breaks some editor source code folding and it's ugly for the eye.
The solution is to use textwrap.dedent()
import textwrap
def test():
# end first line with \ to avoid the empty line!
s = """\
hello
world
"""
print(repr(s)) # prints ' hello\n world\n '
print(repr(textwrap.dedent(s))) # prints 'hello\n world\n'
If you forget to add the trailing \
character of s = '''\
or use s = '''hello
, you're going to have a bad time with black.
Play a sound⚑
pip install playsound
from playsound import playsound
playsound("path/to/file.wav")
Deep copy a dictionary⚑
import copy
d = {...}
d2 = copy.deepcopy(d)
Find the root directory of a package⚑
pyprojroot
finds the root working directory for your project as a pathlib
object. You can now use the here function to pass in a relative path from the project root directory (no matter what working directory you are in the project), and you will get a full path to the specified file.
Installation⚑
pip install pyprojroot
Usage⚑
from pyprojroot import here
here()
Check if an object has an attribute⚑
if hasattr(a, "property"):
a.property
Check if a loop ends completely⚑
for
loops can take an else
block which is not run if the loop has ended with a break
statement.
for i in [1, 2, 3]:
print(i)
if i == 3:
break
else:
print("for loop was not broken")
Merge two lists⚑
z = x + y
Merge two dictionaries⚑
z = {**x, **y}
Create user defined exceptions⚑
Programs may name their own exceptions by creating a new exception class. Exceptions should typically be derived from the Exception
class, either directly or indirectly.
Exception classes are meant to be kept simple, only offering a number of attributes that allow information about the error to be extracted by handlers for the exception. When creating a module that can raise several distinct errors, a common practice is to create a base class for exceptions defined by that module, and subclass that to create specific exception classes for different error conditions:
class Error(Exception):
"""Base class for exceptions in this module."""
class ConceptNotFoundError(Error):
"""Transactions with unmatched concept."""
def __init__(self, message: str, transactions: List[Transaction]) -> None:
"""Initialize the exception."""
self.message = message
self.transactions = transactions
super().__init__(self.message)
Most exceptions are defined with names that end in “Error”, similar to the naming of the standard exceptions.
Import a module or it's objects from within a python program⚑
import importlib
module = importlib.import_module("os")
module_class = module.getcwd
relative_module = importlib.import_module(".model", package="mypackage")
class_to_extract = "MyModel"
extracted_class = geattr(relative_module, class_to_extract)
The first argument specifies what module to import in absolute or relative terms (e.g. either pkg.mod
or ..mod
). If the name is specified in relative terms, then the package argument must be set to the name of the package which is to act as the anchor for resolving the package name (e.g. import_module('..mod', 'pkg.subpkg')
will import pkg.mod
).
Get system's timezone and use it in datetime⚑
To obtain timezone information in the form of a datetime.tzinfo
object, use dateutil.tz.tzlocal()
:
from dateutil import tz
myTimeZone = tz.tzlocal()
This object can be used in the tz
parameter of datetime.datetime.now()
:
from datetime import datetime
from dateutil import tz
localisedDatetime = datetime.now(tz=tz.tzlocal())
Capitalize a sentence⚑
To change the caps of the first letter of the first word of a sentence use:
>> sentence = "add funny Emojis"
>> sentence[0].upper() + sentence[1:]
Add funny Emojis
The .capitalize
method transforms the rest of words to lowercase. The .title
transforms all sentence words to capitalize.
Get the last monday datetime⚑
import datetime
today = datetime.date.today()
last_monday = today - datetime.timedelta(days=today.weekday())
Issues⚑
- Pypi won't allow you to upload packages with direct dependencies: update the section above.