August of 2024
Life Management⚑
Time management⚑
Org Mode⚑
-
New: Change the default org-todo-keywords.
orig = '''* NEW_TODO_STATE First entry * NEW_DONE_STATE Second entry''' doc = loads(orig, environment={ 'org-todo-keywords': "NEW_TODO_STATE | NEW_DONE_STATE" })
Life Chores Management⚑
Grocy⚑
-
New: Add API and python library docs.
There is no active python library, although it existed pygrocy
Knowledge Management⚑
-
New: Use ebops to create anki cards.
- Ask the AI to generate Anki cards based on the content.
- Save those anki cards in an orgmode (
anki.org
) document - Use
ebops add-anki-notes
to automatically add them to Anki
Anki⚑
Aleph⚑
-
New: API Usage.
The Aleph web interface is powered by a Flask HTTP API. Aleph supports an extensive API for searching documents and entities. It can also be used to retrieve raw metadata, source documents and other useful details. Aleph's API tries to follow a pragmatic approach based on the following principles:
- All API calls are prefixed with an API version; this version is /api/2/.
- Responses and requests are both encoded as JSON. Requests should have the Content-Type and Accept headers set to application/json.
- The application uses Representational State Transfer (REST) principles where convenient, but also has some procedural API calls.
- The API allows API Authorization via an API key or JSON Web Tokens.
Authentication and authorization
By default, any Aleph search will return only public documents in responses to API requests.
If you want to access documents which are not marked public, you will need to sign into the tool. This can be done through the use on an API key. The API key for any account can be found by clicking on the "Profile" menu item in the navigation menu.
The API key must be sent on all queries using the Authorization HTTP header:
Authorization: ApiKey 363af1e2b03b41c6b3adc604956e2f66
Alternatively, the API key can also be sent as a query parameter under the api_key key.
Similarly, a JWT can be sent in the Authorization header, after it has been returned by the login and/or OAuth processes. Aleph does not use session cookies or any other type of stateful API.
-
New: Crossreferencing mentions with entities.
Mentions are names of people or companies that Aleph automatically extracts from files you upload. Aleph includes mentions when cross-referencing a collection, but only in one direction.
Consider the following example:
- "Collection A" contains a file. The file mentions "John Doe".
- "Collection B" contains a Person entity named "John Doe".
If you cross-reference “Collection A”, Aleph includes the mention of “John Doe” in the cross-referencing and will find a match for it in “Collection B”.
However, if you cross-reference “Collection B”, Aleph doesn't consider mentions when trying to find a match for the Person entity.
As long as you only want to compare the mentions in one specific collection against entities (but not mentions) in another collection, Aleph’s cross-ref should be able to do that. If you want to compare entities in a specific collection against other entities and mentions in other collections, you will have to do that yourself.
If you have a limited number of collection, one option might be to fetch all mentions and automatically create entities for each mention using the API.
To fetch a list of mentions for a collection you can use the
/api/2/entities?filter:collection_id=137&filter:schemata=Mention
API request.
Coding⚑
Languages⚑
Configure Docker to host the application⚑
-
New: Minify the images.
dive and slim are two cli tools you can use to optimise the size of your dockers.
Python Snippets⚑
-
New: Send a linux desktop notification.
To show a Linux desktop notification from a Python script, you can use the
notify2
library (although it's last commit was done on 2017. This library provides an easy way to send desktop notifications on Linux.Alternatively, you can use the
subprocess
module to call thenotify-send
command-line utility directly. This is a more straightforward method but requiresnotify-send
to be installed.import subprocess def send_notification(title: str, message: str = "", urgency: str = "normal") -> None: """Send a desktop notification using notify-send. Args: title (str): The title of the notification. message (str): The message body of the notification. Defaults to an empty string. urgency (str): The urgency level of the notification. Can be 'low', 'normal', or 'critical'. Defaults to 'normal'. """ subprocess.run(["notify-send", "-u", urgency, title, message])
-
import traceback def cause_error(): return 1 / 0 # This will raise a ZeroDivisionError try: cause_error() except Exception as error: # Capture the exception traceback as a string error_message = "".join(traceback.format_exception(None, error, error.__traceback__)) print("An error occurred:\n", error_message)
questionary⚑
-
New: Unit testing questionary code.
Testing
questionary
code can be challenging because it involves interactive prompts that expect user input. However, there are ways to automate the testing process. You can use libraries likepexpect
,pytest
, andpytest-mock
to simulate user input and test the behavior of your code.Here’s how you can approach testing
questionary
code usingpytest-mock
to mockquestionary
functionsYou can mock
questionary
functions likequestionary.select().ask()
to simulate user choices without actual user interaction.Testing a single
questionary.text
promptLet's assume you have a function that asks the user for their name:
import questionary def ask_name() -> str: name = questionary.text("What's your name?").ask() return name
You can test this function by mocking the
questionary.text
prompt to simulate the user's input.import pytest from your_module import ask_name def test_ask_name(mocker): # Mock the text function to simulate user input mock_text = mocker.patch('questionary.text') # Define the response for the prompt mock_text.return_value.ask.return_value = "Alice" result = ask_name() assert result == "Alice"
Test a function that has many questions
Here’s an example of how to test a function that contains two
questionary.text
prompts usingpytest-mock
.Let's assume you have a function that asks for the first and last names of a user:
import questionary def ask_full_name() -> dict: first_name = questionary.text("What's your first name?").ask() last_name = questionary.text("What's your last name?").ask() return {"first_name": first_name, "last_name": last_name}
You can mock both
questionary.text
calls to simulate user input for both the first and last names:import pytest from your_module import ask_full_name def test_ask_full_name(mocker): # Mock the text function for the first name prompt mock_text_first = mocker.patch('questionary.text') # Define the response for the first name prompt mock_text_first.side_effect = ["Alice", "Smith"] result = ask_full_name() assert result == {"first_name": "Alice", "last_name": "Smith"}
Coding tools⚑
Coding with AI⚑
-
New: Add new prompts for developers.
- trigger: :polish form: | Polish the next code [[code]] with the next conditions: - Use type hints on all functions and methods - Add or update the docstring using google style on all functions and methods form_fields: code: multiline: true - trigger: :commit form: | Act as an expert developer. Create a message commit with the next conditions: - follow semantic versioning - create a semantic version comment per change - include all comments in a raw code block so that it's easy to copy for the following diff [[text]] form_fields: text: multiline: true
-
Correction: Update the ai prompts.
```yaml matches: - trigger: :function form: | Create a function with: - type hints - docstrings for all classes, functions and methods - docstring using google style with line length less than 89 characters - adding logging traces using the log variable log = logging.getLogger(name) - Use fstrings instead of %s - If you need to open or write a file always set the encoding to utf8 - If possible add an example in the docstring - Just give the code, don't explain anything
Called [[name]] that: [[text]] form_fields: text: multiline: true
-
trigger: :class form: | Create a class with:
- type hints
- docstring using google style with line length less than 89 characters
- use docstrings on the class and each methods
- adding logging traces using the log variable log = logging.getLogger(name)
- Use fstrings instead of %s
- If you need to open or write a file always set the encoding to utf8
- If possible add an example in the docstring
- Just give the code, don't explain anything
Called [[name]] that: [[text]] form_fields: text: - trigger: :class form: | ... - Use paragraphs to separate the AAA blocks and don't add comments like # Arrange or # Act or # Act/Assert or # Assert. So the test will only have black lines between sections - In the Act section if the function to test returns a value always name that variable result. If the function to test doesn't return any value append an # act comment at the end of the line. - If the test uses a pytest.raises there is no need to add the # act comment - Don't use mocks - Use fstrings instead of %s - Gather all tests over the same function on a common class - If you need to open or write a file always set the encoding to utf8 - Just give the code, don't explain anything
form_fields: text: - trigger: :polish form: | ... - Add or update the docstring using google style on all classes, functions and methods - Wrap the docstring lines so they are smaller than 89 characters - All docstrings must start in the same line as the """ - Add logging traces using the log variable log = logging.getLogger(name) - Use f-strings instead of %s - Just give the code, don't explain anything form_fields: code: multiline: true - trigger: :text form: | Polish the next text by:
- Summarising each section without losing relevant data
- Tweak the markdown format
- Improve the wording
[[text]] form_fields: text: multiline: true
-
trigger: :readme form: | Create the README.md taking into account:
- Use GPLv3 for the license
- Add Lyz as the author
- Add an installation section
- Add an usage section
of: [[text]]
form_fields: text: multiline: true ``` feat(aleph#Get all documents of a collection): Get all documents of a collection
list_aleph_collection_documents.py
is a Python script designed to interact with an API to retrieve and analyze documents from specified collections. It offers a command-line interface (CLI) to list and check documents within a specified collection.Features
- Retrieve documents from a specified collection.
- Analyze document processing statuses and warn if any are not marked as successful.
- Return a list of filenames from the retrieved documents.
- Supports verbose output for detailed logging.
- Environment variable support for API key management.
Installation
To install the required dependencies, use
pip
:pip install typer requests
Ensure you have Python 3.6 or higher installed.
Create the file
list_aleph_collection_documents.py
with the next contents:import logging import requests from typing import List, Dict, Any, Optional import logging import typer from typing import List, Dict, Any log = logging.getLogger(__name__) app = typer.Typer() @app.command() def get_documents( collection_name: str = typer.Argument(...), api_key: Optional[str] = typer.Option(None, envvar="API_KEY"), base_url: str = typer.Option("https://your.aleph.org"), verbose: bool = typer.Option( False, "--verbose", "-v", help="Enable verbose output" ), ): """CLI command to retrieve documents from a specified collection.""" if verbose: logging.basicConfig(level=logging.DEBUG) log.debug("Verbose mode enabled.") else: logging.basicConfig(level=logging.INFO) if api_key is None: log.error( "Please specify your api key either through the --api-key argument " "or through the API_KEY environment variable" ) raise typer.Exit(code=1) try: documents = list_collection_documents(api_key, base_url, collection_name) filenames = check_documents(documents) if filenames: print("\n".join(filenames)) else: log.warning("No documents found.") except Exception as e: log.error(f"Failed to retrieve documents: {e}") raise typer.Exit(code=1) def list_collection_documents( api_key: str, base_url: str, collection_name: str ) -> List[Dict[str, Any]]: """ Retrieve documents from a specified collection using pagination. Args: api_key (str): The API key for authentication. base_url (str): The base URL of the API. collection_name (str): The name of the collection to retrieve documents from. Returns: List[Dict[str, Any]]: A list of documents from the specified collection. Example: >>> docs = list_collection_documents("your_api_key", "https://api.example.com", "my_collection") >>> print(len(docs)) 1000 """ headers = { "Authorization": f"ApiKey {api_key}", "Accept": "application/json", "Content-Type": "application/json", } collections_url = f"{base_url}/api/2/collections" documents_url = f"{base_url}/api/2/entities" log.debug(f"Requesting collections list from {collections_url}") collections = [] params = {"limit": 300} while True: response = requests.get(collections_url, headers=headers, params=params) response.raise_for_status() data = response.json() collections.extend(data["results"]) log.debug( f"Fetched {len(data['results'])} collections, " f"page {data['page']} of {data['pages']}" ) if not data["next"]: break params["offset"] = params.get("offset", 0) + data["limit"] collection_id = next( (c["id"] for c in collections if c["label"] == collection_name), None ) if not collection_id: log.error(f"Collection {collection_name} not found.") return [] log.info(f"Found collection '{collection_name}' with ID {collection_id}") documents = [] params = { "q": "", "filter:collection_id": collection_id, "filter:schemata": "Document", "limit": 300, } while True: log.debug(f"Requesting documents from collection {collection_id}") response = requests.get(documents_url, headers=headers, params=params) response.raise_for_status() data = response.json() documents.extend(data["results"]) log.info( f"Fetched {len(data['results'])} documents, " f"page {data['page']} of {data['pages']}" ) if not data["next"]: break params["offset"] = params.get("offset", 0) + data["limit"] log.info(f"Retrieved {len(documents)} documents from collection {collection_name}") return documents def check_documents(documents: List[Dict[str, Any]]) -> List[str]: """Analyze the processing status of documents and return a list of filenames. Args: documents (List[Dict[str, Any]]): A list of documents in JSON format. Returns: List[str]: A list of filenames from documents with a successful processing status. Raises: None, but logs warnings if a document's processing status is not 'success'. Example: >>> docs = [{"properties": {"processingStatus": ["success"], "fileName": ["file1.txt"]}}, >>> {"properties": {"processingStatus": ["failed"], "fileName": ["file2.txt"]}}] >>> filenames = check_documents(docs) >>> print(filenames) ['file1.txt'] """ filenames = [] for doc in documents: status = doc.get("properties", {}).get("processingStatus")[0] filename = doc.get("properties", {}).get("fileName")[0] if status != "success": log.warning( f"Document with filename {filename} has processing status: {status}" ) if filename: filenames.append(filename) log.debug(f"Collected filenames: {filenames}") return filenames if __name__ == "__main__": app()
Get your API key
By default, any Aleph search will return only public documents in responses to API requests.
If you want to access documents which are not marked public, you will need to sign into the tool. This can be done through the use on an API key. The API key for any account can be found by clicking on the "Settings" menu item in the navigation menu.
Usage
You can run the script directly from the command line. Below are examples of usage:
Retrieve and list documents from a collection:
python list_aleph_collection_documents.py --api-key "your-api-key" 'Name of your collection'
Using an Environment Variable for the API Key
This is better from a security perspective.
export API_KEY=your_api_key python list_aleph_collection_documents.py 'Name of your collection'
Enabling Verbose Logging
To enable detailed debug logs, use the
--verbose
or-v
flag:python list_aleph_collection_documents.py -v 'Name of your collection'
Getting help
python list_aleph_collection_documents.py --help
-
Espanso⚑
-
New: Desktop application to add words easily.
Going into the espanso config files to add words is cumbersome, to make things easier you can use the
espansadd
Python script.I'm going to assume that you have the following prerequisites:
- A Linux distribution with i3 window manager installed.
- Python 3 installed.
- Espanso installed and configured.
ruyaml
andtkinter
Python libraries installed.notify-send
installed.- Basic knowledge of editing configuration files in i3.
Installation
Create a new Python script named
espansadd.py
with the following content:import tkinter as tk from tkinter import simpledialog import traceback import subprocess import os import sys from ruyaml import YAML from ruyaml.scanner import ScannerError file_path = os.path.expanduser("~/.config/espanso/match/typofixer_overwrite.yml") def append_to_yaml(file_path: str, trigger: str, replace: str) -> None: """Appends a new entry to the YAML file. Args:ath file_path (str): The file to append the new entry. trigger (str): The trigger string to be added. replace (str): The replacement string to be added. """ # Define the new snippet new_entry = { "trigger": trigger, "replace": replace, "propagate_case": True, "word": True, } # Load existing data or initialize an empty list try: with open(os.path.expanduser(file_path), "r") as f: try: data = YAML().load(f) except ScannerError as e: send_notification( f"Error parsing yaml of configuration file {file_path}", f"{e.problem_mark}: {e.problem}", "critical", ) sys.exit(1) except FileNotFoundError: send_notification( f"Error opening the espanso file {file_path}", urgency="critical" ) sys.exit(1) data["matches"].append(new_entry) # Write the updated data back to the file with open(os.path.expanduser(file_path), "w+") as f: yaml = YAML() yaml.default_flow_style = False yaml.dump(data, f) def send_notification(title: str, message: str = "", urgency: str = "normal") -> None: """Send a desktop notification using notify-send. Args: title (str): The title of the notification. message (str): The message body of the notification. Defaults to an empty string. urgency (str): The urgency level of the notification. Can be 'low', 'normal', or 'critical'. Defaults to 'normal'. """ subprocess.run(["notify-send", "-u", urgency, title, message]) def main() -> None: """Main function to prompt user for input and append to the YAML file.""" # Create the main Tkinter window (it won't be shown) window = tk.Tk() window.withdraw() # Hide the main window # Prompt the user for input trigger = simpledialog.askstring("Espanso add input", "Enter trigger:") replace = simpledialog.askstring("Espanso add input", "Enter replace:") # Check if both inputs were provided try: if trigger and replace: append_to_yaml(file_path, trigger, replace) send_notification("Espanso snippet added successfully") else: send_notification( "Both trigger and replace are required", urgency="critical" ) except Exception as error: error_message = "".join( traceback.format_exception(None, error, error.__traceback__) ) send_notification( "There was an unknown error adding the espanso entry", error_message, urgency="critical", ) if __name__ == "__main__": main()
Ensure the script has executable permissions. Run the following command:
chmod +x espansadd.py
To make the
espansadd
script easily accessible, we can configure a key binding in i3 to run the script. Open your i3 configuration file, typically located at~/.config/i3/config
or~/.i3/config
, and add the following lines:bindsym $mod+Shift+e exec --no-startup-id /path/to/your/espansadd.py
Replace
/path/to/your/espansadd.py
with the actual path to your script.If you also want the popup windows to be in floating mode add
for_window [title="Espanso add input"] floating enable
After editing the configuration file, reload i3 to apply the changes. You can do this by pressing
Mod
+Shift
+R
(whereMod
is typically theSuper
orWindows
key) or by running the following command:i3-msg reload
Usage
Now that everything is set up, you can use the
espansadd
script by pressingMod
+Shift
+E
. This will open a dialog where you can enter the trigger and replacement text for the new Espanso snippet. After entering the information and pressing Enter, a notification will appear confirming the snippet has been added, or showing an error message if something went wrong.
OCR⚑
Camelot⚑
-
New: Introduce Camelot.
Camelot is a Python library that can help you extract tables from PDFs
import camelot tables = camelot.read_pdf('foo.pdf') tables <TableList n=1> tables.export('foo.csv', f='csv', compress=True) # json, excel, html, markdown, sqlite tables[0] <Table shape=(7, 7)> tables[0].parsing_report { 'accuracy': 99.02, 'whitespace': 12.24, 'order': 1, 'page': 1 } tables[0].to_csv('foo.csv') # to_json, to_excel, to_html, to_markdown, to_sqlite tables[0].df # get a pandas DataFrame!
To install Camelot from PyPI using pip, please include the extra cv requirement as shown:
$ pip install "camelot-py[base]"
It requires Ghostscript to be able to use the
lattice
mode. Which is better than usingtabular-py
that requires java to be installed.Usage
To detect line segments, Lattice needs the lines that make the table to be in the foreground. To process background lines, you can pass process_background=True.
tables = camelot.read_pdf('background_lines.pdf', process_background=True)
tables[1].df
References - Docs
DevOps⚑
Continuous Integration⚑
Bandit⚑
-
New: Solving warning B603: subprocess_without_shell_equals_true.
The
B603: subprocess_without_shell_equals_true
issue in Bandit is raised when thesubprocess
module is used without settingshell=True
. Bandit flags this because usingshell=True
can be a security risk if the command includes user-supplied input, as it opens the door to shell injection attacks.To fix it:
- Avoid
shell=True
if possible: Instead, pass the command and its arguments as a list tosubprocess.Popen
(orsubprocess.run
,subprocess.call
, etc.). This way, the command is executed directly without invoking the shell, reducing the risk of injection attacks.
Here's an example:
import subprocess # Instead of this: # subprocess.Popen("ls -l", shell=True) # Do this: subprocess.Popen(["ls", "-l"])
- When you must use
shell=True
: - If you absolutely need to useshell=True
(e.g., because you are running a complex shell command or using shell features like wildcards), ensure that the command is either hardcoded or sanitized to avoid security risks.
Example with
shell=True
:import subprocess # Command is hardcoded and safe command = "ls -l | grep py" subprocess.Popen(command, shell=True)
If the command includes user input, sanitize the input carefully:
import subprocess user_input = "some_directory" command = f"ls -l {subprocess.list2cmdline([user_input])}" subprocess.Popen(command, shell=True)
Note: Even with precautions, using
shell=True
is risky with user input, so avoid it if possible.- Explicitly tell bandit you have considered the risk: If you have reviewed the code and are confident that the code is safe in your particular case, you can mark the line with a
# nosec
comment to tell Bandit to ignore the issue:
import subprocess command = "ls -l | grep py" subprocess.Popen(command, shell=True) # nosec
- Avoid
Operating Systems⚑
Linux⚑
Linux Snippets⚑
-
Correction: Docker prune without removing the manual networks.
If you run the command
docker system prune
in conjunction with watchtower and manually defined networks you may run into the issue that the docker system prune acts just when the dockers are stopped and thus removing the networks, which will prevent the dockers to start. In those cases you can either make sure that docker system prune is never run when watchtower is doing the updates or you can split the command into the next script:date echo "Pruning the containers" docker container prune -f --filter "label!=prune=false" echo "Pruning the images" docker image prune -f --filter "label!=prune=false" echo "Pruning the volumes" docker volume prune -f
Wireguard⚑
-
New: Check the status of the tunnel.
One method is to do ping between VPN IP addresses or run command
wg show`` from the server or from the client. Below you can see
wg show`` command output where VPN is not up.$: wg show interface: wg0 public key: qZ7+xNeXCjKdRNM33Diohj2Y/KSOXwvFfgTS1LRx+EE= private key: (hidden) listening port: 45703 peer: mhLzGkqD1JujPjEfZ6gkbusf3sfFzy+1KXBwVNBRBHs= endpoint: 3.133.147.235:51820 allowed ips: 10.100.100.1/32 transfer: 0 B received, 592 B sent persistent keepalive: every 21 seconds
The below output from the
wg show
command indicates the VPN link is up. See the line withlast handshake time
$: wg show interface: wg0 public key: qZ7+xNeXCjKdRNM33Diohj2Y/KSOXwvFfgTS1LRx+EE= private key: (hidden) listening port: 49785 peer: 6lf4SymMbY+WboI4jEsM+P9DhogzebSULrkFowDTt0M= endpoint: 3.133.147.235:51820 allowed ips: 10.100.100.1/32 latest handshake: 14 seconds ago transfer: 732 B received, 820 B sent persistent keepalive: every 21 seconds