28th August 2025
Militancy⚑
Collaborating tools⚑
Aleph⚑
-
New: Add warning for new potential users.
WARNING: Check out the investigative journalism article before using Aleph
-
New: Compare investigative journalism tools.
After reviewing Aleph Pro, Open Aleph, DARC and Datashare I feel that the investigative reporting software environment, as of June 2025, is very brittle and under a huge crisis that will have a breaking point on October of 2025, day where OCCRP will do the switch from Aleph to Aleph Pro.
Given this scenario, I think the best thing to do, if you already have an Aleph instance, is to keep using it until things stabilise. As you won\'t have software updates since October 2025, I suggest that from then on you protect the service behind an VPN and/or SSL client certificate.
I also feel it\'s also a great moment to make a strategic decision on how you want to use an investigative reporting platform. Some key questions are:
- How much do you care of our data being leaked or lost by third parties?
- How much do you trust OCCRP, DARC or ICIJ?
- Do you want to switch from a self hosted platform to an external managed one? Will they be able to give a better service?
- If you want to stay on the self hosted solution. Shall you migrate to Datashare instead of Open Aleph?
- How dependent are you on open source software? How fragile are the teams that support that software? Can you help change that fragility?
- Shall you use the AI for your investigative processes? If so, where and when?
I hope the analysis below may help shed some light on some of these questions. The only one that is not addressed is the AI as it\'s a more political, philosophical one, that would make the document even longer.
Analysis of the present
Development dependent in US government
The two main software are developed by non-profits, Aleph by OCCRP and Datashare by ICIJ, that received part of their funding from the US government. This funding was lost after Naranjitler\'s administration funding cuts:
- OCCRP lost this year 38% of their operational funds: \"As a result, we had to lay off 40 people --- one fifth of our staff --- and have temporarily reduced some of the salaries of others. But there is more. OCCRP has also been funding a number of organizations across Europe, in some of the most difficult countries. Eighty percent of those sub-grants that we provide to other newsrooms have been cut as well.\"
- ICIJ lost this year 8.6% of their operational funds with no apparent effects on the software or the staff.
OCCRP decided to close the source code of Aleph triggering the team split up
With OCCRP decision to close the source of Aleph an important part of their team (the co-leader of the research and data team, the Chief Data Editor and a developer) decided to leave the project and fund DARC.
Although they look to be in good terms. They\'re collaborating, although it could be OCCRP trying to save their face on this dark turn.
These software have very few developers and no community behind them
- Aleph looks to be currently developed by 2 developers, and it\'s development has stagnated since the main developer (`pudo`) stopped developing and moved to Open Sanctions 4 years ago. 3 key members of their team have moved on to Open Aleph after the split. We can only guess if this is the same team that is developing Aleph Pro. If it\'s not then they are developers we know nothing about and can\'t audit.
- Open Aleph looks to be developed by 4 people, 3 of them were part of the Aleph development team until the break up. The other one created a company to host Aleph instances 4 years ago.
- Datashare seems to be developed by 6 developers.
In all projects pull requests by the community have been very scarce.
The community support is not that great
My experience requesting features, proposing fixes with Aleph before the split is that they answer well on their slack, but are slow on the issues and the pull requests that fall outside their planned roadmap. Even if they are bugs. I\'ve been running a script on each ingest to fix an UI bug for a year already. I tried to work with them in solving it without success.
I don\'t have experience with Datashare, but they do answer and fix the issues the people open.
Analysis of the available software
Aleph Pro
The analysis is based on their announcement and their FAQ.
Pros
- As long as you have less than 1TB of data and are a nonprofit it will, for now, cost you way less than hosting your solution
- OCCRP is behind the project
Cons
- They seem to have an unstable, small and broken development team
-
They only offer 1TB of data which is enough for small to medium projects but doesn\'t give much space to grow
-
They lost my trust
There are several reasons that make me hesitant to trust them:
- They don\'t want to publish their source code
- They decided that the path to solve a complicated financial situation is to close their source code
- They advocated in the past (and even now!) that being open-sourced was a corner part of the project and yet they close their code.
- They hid that 52% of their funding came from the US government.
With the next consequences:
- I would personally not give them my data or host their software.
- I wouldn\'t be surprised if in the future they retract on their promises, such as offering Aleph Pro for free forever for nonprofit journalism organizations.
-
You loose sovereignty of your data
Either if you upload your data to their servers or host a closed sourced program in yours, you have no way of knowing what are they doing with your data. Given their economical situation, doing business with the data could be an option.
It could also be potentially difficult to extract your data in the future.
-
You loose sovereignty of your service
If they host the service you depend on them for any operations such as maintenance, upgrades, and keeping the service up.
You\'ll also no longer be able to change the software to apply patches to problems and would depend on them for their implementation and application.
You\'ll no longer have any easy way to know what does the program do. This is critical from a security point of view as introduced backdoors would go unnoticed. It\'s also worrying as we could not audit how they implement the AI. It is known that AI solutions tend to be biased and may thwart the investigative process.
Finally, if they decide to shutdown or change the conditions you\'re sold.
-
I looks like they are selling smoke
Their development activity has dropped in the recent years, they have a weakened team and yet they are promising a complete rewrite to create a brand new software. In an announce that is filled with buzzwords such as AI without giving any solid evidence.
I feel that the whole announcement is written to urge people to buy their product and to save their face. Its not written to the community or their users, is for those that can give them money.
-
They offer significant performance upgrades and lower infrastructure costs at the same time that they incorporate the latest developments in data-analysis and AI
Depending on how they materialise the data-analysis and AI new features it will mean a small to a great increase in infrastructure costs. Hosting these processes is very resource intensive and expensive.
The decrease in infra costs may come from:
- Hosting many aleph instances under the same infrastructure is more efficient than each organisation having their own.
- They might migrate the code to a more efficient language like rust or go
So even though Aleph Pro will require more resources, as they are all going to be hosted in OCCRP it will be cheaper overall.
I\'m not sure how they want to implement the AI, I see two potential places:
- To improve the ingest process.
- To use LLM (like ChatGPT) to query the data.
Both features are very, very, very expensive resource wise. The only way to give those features at the same time as lowering the infra costs is by outsourcing the AI services. If they do this, it will mean that your data will be processed by that third party, with all the awful consequences it entails.
-
They are selling existent features as new or are part of other open source projects
Such as:
- Rebuilt the ingest pipeline: they recently released it in the latest versions of Aleph
- Modular design: The source is already modular (although it can always be improved)
- Enhanced data models for better linking, filtering, and insights. Their model is based on followthemoney which is open source.
-
-
They are ditching part of their user base
They only support self-hosted solutions to enterprise license clients. This leaves out small organisations or privacy minded individuals. Even this solution is said to be maintained in partnership with OCCRP.
-
The new version benefits may not be worth the costs
They say that Aleph Pro will deliver a faster, smarter, and more flexible platform, combining a modern technical foundation with user-centered design and major performance gains. But if you do not do a heavy use of the service you may not need some of these improvements. Although they for sure would be nice to have.
-
It could be unstable for a while
A complete platform rewrite is usually good in the long run but these kind of migrations tend to have an unstable period of time where some of the functionality might be missing
-
You need to make the decision blindly
Even though they are going to give a beta if you request it, I\'m not sure of this, before doing the switch, you need to make the decision beforehand. You may not even like the new software
Datashare
Pros
- ICIJ, a more reliable non profit, is behind it.
- Has the biggest and stable development team
- Is the most active project
- Better community support
- You can host it
- It\'s open source
Cons
- If you have an Aleph instance there is no documented way to migrate to Datashare. And there is still not an easy way to do the migration, as they don't yet use the
followthemoney
data schema. - They won\'t host an instance for you.
Open Aleph
Pros
- You can host it
- It\'s open source
- The hosted solution will probably cost you way less than hosting your own solution (although they don\'t show prices)
- The people behind it have proven their ethic values
- I know one of their developers. She is a fantastic person which is very involved in putting technology in the service of society, and has been active at the CCC.
- They are actively reaching out to give support with the migration
Cons
- A new small organisation is behind the project
- A small development team with few recent activity. Since their creation (3 months ago) their development pace is slow (the contributors don\'t even load). It could be because they are still setting up the new organisation and doing the fork.
- It may not have all the features Aleph has. They started the fork on November of 2024 and are 137 commits ahead and 510 behind. But they could be squash commits.
- Their community forum doesn\'t have much activity
- The remote hosted solution has the same problems as Aleph Pro in terms of data and service sovereignty. Although I do trust more DARC than OCCRP.
Life navigation⚑
Time navigation⚑
Org Mode⚑
-
New: Exporting to pdf or markdown.
To pdf
If you want to convert it with python you first need to install the dependencies:
sudo apt install texlive-xetex
Then you can do
pandoc input.org -o output.pdf --pdf-engine=xelatex -V geometry:margin=1in -V fontsize=11pt -V colorlinks=true
To markdown
pandoc input.org -o output.md
Content Management⚑
News Management⚑
-
New: Add forensic architecture.
- Forensic architecture: Forensic Architecture (FA) is a research agency based at Goldsmiths, University of London. Our mandate is to develop, employ, and disseminate new techniques, methods, and concepts for investigating state and corporate violence. Our team includes architects, software developers, filmmakers, investigative journalists, scientists, and lawyers.
Torrent management⚑
qBittorrent⚑
Technology⚑
Coding⚑
Bash snippets⚑
-
New: Add context switches column meanings.
- UID: The real user identification number of the task being monitored.
- USER: The name of the real user owning the task being monitored.
- PID: The identification number of the task being monitored.
- cswch/s: Total number of voluntary context switches the task made per second. A voluntary context switch occurs when a task blocks because it requires a resource that is unavailable.
- nvcswch/s: Total number of non voluntary context switches the task made per second. A involuntary context switch takes place when a task executes for the duration of its time slice and then is forced to relinquish the processor.
- Command: The command name of the task.
Configure Docker to host the application⚑
-
New: Do a copy of a list of docker images in your private registry.
set -euo pipefail SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" IMAGE_LIST_FILE="${1:-${SCRIPT_DIR}/bitnami-images.txt}" TARGET_REGISTRY="${2:-}" if [[ -z "$TARGET_REGISTRY" ]]; then echo "Usage: $0 <image_list_file> <target_registry>" echo "Example: $0 bitnami-images.txt your.docker.registry.org" exit 1 fi if [[ ! -f "$IMAGE_LIST_FILE" ]]; then echo "Error: Image list file '$IMAGE_LIST_FILE' not found" exit 1 fi log() { echo "[$(date '+%Y-%m-%d %H:%M:%S')] $*" } extract_image_name_and_tag() { local full_image="$1" # Remove registry prefix to get org/repo:tag # Examples: # docker.io/bitnami/discourse:3.4.7 -> bitnami/discourse:3.4.7 # registry-1.docker.io/bitnami/os-shell:11-debian-11-r95 -> bitnami/os-shell:11-debian-11-r95 if [[ "$full_image" =~ ^[^/]*\.[^/]+/ ]]; then # Contains registry with dot - remove everything up to first / echo "${full_image#*/}" else # No registry prefix echo "$full_image" fi } pull_and_push_multiarch() { local source_image="$1" local target_registry="$2" local image_name_with_tag image_name_with_tag=$(extract_image_name_and_tag "$source_image") local target_image="${target_registry}/${image_name_with_tag}" log "Processing: $source_image -> $target_image" local pushed_images=() local architectures=("linux/amd64" "linux/arm64") local arch_suffixes=("amd64" "arm64") # Try to pull and push each architecture for i in "${!architectures[@]}"; do local platform="${architectures[$i]}" local arch_suffix="${arch_suffixes[$i]}" log "Attempting to pull ${platform} image: $source_image" if sudo docker pull --platform "$platform" "$source_image" 2>/dev/null; then log "Successfully pulled ${platform} image" # Tag with architecture-specific tag for manifest creation local arch_specific_tag="${target_image}-${arch_suffix}" sudo docker tag "$source_image" "$arch_specific_tag" log "Pushing ${platform} image as ${arch_specific_tag}" if sudo docker push "$arch_specific_tag"; then log "Successfully pushed ${platform} image" pushed_images+=("$arch_specific_tag") else log "Failed to push ${platform} image" sudo docker rmi "$arch_specific_tag" 2>/dev/null || true fi else log "⚠️ ${platform} image not available for $source_image - skipping" fi done if [[ ${#pushed_images[@]} -eq 0 ]]; then log "❌ No images were successfully pushed for $source_image" return 1 fi # Create the main tag with proper multi-arch manifest if [[ ${#pushed_images[@]} -gt 1 ]]; then log "Creating multi-arch manifest for $target_image" # Remove any existing manifest (in case of retry) sudo docker manifest rm "$target_image" 2>/dev/null || true if sudo docker manifest create "$target_image" "${pushed_images[@]}"; then # Annotate each architecture in the manifest for i in "${!pushed_images[@]}"; do local arch_tag="${pushed_images[$i]}" local arch="${arch_suffixes[$i]}" sudo docker manifest annotate "$target_image" "$arch_tag" --arch "$arch" --os linux done log "Pushing multi-arch manifest to $target_image" if sudo docker manifest push "$target_image"; then log "✅ Successfully pushed multi-arch image: $target_image" else log "❌ Failed to push manifest for $target_image" return 1 fi else log "❌ Failed to create manifest for $target_image" return 1 fi else # Only one architecture - tag and push directly log "Single architecture available, pushing as $target_image" sudo docker tag "${pushed_images[0]}" "$target_image" if sudo docker push "$target_image"; then log "✅ Successfully pushed single-arch image: $target_image" else log "❌ Failed to push $target_image" return 1 fi fi # Clean up local images to save space sudo docker rmi "$source_image" "${pushed_images[@]}" 2>/dev/null || true if [[ ${#pushed_images[@]} -eq 1 ]]; then sudo docker rmi "$target_image" 2>/dev/null || true fi return 0 } main() { log "Starting multi-architecture image pull and push" log "Source list: $IMAGE_LIST_FILE" log "Target registry: $TARGET_REGISTRY" # Enable experimental CLI features for manifest commands export DOCKER_CLI_EXPERIMENTAL=enabled total_images=$(wc -l "$IMAGE_LIST_FILE") local processed_images=0 local successful_images=0 local failed_images=() while IFS= read -r image_line; do # Skip empty lines and comments [[ -z "$image_line" || "$image_line" =~ ^[[:space:]]*# ]] && continue # Remove leading/trailing whitespace image_line=$(echo "$image_line" | sed 's/^[[:space:]]*//;s/[[:space:]]*$//') [[ -z "$image_line" ]] && continue echo $image_line processed_images=$((processed_images + 1)) log "[$processed_images/$total_images] Processing: $image_line" if pull_and_push_multiarch "$image_line" "$TARGET_REGISTRY"; then successful_images=$((successful_images + 1)) log "✓ Success: $image_line" else failed_images+=("$image_line") log "✗ Failed: $image_line" fi log "Progress: $processed_images/$total_images completed" echo "----------------------------------------" done <"$IMAGE_LIST_FILE" log "Final Summary:" log "Total images processed: $processed_images" log "Successful: $successful_images" log "Failed: ${#failed_images[@]}" if [[ ${#failed_images[@]} -gt 0 ]]; then log "Failed images:" printf ' %s\n' "${failed_images[@]}" exit 1 fi log "🎉 All images processed successfully!" } main "$@"
-
New: Migrate away from bitnami.
Bitnami is changing their pull policy making it unfeasible to use their images (More info here: 1, 2, 3), so there is the need to migrate to other image providers.
Which alternative to use
The migration can be done to the official maintained images (although this has some disadvantages) or to any of the common docker image builders:
- https://github.com/home-operations/containers/
- https://github.com/linuxserver
- https://github.com/11notes
There is an effort to build a fork of bitnami images but it has not yet much inertia.
Regarding the alternatives of helm charts, a quick look shown this one.
Infrastructure archeology
First you need to know which images are you using, to do that you can:
- Clone al git repositories of a series of organisations and do a local grep
- Search all the container images in use in kubernetes that match a desired string
- Recursively pull a copy of all helm charts used by an argocd repository and then do a grep.
Create a local copy of the images
It's wise to make a copy of the used images in your local registry to be able to pull the dockers once bitnami does not longer let you.
To do that you can save the used images in a
bitnami-images.txt
file and run the next script:set -euo pipefail SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" IMAGE_LIST_FILE="${1:-${SCRIPT_DIR}/bitnami-images.txt}" TARGET_REGISTRY="${2:-}" if [[ -z "$TARGET_REGISTRY" ]]; then echo "Usage: $0 <image_list_file> <target_registry>" echo "Example: $0 bitnami-images.txt registry.cloud.icij.org" exit 1 fi if [[ ! -f "$IMAGE_LIST_FILE" ]]; then echo "Error: Image list file '$IMAGE_LIST_FILE' not found" exit 1 fi log() { echo "[$(date '+%Y-%m-%d %H:%M:%S')] $*" } extract_image_name_and_tag() { local full_image="$1" # Remove registry prefix to get org/repo:tag # Examples: # docker.io/bitnami/discourse:3.4.7 -> bitnami/discourse:3.4.7 # registry-1.docker.io/bitnami/os-shell:11-debian-11-r95 -> bitnami/os-shell:11-debian-11-r95 # registry-proxy.internal.cloud.icij.org/bitnami/minideb:stretch -> bitnami/minideb:stretch if [[ "$full_image" =~ ^[^/]*\.[^/]+/ ]]; then # Contains registry with dot - remove everything up to first / echo "${full_image#*/}" else # No registry prefix echo "$full_image" fi } pull_and_push_multiarch() { local source_image="$1" local target_registry="$2" local image_name_with_tag image_name_with_tag=$(extract_image_name_and_tag "$source_image") local target_image="${target_registry}/${image_name_with_tag}" log "Processing: $source_image -> $target_image" local pushed_images=() local architectures=("linux/amd64" "linux/arm64") local arch_suffixes=("amd64" "arm64") # Try to pull and push each architecture for i in "${!architectures[@]}"; do local platform="${architectures[$i]}" local arch_suffix="${arch_suffixes[$i]}" log "Attempting to pull ${platform} image: $source_image" if sudo docker pull --platform "$platform" "$source_image" 2>/dev/null; then log "Successfully pulled ${platform} image" # Tag with architecture-specific tag for manifest creation local arch_specific_tag="${target_image}-${arch_suffix}" sudo docker tag "$source_image" "$arch_specific_tag" log "Pushing ${platform} image as ${arch_specific_tag}" if sudo docker push "$arch_specific_tag"; then log "Successfully pushed ${platform} image" pushed_images+=("$arch_specific_tag") else log "Failed to push ${platform} image" sudo docker rmi "$arch_specific_tag" 2>/dev/null || true fi else log "⚠️ ${platform} image not available for $source_image - skipping" fi done if [[ ${#pushed_images[@]} -eq 0 ]]; then log "❌ No images were successfully pushed for $source_image" return 1 fi # Create the main tag with proper multi-arch manifest if [[ ${#pushed_images[@]} -gt 1 ]]; then log "Creating multi-arch manifest for $target_image" # Remove any existing manifest (in case of retry) sudo docker manifest rm "$target_image" 2>/dev/null || true if sudo docker manifest create "$target_image" "${pushed_images[@]}"; then # Annotate each architecture in the manifest for i in "${!pushed_images[@]}"; do local arch_tag="${pushed_images[$i]}" local arch="${arch_suffixes[$i]}" sudo docker manifest annotate "$target_image" "$arch_tag" --arch "$arch" --os linux done log "Pushing multi-arch manifest to $target_image" if sudo docker manifest push "$target_image"; then log "✅ Successfully pushed multi-arch image: $target_image" else log "❌ Failed to push manifest for $target_image" return 1 fi else log "❌ Failed to create manifest for $target_image" return 1 fi else # Only one architecture - tag and push directly log "Single architecture available, pushing as $target_image" sudo docker tag "${pushed_images[0]}" "$target_image" if sudo docker push "$target_image"; then log "✅ Successfully pushed single-arch image: $target_image" else log "❌ Failed to push $target_image" return 1 fi fi # Clean up local images to save space sudo docker rmi "$source_image" "${pushed_images[@]}" 2>/dev/null || true if [[ ${#pushed_images[@]} -eq 1 ]]; then sudo docker rmi "$target_image" 2>/dev/null || true fi return 0 } main() { log "Starting multi-architecture image pull and push" log "Source list: $IMAGE_LIST_FILE" log "Target registry: $TARGET_REGISTRY" # Enable experimental CLI features for manifest commands export DOCKER_CLI_EXPERIMENTAL=enabled total_images=$(wc -l "$IMAGE_LIST_FILE") local processed_images=0 local successful_images=0 local failed_images=() while IFS= read -r image_line; do # Skip empty lines and comments [[ -z "$image_line" || "$image_line" =~ ^[[:space:]]*# ]] && continue # Remove leading/trailing whitespace image_line=$(echo "$image_line" | sed 's/^[[:space:]]*//;s/[[:space:]]*$//') [[ -z "$image_line" ]] && continue echo $image_line processed_images=$((processed_images + 1)) log "[$processed_images/$total_images] Processing: $image_line" if pull_and_push_multiarch "$image_line" "$TARGET_REGISTRY"; then successful_images=$((successful_images + 1)) log "✓ Success: $image_line" else failed_images+=("$image_line") log "✗ Failed: $image_line" fi log "Progress: $processed_images/$total_images completed" echo "----------------------------------------" done <"$IMAGE_LIST_FILE" log "Final Summary:" log "Total images processed: $processed_images" log "Successful: $successful_images" log "Failed: ${#failed_images[@]}" if [[ ${#failed_images[@]} -gt 0 ]]; then log "Failed images:" printf ' %s\n' "${failed_images[@]}" exit 1 fi log "🎉 All images processed successfully!" } main "$@"
Replace a bitnami image with a local one
If you for some reason need to pull
bitnami/discourse:3.4.7
and get an error you need instead to pull it from{your_registry}/bitnami/discourse:3.4.7
. If you want to pull a specific architecture you can append it at the end ({your_registry}/bitnami/discourse:3.4.7-amd64
.If you need to do the changes in an argocd managed kubernetes application search in the
values.yaml
orvalues-{environment}.yaml
files for theimage:
string. If it's not defined you may need to look at the helm chart definition. To do that open theChart.yaml
file to find the chart and the version used. For example:--- apiVersion: v2 name: discourse version: 1.0.0 dependencies: - name: discourse version: 12.6.2 repository: https://charts.bitnami.com/bitnami
You can pull a local copy of the chart with:
- If the chart is using an
oci
url:
helm pull oci://registry-1.docker.io/bitnamicharts/postgresql --version 8.10.X --untar -d postgres8
- If it's using an
https
url:
helm pull cost-analyzer --repo https://kubecost.github.io/cost-analyzer/ --version 2.7.2
And inspect the
values.yaml
file and all the templates until you find which key value you need to add.feat(docker#Cannot invoke "jdk.internal.platform.CgroupInfo.getMountPoint()" because "anyController" is null) : Cannot invoke "jdk.internal.platform.CgroupInfo.getMountPoint()" because "anyController" is null
It's caused because the docker is not able to access the cgroups, this can be caused by a docker using the legacy cgroups v1 while the linux kernel (>6.12) is using the v2.
The best way to fix it is to upgrade the docker to use v2, if you can't you need to force the system to use the v1. To do that:
- Edit
/etc/default/grub
to add the configurationGRUB_CMDLINE_LINUX="systemd.unified_cgroup_hierarchy=0"
- Then update GRUB:
sudo update-grub
- Reboot
-
New: Clone al git repositories of a series of organisations.
It assumes you have
tea
configured to interact with the desired gitea instance.set -e ORGANIZATIONS=("ansible-playbooks" "ansible-roles") clone_org_repos() { local page=1 local has_more=true while [ "$has_more" = true ]; do echo "Fetching page $page..." local csv_output csv_output=$(tea repo ls --output csv --page "$page" 2>/dev/null || true) if [ -z "$csv_output" ] || [ "$csv_output" = '"owner","name","type","ssh"' ] || [ "$(echo "$csv_output" | wc -l)" -lt 3 ]; then echo "No more repositories found on page $page" has_more=false break fi local repo_count=0 while IFS=',' read -r owner name type ssh_url; do if [ "$owner" = '"owner"' ]; then continue fi owner=$(echo "$owner" | sed 's/"//g') name=$(echo "$name" | sed 's/"//g') ssh_url=$(echo "$ssh_url" | sed 's/"//g') # echo "owner: $owner name: $name ssh_url: $ssh_url" if [[ -n "$name" ]] && [[ -n "$ssh_url" ]] && [[ "${ORGANIZATIONS[*]}" =~ $owner ]]; then echo "Cloning repository: $name" if [ ! -d "$name" ]; then git clone "$ssh_url" "$owner/$name" || { echo "Failed to clone $name, skipping..." continue } else echo "Repository $name already exists, skipping..." fi repo_count=$((repo_count + 1)) fi done <<<"$csv_output" ((page++)) done cd .. echo "Finished processing $org" echo } main() { echo "Starting repository cloning process..." echo "Target organizations: ${ORGANIZATIONS[*]}" echo if ! command -v tea &>/dev/null; then echo "Error: 'tea' command not found. Please install gitea tea CLI." exit 1 fi if ! command -v git &>/dev/null; then echo "Error: 'git' command not found. Please install git." exit 1 fi for org in "${ORGANIZATIONS[@]}"; do if [ ! -d "$org" ]; then mkdir "$org" fi done clone_org_repos echo "Repository cloning process completed!" echo "Check the following directories:" for org in "${ORGANIZATIONS[@]}"; do if [ -d "$org" ]; then echo " - $org/ ($(find "$org" -maxdepth 1 -type d | wc -l) repositories)" fi done } main "$@"
Python Snippets⚑
-
New: Remove a directory with content.
import shutil shutil.rmtree(Path('/path/to/directory'))
-
New: Recursively find files if you only want the files and directories of the first level.
from pathlib import Path path = Path("/your/directory") for item in path.iterdir(): if item.is_file(): print(f"File: {item.name}") elif item.is_dir(): print(f"Directory: {item.name}")
feat(helm#Download a chart): Download a chart
If the chart is using an
oci
url:helm pull oci://registry-1.docker.io/bitnamicharts/postgresql --version 8.10.X --untar -d postgres8
If it's using an
https
url:helm pull cost-analyzer --repo https://kubecost.github.io/cost-analyzer/ --version 2.7.2
DevSecOps⚑
ArgoCD⚑
-
New: Recursively pull a copy of all helm charts used by an argocd repository.
Including the dependencies of the dependencies.
import argparse import logging import subprocess import sys from pathlib import Path from typing import Dict, List, Set import yaml class HelmChartPuller: def __init__(self): self.pulled_charts: Set[str] = set() self.setup_logging() def setup_logging(self): logging.basicConfig( level=logging.INFO, format="[%(asctime)s] %(levelname)s: %(message)s", datefmt="%Y-%m-%d %H:%M:%S", ) self.logger = logging.getLogger(__name__) def parse_chart_yaml(self, chart_file: Path) -> Dict: """Parse Chart.yaml file and return its contents.""" try: with open(chart_file, "r", encoding="utf-8") as f: return yaml.safe_load(f) or {} except Exception as e: self.logger.error(f"Failed to parse {chart_file}: {e}") return {} def get_dependencies(self, chart_data: Dict) -> List[Dict]: """Extract dependencies from chart data.""" return chart_data.get("dependencies", []) def is_chart_pulled(self, name: str, version: str) -> bool: """Check if chart has already been pulled.""" chart_id = f"{name}-{version}" return chart_id in self.pulled_charts def mark_chart_pulled(self, name: str, version: str): """Mark chart as pulled to avoid duplicates.""" chart_id = f"{name}-{version}" self.pulled_charts.add(chart_id) def pull_chart(self, name: str, version: str, repository: str) -> bool: """Pull a Helm chart using appropriate method (OCI or traditional).""" if self.is_chart_pulled(name, version): self.logger.info(f"Chart {name}-{version} already pulled, skipping") return True self.logger.info(f"Pulling chart: {name} version {version} from {repository}") try: if repository.startswith("oci://"): oci_url = f"{repository}/{name}" cmd = ["helm", "pull", oci_url, "--version", version, "--untar"] else: cmd = [ "helm", "pull", name, "--repo", repository, "--version", version, "--untar", ] result = subprocess.run(cmd, capture_output=True, text=True, check=True) self.logger.info(f"Successfully pulled chart: {name}-{version}") self.mark_chart_pulled(name, version) return True except subprocess.CalledProcessError as e: self.logger.error(f"Failed to pull chart {name}-{version}: {e.stderr}") return False def process_chart_dependencies(self, chart_file: Path): """Process dependencies from a Chart.yaml file recursively.""" self.logger.info(f"Processing dependencies from: {chart_file}") chart_data = self.parse_chart_yaml(chart_file) if not chart_data: return dependencies = self.get_dependencies(chart_data) if not dependencies: self.logger.info(f"No dependencies found in {chart_file}") return for dep in dependencies: name = dep.get("name", "") version = dep.get("version", "") repository = dep.get("repository", "") if not all([name, version, repository]): self.logger.warning(f"Incomplete dependency in {chart_file}: {dep}") continue if self.pull_chart(name, version, repository): pulled_chart_dir = Path.cwd() / name if pulled_chart_dir.is_dir(): dep_chart_file = pulled_chart_dir / "Chart.yaml" if dep_chart_file.is_file(): self.logger.info( f"Found Chart.yaml in pulled dependency: {dep_chart_file}" ) self.process_chart_dependencies(dep_chart_file) def find_chart_files(self, search_dir: Path) -> List[Path]: """Find all Chart.yaml files in the given directory.""" self.logger.info(f"Searching for Chart.yaml files in: {search_dir}") return list(search_dir.rglob("Chart.yaml")) def check_dependencies(self): """Check if required dependencies are available.""" try: subprocess.run(["helm", "version"], capture_output=True, check=True) except (subprocess.CalledProcessError, FileNotFoundError): self.logger.error("helm command not found. Please install Helm.") sys.exit(1) try: import yaml except ImportError: self.logger.error( "PyYAML module not found. Install with: pip install PyYAML" ) sys.exit(1) def run(self, target_dir: str): """Main execution method.""" self.check_dependencies() target_path = Path(target_dir) if not target_path.is_dir(): self.logger.error(f"Directory '{target_dir}' does not exist") sys.exit(1) self.logger.info(f"Starting to process Helm charts in: {target_path}") self.logger.info(f"Charts will be pulled to current directory: {Path.cwd()}") chart_files = self.find_chart_files(target_path) if not chart_files: self.logger.info("No Chart.yaml files found") return for chart_file in chart_files: self.logger.info(f"Found Chart.yaml: {chart_file}") self.process_chart_dependencies(chart_file) self.logger.info( f"Completed processing. Total unique charts pulled: {len(self.pulled_charts)}" ) if self.pulled_charts: self.logger.info(f"Pulled charts: {', '.join(sorted(self.pulled_charts))}") def main(): parser = argparse.ArgumentParser( description="Recursively pull Helm charts and their dependencies from Chart.yaml files" ) parser.add_argument("directory", help="Directory to search for Chart.yaml files") args = parser.parse_args() puller = HelmChartPuller() puller.run(args.directory) if __name__ == "__main__": main()
-
kubectl get nodes -l kubernetes.io/arch=arm64 -o jsonpath='{.items[*].metadata.name}' | xargs kubectl cordon
-
New: Search all the container images in use that match a desired string.
set -e log() { echo "[$(date '+%Y-%m-%d %H:%M:%S')] $*" >&2 } usage() { echo "Usage: $0" echo "Describes all pods in all namespaces and greps for images containing 'bitnami'" exit 1 } check_dependencies() { if ! command -v kubectl >/dev/null 2>&1; then log "Error: kubectl command not found" exit 1 fi # Test kubectl connectivity if ! kubectl cluster-info >/dev/null 2>&1; then log "Error: Cannot connect to Kubernetes cluster" exit 1 fi } find_bitnami_images() { log "Getting all pods from all namespaces..." # Get all pods from all namespaces and describe them kubectl get pods --all-namespaces -o wide --no-headers | while read -r namespace name ready status restarts age ip node nominated readiness; do log "Describing pod: $namespace/$name" # Describe the pod and grep for bitnami images description=$(kubectl describe pod "$name" -n "$namespace" 2>/dev/null) # Look for image lines containing bitnami bitnami_images=$(echo "$description" | grep -i "image:" | grep -i "bitnami" || true) if [[ -n "$bitnami_images" ]]; then echo "=========================================" echo "Pod: $namespace/$name" echo "Status: $status" echo "Bitnami Images Found:" echo "$bitnami_images" echo "=========================================" echo fi done } main() { if [[ $# -ne 0 ]]; then usage fi check_dependencies log "Starting search for Bitnami images in all pods across all namespaces" find_bitnami_images log "Search completed" } main "$@"
-
New: Force the removal of a node from the cluster.
To force the removal of a node from a Kubernetes cluster, you have several options depending on your situation:
To prevent new pods from being scheduled while you prepare:
kubectl cordon <node-name>
1. Graceful Node Removal (Recommended)
First, try the standard approach:
kubectl drain <node-name> --ignore-daemonsets --delete-emptydir-data kubectl delete node <node-name>
2. Force Removal When Node is Unresponsive
If the node is unresponsive or the graceful removal fails:
kubectl drain <node-name> --ignore-daemonsets --delete-emptydir-data --force --grace-period=0 kubectl delete node <node-name>
Immediate Forced Removal
For emergency situations where you need immediate removal:
kubectl delete node <node-name> --force --grace-period=0
Common Drain Options
--ignore-daemonsets
: Ignores DaemonSet pods (they'll be recreated anyway)--delete-emptydir-data
: Deletes pods using emptyDir volumes--force
: Forces deletion of pods not managed by controllers--grace-period=0
: Immediately kills pods without waiting--timeout=300s
: Sets timeout for the drain operation
-
New: Raise alert when value is empty.
Using vector(0)
One way to solve it is to use the
vector(0)
operator with the operationor on() vector(0)
(count_over_time({filename="/var/log/mail.log"} |= `Mail is sent` [24h]) or on() vector(0)) < 1
Using unless
If you're doing an aggregation over a label this approach won't work because it will add a new time series with value 0. In those cases use a broader search that includes other logs and the
unless
operator:(sum by(hostname) (count_over_time({job="systemd-journal"} [1h])) unless sum by(hostname) (count_over_time({service_name="watchtower"} [1d]))) > 0
This will return a value > 0 for any hostname that has systemd-journal logs but no watchtower logs in the past day, which is perfect for alerting conditions.
-
New: Upgrade postgres.
Dump your database
Dump your existing database with a command similar to
docker compose exec postgresql pg_dump -U authentik -d authentik -cC > upgrade_backup_12.sql
.Before continuing, ensure the SQL dump file
upgrade_backup_12.sql
includes all your database content.Stop your application stack
Stop all services with
docker compose down
.Backup your existing database
Move the directory where your data is to a new one:
mv /path/to/database /path/to/v12-backup
Modify your docker-compose.yml file
Update the PostgreSQL service image from
docker.io/library/postgres:12-alpine
todocker.io/library/postgres:17-alpine
.Add
network_mode: none
and comment out anynetwork
directive to prevent connections being established to the database during the upgrade.Recreate the database container
Pull new images and re-create the PostgreSQL container:
docker compose pull && docker compose up --force-recreate -d postgresql
Apply your backup to the new database:
cat upgrade_backup_12.sql | docker compose exec -T postgresql psql -U authentik
Remove the network configuration setting
network_mode: none
that you added to the Compose file in the previous step.Bring the service up
Start again the service with
docker compose up
and see that everything is working as expected.
OpenZFS⚑
-
New: Monitor the zfs ram usage.
feat(linux_snippets#Unattended upgrades): Unattended upgrades- alert: HostOutOfMemory # if we don't add the node_zfs_arc_size, the ARC is taken as used space triggering the alert as a false positive expr: (node_memory_MemAvailable_bytes + node_zfs_arc_size)/ node_memory_MemTotal_bytes * 100 < 10 for: 5m labels: severity: warning annotations: summary: Host out of memory (instance {{ $labels.instance }}) message: "Node memory is filling up (< 10% left)\n VALUE = {{ $value\ \ }}"
unattended-upgrades runs daily at a random time
How to tell when unattended upgrades will run today:
The random time is set by a cron job (/etc/cron.daily/apt.compat), and you can read the random time for today by asking systemd:
$ systemctl list-timers apt-daily.timer NEXT LEFT LAST PASSED UNIT ACTIVATES Tue 2017-07-11 01:53:29 CDT 13h left Mon 2017-07-10 11:22:40 CDT 1h 9min ago apt-daily.timer apt-daily.service
In this case, you can see that it ran 1 hour and 9 minutes ago.
How to tell if unattended upgrades are still running:
One easy way is to check the timestamp files for the various apt components:
$ ls -l /var/lib/apt/periodic/ total 0 -rw-r--r-- 1 root root 0 Jul 10 11:24 unattended-upgrades-stamp -rw-r--r-- 1 root root 0 Jul 10 11:23 update-stamp -rw-r--r-- 1 root root 0 Jul 10 11:24 update-success-stamp -rw-r--r-- 1 root root 0 Jul 10 11:24 upgrade-stamp
Putting the data together, you can see that the timer started apt at 11:22. It ran an update which completed at 11:23, then an upgrade which completed at 11:24. Finally, you can see that apt considered the upgrade to be a success (no error or other failure).
Obviously, if you see a recent timer without a corresponding completion timestamp, then you might want to check ps to see if apt is still running.
How to tell which step apt is running right now
One easy way is to check the logfile.
$ less /var/log/unattended-upgrades/unattended-upgrades.log 2017-07-10 11:23:00,348 INFO Initial blacklisted packages: 2017-07-10 11:23:00,349 INFO Initial whitelisted packages: 2017-07-10 11:23:00,349 INFO Starting unattended upgrades script 2017-07-10 11:23:00,349 INFO Allowed origins are: ['o=Ubuntu,a=zesty-security', 'o=Ubuntu,a=zesty-updates'] 2017-07-10 11:23:10,485 INFO Packages that will be upgraded: apport apport-gtk libpoppler-glib8 libpoppler-qt5-1 libpoppler64 poppler-utils python3-apport python3-problem-report 2017-07-10 11:23:10,485 INFO Writing dpkg log to '/var/log/unattended-upgrades/unattended-upgrades-dpkg.log' 2017-07-10 11:24:20,419 INFO All upgrades installed
Here you can see the normal daily process, including the 'started' and 'completed' lines, and the list of packages that were about to be upgraded.
If the list of packages is not logged yet, then apt can be safely interrupted. Once the list of packages is logged, DO NOT interrupt apt.
**Check the number of packages that need an upgrade **
**Manually run the unattended upgrades **apt list --upgradeable
unattended-upgrade -d
Node Exporter⚑
-
New: Monitor host requires a reboot.
Node exporter does not support this metric, but you can monitor reboot requirements using Prometheus node exporter's textfile collector. Here's how to set it up:
Create the monitoring script
First, create a script that checks for the reboot-required file and outputs metrics in Prometheus format:
- Make the script executable and place it in the right location:
sudo cp reboot-required-check.sh /usr/local/bin/ sudo chmod +x /usr/local/bin/reboot-required-check.sh
TEXTFILE_DIR="/var/lib/node_exporter/textfile_collector" METRIC_FILE="$TEXTFILE_DIR/reboot_required.prom" mkdir -p "$TEXTFILE_DIR" if [ -f /var/run/reboot-required ]; then REBOOT_REQUIRED=1 else REBOOT_REQUIRED=0 fi cat > "$METRIC_FILE" << EOF node_reboot_required $REBOOT_REQUIRED EOF chmod 644 "$METRIC_FILE"
- Ensure the textfile collector directory exists:
sudo mkdir -p /var/lib/node_exporter/textfile_collector sudo chown node_exporter:node_exporter /var/lib/node_exporter/textfile_collector
- Create a systemd service to run the script periodically:
[Unit] Description=Check if system requires reboot After=network.target [Service] Type=oneshot ExecStart=/usr/local/bin/reboot-required-check.sh User=node_exporter Group=node_exporter [Install] WantedBy=multi-user.target
- Create a systemd timer to run it regularly:
[Unit] Description=Check if system requires reboot every 5 minutes Requires=reboot-check.service [Timer] OnBootSec=1min OnUnitActiveSec=5min [Install] WantedBy=timers.target
- Install and enable the systemd units:
sudo cp reboot-check.service /etc/systemd/system/ sudo cp reboot-check.timer /etc/systemd/system/ sudo systemctl daemon-reload sudo systemctl enable reboot-check.timer sudo systemctl start reboot-check.timer
- Configure node exporter to use the textfile collector: Make sure your node exporter is started with the
--collector.textfile.directory
flag:
node_exporter --collector.textfile.directory=/var/lib/node_exporter/textfile_collector
Prometheus alerting rule
You can create an alerting rule in Prometheus to notify when a reboot is required:
groups: - name: system.rules rules: - alert: SystemRebootRequired expr: node_reboot_required == 1 for: 0m labels: severity: warning annotations: summary: "System {{ $labels.instance }} requires reboot" description: "System {{ $labels.instance }} requires a reboot due to: {{ $labels.reason }}"
Testing
You can test the setup by:
- Run the script manually:
sudo /usr/local/bin/reboot-required-check.sh cat /var/lib/node_exporter/textfile_collector/reboot_required.prom
- Check if the timer is working:
sudo systemctl status reboot-check.timer sudo journalctl -u reboot-check.service
- Verify metrics are being collected: Visit
http://your-server:9100/metrics
and search fornode_reboot_required
The metrics will show
node_reboot_required 1
when a reboot is required andnode_reboot_required 0
when it's not. Thenode_reboot_required_packages_info
metric includes information about which packages triggered the reboot requirement.
Operating Systems⚑
Linux Snippets⚑
-
New: Resize a partition of an EC2 instance.
If it's the first partition of the first disk.
growpart /dev/nvme0n1 1 resize2fs /dev/nvme0n1p1
Mobile Keyboards⚑
-
New: Introduce mobile keyboards comparison.
Finding the right mobile keyboard that balances functionality, privacy, and usability can be challenging. This guide explores the best open-source and privacy-focused keyboard options available for Android devices.
Quick Recommendations
- For Gboard users transitioning: HeliBoard
- For advanced features and AI: FUTO Keyboard
- For unique input method: Thumb-Key
- For future consideration: FlorisBoard (when stable)
FUTO Keyboard ⭐ Recommended
FUTO represents the cutting edge of privacy-focused keyboard technology, incorporating AI features while maintaining offline functionality.
What Makes FUTO Special
FUTO stands out with transformer-based predictions using llama.cpp and integrated voice input powered by whisper.cpp. Unlike other keyboards that require proprietary libraries, FUTO includes swipe/glide typing by default.
The keyboard is currently in pre-alpha, so expect some bugs and missing features. However, the privacy-preserving approach and innovative AI integration make it worth trying.
Key Features
Smart Text Prediction
- Uses pre-trained transformer models for intelligent autocorrect
- Personal language model that learns from your typing (locally only)
- Currently optimized for English, with other languages in development
- Spanish support is still limited
Privacy-First Design
- All AI processing happens on-device
- Your data never leaves your phone
- FUTO doesn't view or store any typing data
- Internet access only for updates and crash reporting (planned to be removed)
Customization Options
- Multilingual typing support
- Custom keyboard layouts
- Swipe typing works well out of the box
Current Limitations
- Pre-alpha software with occasional bugs
- Limited language support beyond English
- Uses a custom "Source First" license (not traditional open source)
- Screen movement issues when using swipe typing
Licensing Concerns
FUTO uses a custom license rather than traditional open source licenses like GPL. While the source code is available, the licensing terms are more restrictive than typical open source projects. The team promises to adopt proper open source licensing eventually, but this transition hasn't happened yet.
Resources
Not there yet
- The futo voice has a weird bug at least in spanish that sometimes adds at the end of the transcription phrases like: Subscribe! or Chau or Thanks for watching my video!. This is kind of annoying and scary
(¬º-°)¬
HeliBoard - The Reliable Choice
HeliBoard serves as an excellent middle ground, especially for users transitioning from Gboard.
Why Choose HeliBoard
- Active development: Fork of OpenBoard with regular updates
- No network access: Completely offline operation
- User-friendly: Much simpler than AnySoftKeyboard
- Gboard-like experience: Familiar interface for Google Keyboard users
Trade-offs
The main limitation is glide typing, which requires a closed-source library. This compromises the fully open source nature but provides the swipe functionality many users expect.
Resources
Thumb-Key - The Innovative Alternative
For users willing to try something completely different, Thumb-Key offers a unique approach to mobile typing.
The Thumb-Key Concept
Instead of traditional QWERTY, Thumb-Key uses a 3x3 grid layout with swipe gestures for less common letters. This design prioritizes:
- Large, predictable key positions
- Muscle memory development
- Eyes staying on the text area
- Fast typing speeds once mastered
Best For
- Users open to learning new input methods
- Those who prefer larger touch targets
- Privacy enthusiasts who want to avoid predictive text entirely
- People who find traditional keyboards cramped
The keyboard is highly configurable and focuses on accuracy through key positioning rather than AI predictions.
Resources
FlorisBoard - Future Potential
FlorisBoard shows promise but isn't ready for daily use yet.
Current Status
- Early beta development
- Planned integration with GrapheneOS
- Missing key features like suggestions and glide typing
- Limited documentation available
Worth Watching
While not currently recommended for primary use, FlorisBoard could become a strong contender once it reaches stability.
Resources
Alternative Approaches
Unexpected Keyboard
A minimalist keyboard with a unique layout approach.
Resources
Using Proprietary Keyboards with Restrictions
On privacy-focused ROMs like GrapheneOS and DivestOS, you can use proprietary keyboards while blocking internet access. However, this approach has limitations due to inter-process communication between apps.
Note: This method isn't foolproof, as apps can still potentially communicate through IPC mechanisms.
My Current Setup
After testing various options:
- Primary choice: FUTO Keyboard with swipe enabled
- Backup plan: Try FUTO voice input for longer texts when privacy features improve
- Alternative: Thumb-Key if FUTO doesn't work out
The main issue encountered is screen movement during swipe typing, which may be device-specific.
References and Further Reading
Filosofía⚑
-
New: Nuevos capítulos relevantes sobre el tiempo y otras cosas.
- Punzadas Sonoras: Mirar atrás: un gesto íntimo: Ruptura de la concepción lineal del tiempo, aplicado entre otras cosas a las relaciones. Hablan también sobre el artículo de Leonor Cervantes, Ya no te gusto como antes que da una perspectiva muy chula sobre las relaciones.
- Punzadas Sonoras: Artesano y artista: desnaturalizar la distinción: Capítulo super interesante para analizar las dinámicas de poder en el mundo laboral, el concepto de mingei, pensar sobre "el arte de programar", ...
- Punzadas Sonoras: El ritmo del habitar con Blanca Lacasa: Filosofando sobre la casa, en especial sobre la cocina como jaula y reino. También sobre la relación entre madres e hijas
Arts⚑
Calistenia⚑
-
New: Sentadilla búlgara.
Languages⚑
Galego⚑
-
New: Descubrimiento de os arquivos da meiga.
- Os arquivos da meiga: Foro de contenido en galego