Daniel Moerner

A Curious Moon in Podman or Docker

I recently started working through A Curious Moon, a wonderfully clever data science “mystery” in PostgreSQL. The setup in the book uses Postgres on bare metal, but I wanted to use Postgres in Podman, which is like Docker. One interesting suggestion in the book is to use a Makefile to organize ETL. But this needed a little massaging work with Podman on Fedora, which I want to share here.

Here is my Compose file:

services:
  cassini-pg:
    image: postgres:17
    restart: unless-stopped
    container_name: cassini-pg
    environment:
      - POSTGRES_USER=cassini
      - POSTGRES_PASSWORD=super_secret_password
      - UID=1000
      - GID=1000
    volumes:
      - ./curious_data:/curious_data:z
      - ./scripts:/scripts:z

On a system with enforcing SELinux like Fedora, you’ll need to add the :z to the end of the volume mounts. This will ensure the volume is shared with the correct SELinux labels. Otherwise, you will receive a permissions error when trying to access the directory.

And here is my Makefile:

CONTAINER=cassini-pg
USER=cassini
DB=enceladus
LOCALPATH=${CURDIR}/
DOCKERPATH=/
SCRIPTS=scripts
CSV='/curious_data/data/master_plan.csv'
MASTER=$(SCRIPTS)/import.sql
NORMALIZE=$(SCRIPTS)/normalize.sql
BUILD=$(SCRIPTS)/build.sql

all: normalize
	podman exec -it $(CONTAINER) psql $(DB) -U $(USER) -f $(DOCKERPATH)$(BUILD) && podman exec -it $(CONTAINER) psql $(DB) -U $(USER)

master:
	@cat $(LOCALPATH)$(MASTER) >> $(LOCALPATH)$(BUILD)

import: master
	@echo "COPY import.master_plan FROM $(CSV) WITH DELIMITER ',' HEADER CSV;" >> $(LOCALPATH)$(BUILD)

normalize: import
	@cat $(LOCALPATH)$(NORMALIZE) >> $(LOCALPATH)$(BUILD)

clean:
	@rm -rf $(LOCALPATH)$(BUILD)

createdb:
	podman exec -it $(CONTAINER) createdb $(DB) -U $(USER)

psql:
	podman exec -it $(CONTAINER) psql $(DB) -U $(USER)

A few things to note here. First, when using containers (at least at the start) you will often want to completely blow up and recreate a container. This means being prepared to re-run createdb. I wanted to make this a target of all and psql, but it appears that Postgres does not have an easy-to-use command like CREATE DATABASE IF NOT EXISTS. So instead error messages on a new container should point you to the need to re-run the createdb target.

Second, the default make target should leave us in an interactive psql session. However, just passing a script to the container drops us into what appears to be a psql session (you can type in it), but it is not usable (“Enter” does not execute a command). Therefore, in the all target we first load the build script and then execute a separate psql interactive session.

Third, the paths are a bit messy because we need to use a mixture of paths outside of the container and paths within the container. This could be simplified by using absolute paths and adjusting the container volume mounts to be identical. I did not pursue this, however, because hard-coding absolute paths in the volumes seems fragile and exposes more information than is necessary to the container.

I look forward to sharing more of my journey through this book here on my blog. One more thing: The link to the data included in the book itself seems to be dead, but the data is still available here: https://archive.redfour.io/