Installation
The Metaweave pipeline is a Python package that ingests submissions into PostgreSQL. It runs as a single CLI on a schedule.
Prerequisites
- Python 3.12+
- A Google Cloud SQL Postgres instance (and a service account JSON with access)
- An Azure AD app registration with
Mail.ReadandMail.ReadWritepermissions on the shared Outlook mailbox - Network access to:
graph.microsoft.comlogin.microsoftonline.comsqladmin.googleapis.com(for Cloud SQL Connector)
Set up the venv
cd metaweave/pipelinepython -m venv .venvsource .venv/bin/activatepip install -e ".[dev]"Editable install means changes to src/ are picked up without re-installing.
Dependencies
Pulled in from pyproject.toml:
| Package | Why |
|---|---|
pycryptodome | AES-128-CBC decryption |
sqlalchemy ≥2.0 | ORM + declarative models |
cloud-sql-python-connector[pg8000] | GCP Cloud SQL Connector |
pg8000 | Pure-Python PostgreSQL dialect |
psycopg2-binary | Fallback PostgreSQL driver |
alembic | Schema migrations (included, not yet used) |
msal | Microsoft identity library (OAuth2 client credentials) |
requests | HTTP client for Microsoft Graph |
python-dotenv | .env loading |
Dev deps: pytest, pytest-cov.
Configure environment variables
Create a .env in the pipeline/ directory (or set them in your shell). Required variables:
# Azure AD app registrationAZURE_TENANT_ID=...AZURE_CLIENT_ID=...AZURE_CLIENT_SECRET=...OUTLOOK_USER_EMAIL=metaweave-forms@yourcompany.com
# Cloud SQLGOOGLE_SERVICE_ACCOUNT_BASE64=eyJ0eXBlIjoi... # base64 of the service account JSONCLOUD_SQL_INSTANCE_CONNECTION_NAME=project:region:instancePOSTGRES_USER=...POSTGRES_PASSWORD=...POSTGRES_DB=...
# Encryption (defaults to `mw7k2x9p4q8n3v5h` if unset)MW_AES_KEY=mw7k2x9p4q8n3v5hSee Configuration for the full list of variables.
Create the database tables
First-time setup:
python -m src.main --create-tablesThis calls Base.metadata.create_all(engine) and exits. It’s idempotent — re-running on an existing schema is a no-op.
The 17 tables it creates:
metaweave_vessel metaweave_bunker_deliverymetaweave_fuel_type metaweave_bunker_biofuelmetaweave_voyage metaweave_sof_activitymetaweave_report metaweave_report_cargometaweave_report_event metaweave_month_end_bunker_reportmetaweave_event_fuel_consumption metaweave_berthing_detailsmetaweave_report_bunker_rob metaweave_report_delaymetaweave_report_upcoming_portmetaweave_report_fowe_periodmetaweave_report_scrubber_breakdownSee Data model for the relationships.
Verify the install
Run a single email through end-to-end without touching Outlook:
python -m src.main --file /path/to/sample-email.txt--file reads the body from disk, runs Parse → Map → Write, and exits. Useful for testing without consuming a real mailbox message.
Run the tests
pytest -vTests use SQLite in-memory with PRAGMA foreign_keys=ON — no PostgreSQL needed.
pytest --cov=src tests/ # with coverageSee also
- Running — operational invocation patterns
- Configuration — every env var
- ETL stages — what happens internally