Getting Started with Dunwich

Installation

  • Download correct binaries for your server from Downloads
  • (Optional) wrap those binaries in container. Most popular case - create Docker image with binaries copied inside

Launching Registry

Registry is the first service you want to launch. It’s the center point of system:

  • Handles schemas
  • Handles locking mechanism
  • Handles coordination between workers
  • Handles metrics

(And yes, it’s possible to launch cluster of these to have High Availability. But that’s at the later stages)

Initial config:

registry:
  bind: localhost:4567
  advertise: localhost:4567
  database: data/registry.db

First confusing part:

  • bind is address on which registry is directly reachable
  • advertise is address used when you have cluster of registries.

In simplest terms - bind can be public IP, advertise - private network IP. On one you reach registry, on another registries talk to each other.

Database - Registry is SQLite to persist data. The only requirement is to give path where to store it. (Data Share between different nodes is done using Gossip)

Sink config:

To be able to auto load schemas, Registry needs to reach your sink.

Simple example with Postgres

sink:
  type: postgres
  host: 127.0.0.0
  port: 5432
  database: warehouse
  username: admin
  password: secret123

And the good part - one config can be reused in other apps:

registry:
  bind: localhost:4567
  advertise: localhost:4567
  database: data/registry.db

sink:
  type: postgres
  host: 127.0.0.0
  port: 5432
  database: warehouse
  username: admin
  password: secret123

Now if we simply launch ./bin/admin with same config and it will be fully capable to sync with Registry and fetch new schemas from destination

Launching Admin

As mentioned above, you simply can run admin binary and pass same config. In admin you will:

  • See all schemas
  • Refresh schemas (update existing ones and automatically add new tables)
  • Flag columns (to hide, hash, encrypt or other protections applied automatically)

Admin doesn’t actually need to be always running. You can shut it down after initial config.

Launching API

For API to work (and later worker), we need to have NATS + Jetstream running

For quick local testing, you can run:

#!/bin/bash

OCI=podman

${OCI} run --rm -d --name nats -p 4222:4222 -p 8222:8222 docker.io/library/nats:latest -js

(change OCI to docker if you’re using it) this bash script will launch local NATS cluster which will cover all testing needs

And we need to slightly edit our config by adding two more sections:

service:
  bind: localhost:3105

nats:
  url: nats://localhost:4222
  stream: "local"

Now we can simply launch api binary and pass full config as an argument.

Full config:

registry:
  bind: localhost:4567
  advertise: localhost:4567
  database: data/registry.db

sink:
  type: postgres
  host: 127.0.0.0
  port: 5432
  database: warehouse
  username: admin
  password: secret123

service:
  bind: localhost:3105

nats:
  url: nats://localhost:4222
  stream: "local"

Launching Worker

Config actually covers all of our needs, worker can be simply launched with current one.

What’s next?

API is ready to accept data and worker is ready to push it into Data Warehouse. Next step is actually integrating to the API from your App side.