Getting Started with Dunwich
Installation
- Download correct binaries for your server from Downloads
- (Optional) wrap those binaries in container. Most popular case - create Docker image with binaries copied inside
Launching Registry
Registry is the first service you want to launch. It’s the center point of system:
- Handles schemas
- Handles locking mechanism
- Handles coordination between workers
- Handles metrics
(And yes, it’s possible to launch cluster of these to have High Availability. But that’s at the later stages)
Initial config:
registry:
bind: localhost:4567
advertise: localhost:4567
database: data/registry.db
First confusing part:
bindis address on which registry is directly reachableadvertiseis address used when you have cluster of registries.
In simplest terms - bind can be public IP, advertise - private network IP. On one you reach registry, on another registries talk to each other.
Database - Registry is SQLite to persist data. The only requirement is to give path where to store it. (Data Share between different nodes is done using Gossip)
Sink config:
To be able to auto load schemas, Registry needs to reach your sink.
Simple example with Postgres
sink:
type: postgres
host: 127.0.0.0
port: 5432
database: warehouse
username: admin
password: secret123
And the good part - one config can be reused in other apps:
registry:
bind: localhost:4567
advertise: localhost:4567
database: data/registry.db
sink:
type: postgres
host: 127.0.0.0
port: 5432
database: warehouse
username: admin
password: secret123
Now if we simply launch ./bin/admin with same config and it will be fully capable to sync with Registry and fetch new schemas from destination
Launching Admin
As mentioned above, you simply can run admin binary and pass same config. In admin you will:
- See all schemas
- Refresh schemas (update existing ones and automatically add new tables)
- Flag columns (to hide, hash, encrypt or other protections applied automatically)
Admin doesn’t actually need to be always running. You can shut it down after initial config.
Launching API
For API to work (and later worker), we need to have NATS + Jetstream running
For quick local testing, you can run:
#!/bin/bash
OCI=podman
${OCI} run --rm -d --name nats -p 4222:4222 -p 8222:8222 docker.io/library/nats:latest -js
(change OCI to docker if you’re using it) this bash script will launch local NATS cluster which will cover all testing needs
And we need to slightly edit our config by adding two more sections:
service:
bind: localhost:3105
nats:
url: nats://localhost:4222
stream: "local"
Now we can simply launch api binary and pass full config as an argument.
Full config:
registry:
bind: localhost:4567
advertise: localhost:4567
database: data/registry.db
sink:
type: postgres
host: 127.0.0.0
port: 5432
database: warehouse
username: admin
password: secret123
service:
bind: localhost:3105
nats:
url: nats://localhost:4222
stream: "local"
Launching Worker
Config actually covers all of our needs, worker can be simply launched with current one.
What’s next?
API is ready to accept data and worker is ready to push it into Data Warehouse. Next step is actually integrating to the API from your App side.