Skip to content

Configuration Reference

Complete reference for configuring ALPCRUN.CH services.

Configuration Methods

ALPCRUN.CH services can be configured via:

  1. Command-line flags (highest priority)
  2. Environment variables (middle priority)
  3. Configuration file (lowest priority)

Queue Manager

Command-Line Flags

./queue-manager [OPTIONS]
Flag Type Default Description
--appid string "" Application identifier
--loglevel string "info" Log level: trace, debug, info, warn, error
--logfile string "" Log file path (empty for stdout)
--pretty bool false Pretty-print logs (slower)
--addSource bool false Add source file:line to logs
--bufferSize int 1000 Queue buffer size
--queueType string "priority" Queue type: priority, regular, lockfree, speed
--port int 1337 gRPC server port
--metricsPort int 8080 Prometheus metrics port
--tls bool false Enable TLS
--tlsCert string "" TLS certificate path
--tlsKey string "" TLS private key path
--pprof bool false Enable pprof profiling
--pprofPort int 6060 pprof HTTP port
--version bool false Print version and exit

Environment Variables

Prefix: ALPCRUNCH_

# Example
export ALPCRUNCH_APPID="my-app"
export ALPCRUNCH_LOGLEVEL="debug"
export ALPCRUNCH_BUFFERSIZE=2000
export ALPCRUNCH_TLS=true
export ALPCRUNCH_TLSCERT="/certs/server.crt"
export ALPCRUNCH_TLSKEY="/certs/server.key"

# Required
export METALCORE_API_KEY="your-secret-key"

Configuration File

YAML format:

# queue-manager.yaml
appid: "my-app"
loglevel: "info"
bufferSize: 1000
queueType: "priority"

port: 1337
metricsPort: 8080

tls: true
tlsCert: "/certs/server.crt"
tlsKey: "/certs/server.key"

pprof: false
pprofPort: 6060

Load with:

./queue-manager --config queue-manager.yaml

Queue Types

priority (default): Heap-based priority queue - Best for mixed-priority workloads - O(log n) operations - Recommended for most use cases

regular: Channel-based FIFO queue - Best for single-priority workloads - O(1) operations - Lower overhead

lockfree: CAS-based lock-free queue - Best for high contention - Lower latency variance - Experimental

speed: Unbuffered queue for testing - Not for production - Used for benchmarking

Central Cache

Command-Line Flags

./central-cache [OPTIONS]
Flag Type Default Description
--loglevel string "info" Log level
--logfile string "" Log file path
--pretty bool false Pretty-print logs
--addSource bool false Add source to logs
--port int 2337 gRPC server port
--metricsPort int 8081 Metrics port
--maxTTL duration "600s" Maximum allowed TTL
--cleanupInterval duration "1s" Cleanup frequency
--tls bool false Enable TLS
--tlsCert string "" TLS certificate
--tlsKey string "" TLS key
--pprof bool false Enable pprof
--pprofPort int 6061 pprof port
--version bool false Print version

Environment Variables

Prefix: ALPCRUNCH_

export ALPCRUNCH_LOGLEVEL="info"
export ALPCRUNCH_MAXTTL="600s"
export ALPCRUNCH_CLEANUPINTERVAL="1s"
export METALCORE_API_KEY="your-secret-key"

Configuration File

# central-cache.yaml
loglevel: "info"

port: 2337
metricsPort: 8081

maxTTL: "600s"
cleanupInterval: "1s"

tls: true
tlsCert: "/certs/server.crt"
tlsKey: "/certs/server.key"

Node Cache

Command-Line Flags

./node-cache [OPTIONS]
Flag Type Default Description
--loglevel string "info" Log level
--logfile string "" Log file path
--pretty bool false Pretty-print logs
--addSource bool false Add source to logs
--port int 3337 gRPC server port
--metricsPort int 8082 Metrics port
--centralCacheAddr string "central-cache:2337" Central cache address
--cacheSize string "1GB" Local cache size
--tls bool false Enable TLS
--tlsCert string "" TLS certificate (server)
--tlsKey string "" TLS key (server)
--caCert string "" CA cert (for central cache)
--pprof bool false Enable pprof
--pprofPort int 6062 pprof port
--version bool false Print version

Environment Variables

export ALPCRUNCH_LOGLEVEL="info"
export ALPCRUNCH_CENTRALCACHEADDR="central-cache:2337"
export ALPCRUNCH_CACHESIZE="1GB"
export METALCORE_API_KEY="your-secret-key"

Configuration File

# node-cache.yaml
loglevel: "info"

port: 3337
metricsPort: 8082

centralCacheAddr: "central-cache:2337"
cacheSize: "1GB"

tls: true
tlsCert: "/certs/server.crt"
tlsKey: "/certs/server.key"
caCert: "/certs/ca.crt"

Client Configuration

Environment Variables

# Required
export METALCORE_API_KEY="your-secret-key"

# Optional
export METALCORE_QUEUE_MANAGER_ADDR="queue-manager:1337"
export METALCORE_CACHE_ADDR="central-cache:2337"
export METALCORE_CA_CERT="/certs/ca.crt"

Connection Configuration

import "github.com/limelabs/metalcore-neo/pkg/grpccon"

// Queue manager connection
queueConn, queueClient, err := grpccon.ConnectToQueue(
    os.Getenv("METALCORE_QUEUE_MANAGER_ADDR"),
    os.Getenv("METALCORE_CA_CERT"),
    true)  // TLS enabled

Worker Configuration

Environment Variables

# Required
export METALCORE_API_KEY="your-secret-key"
export METALCORE_WORKER_NODE="$(hostname)"
export METALCORE_WORKER_POD="worker-pod-123"

# Optional
export METALCORE_QUEUE_MANAGER_ADDR="queue-manager:1337"
export METALCORE_CACHE_ADDR="localhost:3337"
export METALCORE_CA_CERT="/certs/ca.crt"
export METALCORE_NUM_THREADS="8"

Kubernetes Configuration

Set automatically via downward API:

env:
- name: METALCORE_WORKER_NODE
  valueFrom:
    fieldRef:
      fieldPath: spec.nodeName
- name: METALCORE_WORKER_POD
  valueFrom:
    fieldRef:
      fieldPath: metadata.name

Logging Configuration

Log Levels

  • trace: Very verbose, all operations
  • debug: Detailed debugging info
  • info: General informational messages (default)
  • warn: Warnings and potential issues
  • error: Error conditions only

Log Format

Standard (default):

{"time":"2025-01-15T10:30:00Z","level":"INFO","msg":"Session created","session_id":"01HZ..."}

Pretty (--pretty):

2025-01-15 10:30:00 INFO  Session created session_id=01HZ...

With Source (--addSource):

{"time":"2025-01-15T10:30:00Z","level":"INFO","source":"queue.go:123","msg":"Session created"}

Log Destinations

Stdout (default):

./queue-manager

File:

./queue-manager --logfile /var/log/queue-manager.log

Systemd journal:

# Logs automatically captured by systemd
systemctl start queue-manager
journalctl -u queue-manager -f

TLS Configuration

Certificate Requirements

Server Certificate (queue-manager, caches): - Valid for service hostname - Subject Alternative Names (SANs) for Kubernetes DNS - PEM-encoded

Client Certificate (optional, for mTLS): - Issued by same CA - Used for mutual authentication

Generating Certificates

Using openssl:

# Generate CA
openssl genrsa -out ca.key 4096
openssl req -new -x509 -days 365 -key ca.key -out ca.crt \
  -subj "/CN=ALPCRUN.CH CA"

# Generate server certificate
openssl genrsa -out server.key 4096
openssl req -new -key server.key -out server.csr \
  -subj "/CN=queue-manager.alpcrun.svc.cluster.local"

# Sign with CA
openssl x509 -req -in server.csr -CA ca.crt -CAkey ca.key \
  -CAcreateserial -out server.crt -days 365

Configuration Example

Queue Manager (server):

./queue-manager \
  --tls \
  --tlsCert /certs/server.crt \
  --tlsKey /certs/server.key

Client:

export METALCORE_CA_CERT=/certs/ca.crt
./client  # Uses CA cert for verification

Prometheus Metrics

Exposed Metrics

Queue Manager (:8080/metrics):

# Connections
alpcrun_clients_connected
alpcrun_workers_connected

# Sessions
alpcrun_sessions_active
alpcrun_sessions_created_total
alpcrun_sessions_closed_total

# Queue depths
alpcrun_queue_task_depth{session_id}
alpcrun_queue_result_depth{session_id}
alpcrun_queue_deadletter_depth{session_id}

# Throughput
alpcrun_tasks_submitted_total{session_id}
alpcrun_results_collected_total{session_id}

# gRPC
grpc_server_handled_total{grpc_method,grpc_code}
grpc_server_handling_seconds{grpc_method}

Central Cache (:8081/metrics):

# Operations
alpcrun_cache_operations_total{operation}
alpcrun_cache_entries

# Performance
alpcrun_cache_operation_duration_seconds{operation}

Node Cache (:8082/metrics):

# Cache performance
alpcrun_cache_hits_total
alpcrun_cache_misses_total
alpcrun_cache_size_bytes

# gRPC to central cache
grpc_client_handled_total{grpc_method,grpc_code}

Prometheus Configuration

# prometheus.yml
scrape_configs:
  - job_name: 'queue-manager'
    static_configs:
      - targets: ['queue-manager:8080']

  - job_name: 'central-cache'
    static_configs:
      - targets: ['central-cache:8081']

  - job_name: 'node-cache'
    kubernetes_sd_configs:
      - role: pod
    relabel_configs:
      - source_labels: [__meta_kubernetes_pod_label_app]
        action: keep
        regex: node-cache
      - source_labels: [__meta_kubernetes_pod_ip]
        target_label: __address__
        replacement: ${1}:8082

Profiling Configuration

Enable pprof

./queue-manager --pprof --pprofPort 6060

Access Profiles

CPU Profile:

go tool pprof http://localhost:6060/debug/pprof/profile?seconds=30

Heap Profile:

go tool pprof http://localhost:6060/debug/pprof/heap

Goroutines:

curl http://localhost:6060/debug/pprof/goroutine?debug=2

Docker Compose Configuration

Complete example:

# docker-compose.yml
version: '3.8'

services:
  queue-manager:
    image: alpcrun/queue-manager:latest
    command:
      - --appid=docker-app
      - --loglevel=info
      - --bufferSize=1000
    ports:
      - "1337:1337"
      - "8080:8080"
    environment:
      - METALCORE_API_KEY=${METALCORE_API_KEY}
    volumes:
      - ./certs:/certs:ro

  central-cache:
    image: alpcrun/central-cache:latest
    command:
      - --maxTTL=600s
      - --cleanupInterval=1s
    ports:
      - "2337:2337"
      - "8081:8081"
    environment:
      - METALCORE_API_KEY=${METALCORE_API_KEY}

  node-cache:
    image: alpcrun/node-cache:latest
    command:
      - --centralCacheAddr=central-cache:2337
      - --cacheSize=1GB
    ports:
      - "3337:3337"
      - "8082:8082"
    environment:
      - METALCORE_API_KEY=${METALCORE_API_KEY}
    depends_on:
      - central-cache

  worker:
    image: alpcrun/demo-worker:latest
    environment:
      - METALCORE_API_KEY=${METALCORE_API_KEY}
      - METALCORE_WORKER_NODE=${HOSTNAME}
      - METALCORE_WORKER_POD=docker-worker-${HOSTNAME}
      - METALCORE_QUEUE_MANAGER_ADDR=queue-manager:1337
      - METALCORE_CACHE_ADDR=node-cache:3337
    depends_on:
      - queue-manager
      - node-cache
    deploy:
      replicas: 4

Kubernetes Configuration

See deployment examples in the repository:

  • Helm charts: deploy/helm/
  • Raw manifests: deploy/k8s/

Key Kubernetes Settings

Queue Manager Deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: queue-manager
spec:
  replicas: 1  # Can be increased for HA
  template:
    spec:
      containers:
      - name: queue-manager
        image: alpcrun/queue-manager:latest
        args:
          - --appid=production
          - --loglevel=info
          - --tls
          - --tlsCert=/certs/tls.crt
          - --tlsKey=/certs/tls.key
        resources:
          requests:
            cpu: "2"
            memory: "4Gi"
          limits:
            cpu: "4"
            memory: "8Gi"

Node Cache DaemonSet:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: node-cache
spec:
  template:
    spec:
      containers:
      - name: node-cache
        image: alpcrun/node-cache:latest
        args:
          - --centralCacheAddr=central-cache:2337
          - --cacheSize=2GB
        resources:
          requests:
            cpu: "500m"
            memory: "2Gi"

Best Practices

  1. Security:
  2. Always use TLS in production
  3. Rotate API keys regularly
  4. Use Kubernetes secrets for sensitive data

  5. Performance:

  6. Adjust buffer sizes based on workload
  7. Use priority queues for mixed workloads
  8. Deploy node cache on all compute nodes

  9. Reliability:

  10. Enable structured logging
  11. Monitor metrics continuously
  12. Set appropriate resource limits

  13. Debugging:

  14. Use debug log level sparingly (performance impact)
  15. Enable pprof only when needed
  16. Add source info for development only

Next Steps