incremental-sync.ts and api/cw/sync.ts imported getBoss() from workert.ts.
When workert.ts (the entry point) dynamically imported incremental-sync.ts,
it triggered a circular module re-evaluation that hung indefinitely.
Extract the PgBoss singleton and getBoss() factory to a new boss-instance.ts
module that neither has top-level async side-effects nor imports from
workert.ts. All consumers (workert.ts, index.ts, incremental-sync.ts,
cw/sync.ts) now import from boss-instance.ts instead.
- Add statement_timeout=30000ms to PgBoss connection URL to prevent
SQL queries from hanging indefinitely
- Add connectionTimeoutMillis=15s to PgBoss config for connection timeout
- Wrap boss.start() in 30s Promise.race timeout with process.exit(1)
on failure to ensure container restarts instead of hanging silently
- Add debug logging around PgBoss startup to diagnose connection issues
The CW MSSQL and API Postgres addresses are internal to the cluster and
unreachable from GitHub-hosted runners, so the sync must run inside k8s.
- Add dalpuri-sync Docker stage to api/Dockerfile: installs deps,
generates both Prisma clients, and runs dalpuri/src/sync.ts
- Add dalpuri/kubernetes/sync-job.yaml: mounts api-env-secret (which
already contains CW_DATABASE_URL) and maps DATABASE_URL -> API_DATABASE_URL
- build-api job now also pushes optima-dalpuri-sync:TAG image
- sync-cw-to-api CI job replaced with kubectl apply/wait pattern,
needs [build-api, build-worker], blocks deploy-api and deploy-worker
The socket retrieved from ensureManagerSocketReady() was never passed to
enqueueDalpuriFullSync(), so inside createWorkerJob the socket.emit('requestId')
call crashed with 'TypeError: undefined is not an object (evaluating A.emit)'.
This caused every full sync job to fail immediately, leaving the DB empty.
The 5s incremental sync interval then flooded the queue with 4700+ jobs that
all failed too since there was no data.
Also manually cleared the backlog of 4720 failed/pending incremental jobs and
2 failed full sync jobs from the production queue.
Two bugs in the catch-up migration that only manifest with real production data:
1. Company (4520 rows): uid was added as TEXT NOT NULL DEFAULT '' causing
all existing rows to get uid='' which makes the PRIMARY KEY constraint
fail with 'could not create unique index, Key (uid)=() is duplicated'.
Fix: add uid as nullable, UPDATE uid = id (copies the existing CUID text
PK into uid), then SET NOT NULL, then swap PK. Also populate the new
integer id column from cw_CompanyId (which is fully populated in prod).
2. UnifiSite (180 rows): old approach just dropped the text companyId and
added a null integer column, destroying all company relationships.
Fix: add companyId_int, UPDATE via JOIN on Company.uid (= old Company.id
text), drop old text column, rename integer column.
Also fix the P3009 handler in migrate-entrypoint.sh: Prisma may emit ANSI
color codes even without a TTY, wrapping backticks in escape sequences and
breaking the regex match. Fix: strip ANSI codes with sed before extracting
the migration name. Also simplify the regex from a rigid format match to a
simpler backtick-content grep.
Production DB manually unblocked (migrate resolve --rolled-back) so the
next deploy will cleanly apply the corrected migration.
POSIX sh exits a script on the assignment line when command substitution
exits non-zero under set -e -- before the subsequent echo ever runs.
DEPLOY_OUTPUT=$(cmd 2>&1) # <- script exits here if cmd fails
EXIT_CODE=$?
echo "$DEPLOY_OUTPUT" # <- never reached
Fix: use the || idiom, which puts the LHS in a compound-command context
where set -e does not apply, and still captures the real exit code:
EXIT_CODE=0
DEPLOY_OUTPUT=$(cmd 2>&1) || EXIT_CODE=$?
echo "$DEPLOY_OUTPUT" # <- always runs
Applied the same fix to the resolve call.
All schema changes that were applied via 'prisma db push' over the past
several months were never captured in migration files. When the postgres
pod restarted just before the migration job ran, the database was rebuilt
from the 15 existing migrations -- creating an old schema that was missing
~20 tables and significant structural changes to User, Opportunity,
CatalogItem, and Company.
This migration bridges the gap idempotently:
- New enums: PhoneType, FaxType, BillingMethod, BillingType, GenderType,
USState, Country, OpportunityInterest
- User: add firstName/lastName/title/active/hidden/cwMemberId/updatedBy;
drop emailVerified/name; make userId nullable
- CatalogItem: TEXT id → INTEGER id + TEXT uid PK; restructure FK columns
- Company: TEXT id → INTEGER id + TEXT uid PK; drop old CW columns; add
dateDeleted/deleteFlag/phone/taxExempt/taxId/website/enteredById
- Opportunity: TEXT id → INTEGER id + TEXT uid PK; drop ~25 flat CW
columns; add typeId/statusId/contactId/siteId/locationId/departmentId/
closedById/primarySalesRepId/secondarySalesRepId/eneteredBy/updatedBy/
oppNarrative/taxCodeId/interest; drop cwDateEntered
- UnifiSite: companyId TEXT → INTEGER
- 20+ new tables: CorporateLocation, InternalDepartment, CompanyAddress,
Contact, CatalogItemType, CatalogCategory, CatalogSubcategory,
CatalogManufacturer, Warehouse, WarehouseBin, ProductInventory,
MinimumStockByWarehouse, ProductData, ServiceTicket, ServiceTicketNote,
ServiceTicketType, ServiceTicketBoard, ServiceTicketLocation,
ServiceTicketSource, ServiceTicketImpact, ServiceTicketPriority,
ServiceTicketServerity, ServiceTicketFinalData, OpportunityType,
OpportunityStatus, ScheduleStatus, ScheduleType, ScheduleSpan,
Schedule, TaxCode
Verified: all 16 migrations apply cleanly on a fresh DB and produce zero
schema drift (prisma migrate diff outputs '-- This is an empty migration.')
Fixes P2022 ColumnNotFound errors on login and all model queries.