We were unable to load Disqus. If you are a moderator please see our troubleshooting guide.
Wow, thanks for pointing this out. Well explained, I really appreciate it.
But why do we need Event Numbering if Kafka guarantees ordering within a partition and also provides (incremental) offset?
I have implemented the outbox pattern using Debezium and Kafka - but can't find much on concrete implementation of the inbox pattern - I want to ack the receipt from Kafka and write to the target DB in one transaction. Any pointers to patterns there (dotnetcore).
Careful with EventNumber! It's usually programmers' assumption that you can use an RDBMS auto-incrementing sequence to implement a "tape" that is tail-able by consumers. Without additional locking, it will result in a bug that is hard to track down! Consider the following 2 concurrent transactions:
A> BEGIN;
B> BEGIN;
A> INSERT INTO ... ; -- we get EventNumber = 1 from auto-incrementing sequence
B> INSERT INTO ...; -- we get EventNumber = 2 from auto-incrementing sequence
B> COMMIT;
-- this is a moment in time that we'll discuss below
A> COMMIT;
For a brief moment - marked above - a consumer that reads the table will see only the record with EventNumber = 2 (because it reads only committed rows), and will not see EventNumber = 1. Therefore, sequence order is not equal to the commit visibility order! If the consumer now updates its LastProcessedEventNumber to 2 and uses that in a subsequent query (WHERE EventNumber > 2), then it will never see the record with EventNumber = 1, because it only becomes visible after the read operation. From the consumer's point of view, Event number 1 will be missing forever (or until you reset LastProcessedEventNumber).
It may seem like SERIALIZABLE could help here, but it is a trap - sequences are completely unaffected by transactions, not even at the highest isolation level. The only thing that works is explicit locking to ensure that, formally speaking, for each pair of events A and B, if A.EventNumber < B.EventNumber, then A becomes visible before or up to the same time as B, but not later. This can be enforced by taking a lock before generating the EventNumber, and releasing only on transaction commit/rollback (so essentially spanning the entire transaction). This is also the major limiter of SQL offset-based solutions' throughput. For this reason, I've been leaning towards set-based (INSERT/DELETE) Outbox implementations lately.
I guess the bottom line is: implementing pub/sub using RDBMS is not as easy as it seems. Sincerely, a fellow implementor of messaging and Event Sourcing infrastructure.