Concurrency Control
Handling simultaneous updates across integrated systems
Concurrency control addresses the challenge of managing simultaneous updates to data, which becomes more complex when integrating multiple systems with different update mechanisms.
What is Concurrency Control?
Concurrency control consists of techniques that coordinate simultaneous operations on shared data to maintain consistency. In integration scenarios, this means:
- Preventing lost updates when multiple systems modify the same data
- Detecting when changes conflict with each other
- Providing mechanisms to resolve conflicts when they occur
- Ensuring data remains consistent across integrated systems
The Concurrency Problem
In integrated environments, concurrent modifications happen when:
- Different users update the same record in different systems
- Automated processes modify data that's being edited by a user
- Batch processes and real-time updates occur simultaneously
- Multiple integration flows update related data
Without proper concurrency control, these scenarios can lead to:
- Lost Updates: Changes are overwritten without being acknowledged
- Inconsistent Reads: Systems read partially updated data
- Data Corruption: Related data becomes inconsistent
Concurrency Control Strategies
Optimistic Concurrency Control
Based on the assumption that conflicts are rare and can be detected after they occur:
- Version Numbers: Increment a counter with each update
Workshop Note
Both FileMaker and QuickBooks Online use version numbers for concurrency control. FileMaker's version number is accessible using
Get(RecordModificationCount), while QBO uses "SyncToken". Both the FileMaker Data API and QBO API will reject updates if the version numbers don't match the expected values. We'll be using version numbers in our workshop examples. - Timestamps: Track when records were last modified
- Hash Values: Compare checksums of record content
- Conditional Updates: Only update if data hasn't changed
Pessimistic Concurrency Control
Based on the assumption that conflicts are likely and should be prevented before they occur:
- Record Locking: Exclusively lock records during edits
- Serialization: Force operations to occur in sequence
- Semaphores: Control access to resources with flags
Conflict Resolution Approaches
When conflicts are detected, there are several resolution strategies:
- Last-Writer-Wins: The most recent update is accepted
- First-Writer-Wins: The earliest update is accepted
- Field-Level Merge: Combine non-conflicting field changes
- Manual Resolution: Present differences to a user for resolution
- Business Rule Resolution: Apply domain-specific rules to resolve conflicts
Integration-Specific Concurrency Challenges
Eventual Consistency
When using asynchronous integration:
- Systems may temporarily show different states
- Updates arrive at different times at different systems
- The integration might need to compensate for late-arriving updates
Distributed Transactions
For operations spanning multiple systems:
- No built-in transaction mechanism across system boundaries
- Need for compensating transactions to recover from partial failures
- Coordination of commit operations across different platforms
Time Synchronization
For timestamp-based concurrency control:
- Systems may have clock differences
- Time zones can complicate timestamp comparison
- Need for consistent time reference or logical clocks
Best Practices
- Design for concurrency: Plan concurrency control from the beginning
- Use optimistic approach for low-contention data: When conflicts are rare
- Use pessimistic approach for high-contention data: When conflicts are common
- Implement timeouts: Don't let locks persist indefinitely
- Keep lock scope narrow: Lock at the most granular level practical
- Document resolution rules: Make clear how conflicts should be resolved
- Log conflicts: Maintain records of detected conflicts and resolutions
- Use application-level concurrency control: Don't rely solely on database locks