Solving Spring Batch ExecutionContext Storage Error
Complete Guide to Fix "Data too long for column 'SERIALIZED_CONTEXT'" Error
Introduction
Today, I want to share my experience resolving an unexpected error that occurred in a batch module using Spring Batch and Quartz. The SERIALIZED_CONTEXT
column size limitation error is a common issue when operating batch systems, but it can be challenging to identify the exact cause and solution.
The Problem - Sudden Batch Failure
During a routine nightly batch execution, the job suddenly failed with the following error:
org.springframework.batch.core.JobExecutionException: Flow execution ended unexpectedly
at org.springframework.batch.core.job.flow.FlowJob.doExecute(FlowJob.java:136)
...
Caused by: org.springframework.batch.core.step.StepExecutionException: Flow execution ended unexpectedly
at org.springframework.batch.core.step.job.JobStep.doExecute(JobStep.java:131)
...
Caused by: org.springframework.jdbc.BadSqlGrammarException: PreparedStatementCallback; bad SQL grammar [UPDATE BATCH_JOB_EXECUTION_CONTEXT SET SERIALIZED_CONTEXT = ? WHERE JOB_EXECUTION_ID = ?]
...
Caused by: java.sql.SQLSyntaxErrorException: Data too long for column 'SERIALIZED_CONTEXT' at row 1
Key Error Message: Data too long for column 'SERIALIZED_CONTEXT' at row 1
Initially, I was puzzled: “Data too long? What kind of data is the batch storing so much of?”
Root Cause Analysis - Understanding ExecutionContext
What is Spring Batch ExecutionContext?
Spring Batch uses ExecutionContext
to manage the execution state of Jobs and Steps:
// ExecutionContext usage example
@Component
public class SampleTasklet implements Tasklet {
@Override
public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) {
ExecutionContext context = chunkContext.getStepContext()
.getStepExecution()
.getJobExecution()
.getExecutionContext();
// Store data during execution
context.put("processedCount", 1000);
context.put("lastProcessedId", 12345L);
return RepeatStatus.FINISHED;
}
}
ExecutionContext Functions:
- Data sharing between Jobs and Steps
- State restoration when restarting
- Progress tracking and checkpoint management
Database Storage Structure
Spring Batch serializes ExecutionContext and stores it in the following tables:
-- Job-level context
CREATE TABLE BATCH_JOB_EXECUTION_CONTEXT (
JOB_EXECUTION_ID BIGINT NOT NULL,
SHORT_CONTEXT VARCHAR(2500) NOT NULL,
SERIALIZED_CONTEXT TEXT, -- This is the problem!
PRIMARY KEY (JOB_EXECUTION_ID)
);
-- Step-level context
CREATE TABLE BATCH_STEP_EXECUTION_CONTEXT (
STEP_EXECUTION_ID BIGINT NOT NULL,
SHORT_CONTEXT VARCHAR(2500) NOT NULL,
SERIALIZED_CONTEXT TEXT, -- This too!
PRIMARY KEY (STEP_EXECUTION_ID)
);
Identifying the Root Cause
Upon examining the ExecutionContext at the time of the error:
// Problematic code (conceptual example)
public class ProblematicProcessor implements ItemProcessor<InputData, OutputData> {
@Override
public OutputData process(InputData item) throws Exception {
ExecutionContext context = getExecutionContext();
// Storing large lists in ExecutionContext (root cause!)
List<DetailData> processedDetails = processDetailData(item);
context.put("detailDataList", processedDetails); // Tens of thousands of records
// Also storing complex objects
ComplexBusinessObject businessObj = createBusinessObject(item);
context.put("businessObject", businessObj); // Large object
return transform(item);
}
}
Issues:
- Storing massive list data in ExecutionContext
- Complex objects causing size explosion when serialized
- Exceeding default
TEXT
type size limit (MySQL: 65,535 bytes)
Checking Actual Size
-- Check current stored context size
SELECT
JOB_EXECUTION_ID,
LENGTH(SERIALIZED_CONTEXT) as context_size,
CHAR_LENGTH(SERIALIZED_CONTEXT) as context_length
FROM BATCH_JOB_EXECUTION_CONTEXT
ORDER BY context_size DESC;
-- Result: Some records exceed 65,000+ bytes
Solution 1: Database Schema Modification (Recommended)
Changing Column Type
The most reliable solution is to change the SERIALIZED_CONTEXT
column to a larger type:
-- Modify Job execution context table
ALTER TABLE BATCH_JOB_EXECUTION_CONTEXT
MODIFY COLUMN SERIALIZED_CONTEXT LONGTEXT;
-- Modify Step execution context table
ALTER TABLE BATCH_STEP_EXECUTION_CONTEXT
MODIFY COLUMN SERIALIZED_CONTEXT LONGTEXT;
LONGTEXT Advantages:
- Can store up to 4GB
- Resolves ExecutionContext size limitations
- Ensures future scalability
Post-Change Verification
-- Verify column type change
DESCRIBE BATCH_JOB_EXECUTION_CONTEXT;
DESCRIBE BATCH_STEP_EXECUTION_CONTEXT;
-- Test batch execution after change
Solution 2: ExecutionContext Usage Optimization
Improving Problematic Code
// Before: Storing large data in ExecutionContext
public class ImprovedProcessor implements ItemProcessor<InputData, OutputData> {
@Override
public OutputData process(InputData item) throws Exception {
ExecutionContext context = getExecutionContext();
// Improvement: Store only essential minimal information
context.put("lastProcessedId", item.getId());
context.put("processedCount", getProcessedCount());
// Use separate storage for large data
cacheService.store("detailData_" + item.getId(), processedDetails);
return transform(item);
}
}
ExecutionContext Usage Guidelines
// Good: Simple primitive types and small objects
context.put("currentPage", pageNumber);
context.put("lastProcessedTimestamp", LocalDateTime.now().toString());
context.put("errorCount", errorCount);
// Bad: Large collections or complex objects
context.put("allProcessedData", largeList); // ❌
context.put("complexBusinessObject", heavyObject); // ❌
context.put("temporaryCache", cacheMap); // ❌
Solution 3: Implementing Context Cleanup Tasklet
Adding Cleanup Tasks
@Component
public class ContextCleanupTasklet implements Tasklet {
@Override
public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) {
ExecutionContext jobContext = chunkContext.getStepContext()
.getStepExecution()
.getJobExecution()
.getExecutionContext();
// Remove unnecessary large data
jobContext.remove("temporaryLargeData");
jobContext.remove("cacheData");
// Keep only essential information
Map<String, Object> essentialData = new HashMap<>();
essentialData.put("finalStatus", "COMPLETED");
essentialData.put("processedCount", jobContext.get("processedCount"));
// Clear context and restore only essential information
jobContext.clear();
jobContext.putAll(essentialData);
return RepeatStatus.FINISHED;
}
}
Adding Cleanup Step to Job Configuration
@Bean
public Job optimizedJob() {
return jobBuilderFactory.get("optimizedJob")
.start(dataProcessingStep())
.next(contextCleanupStep()) // Add cleanup task
.build();
}
@Bean
public Step contextCleanupStep() {
return stepBuilderFactory.get("contextCleanupStep")
.tasklet(contextCleanupTasklet)
.build();
}
Solution 4: Monitoring and Prevention
ExecutionContext Size Monitoring
@Component
public class ExecutionContextMonitor {
@EventListener
public void handleStepExecution(StepExecutionEvent event) {
ExecutionContext context = event.getStepExecution()
.getJobExecution()
.getExecutionContext();
// Estimate context size
int estimatedSize = estimateContextSize(context);
if (estimatedSize > 50000) { // 50KB threshold
log.warn("ExecutionContext size is large: {} bytes. Job: {}",
estimatedSize, event.getStepExecution().getJobExecution().getJobInstance().getJobName());
}
}
private int estimateContextSize(ExecutionContext context) {
try {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
ObjectOutputStream oos = new ObjectOutputStream(baos);
oos.writeObject(context);
return baos.size();
} catch (Exception e) {
return 0;
}
}
}
Building Alert System
@Component
public class BatchErrorNotifier {
@EventListener
public void handleJobExecutionException(JobExecutionException ex) {
if (ex.getMessage().contains("SERIALIZED_CONTEXT")) {
alertService.sendAlert(
"ExecutionContext size limit exceeded in batch job. " +
"Consider optimizing context usage or increasing column size."
);
}
}
}
Final Recommended Solution
Based on experience, the most effective combination is:
Step 1: Immediate Application (Schema Modification)
-- Stepwise application to minimize downtime
ALTER TABLE BATCH_JOB_EXECUTION_CONTEXT
MODIFY COLUMN SERIALIZED_CONTEXT LONGTEXT;
ALTER TABLE BATCH_STEP_EXECUTION_CONTEXT
MODIFY COLUMN SERIALIZED_CONTEXT LONGTEXT;
Step 2: Gradual Improvement (Code Optimization)
// Establish ExecutionContext usage principles
public class ExecutionContextBestPractices {
// Principle 1: Store only essential information
public void storeEssentialOnly(ExecutionContext context) {
context.put("checkpoint", lastProcessedId);
context.put("count", processedCount);
// Store detailed data separately
}
// Principle 2: Regular cleanup
public void cleanupPeriodically(ExecutionContext context) {
if (shouldCleanup()) {
context.remove("temporaryData");
}
}
// Principle 3: Size limit checks
public void validateContextSize(ExecutionContext context) {
if (estimateSize(context) > MAX_CONTEXT_SIZE) {
throw new IllegalStateException("ExecutionContext too large");
}
}
}
Best Practices for Prevention
- Keep ExecutionContext Light: Store only state information and checkpoints
- Manage Large Data Separately: Use cache or temporary tables
- Regular Monitoring: Track context size
- Validation in Test Environment: Pre-test with large datasets
- Ensure Schema Flexibility: Use LONGTEXT for scalability
Performance Impact Considerations
Before and After Schema Change
Aspect | TEXT Column | LONGTEXT Column |
---|---|---|
Max Storage | 65,535 bytes | 4GB |
Memory Usage | Lower | Slightly higher |
Query Performance | Faster | Negligible difference |
Backup Size | Smaller | Larger |
ExecutionContext Optimization Results
// Performance comparison
public class PerformanceMetrics {
// Before optimization
// - Average context size: 45KB
// - Serialization time: 150ms
// - Memory usage: High
// After optimization
// - Average context size: 2KB
// - Serialization time: 5ms
// - Memory usage: Low
}
Common Pitfalls to Avoid
1. Storing Entire Result Sets
// ❌ Don't do this
context.put("allResults", resultList); // Can be thousands of records
// ✅ Do this instead
context.put("resultCount", resultList.size());
context.put("lastProcessedId", getLastId(resultList));
2. Caching Complex Objects
// ❌ Avoid
context.put("userCache", userObjectMap); // Memory-heavy objects
// ✅ Better approach
context.put("cacheKeys", userIdList); // Store keys only
3. Temporary Data Accumulation
// ❌ Problems
public void processChunk(List<Item> items) {
for (Item item : items) {
// Accumulating without cleanup
context.put("temp_" + item.getId(), processingResult);
}
}
// ✅ Solution
public void processChunk(List<Item> items) {
Map<String, Object> batchResults = new HashMap<>();
for (Item item : items) {
batchResults.put(item.getId(), process(item));
}
// Store summary only
context.put("batchSummary", createSummary(batchResults));
}
Conclusion
The Spring Batch ExecutionContext size limitation issue is a natural problem encountered as batch systems grow. The key points are:
- Understanding Root Cause: Grasp ExecutionContext’s role and storage mechanism
- Immediate Resolution: Fix current issues with schema modification
- Long-term Optimization: Fundamental resolution through code improvement
- Continuous Monitoring: Establish observation system to prevent recurrence
Through this experience, I gained a deeper understanding of Spring Batch’s internal workings and will approach ExecutionContext usage more carefully in the future. I hope this experience helps other developers facing similar issues.
Additional Resources
For teams experiencing similar issues, consider:
- Monitoring Tools: Implement ExecutionContext size tracking
- Code Review Guidelines: Include ExecutionContext usage in review checklists
- Testing Strategy: Include large-scale data testing in CI/CD pipelines
- Documentation: Maintain clear guidelines for ExecutionContext best practices
Remember: Prevention is always better than cure when it comes to batch processing systems.