Complexity scoring and risk analysis for system migrations. Quantify effort, surface hidden risks, and generate phased migration plans.
The Migration Assessor is the strategic planning layer that sits above the Schema Mapper and Payload Translator. While those services handle the mechanics of mapping and transforming data, the Migration Assessor evaluates the full picture: how complex is the migration, what risks are hiding beneath the surface, and what does a realistic phased plan look like? It produces a weighted complexity score, categorized risk inventory, multi-phase migration plan, and calibrated effort estimates.
Eight weighted factors combine into a single 0-100 complexity score.
| Factor | Weight | Scoring Criteria |
|---|---|---|
| Schema size | 0.15 |
Total number of fields, types, and nested structures across source and target schemas. |
| Structural divergence | 0.20 |
Degree of difference in hierarchy, grouping, and nesting between source and target. |
| Type mismatches | 0.15 |
Count and severity of fields that require type coercion or lossy conversion. |
| Enum complexity | 0.10 |
Unmapped enum values, many-to-one mappings, and missing default handling. |
| Nesting depth delta | 0.10 |
Difference in maximum nesting depth requiring flatten or expand transformations. |
| Unmappable field ratio | 0.15 |
Percentage of source fields with no viable target counterpart. |
| Data volume | 0.05 |
Expected record count and payload sizes affecting migration window and throughput. |
| Domain complexity | 0.10 |
Business rule density, cross-entity dependencies, and regulatory constraints. |
Risks are categorized across four dimensions to ensure nothing is overlooked.
Unmapped fields that will be dropped, precision loss from type narrowing (e.g., float64 to float32), and truncation from length constraint differences.
Fields that share a name but carry different business meanings, divergent enum semantics, and context-dependent value interpretation across systems.
Large record counts that exceed migration window constraints, payload sizes that challenge network throughput, and batch processing bottlenecks.
PII fields requiring special handling, PHI data subject to HIPAA constraints, cross-border data residency rules, and audit trail requirements.
A four-phase approach that moves from high-confidence automation to manual review.
60-70% of fields typically fall here. These have a mapping confidence above 0.8 and can be migrated automatically with no human review. Includes exact name matches, direct type compatibilities, and well-known aliases.
Fields with mapping confidence between 0.5 and 0.8. The system proposes a mapping but flags it for human verification. Typically involves semantic near-matches, partial type overlaps, or ambiguous field names.
Unmappable fields that require bespoke transformation logic, data enrichment from external sources, or entirely new field derivations. Each item includes a complexity estimate and suggested approach.
Post-migration verification including record count reconciliation, checksum validation, referential integrity checks, and business rule assertion testing across the migrated dataset.
Submit source and target schemas to receive a full migration assessment.
{
"source_schema": {
"format": "json_schema",
"content": {
"type": "object",
"properties": {
"patient_id": { "type": "integer" },
"full_name": { "type": "string" },
"dob": { "type": "string" },
"ssn": { "type": "string" },
"diagnosis_codes": {
"type": "array",
"items": { "type": "string" }
}
}
}
},
"target_schema": {
"format": "json_schema",
"content": {
"type": "object",
"properties": {
"id": { "type": "string" },
"first_name": { "type": "string" },
"last_name": { "type": "string" },
"date_of_birth": { "type": "string" },
"icd10_codes": {
"type": "array",
"items": { "type": "string" }
}
}
}
},
"options": {
"depth": "comprehensive",
"estimated_record_count": 250000
}
}
{
"complexity_score": 62,
"risk_level": "high",
"risks": [
{
"category": "data_loss",
"field": "ssn",
"detail": "No target field found. PII data will be dropped.",
"severity": "critical"
},
{
"category": "semantic",
"field": "full_name",
"detail": "Single field must split into first_name + last_name.",
"severity": "medium"
},
{
"category": "regulatory",
"field": "ssn",
"detail": "PII field requires audit trail for deletion.",
"severity": "high"
}
],
"migration_plan": {
"phase_1_auto": ["dob -> date_of_birth", "diagnosis_codes -> icd10_codes"],
"phase_2_review": ["patient_id -> id (type coercion)"],
"phase_3_custom": ["full_name -> first_name + last_name (split)"],
"phase_4_validate": ["record_count_check", "pii_audit"]
},
"effort_estimate": {
"engineering_days": 8.5,
"confidence": 0.72,
"breakdown": {
"auto_mapping": "0.5 days",
"review_mapping": "1 day",
"custom_dev": "3 days",
"validation": "2 days",
"buffer": "2 days"
}
}
}
Choose the level of analysis that fits your planning stage.
Structural comparison only. Returns complexity score and top-level risks within seconds. Best for initial triage and feasibility checks.
Structural plus semantic analysis. Includes migration plan and effort estimate. The default for most assessments -- balances depth with speed.
Full analysis with regulatory scanning, consumer impact modeling, and detailed per-field migration playbooks. Recommended for production migrations.
The assessor produces a calibrated effort estimate expressed in engineering days. The estimate is broken down by migration phase -- auto-mapping, review, custom development, validation, and a risk-adjusted buffer. A confidence score (0 to 1) indicates how reliable the estimate is based on the completeness of the input schemas and the proportion of unambiguous mappings.
{
"effort_estimate": {
"engineering_days": 8.5,
"confidence": 0.72,
"breakdown": {
"auto_mapping": "0.5 days",
"review_mapping": "1 day",
"custom_dev": "3 days",
"validation": "2 days",
"buffer": "2 days"
},
"notes": [
"Custom dev estimate elevated due to name-splitting logic",
"Buffer accounts for PII audit and regulatory review"
]
}
}