Complete Guide to Data Format Conversion: JSON, YAML, XML, TOML
Modern applications often need to work with data in multiple formats. Whether you're migrating configurations, integrating with APIs, or working across different ecosystems, understanding how to convert between JSON, YAML, XML, and TOML is essential. This comprehensive guide covers syntax, use cases, and conversion best practices.
Overview of Data Formats
JSON (JavaScript Object Notation)
Characteristics:
- Simple, lightweight syntax
- Native JavaScript support
- Widely used in APIs
- Strict formatting rules
Best for:
- REST API responses
- Configuration files in JavaScript projects
- Data interchange between systems
- NoSQL database storage
{
"name": "John Doe",
"age": 30,
"active": true,
"tags": ["developer", "designer"],
"address": {
"city": "San Francisco",
"zip": "94102"
}
}
YAML (YAML Ain't Markup Language)
Characteristics:
- Human-readable, whitespace-sensitive
- Supports comments
- More concise than JSON
- Superset of JSON
Best for:
- Configuration files (Docker, Kubernetes, CI/CD)
- Data serialization
- Complex nested structures
- Human-edited files
name: John Doe
age: 30
active: true
tags:
- developer
- designer
address:
city: San Francisco
zip: "94102"
# Comments are supported
XML (eXtensible Markup Language)
Characteristics:
- Verbose, tag-based syntax
- Strong schema validation
- Supports attributes
- Industry standard for many domains
Best for:
- Enterprise systems (SOAP APIs)
- Document markup
- Configuration files (Maven, Spring)
- Data with metadata
<?xml version="1.0" encoding="UTF-8"?>
<person id="123">
<name>John Doe</name>
<age>30</age>
<active>true</active>
<tags>
<tag>developer</tag>
<tag>designer</tag>
</tags>
<address>
<city>San Francisco</city>
<zip>94102</zip>
</address>
</person>
TOML (Tom's Obvious, Minimal Language)
Characteristics:
- Designed for configuration files
- Clear, minimal syntax
- Supports comments
- Strong typing
Best for:
- Configuration files (Rust Cargo, Python Poetry)
- Simple, flat structures
- Human-maintained configs
- Version control friendly
name = "John Doe"
age = 30
active = true
tags = ["developer", "designer"]
[address]
city = "San Francisco"
zip = "94102"
# Comments are supported
Format Comparison
| Feature | JSON | YAML | XML | TOML |
|---|---|---|---|---|
| Human-readable | ⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐ | ⭐⭐⭐⭐ |
| Compact | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐ | ⭐⭐⭐⭐ |
| Comments | ❌ | ✅ | ✅ | ✅ |
| Schema validation | Limited | Limited | ✅ | Limited |
| Language support | ✅✅✅ | ✅✅ | ✅✅✅ | ✅ |
| Parsing speed | ⭐⭐⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐ | ⭐⭐⭐⭐ |
Common Conversion Scenarios
JSON ↔ YAML
JSON to YAML:
// JavaScript
const yaml = require('js-yaml');
const json = {
name: "API Config",
version: "1.0",
endpoints: [
{ path: "/users", method: "GET" },
{ path: "/posts", method: "POST" }
]
};
const yamlString = yaml.dump(json);
console.log(yamlString);
Output:
name: API Config
version: '1.0'
endpoints:
- path: /users
method: GET
- path: /posts
method: POST
YAML to JSON:
const yaml = require('js-yaml');
const yamlString = `
name: API Config
version: "1.0"
endpoints:
- path: /users
method: GET
`;
const json = yaml.load(yamlString);
console.log(JSON.stringify(json, null, 2));
JSON ↔ XML
JSON to XML:
const js2xml = require('js2xmlparser');
const data = {
user: {
name: "John Doe",
email: "john@example.com",
roles: {
role: ["admin", "editor"]
}
}
};
const xml = js2xml.parse("root", data);
console.log(xml);
Output:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<user>
<name>John Doe</name>
<email>john@example.com</email>
<roles>
<role>admin</role>
<role>editor</role>
</roles>
</user>
</root>
XML to JSON:
const xml2js = require('xml2js');
const xmlString = `
<root>
<user>
<name>John Doe</name>
<email>john@example.com</email>
</user>
</root>
`;
xml2js.parseString(xmlString, (err, result) => {
console.log(JSON.stringify(result, null, 2));
});
JSON ↔ TOML
JSON to TOML:
const toml = require('@iarna/toml');
const data = {
package: {
name: "my-app",
version: "1.0.0"
},
dependencies: {
"express": "^4.18.0",
"lodash": "^4.17.21"
}
};
const tomlString = toml.stringify(data);
console.log(tomlString);
Output:
[package]
name = "my-app"
version = "1.0.0"
[dependencies]
express = "^4.18.0"
lodash = "^4.17.21"
Handling Complex Data Structures
Nested Objects
// JSON
{
"database": {
"host": "localhost",
"port": 5432,
"credentials": {
"username": "admin",
"password": "secret"
}
}
}
// YAML (more readable)
database:
host: localhost
port: 5432
credentials:
username: admin
password: secret
// TOML (dotted keys)
[database]
host = "localhost"
port = 5432
[database.credentials]
username = "admin"
password = "secret"
Arrays and Lists
// JSON
{
"servers": [
{ "name": "web1", "ip": "192.168.1.1" },
{ "name": "web2", "ip": "192.168.1.2" }
]
}
// YAML (cleaner)
servers:
- name: web1
ip: 192.168.1.1
- name: web2
ip: 192.168.1.2
// TOML (table arrays)
[[servers]]
name = "web1"
ip = "192.168.1.1"
[[servers]]
name = "web2"
ip = "192.168.1.2"
Mixed Types
# YAML supports multiple types naturally
config:
string: "Hello"
number: 42
float: 3.14
boolean: true
null_value: null
date: 2025-01-14
list: [1, 2, 3]
multiline: |
This is a
multiline string
Conversion Challenges and Solutions
Challenge 1: Comments Don't Survive
Problem:
# Important: Don't change this value
max_connections: 100
When converted to JSON, comments are lost:
{
"max_connections": 100
}
Solution:
- Add comments as special keys with
_commentprefix - Maintain separate documentation
- Use description fields in schema
{
"_comment": "Important: Don't change this value",
"max_connections": 100
}
Challenge 2: XML Attributes vs Elements
Problem: XML attributes and elements represent data differently:
<user id="123" status="active">
<name>John Doe</name>
</user>
Solution: Use a convention for attributes in JSON:
{
"user": {
"@id": "123",
"@status": "active",
"name": "John Doe"
}
}
Challenge 3: Type Coercion
Problem: YAML auto-converts types:
zip_code: 12345 # Interpreted as number
country: NO # Interpreted as boolean false!
version: 1.0 # Interpreted as number
Solution: Always quote ambiguous values:
zip_code: "12345"
country: "NO"
version: "1.0"
Challenge 4: Circular References
Problem: Some formats don't support circular references:
const obj = { name: "test" };
obj.self = obj; // Circular reference
JSON.stringify(obj); // Error: Converting circular structure to JSON
Solution:
- Break circular references before conversion
- Use references/anchors in YAML:
person: &person_ref
name: John
friend: *person_ref
Best Practices
1. Choose the Right Format for the Job
- APIs: JSON (fastest parsing, universal support)
- Config files: YAML or TOML (human-friendly, comments)
- Enterprise systems: XML (mature tooling, validation)
- Simple configs: TOML (clear, type-safe)
2. Validate Before Converting
function convertJSON toYAML(jsonString) {
try {
// Validate JSON first
const data = JSON.parse(jsonString);
// Convert to YAML
return yaml.dump(data);
} catch (error) {
throw new Error(`Invalid JSON: ${error.message}`);
}
}
3. Preserve Type Information
// Add metadata for lossless conversion
const data = {
_meta: {
types: {
"age": "integer",
"score": "float"
}
},
age: 30,
score: 95.5
};
4. Handle Large Files Efficiently
const fs = require('fs');
const stream = require('stream');
const yaml = require('js-yaml');
// Stream large YAML files
const readStream = fs.createReadStream('large-file.yaml', 'utf8');
const writeStream = fs.createWriteStream('output.json');
readStream.pipe(new stream.Transform({
transform(chunk, encoding, callback) {
const data = yaml.load(chunk.toString());
callback(null, JSON.stringify(data));
}
})).pipe(writeStream);
5. Test Edge Cases
const testCases = [
'', // Empty string
'{}', // Empty object
'[]', // Empty array
'null', // Null value
'"12345"', // Quoted number
'{"key": undefined}', // Undefined value
];
testCases.forEach(test => {
try {
const result = convertFormat(test);
console.log('✓ Handled:', test);
} catch (error) {
console.error('✗ Failed:', test, error.message);
}
});
Tools and Libraries
JavaScript/Node.js
- JSON: Native
JSON.parse()/JSON.stringify() - YAML:
js-yaml,yaml - XML:
xml2js,fast-xml-parser,js2xmlparser - TOML:
@iarna/toml,@ltd/j-toml
Python
- JSON: Native
jsonmodule - YAML:
PyYAML,ruamel.yaml - XML:
xml.etree.ElementTree,lxml - TOML:
toml,tomli/tomllib(Python 3.11+)
Command Line
# JSON to YAML
yq eval -P file.json > file.yaml
# YAML to JSON
yq eval -o=json file.yaml > file.json
# XML to JSON
xmlstarlet sel -t -c "." file.xml | python -c "import sys, json, xmltodict; print(json.dumps(xmltodict.parse(sys.stdin.read())))"
Online Tools
Use toolcli Format Converter to:
- Convert between JSON, YAML, XML, and TOML instantly
- Validate syntax before conversion
- Format and beautify output
- Handle large files client-side (no uploads required)
Real-World Examples
Example 1: Kubernetes Config Migration
# Original Docker Compose (YAML)
version: "3.8"
services:
web:
image: nginx:latest
ports:
- "80:80"
environment:
- NODE_ENV=production
# Convert to Kubernetes (YAML with different structure)
apiVersion: v1
kind: Service
metadata:
name: web
spec:
selector:
app: web
ports:
- port: 80
targetPort: 80
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: web
spec:
replicas: 1
template:
spec:
containers:
- name: web
image: nginx:latest
ports:
- containerPort: 80
env:
- name: NODE_ENV
value: production
Example 2: API Response Transformation
// XML SOAP response
const xmlResponse = `
<soap:Envelope>
<soap:Body>
<GetUserResponse>
<User>
<ID>123</ID>
<Name>John Doe</Name>
</User>
</GetUserResponse>
</soap:Body>
</soap:Envelope>
`;
// Convert to JSON REST format
const jsonResponse = {
user: {
id: 123,
name: "John Doe"
}
};
Example 3: Configuration File Migration
# Migrate from .toml to .json for Node.js project
[server]
host = "0.0.0.0"
port = 3000
[database]
host = "localhost"
port = 5432
name = "myapp"
Converts to:
{
"server": {
"host": "0.0.0.0",
"port": 3000
},
"database": {
"host": "localhost",
"port": 5432,
"name": "myapp"
}
}
Conclusion
Understanding data format conversion is crucial for modern development. Key takeaways:
- Choose the right format for your use case
- Understand limitations of each format
- Validate data before and after conversion
- Test edge cases thoroughly
- Use established libraries instead of rolling your own
- Document conversion decisions for your team
By mastering these conversions, you'll be able to work seamlessly across different ecosystems and integrate systems that use different data formats.