Complete Guide to Data Format Conversion: JSON, YAML, XML, TOML

Modern applications often need to work with data in multiple formats. Whether you're migrating configurations, integrating with APIs, or working across different ecosystems, understanding how to convert between JSON, YAML, XML, and TOML is essential. This comprehensive guide covers syntax, use cases, and conversion best practices.

Overview of Data Formats

JSON (JavaScript Object Notation)

Characteristics:

  • Simple, lightweight syntax
  • Native JavaScript support
  • Widely used in APIs
  • Strict formatting rules

Best for:

  • REST API responses
  • Configuration files in JavaScript projects
  • Data interchange between systems
  • NoSQL database storage
{
  "name": "John Doe",
  "age": 30,
  "active": true,
  "tags": ["developer", "designer"],
  "address": {
    "city": "San Francisco",
    "zip": "94102"
  }
}

YAML (YAML Ain't Markup Language)

Characteristics:

  • Human-readable, whitespace-sensitive
  • Supports comments
  • More concise than JSON
  • Superset of JSON

Best for:

  • Configuration files (Docker, Kubernetes, CI/CD)
  • Data serialization
  • Complex nested structures
  • Human-edited files
name: John Doe
age: 30
active: true
tags:
  - developer
  - designer
address:
  city: San Francisco
  zip: "94102"

# Comments are supported

XML (eXtensible Markup Language)

Characteristics:

  • Verbose, tag-based syntax
  • Strong schema validation
  • Supports attributes
  • Industry standard for many domains

Best for:

  • Enterprise systems (SOAP APIs)
  • Document markup
  • Configuration files (Maven, Spring)
  • Data with metadata
<?xml version="1.0" encoding="UTF-8"?>
<person id="123">
  <name>John Doe</name>
  <age>30</age>
  <active>true</active>
  <tags>
    <tag>developer</tag>
    <tag>designer</tag>
  </tags>
  <address>
    <city>San Francisco</city>
    <zip>94102</zip>
  </address>
</person>

TOML (Tom's Obvious, Minimal Language)

Characteristics:

  • Designed for configuration files
  • Clear, minimal syntax
  • Supports comments
  • Strong typing

Best for:

  • Configuration files (Rust Cargo, Python Poetry)
  • Simple, flat structures
  • Human-maintained configs
  • Version control friendly
name = "John Doe"
age = 30
active = true
tags = ["developer", "designer"]

[address]
city = "San Francisco"
zip = "94102"

# Comments are supported

Format Comparison

Feature JSON YAML XML TOML
Human-readable ⭐⭐⭐ ⭐⭐⭐⭐⭐ ⭐⭐ ⭐⭐⭐⭐
Compact ⭐⭐⭐⭐ ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐
Comments
Schema validation Limited Limited Limited
Language support ✅✅✅ ✅✅ ✅✅✅
Parsing speed ⭐⭐⭐⭐⭐ ⭐⭐⭐ ⭐⭐ ⭐⭐⭐⭐

Common Conversion Scenarios

JSON ↔ YAML

JSON to YAML:

// JavaScript
const yaml = require('js-yaml');
const json = {
  name: "API Config",
  version: "1.0",
  endpoints: [
    { path: "/users", method: "GET" },
    { path: "/posts", method: "POST" }
  ]
};

const yamlString = yaml.dump(json);
console.log(yamlString);

Output:

name: API Config
version: '1.0'
endpoints:
  - path: /users
    method: GET
  - path: /posts
    method: POST

YAML to JSON:

const yaml = require('js-yaml');
const yamlString = `
name: API Config
version: "1.0"
endpoints:
  - path: /users
    method: GET
`;

const json = yaml.load(yamlString);
console.log(JSON.stringify(json, null, 2));

JSON ↔ XML

JSON to XML:

const js2xml = require('js2xmlparser');
const data = {
  user: {
    name: "John Doe",
    email: "john@example.com",
    roles: {
      role: ["admin", "editor"]
    }
  }
};

const xml = js2xml.parse("root", data);
console.log(xml);

Output:

<?xml version="1.0" encoding="UTF-8"?>
<root>
  <user>
    <name>John Doe</name>
    <email>john@example.com</email>
    <roles>
      <role>admin</role>
      <role>editor</role>
    </roles>
  </user>
</root>

XML to JSON:

const xml2js = require('xml2js');
const xmlString = `
<root>
  <user>
    <name>John Doe</name>
    <email>john@example.com</email>
  </user>
</root>
`;

xml2js.parseString(xmlString, (err, result) => {
  console.log(JSON.stringify(result, null, 2));
});

JSON ↔ TOML

JSON to TOML:

const toml = require('@iarna/toml');
const data = {
  package: {
    name: "my-app",
    version: "1.0.0"
  },
  dependencies: {
    "express": "^4.18.0",
    "lodash": "^4.17.21"
  }
};

const tomlString = toml.stringify(data);
console.log(tomlString);

Output:

[package]
name = "my-app"
version = "1.0.0"

[dependencies]
express = "^4.18.0"
lodash = "^4.17.21"

Handling Complex Data Structures

Nested Objects

// JSON
{
  "database": {
    "host": "localhost",
    "port": 5432,
    "credentials": {
      "username": "admin",
      "password": "secret"
    }
  }
}

// YAML (more readable)
database:
  host: localhost
  port: 5432
  credentials:
    username: admin
    password: secret

// TOML (dotted keys)
[database]
host = "localhost"
port = 5432

[database.credentials]
username = "admin"
password = "secret"

Arrays and Lists

// JSON
{
  "servers": [
    { "name": "web1", "ip": "192.168.1.1" },
    { "name": "web2", "ip": "192.168.1.2" }
  ]
}

// YAML (cleaner)
servers:
  - name: web1
    ip: 192.168.1.1
  - name: web2
    ip: 192.168.1.2

// TOML (table arrays)
[[servers]]
name = "web1"
ip = "192.168.1.1"

[[servers]]
name = "web2"
ip = "192.168.1.2"

Mixed Types

# YAML supports multiple types naturally
config:
  string: "Hello"
  number: 42
  float: 3.14
  boolean: true
  null_value: null
  date: 2025-01-14
  list: [1, 2, 3]
  multiline: |
    This is a
    multiline string

Conversion Challenges and Solutions

Challenge 1: Comments Don't Survive

Problem:

# Important: Don't change this value
max_connections: 100

When converted to JSON, comments are lost:

{
  "max_connections": 100
}

Solution:

  • Add comments as special keys with _comment prefix
  • Maintain separate documentation
  • Use description fields in schema
{
  "_comment": "Important: Don't change this value",
  "max_connections": 100
}

Challenge 2: XML Attributes vs Elements

Problem: XML attributes and elements represent data differently:

<user id="123" status="active">
  <name>John Doe</name>
</user>

Solution: Use a convention for attributes in JSON:

{
  "user": {
    "@id": "123",
    "@status": "active",
    "name": "John Doe"
  }
}

Challenge 3: Type Coercion

Problem: YAML auto-converts types:

zip_code: 12345     # Interpreted as number
country: NO          # Interpreted as boolean false!
version: 1.0         # Interpreted as number

Solution: Always quote ambiguous values:

zip_code: "12345"
country: "NO"
version: "1.0"

Challenge 4: Circular References

Problem: Some formats don't support circular references:

const obj = { name: "test" };
obj.self = obj;  // Circular reference

JSON.stringify(obj);  // Error: Converting circular structure to JSON

Solution:

  • Break circular references before conversion
  • Use references/anchors in YAML:
person: &person_ref
  name: John
  friend: *person_ref

Best Practices

1. Choose the Right Format for the Job

  • APIs: JSON (fastest parsing, universal support)
  • Config files: YAML or TOML (human-friendly, comments)
  • Enterprise systems: XML (mature tooling, validation)
  • Simple configs: TOML (clear, type-safe)

2. Validate Before Converting

function convertJSON toYAML(jsonString) {
  try {
    // Validate JSON first
    const data = JSON.parse(jsonString);

    // Convert to YAML
    return yaml.dump(data);
  } catch (error) {
    throw new Error(`Invalid JSON: ${error.message}`);
  }
}

3. Preserve Type Information

// Add metadata for lossless conversion
const data = {
  _meta: {
    types: {
      "age": "integer",
      "score": "float"
    }
  },
  age: 30,
  score: 95.5
};

4. Handle Large Files Efficiently

const fs = require('fs');
const stream = require('stream');
const yaml = require('js-yaml');

// Stream large YAML files
const readStream = fs.createReadStream('large-file.yaml', 'utf8');
const writeStream = fs.createWriteStream('output.json');

readStream.pipe(new stream.Transform({
  transform(chunk, encoding, callback) {
    const data = yaml.load(chunk.toString());
    callback(null, JSON.stringify(data));
  }
})).pipe(writeStream);

5. Test Edge Cases

const testCases = [
  '',                    // Empty string
  '{}',                  // Empty object
  '[]',                  // Empty array
  'null',                // Null value
  '"12345"',             // Quoted number
  '{"key": undefined}',  // Undefined value
];

testCases.forEach(test => {
  try {
    const result = convertFormat(test);
    console.log('✓ Handled:', test);
  } catch (error) {
    console.error('✗ Failed:', test, error.message);
  }
});

Tools and Libraries

JavaScript/Node.js

  • JSON: Native JSON.parse() / JSON.stringify()
  • YAML: js-yaml, yaml
  • XML: xml2js, fast-xml-parser, js2xmlparser
  • TOML: @iarna/toml, @ltd/j-toml

Python

  • JSON: Native json module
  • YAML: PyYAML, ruamel.yaml
  • XML: xml.etree.ElementTree, lxml
  • TOML: toml, tomli/tomllib (Python 3.11+)

Command Line

# JSON to YAML
yq eval -P file.json > file.yaml

# YAML to JSON
yq eval -o=json file.yaml > file.json

# XML to JSON
xmlstarlet sel -t -c "." file.xml | python -c "import sys, json, xmltodict; print(json.dumps(xmltodict.parse(sys.stdin.read())))"

Online Tools

Use toolcli Format Converter to:

  • Convert between JSON, YAML, XML, and TOML instantly
  • Validate syntax before conversion
  • Format and beautify output
  • Handle large files client-side (no uploads required)

Real-World Examples

Example 1: Kubernetes Config Migration

# Original Docker Compose (YAML)
version: "3.8"
services:
  web:
    image: nginx:latest
    ports:
      - "80:80"
    environment:
      - NODE_ENV=production

# Convert to Kubernetes (YAML with different structure)
apiVersion: v1
kind: Service
metadata:
  name: web
spec:
  selector:
    app: web
  ports:
    - port: 80
      targetPort: 80
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web
spec:
  replicas: 1
  template:
    spec:
      containers:
        - name: web
          image: nginx:latest
          ports:
            - containerPort: 80
          env:
            - name: NODE_ENV
              value: production

Example 2: API Response Transformation

// XML SOAP response
const xmlResponse = `
<soap:Envelope>
  <soap:Body>
    <GetUserResponse>
      <User>
        <ID>123</ID>
        <Name>John Doe</Name>
      </User>
    </GetUserResponse>
  </soap:Body>
</soap:Envelope>
`;

// Convert to JSON REST format
const jsonResponse = {
  user: {
    id: 123,
    name: "John Doe"
  }
};

Example 3: Configuration File Migration

# Migrate from .toml to .json for Node.js project
[server]
host = "0.0.0.0"
port = 3000

[database]
host = "localhost"
port = 5432
name = "myapp"

Converts to:

{
  "server": {
    "host": "0.0.0.0",
    "port": 3000
  },
  "database": {
    "host": "localhost",
    "port": 5432,
    "name": "myapp"
  }
}

Conclusion

Understanding data format conversion is crucial for modern development. Key takeaways:

  1. Choose the right format for your use case
  2. Understand limitations of each format
  3. Validate data before and after conversion
  4. Test edge cases thoroughly
  5. Use established libraries instead of rolling your own
  6. Document conversion decisions for your team

By mastering these conversions, you'll be able to work seamlessly across different ecosystems and integrate systems that use different data formats.

Additional Resources