XML Syntax, Namespaces, and Well-Formedness

Unit 1CLO01

Learning Objectives

Course Learning Outcomes

CLO01

Course Outcomes

CO01
ℹ️

Introduction

Real XML systems depend on strict syntax rules so that any conforming parser produces the same tree. This topic focuses on well-formedness rules, escaping, and namespaces. Namespaces allow multiple vocabularies to coexist without collisions—essential for XSD and many industry XML formats.

The Basics

Well-formedness rules

A document is well-formed if:

  1. Exactly one root element exists.
  2. Tags are properly nested.
  3. Every start tag has a matching end tag (or uses empty-element syntax).
  4. Attribute values are quoted.

Escaping special characters

In text content:

  • & becomes &
  • < becomes &lt;

Entities (basic)

Predefined entities: &amp; &lt; &gt; &quot; &apos;

Technical Details

Namespaces

Namespaces prevent collisions by qualifying names with a URI.

<bk:book xmlns:bk="http://example.com/book">
  <bk:title>XML</bk:title>
</bk:book>

Default namespace

<book xmlns="http://example.com/book">
  <title>XML</title>
</book>

Note: unprefixed attributes do not automatically belong to the default namespace in XML 1.0.

Examples

Examples

Valid nesting

<a>
  <b>
    <c />
  </b>
</a>

Invalid nesting

<a><b></a></b>

Escaping

<msg>Use &amp; to represent an ampersand.</msg>

Real-World Use

Practical

  • Create an XML file that uses a namespace prefix (e.g., lib, inv).
  • Validate well-formedness with any strict parser.
  • Intentionally introduce 3 errors and note the error messages.

📝 For exams

Exam

  • Define namespace and default namespace.
  • Explain well-formedness rules.
  • Explain namespaces (prefix/URI mapping) and why they are needed.

✨ Key points

Takeaways

  • Well-formedness is required for parsing.
  • Escape special characters correctly.
  • Namespaces prevent collisions and enable schema-driven validation.