XML Syntax, Namespaces, and Well-Formedness

Unit 1CLO01

Learning Objectives

Course Learning Outcomes

CLO01

Course Outcomes

CO01
ℹ️

Introduction

Syntax discipline is what lets any conforming parser build the same tree. This topic sharpens the core rules—proper nesting, entity handling, and namespace hygiene—so downstream validation and transforms stay predictable.

Study tracker

Mark what you have completed for this topic.

0% done

The Basics

Well-formedness rules

  1. Exactly one root element.
  2. Properly nested tags; every start tag closes.
  3. Attribute values are quoted.
  4. Reserved characters (<, >, &) are escaped inside text.

Predefined entities table

CharacterEntityWhen to use
<&lt;Less-than symbol in text
>&gt;Greater-than symbol in text
&&amp;Ampersand in text
"&quot;Double quote in attributes
'&apos;Single quote (apostrophe) in attributes

Validity reminder

Valid = well-formed + conforms to a schema (DTD/XSD).

Technical Details

Namespaces in action

Namespaces prevent name collisions when mixing vocabularies.

With prefix:

<bk:book xmlns:bk="http://example.com/book">
  <bk:title>XML Handbook</bk:title>
</bk:book>

Default namespace:

<book xmlns="http://example.com/book">
  <title>XML Handbook</title>
</book>

Notes:

  • Attributes are not in the default namespace unless prefixed.
  • Choose stable, resolvable URIs for namespaces even if they are not fetched at runtime.

Examples

Visual: Good vs bad nesting

✅ Correct nesting (tags open and close in the right order)

<order>
  <item>
    <sku>123</sku>
  </item>
</order>

❌ Broken nesting (item closes after order—parser error)

<order><item></order></item>

Entity usage examples

Problem: You want to show "if value < 10 & flag = true" in XML

❌ Wrong (parser treats < and & as markup)

<condition>if value < 10 & flag = true</condition>

✅ Correct (entities escape special characters)

<condition>if value &lt; 10 &amp; flag = true</condition>

Visual: Namespace collision

Problem: Two vocabularies both use <title>

<!-- Without namespaces—ambiguous -->
<document>
  <title>Book Title</title>
  <title>Page Title</title>
</document>

Solution: Use prefixes to distinguish

<doc xmlns:bk="http://books.com" xmlns:pg="http://pages.com">
  <bk:title>Book Title</bk:title>
  <pg:title>Page Title</pg:title>
</doc>

Self-check

Real-World Use

Practice

  • Add a namespace prefix to an existing XML file and run it through a validator.
  • Intentionally break three rules (double root, bad nesting, unescaped <) and read the parser errors.
  • Add meaningful attributes (id, status) and confirm they remain quoted.

📝 For exams

Exam focus

  • Define namespace and default namespace.
  • List the five predefined entities.
  • Explain difference between well-formed and valid with one-liner examples.

✨ Key points

Takeaways

  • Namespaces are non-negotiable when combining vocabularies.
  • Most parser errors come from nesting or unescaped characters.
  • Stick to UTF-8 in the prolog to avoid encoding surprises.