XML Overview & Markup Languages

Unit 1CLO01

Learning Objectives

Course Learning Outcomes

CLO01

Course Outcomes

CO01
ℹ️

Introduction

XML is a standards-based way to package meaning and structure in plain text. It grew out of SGML to make data exchange predictable across tools, platforms, and decades. Think of it as a contract for how information is organized—not how it looks.

Study tracker

Mark what you have completed for this topic.

0% done

The Basics

Markup in practice

A markup language labels parts of a document so humans and programs agree on what each part means.

  • HTML labels how to display; XML labels what the data is.
  • XML is a meta-language: you mint tags that match your domain (student, invoice, sensor, book).

Why extensible matters

You are not locked to a fixed tag set. New elements can be added without breaking existing parsers as long as they stay well-formed.

Well-formed vs valid

  • Well-formed: syntax rules are respected (single root, proper nesting, quoted attributes).
  • Valid: well-formed and obeys a schema (DTD or XSD).

Technical Details

Tree mental model

XML documents are ordered trees:

  • Root element contains everything.
  • Elements may hold text, other elements, and attributes.
  • Attributes are name/value metadata on elements.

When to use elements vs attributes

  • Use elements for repeatable or structured data (titles, items, nested records).
  • Use attributes for identifiers, flags, short metadata (id, type, status).

XML ecosystem map

  • Structure guardrails: DTD / XSD
  • Navigation: XPath
  • Transformation: XSLT

Parsing flow

  1. Parse for well-formedness.
  2. (Optional) Validate against DTD/XSD.
  3. Query or transform with XPath/XSLT.
  4. Serialize output for transport or storage.

Examples

Tiny XML sample

<?xml version="1.0" encoding="UTF-8"?>
<student id="S1">
  <name>Meera</name>
  <dept>CSE</dept>
  <cgpa>9.1</cgpa>
</student>

HTML vs XML side by side

  • HTML: <h1>Title</h1> (presentation)
  • XML: <title>Distributed Systems</title> (meaning)

Self-check

Real-World Use

Quick hands-on

  • Draft one XML document for a dataset you like (books, courses, products).
  • Check well-formedness with any strict parser.
  • Add comments and attributes sparingly to see how tools display them.

Common pitfalls to avoid: missing end tags, multiple roots, unescaped & or <, and unquoted attributes.

📝 For exams

Exam checkpoints

  • Short: define XML; list 3 differences vs HTML; explain "extensible".
  • Medium: describe well-formed vs valid with one example each.
  • Long: explain how XML + DTD/XSD + XPath + XSLT form a pipeline.

✨ Key points

Takeaways

  • XML captures meaning; presentation is handled elsewhere.
  • Treat every XML document as a tree.
  • Validation is optional but crucial when systems exchange data.