XML Overview & Markup Languages

Unit 1â€ĸCLO01

Learning Objectives

Course Learning Outcomes

CLO01

Course Outcomes

CO01
â„šī¸

Introduction

XML (eXtensible Markup Language) is a text-based standard for representing structured information. Unlike HTML (which focuses on presentation), XML focuses on meaning and structure. This topic builds the foundation: what markup is, why XML was created, where XML is used, and how DTD, XSD, XPath, and XSLT form a pipeline for validation, querying, and transformation.

The Basics

Markup: structure + meaning

A markup language embeds tags in text to describe structure.

  • In HTML, tags mostly describe presentation and document structure.
  • In XML, tags primarily describe data meaning and relationships.

Why "extensible" matters

XML does not provide a fixed vocabulary. You create tags that match your domain, for example:

  • <student>, <course>, <invoice>, <patient>

This is why XML is used in data exchange: it lets different systems agree on a vocabulary and validate it.

Well-formed vs valid

  • Well-formed: follows XML syntax rules.
  • Valid: well-formed and conforms to a schema (DTD/XSD).

The XML family in one line

  • DTD/XSD: define and validate structure
  • XPath: select nodes
  • XSLT: transform XML into another format

Technical Details

XML as a tree model

XML documents form a rooted, ordered tree:

  • Document root
  • Elements (nodes)
  • Attributes (name/value metadata)
  • Text nodes
  • Comments and processing instructions

Thinking in trees is essential for XPath and XSLT.

Elements vs attributes (exam-friendly guidance)

Use elements for repeatable or structured content and nested data.

Use attributes for identifiers, flags, and small metadata.

XML ecosystem mapping

NeedTechnology
enforce structureDTD / XSD
enforce datatypesXSD
select dataXPath
generate output formatsXSLT

Parsing overview

Typical steps:

  1. Parse XML (must be well-formed)
  2. Validate (optional): DTD/XSD
  3. Query/Transform: XPath/XSLT
  4. Serialize output

Examples

Example: a simple XML document

<?xml version="1.0" encoding="UTF-8"?>
<student id="S1">
  <name>Vivek</name>
  <dept>CSE</dept>
  <cgpa>9.1</cgpa>
</student>

Example: XML vs HTML

  • HTML: <h1>Title</h1> (display heading)
  • XML: <title>DBMS</title> (data meaning)

Real-World Use

Practical

  • Create an XML file for any dataset (students, books, movies).
  • Make sure it is well-formed:
    • single root
    • properly nested tags
    • quoted attribute values
    • matching start/end tags

Common pitfalls

  • Mismatched tags
  • Unescaped special characters (&, <, >)
  • Multiple root elements

📝 For exams

Exam points

2-mark

  • Define XML.
  • List differences between XML and HTML.
  • What is well-formed vs valid?

5-mark

  • Explain the XML ecosystem and where DTD/XSD/XPath/XSLT fit.

10-mark

  • Discuss XML as a meta-language and compare it to HTML with examples.

✨ Key points

Takeaways

  • XML describes structure and meaning, not presentation.
  • Well-formedness is syntax; validity requires DTD/XSD.
  • Model XML as a tree for XPath/XSLT.