XML Overview & Markup Languages
Learning Objectives
Course Learning Outcomes
Course Outcomes
Introduction
XML (eXtensible Markup Language) is a text-based standard for representing structured information. Unlike HTML (which focuses on presentation), XML focuses on meaning and structure. This topic builds the foundation: what markup is, why XML was created, where XML is used, and how DTD, XSD, XPath, and XSLT form a pipeline for validation, querying, and transformation.
The Basics
Markup: structure + meaning
A markup language embeds tags in text to describe structure.
- In HTML, tags mostly describe presentation and document structure.
- In XML, tags primarily describe data meaning and relationships.
Why "extensible" matters
XML does not provide a fixed vocabulary. You create tags that match your domain, for example:
<student>,<course>,<invoice>,<patient>
This is why XML is used in data exchange: it lets different systems agree on a vocabulary and validate it.
Well-formed vs valid
- Well-formed: follows XML syntax rules.
- Valid: well-formed and conforms to a schema (DTD/XSD).
The XML family in one line
- DTD/XSD: define and validate structure
- XPath: select nodes
- XSLT: transform XML into another format
Technical Details
XML as a tree model
XML documents form a rooted, ordered tree:
- Document root
- Elements (nodes)
- Attributes (name/value metadata)
- Text nodes
- Comments and processing instructions
Thinking in trees is essential for XPath and XSLT.
Elements vs attributes (exam-friendly guidance)
Use elements for repeatable or structured content and nested data.
Use attributes for identifiers, flags, and small metadata.
XML ecosystem mapping
| Need | Technology |
|---|---|
| enforce structure | DTD / XSD |
| enforce datatypes | XSD |
| select data | XPath |
| generate output formats | XSLT |
Parsing overview
Typical steps:
- Parse XML (must be well-formed)
- Validate (optional): DTD/XSD
- Query/Transform: XPath/XSLT
- Serialize output
Examples
Example: a simple XML document
<?xml version="1.0" encoding="UTF-8"?>
<student id="S1">
<name>Vivek</name>
<dept>CSE</dept>
<cgpa>9.1</cgpa>
</student>
Example: XML vs HTML
- HTML:
<h1>Title</h1>(display heading) - XML:
<title>DBMS</title>(data meaning)
Real-World Use
Practical
- Create an XML file for any dataset (students, books, movies).
- Make sure it is well-formed:
- single root
- properly nested tags
- quoted attribute values
- matching start/end tags
Common pitfalls
- Mismatched tags
- Unescaped special characters (
&,<,>) - Multiple root elements
đ For exams
Exam points
2-mark
- Define XML.
- List differences between XML and HTML.
- What is well-formed vs valid?
5-mark
- Explain the XML ecosystem and where DTD/XSD/XPath/XSLT fit.
10-mark
- Discuss XML as a meta-language and compare it to HTML with examples.
⨠Key points
Takeaways
- XML describes structure and meaning, not presentation.
- Well-formedness is syntax; validity requires DTD/XSD.
- Model XML as a tree for XPath/XSLT.