XML Overview & Markup Languages
Unit 1•CLO01
Learning Objectives
Course Learning Outcomes
CLO01
Course Outcomes
CO01
ℹ️
Introduction
XML is a standards-based way to package meaning and structure in plain text. It grew out of SGML to make data exchange predictable across tools, platforms, and decades. Think of it as a contract for how information is organized—not how it looks.
Study tracker
Mark what you have completed for this topic.
0% done
The Basics
Markup in practice
A markup language labels parts of a document so humans and programs agree on what each part means.
- HTML labels how to display; XML labels what the data is.
- XML is a meta-language: you mint tags that match your domain (student, invoice, sensor, book).
Why extensible matters
You are not locked to a fixed tag set. New elements can be added without breaking existing parsers as long as they stay well-formed.
Well-formed vs valid
- Well-formed: syntax rules are respected (single root, proper nesting, quoted attributes).
- Valid: well-formed and obeys a schema (DTD or XSD).
Technical Details
Tree mental model
XML documents are ordered trees:
- Root element contains everything.
- Elements may hold text, other elements, and attributes.
- Attributes are name/value metadata on elements.
When to use elements vs attributes
- Use elements for repeatable or structured data (titles, items, nested records).
- Use attributes for identifiers, flags, short metadata (id, type, status).
XML ecosystem map
- Structure guardrails: DTD / XSD
- Navigation: XPath
- Transformation: XSLT
Parsing flow
- Parse for well-formedness.
- (Optional) Validate against DTD/XSD.
- Query or transform with XPath/XSLT.
- Serialize output for transport or storage.
Examples
Tiny XML sample
<?xml version="1.0" encoding="UTF-8"?>
<student id="S1">
<name>Meera</name>
<dept>CSE</dept>
<cgpa>9.1</cgpa>
</student>
HTML vs XML side by side
- HTML: <h1>Title</h1> (presentation)
- XML: <title>Distributed Systems</title> (meaning)
Self-check
Real-World Use
Quick hands-on
- Draft one XML document for a dataset you like (books, courses, products).
- Check well-formedness with any strict parser.
- Add comments and attributes sparingly to see how tools display them.
Common pitfalls to avoid: missing end tags, multiple roots, unescaped & or <, and unquoted attributes.
📝 For exams
Exam checkpoints
- Short: define XML; list 3 differences vs HTML; explain "extensible".
- Medium: describe well-formed vs valid with one example each.
- Long: explain how XML + DTD/XSD + XPath + XSLT form a pipeline.
✨ Key points
Takeaways
- XML captures meaning; presentation is handled elsewhere.
- Treat every XML document as a tree.
- Validation is optional but crucial when systems exchange data.