XML = Extensible Markup Language, can create own tags, machine can read it
XML based on SGML
- Tags = text between brackets
- Elements = starting tag, ending tag, everything in between
- Attributes = name-value pair inside starting tag
- simplify data interchange
- enable smart code
- enable smart searches
3 kinds of XML documents:
- Invalid docs = don't follow syntax rules of element or DTD
- Valid docs = follows both XML and DTD rules
- Well-formed docs = follow XML syntax rules but don't have DTD rules
Elements can't overlap
End tags required
Elements are case sensitive
Attributes must have quoted values
XML declarations
Also: comments, processing instructions, entities
Use namespaces to specify tags
DTD = document type definitions, specifies basic structure of XML doc
-some elements must appear, must appear in a certain order
-elements must contain text
-use of certain symbols
DTD can:
- define which attributes are required
- define default values for attributes
- list all of valid values for given attribute
use XML syntax
support datatypes
are extensible
have more expressive power
Programming interfaces:
- Document Object Model
- Simple API for XML (SAX)
- JDOM
- Java API for XML Parsing (JAXP)
-XML schema: primer, doc structures, data types
-XSL, XSLT, XPath = formatting standards
-XLink and XPointer = linking and referencing standards
Web services: SOAP, WSDL, UDDI
This was a good overview of XML. I have also learned about XML in other classes, and I can see how it would be very useful and could lead to the goal of the semantic web. I can also see how it could be complicated, though, and that standardization still needs to be clarified. There also seems to be a need to get all organizations from all over the world to agree on these standards, and that can be a difficult compromise to reach.
A survey of XML standards: Part 1
Core XML technologies that are standards
XML
XML 1.0 (2nd ed.) = builds on Unicode
XML 1.1 = first revision
-Recommended intros/tutorials
-References
Catalogs
XML Catalogs = governed by RFC 2396: Uniform Resource Identifiers, RFC 2141: Uniform Resource Names
-entity
-entity catalog
-system identifiers
-URIs
-URNs
-public identifiers
OASIS Open Catalog
-Recommended intros/tutorials
XML Namespaces
Namespaces in XML 1.0
-XHTML
Namespaces in XML 1.1
-Resource Directory Description Language (RDDL)
-RDF
-TAG
-XLink
-Recommended intros/tutorials
-References
XML Base
XML Base
-Recommended intros/tutorials
XInclude
XML Inclusions (XInclude) 1.0
-Recommended intros/tutorials
XML Infoset
XML Information Set
-information items
-Recommended intros/tutorials
Canonical XML (c14n)
Canonical XML Version 1.0
-Exclusive XML Canonicalization Version 1.0
XPath
XML Path Language (XPath) 1.0
-XSLT
-W3C XML schema
-Recommended intros/tutorials
XPointer
XPointer Framework
-xpointer() scheme
-element() scheme
-xmins() scheme
-FIXptr
-Recommended intros/tutorials
XLink
XML Linking Language (XLink) 1.0
-HLink
-simple links
-extended links
-linkbases
-Recommended intros/tutorials
-References
RELAX NG
RELAX NG
-XML schema
RELAX NG Compact Syntax
-Document Schema Definition Languages (DSDL)
-Recommended intros/tutorials
-References
W3C XML schema
XML Schema Part 1: Structures
XML Schema Part 2: Datatypes
-Recommended intros/tutorials
-References
Schematron
Schematron Assertion Language 1.5
-Recommended intros/tutorials
-References
Standards made by:
- W3C
- International Organization for Standardization (ISO)
- Organization for the Advancement of Structured Information Standards (OASIS)
- Internet Engineering Taskforce (IETF)
- XML community
XML Schema Tutorial (w3schools.com)
XML schema = describes structure of XML document
= is XML-based alternative to DTD
= also XML Schema Definition (XSD)
Need to know:
- HTML/XHTML
- XML and XML Namespaces
- Basic understanding of DTD
XML Schema defines:
- elements that can appear in a document
- attributes that can appear in a document
- which elements are child elements
- order of child elements
- number of child elements
- whether an element is empty or can include text
- data types for elements and attributes
- default and fixed values for elements and attributes
XML Schema is W3C recommendation
If data types supported, it is easier to:
- describe allowable document content
- validate the correctness of data
- work with data from a database
- define data facets (restrictions on data)
- define data patterns (data formats)
- convert data between different data types
XML schemas use XML syntax because: don't need to learn new language, can use XML editor and parser, can manipulate with XML DOM, can transform with XSLT
XML schemas secure data communication-are extensible
-well-formed is not enough
Well-formed:
- it must begin with the XML declaration
- it must have one unique root element
- start-tags must have matching end-tags
- elements are case sensitive
- all elements must be closed
- all elements must be properly nested
- all attribute values must be quoted
- entities must be used for special characters
XML documents can have reference to DTD or XML Schema
Note element is complex type, other elements are simple types
< "schema" > element is root of every XML schema-may contain some attributes
-doc can reference XML schema
-can specify default namespace
-can use schemaLocation attribute
- XSD simple elements, attributes, and restrictions/facets
- -only text
-empty elements have no content
-can contain only elements
-complex text only
-can be mixed text and other
-order, occurrence, and group indicators
-any or anyAttribute
-substitution
Data Types
- string
- date
- numeric
- misc.
- Schema references
No comments:
Post a Comment