Sunday, March 18, 2012

Week 10 Reading Notes

Introduction to XML (IBM)

XML = Extensible Markup Language, can create own tags, machine can read it

XML based on SGML
  1. Tags = text between brackets
  2. Elements = starting tag, ending tag, everything in between
  3. Attributes = name-value pair inside starting tag
XML can:
  • simplify data interchange
  • enable smart code
  • enable smart searches

3 kinds of XML documents:

  1. Invalid docs = don't follow syntax rules of element or DTD
  2. Valid docs = follows both XML and DTD rules
  3. Well-formed docs = follow XML syntax rules but don't have DTD rules
Need a single root element
Elements can't overlap
End tags required
Elements are case sensitive
Attributes must have quoted values
XML declarations
Also: comments, processing instructions, entities

Use namespaces to specify tags

DTD = document type definitions, specifies basic structure of XML doc
-some elements must appear, must appear in a certain order
-elements must contain text
-use of certain symbols

DTD can:
  • define which attributes are required
  • define default values for attributes
  • list all of valid values for given attribute
XML schemas:
use XML syntax
support datatypes
are extensible
have more expressive power

Programming interfaces:
  1. Document Object Model
  2. Simple API for XML (SAX)
  3. JDOM
  4. Java API for XML Parsing (JAXP)
XML Standards = determined by w3
-XML schema: primer, doc structures, data types
-XSL, XSLT, XPath = formatting standards
-XLink and XPointer = linking and referencing standards

Web services: SOAP, WSDL, UDDI

This was a good overview of XML. I have also learned about XML in other classes, and I can see how it would be very useful and could lead to the goal of the semantic web. I can also see how it could be complicated, though, and that standardization still needs to be clarified. There also seems to be a need to get all organizations from all over the world to agree on these standards, and that can be a difficult compromise to reach.


A survey of XML standards: Part 1

Core XML technologies that are standards

XML
XML 1.0 (2nd ed.) = builds on Unicode
XML 1.1 = first revision
-Recommended intros/tutorials
-References

Catalogs
XML Catalogs = governed by RFC 2396: Uniform Resource Identifiers, RFC 2141: Uniform Resource Names
-entity
-entity catalog
-system identifiers
-URIs
-URNs
-public identifiers
OASIS Open Catalog
-Recommended intros/tutorials

XML Namespaces
Namespaces in XML 1.0
-XHTML
Namespaces in XML 1.1
-Resource Directory Description Language (RDDL)
-RDF
-TAG
-XLink
-Recommended intros/tutorials
-References

XML Base
XML Base
-Recommended intros/tutorials

XInclude
XML Inclusions (XInclude) 1.0
-Recommended intros/tutorials

XML Infoset
XML Information Set
-information items
-Recommended intros/tutorials

Canonical XML (c14n)
Canonical XML Version 1.0
-Exclusive XML Canonicalization Version 1.0

XPath
XML Path Language (XPath) 1.0
-XSLT
-W3C XML schema
-Recommended intros/tutorials

XPointer
XPointer Framework
-xpointer() scheme
-element() scheme
-xmins() scheme
-FIXptr
-Recommended intros/tutorials

XLink
XML Linking Language (XLink) 1.0
-HLink
-simple links
-extended links
-linkbases
-Recommended intros/tutorials
-References

RELAX NG
RELAX NG
-XML schema
RELAX NG Compact Syntax
-Document Schema Definition Languages (DSDL)
-Recommended intros/tutorials
-References

W3C XML schema
XML Schema Part 1: Structures
XML Schema Part 2: Datatypes
-Recommended intros/tutorials
-References

Schematron
Schematron Assertion Language 1.5
-Recommended intros/tutorials
-References

Standards made by:
  • W3C
  • International Organization for Standardization (ISO)
  • Organization for the Advancement of Structured Information Standards (OASIS)
  • Internet Engineering Taskforce (IETF)
  • XML community
This article has a lot of information on XML standards, and I think it would be very useful if I needed to get more in-depth with XML and/or its standards. The tutorials and references seem very useful, but I would need to go back to the page to get the names and links of each one. This would be a good reference for the future.


XML Schema Tutorial (w3schools.com)

XML schema = describes structure of XML document
= is XML-based alternative to DTD
= also XML Schema Definition (XSD)

Need to know:
  • HTML/XHTML
  • XML and XML Namespaces
  • Basic understanding of DTD

XML Schema defines:

  1. elements that can appear in a document
  2. attributes that can appear in a document
  3. which elements are child elements
  4. order of child elements
  5. number of child elements
  6. whether an element is empty or can include text
  7. data types for elements and attributes
  8. default and fixed values for elements and attributes

XML Schema is W3C recommendation

If data types supported, it is easier to:

  • describe allowable document content
  • validate the correctness of data
  • work with data from a database
  • define data facets (restrictions on data)
  • define data patterns (data formats)
  • convert data between different data types

XML schemas use XML syntax because: don't need to learn new language, can use XML editor and parser, can manipulate with XML DOM, can transform with XSLT

XML schemas secure data communication
-are extensible
-well-formed is not enough

Well-formed:
  • it must begin with the XML declaration
  • it must have one unique root element
  • start-tags must have matching end-tags
  • elements are case sensitive
  • all elements must be closed
  • all elements must be properly nested
  • all attribute values must be quoted
  • entities must be used for special characters

XML documents can have reference to DTD or XML Schema

Note element is complex type, other elements are simple types

< "schema" > element is root of every XML schema
-may contain some attributes
-doc can reference XML schema
-can specify default namespace
-can use schemaLocation attribute
  • XSD simple elements, attributes, and restrictions/facets
  • -only text
XSD complex types = contains other elements/attributes
-empty elements have no content
-can contain only elements
-complex text only
-can be mixed text and other
-order, occurrence, and group indicators
-any or anyAttribute
-substitution

Data Types
  1. string
  2. date
  3. numeric
  4. misc.
  • Schema references
I really like these W3C tutorials, and this one gave me a lot of good information about XML schemas. I would be able to go back and get more deep information about the topic, but these notes will help me remember what is covered in the tutorial and some basic definitions. I think XML schemas could be very useful, and I don't totally understand them still, but it would be a good topic of investigation for the future.

No comments:

Post a Comment