187 lines
		
	
	
		
			7.7 KiB
		
	
	
	
		
			HTML
		
	
	
	
	
	
			
		
		
	
	
			187 lines
		
	
	
		
			7.7 KiB
		
	
	
	
		
			HTML
		
	
	
	
	
	
| <html>
 | |
| <head>
 | |
| <!--
 | |
| 
 | |
|     DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS HEADER.
 | |
| 
 | |
|     Copyright (c) 2010-2013 Oracle and/or its affiliates. All rights reserved.
 | |
| 
 | |
|     The contents of this file are subject to the terms of either the GNU
 | |
|     General Public License Version 2 only ("GPL") or the Common Development
 | |
|     and Distribution License("CDDL") (collectively, the "License").  You
 | |
|     may not use this file except in compliance with the License.  You can
 | |
|     obtain a copy of the License at
 | |
|     http://glassfish.java.net/public/CDDL+GPL_1_1.html
 | |
|     or packager/legal/LICENSE.txt.  See the License for the specific
 | |
|     language governing permissions and limitations under the License.
 | |
| 
 | |
|     When distributing the software, include this License Header Notice in each
 | |
|     file and include the License file at packager/legal/LICENSE.txt.
 | |
| 
 | |
|     GPL Classpath Exception:
 | |
|     Oracle designates this particular file as subject to the "Classpath"
 | |
|     exception as provided by Oracle in the GPL Version 2 section of the License
 | |
|     file that accompanied this code.
 | |
| 
 | |
|     Modifications:
 | |
|     If applicable, add the following below the License Header, with the fields
 | |
|     enclosed by brackets [] replaced by your own identifying information:
 | |
|     "Portions Copyright [year] [name of copyright owner]"
 | |
| 
 | |
|     Contributor(s):
 | |
|     If you wish your version of this file to be governed by only the CDDL or
 | |
|     only the GPL Version 2, indicate your decision by adding "[Contributor]
 | |
|     elects to include this software in this distribution under the [CDDL or GPL
 | |
|     Version 2] license."  If you don't indicate a single choice of license, a
 | |
|     recipient has the option to distribute your version of this file under
 | |
|     either the CDDL, the GPL Version 2 or to extend the choice of license to
 | |
|     its licensees as provided above.  However, if you add GPL Version 2 code
 | |
|     and therefore, elected the GPL Version 2 license, then the option applies
 | |
|     only if the new code is made subject to such option by the copyright
 | |
|     holder.
 | |
| 
 | |
| -->
 | |
| 
 | |
| 	<title>Design of XSOM</title>
 | |
| 	<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"/>
 | |
| 	<style>
 | |
| 		pre {
 | |
| 			background-color: rgb(240,240,240);
 | |
| 			margin-left:	2em;
 | |
| 			margin-right: 2em;
 | |
| 			padding: 1em;
 | |
| 		}
 | |
| 		p {
 | |
| 			margin-left: 2em;
 | |
| 		}
 | |
| 		dt {
 | |
| 			margin-top: 0.5em;
 | |
| 			margin-left: 2em;
 | |
| 			font-weight: bold;
 | |
| 		}
 | |
| 		dd {
 | |
| 			margin-left: 3em;
 | |
| 		}
 | |
| 	</style>
 | |
| </head>
 | |
| <body>
 | |
| 
 | |
| <h1 style="text-align:center">Design of XSOM</h1>
 | |
| <div align=right style="font-size:smaller">
 | |
| By <a href="mailto:kohsuke.kawaguchi@sun.com">Kohsuke Kawaguchi</a><br>
 | |
| </div>
 | |
| 
 | |
| <p>
 | |
| 	This document describes the details you need to know to extend/maintain XSOM.
 | |
| </p>
 | |
| 
 | |
| <h1>Design Goals</h1>
 | |
| <p>
 | |
| 	The primary design goals of XSOM are:
 | |
| </p>
 | |
| <ol>
 | |
| 	<li>Expose all the information defined in the schema spec
 | |
| 	<li>Provide additional methods that helps simplifying client applications.
 | |
| </ol>
 | |
| <p>
 | |
| 	Providing mutation methods was a non-goal for this project, primarily because of the added complexity.
 | |
| </p>
 | |
| 
 | |
| 
 | |
| <h1>Building workspace</h1>
 | |
| <p>
 | |
| 	The workspace uses Ant as the build tool. The followings are the major targets:
 | |
| </p>
 | |
| <dl>
 | |
| 	<dt>clean</dt>
 | |
| 	<dd>remove the intermediate and output files.</dd>
 | |
| 	
 | |
| 	<dt>compile</dt>
 | |
| 	<dd>generate a parser by RelaxNGCC and compile all the source files into the bin directory.</dd>
 | |
| 	
 | |
| 	<dt>jar</dt>
 | |
| 	<dd>make a jar file</dd>
 | |
| 	
 | |
| 	<dt>release</dt>
 | |
| 	<dd>build a distribution zip file that contains everything from the source code to a binary file</dd>
 | |
| 	
 | |
| 	<dt>src-zip</dt>
 | |
| 	<dd>Build a zip file that contains the source code.</dd>
 | |
| </dl>
 | |
| 
 | |
| <h1>Architecture</h1>
 | |
| <p>
 | |
| 	XSOM consists of roughly three parts.
 | |
| 	
 | |
| 	The first part is the public interface, which is defined in the <code>com.sun.xml.xsom</code> package. The entire functionality of XSOM is exposed via this interface. This interface is derived from a draft document submitted to W3C by some WG members.
 | |
| </p><p>
 | |
| 	The second part is the implementation of these interfaces, the <code>com.sun.xml.xsom.impl</code> package. These code are all hand-written.
 | |
| </p><p>
 | |
| 	The third part is a parser that reads XML representation of XML Schema and builds XSOM nodes accordingly. The package is <code>com.sun.xml.xsom.parser</code>. This part of the code is mostly generated by <a href="http://relaxngcc.sourceforge.net/">RelaxNGCC</a>.
 | |
| </p>
 | |
| <center>
 | |
| 	<img src="architecture.png"/>
 | |
| </center>
 | |
| 
 | |
| 
 | |
| 
 | |
| 
 | |
| <h1>Implementation Details</h1>
 | |
| <p>
 | |
| 	Most of the implementation classes are fairly simple. Probably the only one interesting piece of code is the <code>Ref</code> class, which is a reference to other schema components.
 | |
| </p><p>
 | |
| 	The <code>Ref</code> class itself is just a place hodler and this class defined a series of inner interfaces that are specialized to hold a reference to different kinds of schema components.
 | |
| 	
 | |
| 	The sole purpose of this indirection layer is to support forward references during a parsing of the XML representation.
 | |
| </p><p>
 | |
| 	A typical reference interface would look like this:
 | |
| </p>
 | |
| <pre>
 | |
| public static interface Term {
 | |
|     /** Obtains a reference as a term. */
 | |
|     XSTerm getTerm();
 | |
| }
 | |
| </pre>
 | |
| <p>
 | |
| 	In case this indirection is unnecessary, all implementation classes of <code>XSTerm</code> implements this <code>Ref.Term</code> interface. This applies to all the other types of the <code>Ref</code> interface. Therefore, whereever a reference is necessary, you can stimply pass a real object. In other words, a direct reference (<code>XS***Impl</code>) can be always treated as an indirect reference (<code>Ref.***</code>).
 | |
| </p><p>
 | |
| 	Implementations for forward references are placed in the <code>com.sun.xml.xsom.impl.parser.DelayedRef</code> class. The detail will be discussed later.
 | |
| </p>
 | |
| 
 | |
| 
 | |
| 
 | |
| <h1>Parser</h1>
 | |
| <p>
 | |
| 	The following collaboration diagram shows various objects that participate in a parsing process.
 | |
| </p>
 | |
| <center>
 | |
| 	<img src="collaboration.png"/>
 | |
| </center>
 | |
| <p>
 | |
| 	<code>XSOMParser</code> is the only publicly visible component in this picture. This class also keeps references to vairous other objects that are necessary to parse schemas. This includes an error handler, the root <code>SchemaSet</code> object, an entity resolver, etc.
 | |
| </p><p>
 | |
| 	Whenever the parse method is called, it will create a new NGCCRuntimeEx and configure XMLReader so that a schema file is parsed into this NGCCRuntimeEx instance.
 | |
| 	
 | |
| 	<code>NGCCRuntimeEx</code> derives from <code>NGCCRuntime</code>, which is a class generated by RelaxNGCC. This object will use other RelaxNGCC-generated classes and parse a document and constructs a XSOM object graph appropriately.
 | |
| </p><p>
 | |
| 	When a new XML document is referenced by an import or include statement, a new set of <code>NGCCRuntimeEx</code> is set up to parse that document. One NGCCRuntimeEx can only parse one XML document.
 | |
| </p>
 | |
| <h2>Forward references and back-patching</h2>
 | |
| <p>
 | |
| 	Since we use SAX to parse schemas, the referenced schema component is often unavailable when we hit a reference. Because of this, when we see a reference, we create a "delayed" reference that keeps the name of the referenced component.
 | |
| </p><p>
 | |
| 	Note that because of the way XML Schema <redefine> works, all the references by name must be lazily bound even if the component is already defined.
 | |
| </p><p>
 | |
| 	All these "delayed" references are remembered and tracked by XSOMParser. When the client calls the <code>XSOMParser.getResult</code> method, XSOMParser will make sure that they resolve to a schema component correctly.
 | |
| 	"Delayed" references are available in the <code>DelayedRef</code> class.
 | |
| </p>
 | |
| 
 | |
| 
 | |
| <h2>RelaxNGCC</h2>
 | |
| <p>
 | |
| 	The actual parser is generated by RelaxNGCC from <code>xsom/src/*.rng</code> files. <code>xmlschema.rng</code> is the entry point and all the other files are referenced from this file. For more information about RelaxNGCC, goto <a href="http://relaxngcc.sourceforge.net/">here</a>. Or just contact me (as I'm one of the developers of RelaxNGCC.)
 | |
| </p>
 | |
| 
 | |
| </body>
 | |
| </html>
 |