The research at the Center studies the possibilities on the use of a generalization of Keizer's canonical vectors for knowledge representation. This has generated a canonical based Meta Language (ML) dubbed CanonML.
The research is still in an early stage but the current preliminary version of the CanonML format present a series of advantages when compared with other languages.
As showed below the CanonML language may be more extensible, readable, writable, concise, and elegant. The CanonML format is customizable, rigorously defined, easily parsed by machine, and with semantic modules focused to science.
XML is the eXtensible Markup Language developed by the W3C for use on the internet. However, HTML is still the standard preferred format on the Internet. Currently XML is being used in several off-line applications such as office docs, configuration files and in server databases for online stores.
There exist several attemtps to use XML for machine representation of scientific knowledge. One of pioneering and best studied systems is CML –the Chemical Markup Language– which is a XML application. CML addresses the encoding of a subset of chemical knowledge only. For instance, CML can be used for a non-binary encoding of spectroscopic properties of a molecule. Peter Murray-Rust coined the neologism datuments (data + documents) for describing the next generation of scientific publications based in markup languages. Similar projects are being developed on the fields of biology, physics, geology...
LaTeX is the current standard for authoring physical and mathematical publications. Almost any paper typed in high-energy physics uses this human oriented format. Its use in other communities (e.g. organic chemists) is not so extended. LaTeX is mainly oriented to presentational issues, e.g. hyphenation of text or typesetting of mathematical formulas.
Several knowledge representation systems are based in the programming languages LISP and Scheme (a dialect of LISP). SXML is a representation of XML documents using both a Scheme syntax and its large programming capabilities. This kind of approach is limited by the relative unpopularity of LISP and Scheme.
One popular programming languages today is Python. Python has become a language of facto in some scientific and technical communities; the NASA uses Python for its Workflow Automation System (WAS), BioPhython are Python libraries and applications oriented to bioinformatics, Google uses Python also. SLiP is defined like a quick, alternative Python based syntax for creating and editing XML documents by hand.
In the next sample you can compare the readability and conciseness of the Canonical Meta Language with that of XML, LaTeX, LISP, and SLiP like syntaxes. The semantics of CanonML is still preliminary.
The structure and semantics contained in the XML and SLiP and in the LISP and LaTeX like codes shown below is not totally equivalent to the CanonML format. Some decisions were taken during the conversion process like is explained for each sample below. In general, extra structure and information contained in the CanonML format was lost during conversion to the other syntaxes. Still the direct comparison of the formats shown can help to valuate CanonML syntax in context. For completeness, a fully equivalent XML encoding of the original CanonML expression is provided at the end of this section.
Extensible, concise, readable, easily typed by humans and parsed by machine. Sample using a BSD/Allman like indentation style.
(\section
(\title advantages of (\emphasis (\@\level 1) canonical science))
(\paragraph Canonical science has the next interesting properties:)
(\list
(\paragraph Broad applicability.)
(\paragraph Unified description of disparate natural systems.)
(\paragraph Intrinsic irreversibility built-in.)
(\paragraph Stochastic framework.)
(\paragraph Multi-hierarchical description.)
)
(\paragraph Standard equations arise as a special case from the canonical theory.)
(\paragraph For instance, one can derive the fundamental equation of mechanics.)
(\formal
(\equation
(\fraction
(\numerator
(\partial-differential
(\state-operator)
)
)
(\denominator
(\partial-differential
(\time)
)
)
)
(\liouvillian
(\state-operator)
)
)
)
)
Text-like CanonML structures were converted to raw XML strings. This loses information but improves the readability and familiarity of the resulting XML.
<section>
<title>advantages of <emphasis level="1">canonical science</emphasis></title>
<paragraph>Canonical science has the next interesting properties:</paragraph>
<list>
<paragraph>Broad applicability.</paragraph>
<paragraph>Unified description of disparate natural systems.</paragraph>
<paragraph>Intrinsic irreversibility built-in.</paragraph>
<paragraph>Stochastic framework.</paragraph>
<paragraph>Multi-hierarchical description.</paragraph>
</list>
<paragraph>Standard equations arise as a special case from the canonical theory.</paragraph>
<paragraph>For instance, one can derive the fundamental equation of mechanics.</paragraph>
<formal>
<equation>
<fraction>
<numerator>
<partial-differential>
<state-operator/>
</partial-differential>
</numerator>
<denominator>
<partial-differential>
<time/>
</partial-differential>
</denominator>
</fraction>
<liouvillian>
<state-operator/>
</liouvillian>
</equation>
</formal>
</section>
The SLiP code is directly derived from the XML one.
section:()
title:() "advantages of "
emphasis:(level="1") "canonical science"
paragraph:() "Canonical science has the next interesting properties:"
list:()
paragraph:() "Broad applicability."
paragraph:() "Unified description of disparate natural systems."
paragraph:() "Intrinsic irreversibility built-in."
paragraph:() "Stochastic framework."
paragraph:() "Multi-hierarchical description."
paragraph:() "Standard equations arise as a special case from the canonical theory."
paragraph:() "For instance, one can derive the fundamental equation of mechanics."
formal:()
equation:()
fraction:()
numerator:()
partial-differential:()
state-operator:() ""
denominator:()
partial-differential:()
time:() ""
liovillian:()
state-operator:() ""
There is no standard way to encode attributes in S-expr format, here we choose a DSSSL convention in the LISP code.
(section
(title "advantages of " (emphasis level: 1 "canonical science"))
(paragraph "Canonical science has the next interesting properties:")
(list
(paragraph "Broad applicability.")
(paragraph "Unified description of disparate natural systems.")
(paragraph "Intrinsic irreversibility built-in.")
(paragraph "Stochastic framework.")
(paragraph "Multi-hierarchical description."))
(paragraph "Standard equations arise as a special case from the canonical theory.")
(paragraph "For instance, one can derive the fundamental equation of mechanics.")
(formal
(equation
(fraction
(numerator
(partial-differential
(state-operator)))
(denominator
(partial-differential
(time))))
(liovillian
(state-operator)))))
This LaTeX like code uses a renamed \fraction command and standard begin-end environments for multiline code; paragraphs are explicitly encoded for avoiding a common criticism to standard TeX/LaTeX. A \formal environment is used instead the TeX $$ math mode. Semantic commands like \equation, \partial-differential, \state-operator, \liouvillian or \time are introduced.
\begin{\section}
\title{advantages of \emphasis[1]{canonical science}}
\paragraph{Canonical science has the next interesting properties:}
\begin{\list}
\paragraph{Broad applicability.}
\paragraph{Unified description of disparate natural systems.}
\paragraph{Intrinsic irreversibility built-in.}
\paragraph{Stochastic framework.}
\paragraph{Multi-hierarchical description.}
\end{\list}
\paragraph{Standard equations arise as a special case from the canonical theory.}
\paragraph{For instance, one can derive the fundamental equation of mechanics.}
\begin{\formal}
\equation{\fraction{\numerator{\partial-differential{\state-operator}}
\denominator{\partial-differential{\time}}}
{\liouvillian{\state-operator}}
\end{\formal}
\end{\section}
Several shorthand stylesheets are defined in CanonML. Authors and communities can define their own little syntaxes and shorthand mechanisms according to needs and personal preferences.
Shorthand of above CanonML code using Banner like indentation style,
(\sect
(\titl advantages of (\emph (\@\lev 1) canonical science))
(\par Canonical science has the next interesting properties:)
(\list
(\par Broad applicability.)
(\par Unified description of disparate natural systems.)
(\par Intrinsic irreversibility built-in.)
(\par Stochastic framework.)
(\par Multi-hierarchical description.) )
(\par Standard equations arise as a special case from the canonical theory.)
(\par For instance, one can derive the fundamental equation of mechanics.)
(\form
(\equ
(\fract
(\num
(\part-diff
(\stat-oper) ))
(\den
(\part-diff
(\tim) )))
(\lio
(\stat-oper) ))))
Some people prefer to work with little syntaxes optimized for a subset of problems. A little UNICODE syntax for formatting the equation is as follows,
(\formal ((∂ σ \over ∂ t) = L σ))
A machine fully compatible namespaced XML encoding with zero loose of information is the X-CanonML syntax. This XML encoding illustrates the conciseness of the original CanonML format.
<c:g xmlns:c="http://www.canonicalscience.org/CanonML">
<c:d c:t="b">\section</c:d>
<c:g>
<c:d c:t="b">\title</c:d>
<c:d>advantages</c:d>
<c:d>of</c:d>
<c:g>
<c:d c:t="b">\emphasis</c:d>
<c:g>
<c:d c:t="b">
<c:n>
<c:d>\@</c:d>
<c:d>\level</c:d>
</c:n>
</c:d>
<c:d>1</c:d>
</c:g>
<c:d>canonical</c:d>
<c:d>science</c:d>
</c:g>
</c:g>
<c:g>
<c:d c:t="b">\paragraph</c:d>
<c:d>Canonical</c:d>
<c:d>science</c:d>
<c:d>has</c:d>
<c:d>the</c:d>
<c:d>next</c:d>
<c:d>interesting</c:d>
<c:d>properties:</c:d>
</c:g>
<c:g>
<c:d c:t="b">\list</c:d>
<c:g>
<c:d c:t="b">\paragraph</c:d>
<c:d>Broad</c:d>
<c:d>applicability.</c:d>
</c:g>
<c:g>
<c:d c:t="b">\paragraph</c:d>
<c:d>Unified</c:d>
<c:d>description</c:d>
<c:d>of</c:d>
<c:d>disparate</c:d>
<c:d>natural</c:d>
<c:d>systems.</c:d>
</c:g>
<c:g>
<c:d c:t="b">\paragraph</c:d>
<c:d>Intrinsic</c:d>
<c:d>irreversibility</c:d>
<c:d>built-in.</c:d>
</c:g>
<c:g>
<c:d c:t="b">\paragraph</c:d>
<c:d>Stochastic</c:d>
<c:d>framework.</c:d>
</c:g>
<c:g>
<c:d c:t="b">\paragraph</c:d>
<c:d>Multi-hierarchical</c:d>
<c:d>description.</c:d>
</c:g>
</c:g>
<c:g>
<c:d c:t="b">\paragraph</c:d>
<c:d>Standard</c:d>
<c:d>equations</c:d>
<c:d>arise</c:d>
<c:d>as</c:d>
<c:d>a</c:d>
<c:d>special</c:d>
<c:d>case</c:d>
<c:d>from</c:d>
<c:d>the</c:d>
<c:d>canonical</c:d>
<c:d>theory.</c:d>
</c:g>
<c:g>
<c:d c:t="b">\paragraph</c:d>
<c:d>For</c:d>
<c:d>instance,</c:d>
<c:d>one</c:d>
<c:d>can</c:d>
<c:d>derive</c:d>
<c:d>the</c:d>
<c:d>fundamental</c:d>
<c:d>equation</c:d>
<c:d>of</c:d>
<c:d>mechanics.</c:d>
</c:g>
<c:g>
<c:d c:t="b">\formal</c:d>
<c:g>
<c:d c:t="b">\equation</c:d>
<c:g>
<c:d c:t="b">\fraction</c:d>
<c:g>
<c:d c:t="b">\numerator</c:d>
<c:g>
<c:d c:t="b">\partial-differential</c:d>
<c:g>
<c:d c:t="b">\state-operator</c:d>
</c:g>
</c:g>
</c:g>
<c:g>
<c:d c:t="b">\denominator</c:d>
<c:g>
<c:d c:t="b">\partial-differential</c:d>
<c:g>
<c:d c:t="b">\time</c:d>
</c:g>
</c:g>
</c:g>
</c:g>
<c:g>
<c:d c:t="b">\liouvillian</c:d>
<c:g>
<c:d c:t="b">\state-operator</c:d>
</c:g>
</c:g>
</c:g>
</c:g>
</c:g>
CanonML is still under research, and there is not browser for displaying .cnml files. Due to the popularity of HTML4 on the Internet, this website uses a filter to transform a subset of CanonML to HTML4 classes. For instance, the CanonML code (\fraction ...) is converted to the HTML4 <SPAN CLASS="FRACTION">...</SPAN> on the fly when serving information. Several techniques are available for the conversion, including server-side and client-side scripts. Next, your browser renders the equation following a special –but fully standards compliant– CSS stylesheet.
This technique permits to the Center the publishing of scientific online content for broad audiences. For instance, by using a standard browser as Microsoft Explorer, Mozilla Firefox, or Opera, you can display fractions and other mathematical formulas in an accesible and simple HTML format. This CanonML-HTML4-CSS mixed thecnique avoids the need for using special browser supporting a specialized XML language; it also avoids the need for downloading and installing special plugins or fonts.
(2005 – 2008) some rights reserved