SISU - MANUAL,
RALPH AMISSAH
*****************************
WHAT IS SISU?
=============
----------------------------------------
1. INTRODUCTION - WHAT IS SISU?
-------------------------------
*SiSU* is a framework for document structuring, publishing (in multiple open
standard formats) and search, comprising of: (a) a lightweight document
structure and presentation markup syntax; and (b) an accompanying engine for
generating standard document format outputs from documents prepared in sisu
markup syntax, which is able to produce multiple standard outputs (including
the population of sql databases) that (can) share a common numbering system for
the citation of text within a document.
*SiSU* is developed under an open source, software libre license (GPL3). Its
use case for development is work with medium to large document sets and cope
with evolving document formats/ representation technologies. Documents are
prepared once, and generated as need be to update the technical presentation or
add additional output formats. Various output formats (including search related
output) share a common mechanism for cross-output-format citation.
*SiSU* both defines a markup syntax and provides an engine that produces open
standards format outputs from documents prepared with *SiSU* markup. From a
single lightly prepared document sisu custom builds several standard output
formats which share a common (text object) numbering system for citation of
content within a document (that also has implications for search). The sisu
engine works with an abstraction of the document's structure and content from
which it is possible to generate different forms of representation of the
document. Significantly *SiSU* markup is more sparse than html and outputs
which include html, EPUB, LaTeX, landscape and portrait pdfs, Open Document
Format (ODF), all of which can be added to and updated. *SiSU* is also able to
populate SQL type databases at an object level, which means that searches can
be made with that degree of granularity.
Source document preparation and output generation is a two step process: (i)
document source is prepared, that is, marked up in sisu markup syntax and (ii)
the desired output subsequently generated by running the sisu engine against
document source. Output representations if updated (in the sisu engine) can be
generated by re-running the engine against the prepared source. Using *SiSU*
markup applied to a document, *SiSU* custom builds (to take advantage of the
strengths of different ways of representing documents) various standard open
output formats including plain text, HTML, XHTML, XML, EPUB, OpenDocument,
LaTeX or PDF files, and populate an SQL database with objects[^1] (equating
generally to paragraph-sized chunks) so searches may be performed and matches
returned with that degree of granularity ( e.g. your search criteria is met by
these documents and at these locations within each document). Document output
formats share a common object numbering system for locating content. This is
particularly suitable for "published" works (finalized texts as opposed to
works that are frequently changed or updated) for which it provides a fixed
means of reference of content.
In preparing a *SiSU* document you optionally provide semantic information
related to the document in a document header, and in marking up the substantive
text provide information on the structure of the document, primarily indicating
heading levels and footnotes. You also provide information on basic text
attributes where used. The rest is automatic, sisu from this information custom
builds[^2] the different forms of output requested.
*SiSU* works with an abstraction of the document based on its structure which
is comprised of its headings[^3] and objects[^4], which enables *SiSU* to
represent the document in many different ways, and to take advantage of the
strengths of different ways of presenting documents. The objects are numbered,
and these numbers can be used to provide a common basis for citing material
within a document across the different output format types. This is significant
as page numbers are not well suited to the digital age, in web publishing,
changing a browser's default font or using a different browser can mean that
text will appear on a different page; and publishing in different formats,
html, landscape and portrait pdf etc. again page numbers are not useful to cite
text. Dealing with documents at an object level together with object numbering
also has implications for search that *SiSU* is able to take advantage of.
One of the challenges of maintaining documents is to keep them in a format that
allows use of them independently of proprietary platforms. Consider issues
related to dealing with legacy proprietary formats today and what guarantee you
have that old proprietary formats will remain (or can be read without
proprietary software/equipment) in 15 years time, or the way the way in which
html has evolved over its relatively short span of existence. *SiSU* provides
the flexibility of producing documents in multiple non-proprietary open formats
including html, pdf[^5] ODF,[^6] and EPUB.[^7] Whilst *SiSU* relies on
software, the markup is uncomplicated and minimalistic which guarantees that
future engines can be written to run against it. It is also easily converted to
other formats, which means documents prepared in *SiSU* can be migrated to
other document formats. Further security is provided by the fact that the
software itself, *SiSU* is available under GPL3 a licence that guarantees that
the source code will always be open, and free as in libre, which means that
that code base can be used, updated and further developed as required under the
terms of its license. Another challenge is to keep up with a moving target.
*SiSU* permits new forms of output to be added as they become important, (Open
Document Format text was added in 2006 when it became an ISO standard for
office applications and the archival of documents), EPUB was introduced in
2009; and allows the technical representations existing output to be updated
(html has evolved and the related module has been updated repeatedly over the
years, presumably when the World Wide Web Consortium (w3c) finalises html 5
which is currently under development, the html module will again be updated
allowing all existing documents to be regenerated as html 5).
The document formats are written to the file-system and available for indexing
by independent indexing tools, whether off the web like Google and Yahoo or on
the site like Lucene and Hyperestraier.
*SiSU* also provides other features such as concordance files and document
content certificates, and the working against an abstraction of document
structure has further possibilities for the research and development of other
document representations, the availability of objects is useful for example for
topic maps and thesauri, together with the flexibility of *SiSU* offers great
possibilities.
*SiSU* is primarily for published works, which can take advantage of the
citation system to reliably reference its documents. *SiSU* works well in a
complementary manner with such collaborative technologies as Wikis, which can
take advantage of and be used to discuss the substance of content prepared in
*SiSU*.
----------------------------------------
2. HOW DOES SISU WORK?
----------------------
*SiSU* markup is fairly minimalistic, it consists of: a (largely optional)
document header, made up of information about the document (such as when it was
published, who authored it, and granting what rights) and any processing
instructions; and markup within the substantive text of the document, which is
related to document structure and typeface. *SiSU* must be able to discern the
structure of a document, (text headings and their levels in relation to each
other), either from information provided in the document header or from markup
within the text (or from a combination of both). Processing is done against an
abstraction of the document comprising of information on the document's
structure and its objects,[2] which the program serializes (providing the
object numbers) and which are assigned hash sum values based on their content.
This abstraction of information about document structure, objects, (and hash
sums), provides considerable flexibility in representing documents different
ways and for different purposes (e.g. search, document layout, publishing,
content certification, concordance etc.), and makes it possible to take
advantage of some of the strengths of established ways of representing
documents, (or indeed to create new ones).
----------------------------------------
3. SUMMARY OF FEATURES
----------------------
* sparse/minimal markup (clean utf-8 source texts). Documents are prepared in a
single UTF-8 file using a minimalistic mnemonic syntax. Typical literature,
documents like "War and Peace" require almost no markup, and most of the
headers are optional.
* markup is easily readable/parsable by the human eye, (basic markup is simpler
and more sparse than the most basic HTML), [this may also be converted to XML
representations of the same input/source document].
* markup defines document structure (this may be done once in a header
pattern-match description, or for heading levels individually); basic text
attributes (bold, italics, underscore, strike-through etc.) as required; and
semantic information related to the document (header information, extended
beyond the Dublin core and easily further extended as required); the headers
may also contain processing instructions. *SiSU* markup is primarily an
abstraction of document structure and document metadata to permit taking
advantage of the basic strengths of existing alternative practical standard
ways of representing documents [be that browser viewing, paper publication, sql
search etc.] (html, epub, xml, odf, latex, pdf, sql)
* for output produces reasonably elegant output of established industry and
institutionally accepted open standard formats.[3] takes advantage of the
different strengths of various standard formats for representing documents,
amongst the output formats currently supported are:
* html - both as a single scrollable text and a segmented document
* xhtml
* epub
* XML - both in sax and dom style xml structures for further development as
required
* ODF - open document format, the iso standard for document storage
* LaTeX - used to generate pdf
* pdf (via LaTeX)
* sql - population of an sql database, (at the same object level that is used
to cite text within a document)
Also produces: concordance files; document content certificates (md5 or sha256
digests of headings, paragraphs, images etc.) and html manifests (and sitemaps
of content). (b) takes advantage of the strengths implicit in these very
different output types, (e.g. PDFs produced using typesetting of LaTeX,
databases populated with documents at an individual object/paragraph level,
making possible granular search (and related possibilities))
* ensuring content can be cited in a meaningful way regardless of selected
output format. Online publishing (and publishing in multiple document formats)
lacks a useful way of citing text internally within documents (important to
academics generally and to lawyers) as page numbers are meaningless across
browsers and formats. sisu seeks to provide a common way of pinpoint the text
within a document, (which can be utilized for citation and by search engines).
The outputs share a common numbering system that is meaningful (to man and
machine) across all digital outputs whether paper, screen, or database
oriented, (pdf, HTML, EPUB, xml, sqlite, postgresql), this numbering system can
be used to reference content.
* Granular search within documents. SQL databases are populated at an object
level (roughly headings, paragraphs, verse, tables) and become searchable with
that degree of granularity, the output information provides the
object/paragraph numbers which are relevant across all generated outputs; it is
also possible to look at just the matching paragraphs of the documents in the
database; [output indexing also work well with search indexing tools like
hyperestraier].
* long term maintainability of document collections in a world of changing
formats, having a very sparsely marked-up source document base. there is a
considerable degree of future-proofing, output representations are
"upgradeable", and new document formats may be added. e.g. addition of odf
(open document text) module in 2006, epub in 2009 and in future html5 output
sometime in future, without modification of existing prepared texts
* SQL search aside, documents are generated as required and static once
generated.
* documents produced are static files, and may be batch processed, this needs
to be done only once but may be repeated for various reasons as desired
(updated content, addition of new output formats, updated technology document
presentations/representations)
* document source (plaintext utf-8) if shared on the net may be used as input
and processed locally to produce the different document outputs
* document source may be bundled together (automatically) with associated
documents (multiple language versions or master document with inclusions) and
images and sent as a zip file called a sisupod, if shared on the net these too
may be processed locally to produce the desired document outputs
* generated document outputs may automatically be posted to remote sites.
* for basic document generation, the only software dependency is *Ruby*, and a
few standard Unix tools (this covers plaintext, HTML, EPUB, XML, ODF, LaTeX).
To use a database you of course need that, and to convert the LaTeX generated
to pdf, a latex processor like tetex or texlive.
* as a developers tool it is flexible and extensible
Syntax highlighting for *SiSU* markup is available for a number of text
editors.
*SiSU* is less about document layout than about finding a way with little
markup to be able to construct an abstract representation of a document that
makes it possible to produce multiple representations of it which may be rather
different from each other and used for different purposes, whether layout and
publishing, or search of content
i.e. to be able to take advantage from this minimal preparation starting point
of some of the strengths of rather different established ways of representing
documents for different purposes, whether for search (relational database, or
indexed flat files generated for that purpose whether of complete documents, or
say of files made up of objects), online viewing (e.g. html, xml, pdf), or
paper publication (e.g. pdf)...
the solution arrived at is by extracting structural information about the
document (about headings within the document) and by tracking objects (which
are serialized and also given hash values) in the manner described. It makes
possible representations that are quite different from those offered at
present. For example objects could be saved individually and identified by
their hashes, with an index of how the objects relate to each other to form a
document.
----------------------------------------
4. HELP
-------
4.1 SISU MANUAL
...............
The most up to date information on sisu should be contained in the sisu_manual,
available at:
The manual can be generated from source, found respectively, either within the
*SiSU* tarball or installed locally at:
./data/doc/sisu/markup-samples/sisu_manual
/usr/share/doc/sisu/markup-samples/sisu_manual
move to the respective directory and type e.g.:
sisu sisu_manual.ssm
4.2 SISU MAN PAGES
..................
If *SiSU* is installed on your system usual man commands should be available,
try:
man sisu
Most *SiSU* man pages are generated directly from sisu documents that are used
to prepare the sisu manual, the sources files for which are located within the
*SiSU* tarball at:
./data/doc/sisu/markup-samples/sisu_manual
Once installed, directory equivalent to:
/usr/share/doc/sisu/markup-samples/sisu_manual
Available man pages are converted back to html using man2html:
/usr/share/doc/sisu/html/
./data/doc/sisu/html
An online version of the sisu man page is available here:
* various sisu man pages [link: ] [^8]
* sisu.1 [link: ] [^9]
4.3 SISU BUILT-IN INTERACTIVE HELP
..................................
This is particularly useful for getting the current sisu setup/environment
information:
sisu --help
sisu --help [subject]
sisu --help commands
sisu --help markup
sisu --help env [for feedback on the way your system is setup with regard
to sisu]
sisu -V [environment information, same as above command]
sisu (on its own provides version and some help information)
Apart from real-time information on your current configuration the *SiSU*
manual and man pages are likely to contain more up-to-date information than the
sisu interactive help (for example on commands and markup).
NOTE: Running the command sisu (alone without any flags, filenames or
wildcards) brings up the interactive help, as does any sisu command that is not
recognised. Enter to escape.
----------------------------------------
5. COMMANDS SUMMARY
-------------------
5.1 DESCRIPTION
...............
*SiSU* *SiSU* is a document publishing system, that from a simple single
marked-up document, produces multiple of output formats including: plaintext,
html, xhtml, XML, epub, odt (odf text), LaTeX, pdf, info, and SQL (PostgreSQL
and SQLite), which share numbered text objects ("object citation numbering")
and the same document structure information. For more see:
5.2 DOCUMENT PROCESSING COMMAND FLAGS
.....................................
*-a [filename/wildcard] *
produces plaintext with Unix linefeeds and without markup, (object numbers
are omitted), has footnotes at end of each paragraph that contains them [ -A
for equivalent dos (linefeed) output file] [see -e for endnotes]. (Options
include: --endnotes for endnotes --footnotes for footnotes at the end of each
paragraph --unix for unix linefeed (default) --msdos for msdos linefeed)
*-b [filename/wildcard] *
see --xhtml
*--color-toggle [filename/wildcard] *
screen toggle ansi screen colour on or off depending on default set (unless
-c flag is used: if sisurc colour default is set to 'true', output to screen
will be with colour, if sisurc colour default is set to 'false' or is undefined
screen output will be without colour). Alias -c
*--concordance [filename/wildcard] *
produces concordance (wordmap) a rudimentary index of all the words in a
document. (Concordance files are not generated for documents of over 260,000
words unless this limit is increased in the file sisurc.yml). Alias -w
*-C [--init-site] *
configure/initialise shared output directory files initialize shared output
directory (config files such as css and dtd files are not updated if they
already exist unless modifier is used). -C --init-site configure/initialise
site more extensive than -C on its own, shared output directory files/force
update, existing shared output config files such as css and dtd files are
updated if this modifier is used.
*-CC *
configure/initialise shared output directory files initialize shared output
directory (config files such as css and dtd files are not updated if they
already exist unless modifier is used). The equivalent of: -C --init-site
configure/initialise site, more extensive than -C on its own, shared output
directory files/force update, existing shared output config files such as css
and dtd files are updated if -CC is used.
*-c [filename/wildcard] *
see --color-toggle
*--dal [filename/wildcard/url] *
assumed for most other flags, creates new intermediate files for processing
(document abstraction) that is used in all subsequent processing of other
output. This step is assumed for most processing flags. To skip it see -n.
Alias -m
*--delete [filename/wildcard] *
see --zap
*-D [instruction] [filename] *
see --pg
*-d [--db-[database type (sqlite|pg)]] --[instruction] [filename] *
see --sqlite
*--epub [filename/wildcard] *
produces an epub document, [sisu version 2 only] (filename.epub). Alias -e
*-e [filename/wildcard] *
see --epub
*-F [--webserv=webrick] *
see --sample-search-form
*--git [filename/wildcard] *
produces or updates markup source file structure in a git repo (experimental
and subject to change). Alias -g
*-g [filename/wildcard] *
see --git
*--harvest *.ss[tm] *
makes two lists of sisu output based on the sisu markup documents in a
directory: list of author and authors works (year and titles), and; list by
topic with titles and author. Makes use of header metadata fields (author,
title, date, topic_register). Can be used with maintenance (-M) and remote
placement (-R) flags.
*--help [topic] *
provides help on the selected topic, where topics (keywords) include: list,
(com)mands, short(cuts), (mod)ifiers, (env)ironment, markup, syntax, headers,
headings, endnotes, tables, example, customise, skin, (dir)ectories, path,
(lang)uage, db, install, setup, (conf)igure, convert, termsheet, search, sql,
features, license
*--html [filename/wildcard] *
produces html output, segmented text with table of contents (toc.html and
index.html) and the document in a single file (scroll.html). Alias -h
*-h [filename/wildcard] *
see --html
*-I [filename/wildcard] *
see --texinfo
*-i [filename/wildcard] *
see --manpage
*-L *
prints license information.
*--machine [filename/wildcard/url] *
see --dal (document abstraction level/layer)
*--maintenance [filename/wildcard/url] *
maintenance mode files created for processing preserved and their locations
indicated. (also see -V). Alias -M
*--manpage [filename/wildcard] *
produces man page of file, not suitable for all outputs. Alias -i
*-M [filename/wildcard/url] *
see --maintenance
*-m [filename/wildcard/url] *
see --dal (document abstraction level/layer)
*--no-ocn *
[with --html --pdf or --epub] switches off object citation numbering. Produce
output without identifying numbers in margins of html or LaTeX/pdf output.
*-N [filename/wildcard/url] *
document digest or document content certificate ( DCC ) as md5 digest tree of
the document: the digest for the document, and digests for each object
contained within the document (together with information on software versions
that produced it) (digest.txt). -NV for verbose digest output to screen.
*-n [filename/wildcard/url] *
skip the creation of intermediate processing files (document abstraction) if
they already exist, this skips the equivalent of -m which is otherwise assumed
by most processing flags.
*--odf [filename/wildcard/url] *
see --odt
*--odt [filename/wildcard/url] *
output basic document in opendocument file format (opendocument.odt). Alias
-o
*-o [filename/wildcard/url] *
see --odt
*--pdf [filename/wildcard] *
produces LaTeX pdf (portrait.pdf & landscape.pdf). Default paper size is set
in config file, or document header, or provided with additional command line
parameter, e.g. --papersize-a4 preset sizes include: 'A4', U.S. 'letter' and
'legal' and book sizes 'A5' and 'B5' (system defaults to A4). Alias -p
*--pg [instruction] [filename] *
database postgresql ( --pgsql may be used instead) possible instructions,
include: --createdb; --create; --dropall; --import [filename]; --update
[filename]; --remove [filename]; see database section below. Alias -D
*--po [language_directory/filename language_directory] *
see --po4a
*--po4a [language_directory/filename language_directory] *
produces .pot and po files for the file in the languages specified by the
language directory. *SiSU* markup is placed in subdirectories named with the
language code, e.g. en/ fr/ es/. The sisu config file must set the output
directory structure to multilingual. v3, experimental
*-P [language_directory/filename language_directory] *
see --po4a
*-p [filename/wildcard] *
see --pdf
*--quiet [filename/wildcard] *
quiet less output to screen.
*-q [filename/wildcard] *
see --quiet
*--rsync [filename/wildcard] *
copies sisu output files to remote host using rsync. This requires that
sisurc.yml has been provided with information on hostname and username, and
that you have your "keys" and ssh agent in place. Note the behavior of rsync
different if -R is used with other flags from if used alone. Alone the rsync
--delete parameter is sent, useful for cleaning the remote directory (when -R
is used together with other flags, it is not). Also see --scp. Alias -R
*-R [filename/wildcard] *
see --rsync
*-r [filename/wildcard] *
see --scp
*--sample-search-form [--webserv=webrick] *
generate examples of (naive) cgi search form for sqlite and pgsql depends on
your already having used sisu to populate an sqlite and/or pgsql database, (the
sqlite version scans the output directories for existing sisu_sqlite databases,
so it is first necessary to create them, before generating the search form) see
-d -D and the database section below. If the optional parameter
--webserv=webrick is passed, the cgi examples created will be set up to use the
default port set for use by the webrick server, (otherwise the port is left
blank and the system setting used, usually 80). The samples are dumped in the
present work directory which must be writable, (with screen instructions given
that they be copied to the cgi-bin directory). -Fv (in addition to the above)
provides some information on setting up hyperestraier for sisu. Alias -F
*--scp [filename/wildcard] *
copies sisu output files to remote host using scp. This requires that
sisurc.yml has been provided with information on hostname and username, and
that you have your "keys" and ssh agent in place. Also see --rsync. Alias -r
*--sqlite --[instruction] [filename] *
database type default set to sqlite, (for which --sqlite may be used instead)
or to specify another database --db-[pgsql, sqlite] (however see -D) possible
instructions include: --createdb; --create; --dropall; --import [filename];
--update [filename]; --remove [filename]; see database section below. Alias -d
*--sisupod *
produces a sisupod a zipped sisu directory of markup files including sisu
markup source files and the directories local configuration file, images and
skins. Note: this only includes the configuration files or skins contained in
./_sisu not those in ~/.sisu -S [filename/wildcard] option. Note: (this option
is tested only with zsh). Alias -S
*--sisupod [filename/wildcard] *
produces a zipped file of the prepared document specified along with
associated images, by default named sisupod.zip they may alternatively be named
with the filename extension .ssp This provides a quick way of gathering the
relevant parts of a sisu document which can then for example be emailed. A
sisupod includes sisu markup source file, (along with associated documents if a
master file, or available in multilingual versions), together with related
images and skin. *SiSU* commands can be run directly against a sisupod
contained in a local directory, or provided as a url on a remote site. As there
is a security issue with skins provided by other users, they are not applied
unless the flag --trust or --trusted is added to the command instruction, it is
recommended that file that are not your own are treated as untrusted. The
directory structure of the unzipped file is understood by sisu, and sisu
commands can be run within it. Note: if you wish to send multiple files, it
quickly becomes more space efficient to zip the sisu markup directory, rather
than the individual files for sending). See the -S option without
[filename/wildcard]. Alias -S
*--source [filename/wildcard] *
copies sisu markup file to output directory. Alias -s
*-S *
see --sisupod
*-S [filename/wildcard] *
see --sisupod
*-s [filename/wildcard] *
see --source
*--texinfo [filename/wildcard] *
produces texinfo and info file, (view with pinfo). Alias -I
*--txt [filename/wildcard] *
produces plaintext with Unix linefeeds and without markup, (object numbers
are omitted), has footnotes at end of each paragraph that contains them [ -A
for equivalent dos (linefeed) output file] [see -e for endnotes]. (Options
include: --endnotes for endnotes --footnotes for footnotes at the end of each
paragraph --unix for unix linefeed (default) --msdos for msdos linefeed). Alias
-t
*-T [filename/wildcard (*.termsheet.rb)] *
standard form document builder, preprocessing feature
*-t [filename/wildcard] *
see --txt
*--urls [filename/wildcard] *
prints url output list/map for the available processing flags options and
resulting files that could be requested, (can be used to get a list of
processing options in relation to a file, together with information on the
output that would be produced), -u provides url output mapping for those flags
requested for processing. The default assumes sisu_webrick is running and
provides webrick url mappings where appropriate, but these can be switched to
file system paths in sisurc.yml. Alias -U
*-U [filename/wildcard] *
see --urls
*-u [filename/wildcard] *
provides url mapping of output files for the flags requested for processing,
also see -U
*--v2 [filename/wildcard] *
invokes the sisu v2 document parser/generator. This is the default and is
normally omitted.
*--v3 [filename/wildcard] *
invokes the sisu v3 document parser/generator. Currently under development
and incomplete, v3 requires >= ruby1.9.2p180. You may run sisu3 instead.
*--verbose [filename/wildcard] *
provides verbose output of what is being generated, where output is placed
(and error messages if any), as with -u flag provides a url mapping of files
created for each of the processing flag requests. Alias -v
*-V *
on its own, provides *SiSU* version and environment information (sisu --help
env)
*-V [filename/wildcard] *
even more verbose than the -v flag.
*-v *
on its own, provides *SiSU* version information
*-v [filename/wildcard] *
see --verbose
*--webrick *
starts ruby's webrick webserver points at sisu output directories, the
default port is set to 8081 and can be changed in the resource configuration
files. [tip: the webrick server requires link suffixes, so html output should
be created using the -h option rather than -H ; also, note -F webrick ]. Alias
-W
*-W *
see --webrick
*--wordmap [filename/wildcard] *
see --concordance
*-w [filename/wildcard] *
see --concordance
*--xhtml [filename/wildcard] *
produces xhtml/XML output for browser viewing (sax parsing). Alias -b
*--xml-dom [filename/wildcard] *
produces XML output with deep document structure, in the nature of dom. Alias
-X
*--xml-sax [filename/wildcard] *
produces XML output shallow structure (sax parsing). Alias -x
*-X [filename/wildcard] *
see --xml-dom
*-x [filename/wildcard] *
see --xml-sax
*-Y [filename/wildcard] *
produces a short sitemap entry for the document, based on html output and the
sisu_manifest. --sitemaps generates/updates the sitemap index of existing
sitemaps. (Experimental, [g,y,m announcement this week])
*-y [filename/wildcard] *
produces an html summary of output generated (hyperlinked to content) and
document specific metadata (sisu_manifest.html). This step is assumed for most
processing flags.
*--zap [filename/wildcard] *
Zap, if used with other processing flags deletes output files of the type
about to be processed, prior to processing. If -Z is used as the lone
processing related flag (or in conjunction with a combination of -[mMvVq]),
will remove the related document output directory. Alias -Z
*-Z [filename/wildcard] *
see --zap
----------------------------------------
6. COMMAND LINE MODIFIERS
-------------------------
*--no-ocn *
[with --html --pdf or --epub] switches off object citation numbering. Produce
output without identifying numbers in margins of html or LaTeX/pdf output.
*--no-annotate *
strips output text of editor endnotes[^*1] denoted by asterisk or dagger/plus
sign
*--no-asterisk *
strips output text of editor endnotes[^*2] denoted by asterisk sign
*--no-dagger *
strips output text of editor endnotes[^+1] denoted by dagger/plus sign
----------------------------------------
7. DATABASE COMMANDS
--------------------
dbi - database interface
-D or --pgsql set for postgresql -d or --sqlite default set for sqlite -d is
modifiable with --db=[database type (pgsql or sqlite)]
*--pg -v --createall *
initial step, creates required relations (tables, indexes) in existing
postgresql database (a database should be created manually and given the same
name as working directory, as requested) (rb.dbi) [ -dv --createall sqlite
equivalent] it may be necessary to run sisu -Dv --createdb initially NOTE: at
the present time for postgresql it may be necessary to manually create the
database. The command would be 'createdb [database name]' where database name
would be SiSU_[present working directory name (without path)]. Please use only
alphanumerics and underscores.
*--pg -v --import *
[filename/wildcard] imports data specified to postgresql db (rb.dbi) [ -dv
--import sqlite equivalent]
*--pg -v --update *
[filename/wildcard] updates/imports specified data to postgresql db (rb.dbi)
[ -dv --update sqlite equivalent]
*--pg --remove *
[filename/wildcard] removes specified data to postgresql db (rb.dbi) [ -d
--remove sqlite equivalent]
*--pg --dropall *
kills data" and drops (postgresql or sqlite) db, tables & indexes [ -d
--dropall sqlite equivalent]
The -v is for verbose output.
----------------------------------------
8. SHORTCUTS, SHORTHAND FOR MULTIPLE FLAGS
------------------------------------------
*--update [filename/wildcard] *
Checks existing file output and runs the flags required to update this
output. This means that if only html and pdf output was requested on previous
runs, only the -hp files will be applied, and only these will be generated this
time, together with the summary. This can be very convenient, if you offer
different outputs of different files, and just want to do the same again.
*-0 to -5 [filename or wildcard] *
Default shorthand mappings (note that the defaults can be changed/configured
in the sisurc.yml file):
*-0 *
-mNhwpAobxXyYv [this is the default action run when no options are give, i.e.
on 'sisu [filename]']
*-1 *
-mhewpy
*-2 *
-mhewpaoy
*-3 *
-mhewpAobxXyY
*-4 *
-mhewpAobxXDyY --import
*-5 *
-mhewpAobxXDyY --update
add -v for verbose mode and -c for color, e.g. sisu -2vc [filename or wildcard]
consider -u for appended url info or -v for verbose output
8.1 COMMAND LINE WITH FLAGS - BATCH PROCESSING
..............................................
In the data directory run sisu -mh filename or wildcard eg. "sisu -h cisg.sst"
or "sisu -h *.{sst,ssm}" to produce html version of all documents.
Running sisu (alone without any flags, filenames or wildcards) brings up the
interactive help, as does any sisu command that is not recognised. Enter to
escape.
----------------------------------------
9. INTRODUCTION TO SISU MARKUP[^10]
-----------------------------------
9.1 SUMMARY
...........
*SiSU* source documents are plaintext (UTF-8)[^11] files
All paragraphs are separated by an empty line.
Markup is comprised of:
* at the top of a document, the document header made up of semantic meta-data
about the document and if desired additional processing instructions (such an
instruction to automatically number headings from a particular level down)
* followed by the prepared substantive text of which the most important single
characteristic is the markup of different heading levels, which define the
primary outline of the document structure. Markup of substantive text includes:
* heading levels defines document structure
* text basic attributes, italics, bold etc.
* grouped text (objects), which are to be treated differently, such as code
blocks or poems.
* footnotes/endnotes
* linked text and images
* paragraph actions, such as indent, bulleted, numbered-lists, etc.
Some interactive help on markup is available, by typing sisu and selecting
markup or sisu --help markup
To check the markup in a file:
sisu --identify [filename].sst
For brief descriptive summary of markup history
sisu --query-history
or if for a particular version:
sisu --query-0.38
9.2 MARKUP EXAMPLES
...................
9.2.1 ONLINE
............
Online markup examples are available together with the respective outputs
produced from or from
There is of course this document, which provides a cursory overview of sisu
markup and the respective output produced:
an alternative presentation of markup syntax:
/usr/share/doc/sisu/on_markup.txt.gz
9.2.2 INSTALLED
...............
With *SiSU* installed sample skins may be found in:
/usr/share/doc/sisu/markup-samples (or equivalent directory) and if
sisu-markup-samples is installed also under:
/usr/share/doc/sisu/markup-samples-non-free
----------------------------------------
10. MARKUP OF HEADERS
---------------------
Headers contain either: semantic meta-data about a document, which can be used
by any output module of the program, or; processing instructions.
Note: the first line of a document may include information on the markup
version used in the form of a comment. Comments are a percentage mark at the
start of a paragraph (and as the first character in a line of text) followed by
a space and the comment:
% this would be a comment
10.1 SAMPLE HEADER
..................
This current document is loaded by a master document that has a header similar
to this one:
% SiSU master 2.0
@title: SiSU
:subtitle: Manual
@creator: :author: Amissah, Ralph
@rights: Copyright (C) Ralph Amissah 2007, part of SiSU documentation, License GPL 3
@classify:
:type: information
:topic_register: SiSU:manual;electronic documents:SiSU:manual
:subject: ebook, epublishing, electronic book, electronic publishing,
electronic document, electronic citation, data structure,
citation systems, search
% used_by: manual
@date:
:published: 2008-05-22
:created: 2002-08-28
:issued: 2002-08-28
:available: 2002-08-28
:modified: 2010-03-03
@make:
:num_top: 1
:breaks: new=C; break=1
:skin: skin_sisu_manual
:bold: /Gnu|Debian|Ruby|SiSU/
:manpage: name=sisu - documents: markup, structuring, publishing in multiple standard formats, and search;
synopsis=sisu [-abcDdeFhIiMmNnopqRrSsTtUuVvwXxYyZz0-9] [filename/wildcard ]
. sisu [-Ddcv] [instruction]
. sisu [-CcFLSVvW]
. sisu --v2 [operations]
. sisu --v3 [operations]
@links:
{ SiSU Homepage }http://www.sisudoc.org/
{ SiSU Manual }http://www.sisudoc.org/sisu/sisu_manual/
{ Book Samples & Markup Examples }http://www.jus.uio.no/sisu/SiSU/examples.html
{ SiSU Download }http://www.jus.uio.no/sisu/SiSU/download.html
{ SiSU Changelog }http://www.jus.uio.no/sisu/SiSU/changelog.html
{ SiSU Git repo }http://git.sisudoc.org/?p=code/sisu.git;a=summary
{ SiSU List Archives }http://lists.sisudoc.org/pipermail/sisu/
{ SiSU @ Debian }http://packages.qa.debian.org/s/sisu.html
{ SiSU Project @ Debian }http://qa.debian.org/developer.php?login=sisu@lists.sisudoc.org
{ SiSU @ Wikipedia }http://en.wikipedia.org/wiki/SiSU
10.2 AVAILABLE HEADERS
......................
Header tags appear at the beginning of a document and provide meta information
on the document (such as the Dublin Core), or information as to how the
document as a whole is to be processed. All header instructions take the form
@headername: or on the next line and indented by once space :subheadername: All
Dublin Core meta tags are available
*@indentifier:* information or instructions
where the "identifier" is a tag recognised by the program, and the
"information" or "instructions" belong to the tag/indentifier specified
Note: a header where used should only be used once; all headers apart from
@title: are optional; the @structure: header is used to describe document
structure, and can be useful to know.
This is a sample header
% SiSU 2.0 [declared file-type identifier with markup version]
@title: [title text] [this header is the only one that is mandatory]
:subtitle: [subtitle if any]
:language: English
@creator:
:author: [Lastname, First names]
:illustrator: [Lastname, First names]
:translator: [Lastname, First names]
:prepared_by: [Lastname, First names]
@date:
:published: [year or yyyy-mm-dd]
:created: [year or yyyy-mm-dd]
:issued: [year or yyyy-mm-dd]
:available: [year or yyyy-mm-dd]
:modified: [year or yyyy-mm-dd]
:valid: [year or yyyy-mm-dd]
:added_to_site: [year or yyyy-mm-dd]
:translated: [year or yyyy-mm-dd]
@rights:
:copyright: Copyright (C) [Year and Holder]
:license: [Use License granted]
:text: [Year and Holder]
:translation: [Name, Year]
:illustrations: [Name, Year]
@classify:
:topic_register: SiSU:markup sample:book;book:novel:fantasy
:type:
:subject:
:description:
:keywords:
:abstract:
:isbn: [ISBN]
:loc: [Library of Congress classification]
:dewey: [Dewey classification
:pg: [Project Gutenberg text number]
@links: { SiSU }http://www.sisudoc.org
{ FSF }http://www.fsf.org
@make:
:skin: skin_name [skins change default settings related to the appearance of documents generated]
:num_top: 1
:headings: [text to match for each level
(e.g. PART; Chapter; Section; Article; or another: none; BOOK|FIRST|SECOND; none; CHAPTER;)
:breaks: new=:C; break=1
:promo: sisu, ruby, sisu_search_libre, open_society
:bold: [regular expression of words/phrases to be made bold]
:italics: [regular expression of words/phrases to italicise]
@original:
:language: [language]
@notes:
:comment:
:prefix: [prefix is placed just after table of contents]
----------------------------------------
11. MARKUP OF SUBSTANTIVE TEXT
------------------------------
11.1 HEADING LEVELS
...................
Heading levels are :A~ ,:B~ ,:C~ ,1~ ,2~ ,3~ ... :A - :C being part / section
headings, followed by other heading levels, and 1 -6 being headings followed by
substantive text or sub-headings. :A~ usually the title :A~? conditional level
1 heading (used where a stand-alone document may be imported into another)
*:A~ [heading text]* Top level heading [this usually has similar content to the
title @title: ] NOTE: the heading levels described here are in 0.38 notation,
see heading
*:B~ [heading text]* Second level heading [this is a heading level divider]
*:C~ [heading text]* Third level heading [this is a heading level divider]
*1~ [heading text]* Top level heading preceding substantive text of document or
sub-heading 2, the heading level that would normally be marked 1. or 2. or 3.
etc. in a document, and the level on which sisu by default would break html
output into named segments, names are provided automatically if none are given
(a number), otherwise takes the form 1~my_filename_for_this_segment
*2~ [heading text]* Second level heading preceding substantive text of document
or sub-heading 3 , the heading level that would normally be marked 1.1 or 1.2
or 1.3 or 2.1 etc. in a document.
*3~ [heading text]* Third level heading preceding substantive text of document,
that would normally be marked 1.1.1 or 1.1.2 or 1.2.1 or 2.1.1 etc. in a
document
1~filename level 1 heading,
% the primary division such as Chapter that is followed by substantive text, and may be further subdivided (this is the level on which by default html segments are made)
11.2 FONT ATTRIBUTES
....................
*markup example:*
normal text, *{emphasis}*, !{bold text}!, /{italics}/, _{underscore}_, "{citation}",
^{superscript}^, ,{subscript},, +{inserted text}+, -{strikethrough}-, #{monospace}#
normal text
*{emphasis}* [note: can be configured to be represented by bold, italics or underscore]
!{bold text}!
/{italics}/
_{underscore}_
"{citation}"
^{superscript}^
,{subscript},
+{inserted text}+
-{strikethrough}-
#{monospace}#
*resulting output:*
normal text, *emphasis*, *bold text*, /italics/, _underscore_, "citation",
^superscript^, [subscript], +inserted text+, -strikethrough-, monospace
normal text
*emphasis* [note: can be configured to be represented by bold, italics or
underscore]
*bold text*
/italics/
_underscore_
"citation"
^superscript^
[subscript]
+inserted text+
-strikethrough-
monospace
11.3 INDENTATION AND BULLETS
............................
*markup example:*
ordinary paragraph
_1 indent paragraph one step
_2 indent paragraph two steps
_9 indent paragraph nine steps
*resulting output:*
ordinary paragraph
indent paragraph one step
indent paragraph two steps
indent paragraph nine steps
*markup example:*
_* bullet text
_1* bullet text, first indent
_2* bullet text, two step indent
*resulting output:*
* bullet text
* bullet text, first indent
* bullet text, two step indent
Numbered List (not to be confused with headings/titles, (document structure))
*markup example:*
# numbered list numbered list 1., 2., 3, etc.
_# numbered list numbered list indented a., b., c., d., etc.
11.4 FOOTNOTES / ENDNOTES
.........................
Footnotes and endnotes are marked up at the location where they would be
indicated within a text. They are automatically numbered. The output type
determines whether footnotes or endnotes will be produced
*markup example:*
~{ a footnote or endnote }~
*resulting output:*
[^12]
*markup example:*
normal text~{ self contained endnote marker & endnote in one }~ continues
*resulting output:*
normal text[^13] continues
*markup example:*
normal text ~{* unnumbered asterisk footnote/endnote, insert multiple asterisks if required }~ continues
normal text ~{** another unnumbered asterisk footnote/endnote }~ continues
*resulting output:*
normal text [^*] continues
normal text [^**] continues
*markup example:*
normal text ~[* editors notes, numbered asterisk footnote/endnote series ]~ continues
normal text ~[+ editors notes, numbered asterisk footnote/endnote series ]~ continues
*resulting output:*
normal text [^*3] continues
normal text [^+2] continues
*Alternative endnote pair notation for footnotes/endnotes:*
% note the endnote marker "~^"
normal text~^ continues
^~ endnote text following the paragraph in which the marker occurs
the standard and pair notation cannot be mixed in the same document
11.5 LINKS
..........
11.5.1 NAKED URLS WITHIN TEXT, DEALING WITH URLS
................................................
urls found within text are marked up automatically. A url within text is
automatically hyperlinked to itself and by default decorated with angled
braces, unless they are contained within a code block (in which case they are
passed as normal text), or escaped by a preceding underscore (in which case the
decoration is omitted).
*markup example:*
normal text http://www.sisudoc.org/ continues
*resulting output:*
normal text continues
An escaped url without decoration
*markup example:*
normal text _http://www.sisudoc.org/ continues
deb http://www.jus.uio.no/sisu/archive unstable main non-free
*resulting output:*
normal text http://www.sisudoc.org/ continues
deb http://www.jus.uio.no/sisu/archive unstable main non-free
where a code block is used there is neither decoration nor hyperlinking, code
blocks are discussed later in this document
*resulting output:*
deb http://www.jus.uio.no/sisu/archive unstable main non-free
deb-src http://www.jus.uio.no/sisu/archive unstable main non-free
11.5.2 LINKING TEXT
...................
To link text or an image to a url the markup is as follows
*markup example:*
about { SiSU }http://url.org markup
*resulting output:*
about SiSU [link: ] markup
A shortcut notation is available so the url link may also be provided
automatically as a footnote
*markup example:*
about {~^ SiSU }http://url.org markup
*resulting output:*
about SiSU [link: ] [^14] markup
Internal document links to a tagged location, including an ocn
*markup example:*
about { text links }#link_text
*resulting output:*
about text links
Shared document collection link
*markup example:*
about { SiSU book markup examples }:SiSU/examples.html
*resulting output:*
about *SiSU* book markup examples
11.5.3 LINKING IMAGES
.....................
*markup example:*
{ tux.png 64x80 }image
% various url linked images
[image: "a better way"]
[image: "Way Better - with Gnu/Linux, Debian and Ruby"]
{~^ ruby_logo.png "Ruby" }http://www.ruby-lang.org/en/
*resulting output:*
tux.png 64x80 [link: local image]
tux.png 64x80 "Gnu/Linux - a better way" [link: ]
GnuDebianLinuxRubyBetterWay.png 100x101 "Way Better - with Gnu/Linux, Debian
and Ruby" [link: ]
ruby_logo.png 70x90 "Ruby" [link: ] [^15]
*linked url footnote shortcut*
{~^ [text to link] }http://url.org
% maps to: { [text to link] }http://url.org ~{ http://url.org }~
% which produces hyper-linked text within a document/paragraph, with an endnote providing the url for the text location used in the hyperlink
text marker
note at a heading level the same is automatically achieved by providing names
to headings 1, 2 and 3 i.e. 2~[name] and 3~[name] or in the case of
auto-heading numbering, without further intervention.
11.6 GROUPED TEXT
.................
11.6.1 TABLES
.............
Tables may be prepared in two either of two forms
*markup example:*
table{ c3; 40; 30; 30;
This is a table
this would become column two of row one
column three of row one is here
And here begins another row
column two of row two
column three of row two, and so on
}table
*resulting output:*
This is a table┆this would become column two of row one┆column three of row one is here
And here begins another row┆column two of row two┆column three of row two, and so on
a second form may be easier to work with in cases where there is not much
information in each column
*markup example:*[^16]
!_ Table 3.1: Contributors to Wikipedia, January 2001 - June 2005
{table~h 24; 12; 12; 12; 12; 12; 12;}
|Jan. 2001|Jan. 2002|Jan. 2003|Jan. 2004|July 2004|June 2006
Contributors* | 10| 472| 2,188| 9,653| 25,011| 48,721
Active contributors** | 9| 212| 846| 3,228| 8,442| 16,945
Very active contributors*** | 0| 31| 190| 692| 1,639| 3,016
No. of English language articles| 25| 16,000| 101,000| 190,000| 320,000| 630,000
No. of articles, all languages | 25| 19,000| 138,000| 490,000| 862,000|1,600,000
\* Contributed at least ten times; \** at least 5 times in last month; \*\** more than 100 times in last month.
*resulting output:*
*Table 3.1: Contributors to Wikipedia, January 2001 - June 2005*
┆Jan. 2001┆Jan. 2002┆Jan. 2003┆Jan. 2004┆July 2004┆June 2006
Contributors*┆10┆472┆2,188┆9,653┆25,011┆48,721
Active contributors**┆9┆212┆846┆3,228┆8,442┆16,945
Very active contributors***┆0┆31┆190┆692┆1,639┆3,016
No. of English language articles┆25┆16,000┆101,000┆190,000┆320,000┆630,000
No. of articles, all languages┆25┆19,000┆138,000┆490,000┆862,000┆1,600,000
* Contributed at least ten times; ** at least 5 times in last month; *** more
than 100 times in last month.
11.6.2 POEM
...........
*basic markup:*
poem{
Your poem here
}poem
Each verse in a poem is given an object number.
*markup example:*
poem{
`Fury said to a
mouse, That he
met in the
house,
"Let us
both go to
law: I will
prosecute
YOU. --Come,
I'll take no
denial; We
must have a
trial: For
really this
morning I've
nothing
to do."
Said the
mouse to the
cur, "Such
a trial,
dear Sir,
With
no jury
or judge,
would be
wasting
our
breath."
"I'll be
judge, I'll
be jury,"
Said
cunning
old Fury:
"I'll
try the
whole
cause,
and
condemn
you
to
death."'
}poem
*resulting output:*
`Fury said to a
mouse, That he
met in the
house,
"Let us
both go to
law: I will
prosecute
YOU. --Come,
I'll take no
denial; We
must have a
trial: For
really this
morning I've
nothing
to do."
Said the
mouse to the
cur, "Such
a trial,
dear Sir,
With
no jury
or judge,
would be
wasting
our
breath."
"I'll be
judge, I'll
be jury,"
Said
cunning
old Fury:
"I'll
try the
whole
cause,
and
condemn
you
to
death."'
11.6.3 GROUP
............
*basic markup:*
group{
Your grouped text here
}group
A group is treated as an object and given a single object number.
*markup example:*
group{
`Fury said to a
mouse, That he
met in the
house,
"Let us
both go to
law: I will
prosecute
YOU. --Come,
I'll take no
denial; We
must have a
trial: For
really this
morning I've
nothing
to do."
Said the
mouse to the
cur, "Such
a trial,
dear Sir,
With
no jury
or judge,
would be
wasting
our
breath."
"I'll be
judge, I'll
be jury,"
Said
cunning
old Fury:
"I'll
try the
whole
cause,
and
condemn
you
to
death."'
}group
*resulting output:*
`Fury said to a
mouse, That he
met in the
house,
"Let us
both go to
law: I will
prosecute
YOU. --Come,
I'll take no
denial; We
must have a
trial: For
really this
morning I've
nothing
to do."
Said the
mouse to the
cur, "Such
a trial,
dear Sir,
With
no jury
or judge,
would be
wasting
our
breath."
"I'll be
judge, I'll
be jury,"
Said
cunning
old Fury:
"I'll
try the
whole
cause,
and
condemn
you
to
death."'
11.6.4 CODE
...........
Code tags code{ ... }code (used as with other group tags described above) are
used to escape regular sisu markup, and have been used extensively within this
document to provide examples of *SiSU* markup. You cannot however use code tags
to escape code tags. They are however used in the same way as group or poem
tags.
A code-block is treated as an object and given a single object number. [an
option to number each line of code may be considered at some later time]
*use of code tags instead of poem compared, resulting output:*
`Fury said to a
mouse, That he
met in the
house,
"Let us
both go to
law: I will
prosecute
YOU. --Come,
I'll take no
denial; We
must have a
trial: For
really this
morning I've
nothing
to do."
Said the
mouse to the
cur, "Such
a trial,
dear Sir,
With
no jury
or judge,
would be
wasting
our
breath."
"I'll be
judge, I'll
be jury,"
Said
cunning
old Fury:
"I'll
try the
whole
cause,
and
condemn
you
to
death."'
From *SiSU* 2.7.7 on you can number codeblocks by placing a hash after the
opening code tag code{# as demonstrated here:
1 ┆ `Fury said to a
2 ┆ mouse, That he
3 ┆ met in the
4 ┆ house,
5 ┆ "Let us
6 ┆ both go to
7 ┆ law: I will
8 ┆ prosecute
9 ┆ YOU. --Come,
10 ┆ I'll take no
11 ┆ denial; We
12 ┆ must have a
13 ┆ trial: For
14 ┆ really this
15 ┆ morning I've
16 ┆ nothing
17 ┆ to do."
18 ┆ Said the
19 ┆ mouse to the
20 ┆ cur, "Such
21 ┆ a trial,
22 ┆ dear Sir,
23 ┆ With
24 ┆ no jury
25 ┆ or judge,
26 ┆ would be
27 ┆ wasting
28 ┆ our
29 ┆ breath."
30 ┆ "I'll be
31 ┆ judge, I'll
32 ┆ be jury,"
33 ┆ Said
34 ┆ cunning
35 ┆ old Fury:
36 ┆ "I'll
37 ┆ try the
38 ┆ whole
39 ┆ cause,
40 ┆ and
41 ┆ condemn
42 ┆ you
43 ┆ to
44 ┆ death."'
11.7 BOOK INDEX
...............
To make an index append to paragraph the book index term relates to it, using
an equal sign and curly braces.
Currently two levels are provided, a main term and if needed a sub-term.
Sub-terms are separated from the main term by a colon.
Paragraph containing main term and sub-term.
={Main term:sub-term}
The index syntax starts on a new line, but there should not be an empty line
between paragraph and index markup.
The structure of the resulting index would be:
Main term, 1
sub-term, 1
Several terms may relate to a paragraph, they are separated by a semicolon. If
the term refers to more than one paragraph, indicate the number of paragraphs.
Paragraph containing main term, second term and sub-term.
={first term; second term: sub-term}
The structure of the resulting index would be:
First term, 1,
Second term, 1,
sub-term, 1
If multiple sub-terms appear under one paragraph, they are separated under the
main term heading from each other by a pipe symbol.
Paragraph containing main term, second term and sub-term.
={Main term:sub-term+1|second sub-term
A paragraph that continues discussion of the first sub-term
The plus one in the example provided indicates the first sub-term spans one
additional paragraph. The logical structure of the resulting index would be:
Main term, 1,
sub-term, 1-3,
second sub-term, 1,
----------------------------------------
12. COMPOSITE DOCUMENTS MARKUP
------------------------------
It is possible to build a document by creating a master document that requires
other documents. The documents required may be complete documents that could be
generated independently, or they could be markup snippets, prepared so as to be
easily available to be placed within another text. If the calling document is a
master document (built from other documents), it should be named with the
suffix *.ssm* Within this document you would provide information on the other
documents that should be included within the text. These may be other documents
that would be processed in a regular way, or markup bits prepared only for
inclusion within a master document *.sst* regular markup file, or *.ssi*
(insert/information) A secondary file of the composite document is built prior
to processing with the same prefix and the suffix *._sst*
basic markup for importing a document into a master document
<< filename1.sst
<< filename2.ssi
The form described above should be relied on. Within the Vim editor it results
in the text thus linked becoming hyperlinked to the document it is calling in
which is convenient for editing. Alternative markup for importation of
documents under consideration, and occasionally supported have been.
<< filename.ssi
<<{filename.ssi}
% using textlink alternatives
<< |filename.ssi|@|^|
----------------------------------------
MARKUP SYNTAX HISTORY
=====================
----------------------------------------
13. NOTES RELATED TO FILES-TYPES AND MARKUP SYNTAX
--------------------------------------------------
2.0 introduced new headers and is therefore incompatible with 1.0 though
otherwise the same with the addition of a couple of tags (i.e. a superset)
0.38 is substantially current for version 1.0
depreciated 0.16 supported, though file names were changed at 0.37
* sisu --query=[sisu version [0.38] or 'history]
provides a short history of changes to *SiSU* markup
*SiSU 2.0* (2010-03-06:09/6) same as 1.0, apart from the changing of headers
and the addition of a monospace tag related headers now grouped, e.g.
@title:
:subtitle:
@creator:
:author:
:translator:
:illustrator:
@rights:
:text:
:illustrations:
see document markup samples, and sisu --help headers
the monospace tag takes the form of a hash '#'
#{ this enclosed text would be monospaced }#
*1.0* (2009-12-19:50/6) same as 0.69
*0.69* (2008-09-16:37/2) (same as 1.0) and as previous (0.57) with the addition
of book index tags
/^={.+?}$/
e.g. appended to a paragraph, on a new-line (without a blank line in between)
logical structure produced assuming this is the first text "object"
={GNU/Linux community distribution:Debian+2|Fedora|Gentoo;Free Software Foundation+5}
Free Software Foundation, 1-6
GNU/Linux community distribution, 1
Debian, 1-3
Fedora, 1
Gentoo,
*0.66* (2008-02-24:07/7) same as previous, adds semantic tags, [experimental
and not-used]
/[:;]{.+?}[:;][a-z+]/
*0.57* (2007w34/4) *SiSU* 0.57 is the same as 0.42 with the introduction of
some a shortcut to use the headers @title and @creator in the first heading
[expanded using the contents of the headers @title: and @author:]
:A~ @title by @author
*0.52* (2007w14/6) declared document type identifier at start of text/document:
*SiSU* 0.52
or, backward compatible using the comment marker:
% *SiSU* 0.38
variations include '*SiSU* (text|master|insert) [version]' and 'sisu-[version]'
*0.51* (2007w13/6) skins changed (simplified), markup unchanged
*0.42* (2006w27/4) * (asterisk) type endnotes, used e.g. in relation to author
*SiSU* 0.42 is the same as 0.38 with the introduction of some additional
endnote types,
Introduces some variations on endnotes, in particular the use of the asterisk
~{* for example for describing an author }~ and ~{** for describing a second author }~
* for example for describing an author
** for describing a second author
and
~[* my note ]~ or ~[+ another note ]~
which numerically increments an asterisk and plus respectively
*1 my note +1 another note
*0.38* (2006w15/7) introduced new/alternative notation for headers, e.g.
@title: (instead of 0~title), and accompanying document structure markup,
:A,:B,:C,1,2,3 (maps to previous 1,2,3,4,5,6)
*SiSU* 0.38 introduced alternative experimental header and heading/structure
markers,
@headername: and headers :A~ :B~ :C~ 1~ 2~ 3~
as the equivalent of:
0~headername and headers 1~ 2~ 3~ 4~ 5~ 6~
The internal document markup of *SiSU* 0.16 remains valid and standard Though
note that *SiSU* 0.37 introduced a new file naming convention
*SiSU* has in effect two sets of levels to be considered, using 0.38 notation
A-C headings/levels, pre-ordinary paragraphs /pre-substantive text, and 1-3
headings/levels, levels which are followed by ordinary text. This may be
conceptualised as levels A,B,C, 1,2,3, and using such letter number notation,
in effect: A must exist, optional B and C may follow in sequence (not strict) 1
must exist, optional 2 and 3 may follow in sequence i.e. there are two
independent heading level sequences A,B,C and 1,2,3 (using the 0.16 standard
notation 1,2,3 and 4,5,6) on the positive side: the 0.38 A,B,C,1,2,3
alternative makes explicit an aspect of structuring documents in *SiSU* that is
not otherwise obvious to the newcomer (though it appears more complicated, is
more in your face and likely to be understood fairly quickly); the substantive
text follows levels 1,2,3 and it is 'nice' to do most work in those levels
*0.37* (2006w09/7) introduced new file naming convention, .sst (text), .ssm
(master), .ssi (insert), markup syntax unchanged
*SiSU* 0.37 introduced new file naming convention, using the file extensions
.sst .ssm and .ssi to replace .s1 .s2 .s3 .r1 .r2 .r3 and .si
this is captured by the following file 'rename' instruction:
rename 's/\.s[123]$/\.sst/' *.s{1,2,3}
rename 's/\.r[123]$/\.ssm/' *.r{1,2,3}
rename 's/\.si$/\.ssi/' *.si
The internal document markup remains unchanged, from *SiSU* 0.16
*0.35* (2005w52/3) sisupod, zipped content file introduced
*0.23* (2005w36/2) utf-8 for markup file
*0.22* (2005w35/3) image dimensions may be omitted if rmagick is available to
be relied upon
*0.20.4* (2005w33/4) header 0~links
*0.16* (2005w25/2) substantial changes introduced to make markup cleaner,
header 0~title type, and headings [1-6]~ introduced, also percentage sign (%)
at start of a text line as comment marker
*SiSU* 0.16 (0.15 development branch) introduced the use of
the header 0~ and headings/structure 1~ 2~ 3~ 4~ 5~ 6~
in place of the 0.1 header, heading/structure notation
*SiSU* 0.1 headers and headings structure represented by header 0{~ and
headings/structure 1{ 2{ 3{ 4{~ 5{ 6{
----------------------------------------
14. SISU FILETYPES
------------------
*SiSU* has plaintext and binary filetypes, and can process either type of
document.
14.1 .SST .SSM .SSI MARKED UP PLAIN TEXT
........................................
*SiSU* documents are prepared as plain-text (utf-8) files with *SiSU* markup.
They may make reference to and contain images (for example), which are stored
in the directory beneath them _sisu/image. *SiSU* plaintext markup files are of
three types that may be distinguished by the file extension used: regular text
.sst; master documents, composite documents that incorporate other text, which
can be any regular text or text insert; and inserts the contents of which are
like regular text except these are marked .ssi and are not processed.
*SiSU* processing can be done directly against a sisu documents; which may be
located locally or on a remote server for which a url is provided.
*SiSU* source markup can be shared with the command:
sisu -s [filename]
14.1.1 SISU TEXT - REGULAR FILES (.SST)
.......................................
The most common form of document in *SiSU*, see the section on *SiSU* markup.
14.1.2 SISU MASTER FILES (.SSM)
...............................
Composite documents which incorporate other *SiSU* documents which may be
either regular *SiSU* text .sst which may be generated independently, or
inserts prepared solely for the purpose of being incorporated into one or more
master documents.
The mechanism by which master files incorporate other documents is described as
one of the headings under under *SiSU* markup in the *SiSU* manual.
Note: Master documents may be prepared in a similar way to regular documents,
and processing will occur normally if a .sst file is renamed .ssm without
requiring any other documents; the .ssm marker flags that the document may
contain other documents.
Note: a secondary file of the composite document is built prior to processing
with the same prefix and the suffix ._sst [^17]
14.1.3 SISU INSERT FILES (.SSI)
...............................
Inserts are documents prepared solely for the purpose of being incorporated
into one or more master documents. They resemble regular *SiSU* text files
except they are ignored by the *SiSU* processor. Making a file a .ssi file is a
quick and convenient way of flagging that it is not intended that the file
should be processed on its own.
14.2 SISUPOD, ZIPPED BINARY CONTAINER (SISUPOD.ZIP, .SSP)
.........................................................
A sisupod is a zipped *SiSU* text file or set of *SiSU* text files and any
associated images that they contain (this will be extended to include sound and
multimedia-files)
*SiSU* plaintext files rely on a recognised directory structure to find
contents such as images associated with documents, but all images for example
for all documents contained in a directory are located in the sub-directory
_sisu/image. Without the ability to create a sisupod it can be inconvenient to
manually identify all other files associated with a document. A sisupod
automatically bundles all associated files with the document that is turned
into a pod.
The structure of the sisupod is such that it may for example contain a single
document and its associated images; a master document and its associated
documents and anything else; or the zipped contents of a whole directory of
prepared *SiSU* documents.
The command to create a sisupod is:
sisu -S [filename]
Alternatively, make a pod of the contents of a whole directory:
sisu -S
*SiSU* processing can be done directly against a sisupod; which may be located
locally or on a remote server for which a url is provided.
----------------------------------------
15. EXPERIMENTAL ALTERNATIVE INPUT REPRESENTATIONS
--------------------------------------------------
15.1 ALTERNATIVE XML
....................
*SiSU* offers alternative XML input representations of documents as a proof of
concept, experimental feature. They are however not strictly maintained, and
incomplete and should be handled with care.
*convert from sst to simple xml representations (sax, dom and node):*
sisu --to-sax [filename/wildcard] or sisu --to-sxs [filename/wildcard]
sisu --to-dom [filename/wildcard] or sisu --to-sxd [filename/wildcard]
sisu --to-node [filename/wildcard] or sisu --to-sxn [filename/wildcard]
*convert to sst from any sisu xml representation (sax, dom and node):*
sisu --from-xml2sst [filename/wildcard [.sxs.xml,.sxd.xml,sxn.xml]]
or the same:
sisu --from-sxml [filename/wildcard [.sxs.xml,.sxd.xml,sxn.xml]]
15.1.1 XML SAX REPRESENTATION
.............................
To convert from sst to simple xml (sax) representation:
sisu --to-sax [filename/wildcard] or sisu --to-sxs [filename/wildcard]
To convert from any sisu xml representation back to sst
sisu --from-xml2sst [filename/wildcard [.sxs.xml,.sxd.xml,sxn.xml]]
or the same:
sisu --from-sxml [filename/wildcard [.sxs.xml,.sxd.xml,sxn.xml]]
15.1.2 XML DOM REPRESENTATION
.............................
To convert from sst to simple xml (dom) representation:
sisu --to-dom [filename/wildcard] or sisu --to-sxd [filename/wildcard]
To convert from any sisu xml representation back to sst
sisu --from-xml2sst [filename/wildcard [.sxs.xml,.sxd.xml,sxn.xml]]
or the same:
sisu --from-sxml [filename/wildcard [.sxs.xml,.sxd.xml,sxn.xml]]
15.1.3 XML NODE REPRESENTATION
..............................
To convert from sst to simple xml (node) representation:
sisu --to-node [filename/wildcard] or sisu --to-sxn [filename/wildcard]
To convert from any sisu xml representation back to sst
sisu --from-xml2sst [filename/wildcard [.sxs.xml,.sxd.xml,sxn.xml]]
or the same:
sisu --from-sxml [filename/wildcard [.sxs.xml,.sxd.xml,sxn.xml]]
----------------------------------------
16. CONFIGURATION
-----------------
16.1 DETERMINING THE CURRENT CONFIGURATION
..........................................
Information on the current configuration of *SiSU* should be available with the
help command:
sisu -v
which is an alias for:
sisu --help env
Either of these should be executed from within a directory that contains sisu
markup source documents.
16.2 CONFIGURATION FILES (CONFIG.YML)
.....................................
*SiSU* configration parameters are adjusted in the configuration file, which
can be used to override the defaults set. This includes such things as which
directory interim processing should be done in and where the generated output
should be placed.
The *SiSU* configuration file is a yaml file, which means indentation is
significant.
*SiSU* resource configuration is determined by looking at the following files
if they exist:
./_sisu/sisurc.yml
~/.sisu/sisurc.yml
/etc/sisu/sisurc.yml
The search is in the order listed, and the first one found is used.
In the absence of instructions in any of these it falls back to the internal
program defaults.
Configuration determines the output and processing directories and the database
access details.
If *SiSU* is installed a sample sisurc.yml may be found in /etc/sisu/sisurc.yml
----------------------------------------
17. SKINS
---------
Skins modify the default appearance of document output on a document,
directory, or site wide basis. Skins are looked for in the following locations:
./_sisu/skin
~/.sisu/skin
/etc/sisu/skin
*Within the skin directory* are the following the default sub-directories for
document skins:
./skin/doc
./skin/dir
./skin/site
A skin is placed in the appropriate directory and the file named skin_[name].rb
The skin itself is a ruby file which modifies the default appearances set in
the program.
17.1 DOCUMENT SKIN
..................
Documents take on a document skin, if the header of the document specifies a
skin to be used.
@skin: skin_united_nations
17.2 DIRECTORY SKIN
...................
A directory may be mapped on to a particular skin, so all documents within that
directory take on a particular appearance. If a skin exists in the skin/dir
with the same name as the document directory, it will automatically be used for
each of the documents in that directory, (except where a document specifies the
use of another skin, in the skin/doc directory).
A personal habit is to place all skins within the doc directory, and symbolic
links as needed from the site, or dir directories as required.
17.3 SITE SKIN
..............
A site skin, modifies the program default skin.
17.4 SAMPLE SKINS
.................
With *SiSU* installed sample skins may be found in:
/etc/sisu/skin/doc and
/usr/share/doc/sisu/markup-samples/samples/_sisu/skin/doc
(or equivalent directory) and if sisu-markup-samples is installed also under:
/usr/share/doc/sisu/markup-samples-non-free/samples/_sisu/skin/doc
Samples of list.yml and promo.yml (which are used to create the right column
list) may be found in:
/usr/share/doc/sisu/markup-samples-non-free/samples/_sisu/skin/yml (or
equivalent directory)
----------------------------------------
18. CSS - CASCADING STYLE SHEETS (FOR HTML, XHTML AND XML)
----------------------------------------------------------
CSS files to modify the appearance of *SiSU* html, XHTML or XML may be placed
in the configuration directory: ./_sisu/css ; ~/.sisu/css or; /etc/sisu/css and
these will be copied to the output directories with the command sisu -CC.
The basic CSS file for html output is html.css, placing a file of that name in
directory _sisu/css or equivalent will result in the default file of that name
being overwritten.
HTML: html.css
XML DOM: dom.css
XML SAX: sax.css
XHTML: xhtml.css
The default homepage may use homepage.css or html.css
Under consideration is to permit the placement of a CSS file with a different
name in directory _sisu/css directory or equivalent, and change the default CSS
file that is looked for in a skin.[^18]
----------------------------------------
19. ORGANISING CONTENT
----------------------
19.1 DIRECTORY STRUCTURE AND MAPPING
....................................
The output directory root can be set in the sisurc.yml file. Under the root,
subdirectories are made for each directory in which a document set resides. If
you have a directory named poems or conventions, that directory will be created
under the output directory root and the output for all documents contained in
the directory of a particular name will be generated to subdirectories beneath
that directory (poem or conventions). A document will be placed in a
subdirectory of the same name as the document with the filetype identifier
stripped (.sst .ssm)
The last part of a directory path, representing the sub-directory in which a
document set resides, is the directory name that will be used for the output
directory. This has implications for the organisation of document collections
as it could make sense to place documents of a particular subject, or type
within a directory identifying them. This grouping as suggested could be by
subject (sales_law, english_literature); or just as conveniently by some other
classification (X University). The mapping means it is also possible to place
in the same output directory documents that are for organisational purposes
kept separately, for example documents on a given subject of two different
institutions may be kept in two different directories of the same name, under a
directory named after each institution, and these would be output to the same
output directory. Skins could be associated with each institution on a
directory basis and resulting documents will take on the appropriate different
appearance.
19.1.1 GENERAL DIRECTORIES
..........................
./subject_name/
% files stored at this level e.g. sisu_manual.sst
./subject_name/_sisu
% configuration file e.g. sisurc.yml
./subject_name/_sisu/skin
% skins in various skin directories doc, dir, site, yml
./subject_name/_sisu/css
./subject_name/_sisu/image
% images for documents contained in this directory
./subject_name/_sisu/mm
19.1.2 REMOTE DIRECTORIES
.........................
./subject_name/
% containing sub_directories named after the generated files from which they are made
./subject_name/src
% contains shared source files text and binary e.g. sisu_manual.sst and sisu_manual.sst.zip
./subject_name/_sisu
% configuration file e.g. sisurc.yml
./subject_name/_sisu/skin
% skins in various skin directories doc, dir, site, yml
./subject_name/_sisu/css
./subject_name/_sisu/image
% images for documents contained in this directory
./subject_name/_sisu/mm
19.1.3 SISUPOD
..............
./sisupod/
% files stored at this level e.g. sisu_manual.sst
./sisupod/_sisu
% configuration file e.g. sisurc.yml
./sisupod/_sisu/skin
% skins in various skin directories doc, dir, site, yml
./sisupod/_sisu/css
./sisupod/_sisu/image
% images for documents contained in this directory
./sisupod/_sisu/mm
19.2 ORGANISING CONTENT
.......................
----------------------------------------
20. HOMEPAGES
-------------
*SiSU* is about the ability to auto-generate documents. Home pages are regarded
as custom built items, and are not created by *SiSU*. More accurately, *SiSU*
has a default home page, which will not be appropriate for use with other
sites, and the means to provide your own home page instead in one of two ways
as part of a site's configuration, these being:
1. through placing your home page and other custom built documents in the
subdirectory _sisu/home/ (this probably being the easier and more convenient
option)
2. through providing what you want as the home page in a skin,
Document sets are contained in directories, usually organised by site or
subject. Each directory can/should have its own homepage. See the section on
directory structure and organisation of content.
20.1 HOME PAGE AND OTHER CUSTOM BUILT PAGES IN A SUB-DIRECTORY
..............................................................
Custom built pages, including the home page index.html may be placed within the
configuration directory _sisu/home/ in any of the locations that is searched
for the configuration directory, namely ./_sisu ; ~/_sisu ; /etc/sisu From
there they are copied to the root of the output directory with the command:
sisu -CC
20.2 HOME PAGE WITHIN A SKIN
............................
Skins are described in a separate section, but basically are a file written in
the programming language *Ruby* that may be provided to change the defaults
that are provided with sisu with respect to individual documents, a directories
contents or for a site.
If you wish to provide a homepage within a skin the skin should be in the
directory _sisu/skin/dir and have the name of the directory for which it is to
become the home page. Documents in the directory commercial_law would have the
homepage modified in skin_commercial law.rb ; or the directory poems in
skin_poems.rb
class Home
def homepage
# place the html content of your homepage here, this will become index.html
<
this is my new homepage.