WikiHTML
WikiHTML is a Java library and executable standalone application that converts a document in wiki text format to an HTML document.
Download
- executable JAR file
- The Central Repository
- as dependency:
<dependency>
<groupId>de.tu-dresden.inf.lat.wikihtml</groupId>
<artifactId>wikihtml</artifactId>
<version>0.1.0</version>
</dependency>
Use
It can be used as a Java library or from the command line. For example, use:
java -jar wikihtml-0.1.0.jar inputfile.text outputfile.html
to create a new HTML file from the command line, and use
java -jar wikihtml-0.1.0.jar inputoutputfile.html
to just update an HTML file with embedded wiki text.
Description
Wiki markup, also wikitext or wikicode, is a markup language for wiki-based pages. It is a simplified human-friendly substitute of HTML. This library reads text written in this markup language and produces an HTML document. There are several “dialects” of wiki markup. This library implements a subset of the language used by the MediaWiki software.
The application generates the HTML document with the original wiki markup source code inside. Technically, the source code will be between: <!--begin_wiki_text and end_wiki_text-->. This allows to update an HTML file using the source in the same file.
This could be useful, for example, when maintaining documentation of a project. The files can be easily edited using a text editor, but after processing them with this library, they can be viewed with a browser.
When using only the wiki formatting, the produced document is an XHTML 1.1 document.
Sections
Sections are marked at the beginning of a line. The heading should be between a sequence of equals signs (=). Using more equals signs makes the heading smaller. For example:
| wiki markup | HTML |
|---|---|
= heading 1 = |
<h1>heading 1</h1> |
== heading 2 == |
<h2>heading 2</h2> |
=== heading 3 === |
<h3>heading 3</h3> |
==== heading 4 ==== |
<h4>heading 4</h4> |
===== heading 5 ===== |
<h5>heading 5</h5> |
====== heading 6 ====== |
<h6>heading 6</h6> |
Line breaks
A new line is marked with two new lines. For example,
Two lines
together
are not considered different lines.
is rendered
Two lines together are not considered different lines.
but:
One line.
Another line.
is rendered
One line.
Another line.
Indented text
Text can be indented using colons (:) at the beginning of the line. For example:
: item 1
: item 2
:: item 2.1
:: item 2.2
::: item 2.2.1
: item 3
produces:
item 1
item 2
item 2.1
item 2.2
item 2.2.1
item3
Unordered lists
Items in a list are marked with asterisks (*) at the beginning of the line. A subitem is marked with more asterisks. For example:
* item 1
* item 2
** item 2.1
** item 2.2
*** item 2.2.1
* item 3
is rendered as
- item 1
- item 2
- item 2.1
- item 2.2
- item 2.2.1
- item 3
Ordered lists
Numbered items are marked with hash signs (#) at the beginning of the line. A subitem is marked with more hash signs. For example:
# item 1
# item 2
## item 2.1
## item 2.2
### item 2.2.1
# item 3
is rendered as
- item 1
- item 2
- item 2.1
- item 2.2 1. item 2.2.1
- item 3
Text formatting
The text can be formatted using apostrophes (‘) according to the following table:
| wiki markup | HTML |
|---|---|
''italics'' |
italics |
'''bold''' |
bold |
'''''bold italics''''' |
bold italics |
Links
Links can be marked with square backets ([ ]). For example:
[https://www.wikipedia.org Wikipedia] renders Wikipedia.
If the brackets are omitted, the URI is shown directly. For example: https://www.wikipedia.org renders https://www.wikipedia.org .
The double square brackets ([[ ]]) are rendered as local links.
Tables
This wiki text:
{| border="1"
| 4 || 9 || 2
|-
| 3 || 5 || 7
|-
| 8 || 1 || 6
|}
produces the following table:
| 4 | 9 | 2 |
| 3 | 5 | 7 |
| 8 | 1 | 6 |
(without the white and gray alternation of lines)
The following wiki text is not implemented in MediaWiki, but it also produces the same table:
- using semicolon:
{||; border="1"
4;9;2
3;5;7
8;1;6
||}
- using comma:
{||, border="1"
4,9,2
3,5,7
8,1,6
||}
- using tabs:
{|| border="1"
4 9 2
3 5 7
8 1 6
||}
nowiki
The pair of tags <nowiki>…</nowiki> is used to mark text without using the wiki formatting. For example:
<nowiki>'''</nowiki>non-bold<nowiki>'''</nowiki> is not in bold.
Variables
The following MediaWiki variables are implemented:
| name | example | meaning |
|---|---|---|
| {{CURRENTDAY}} | 1 |
Displays the current day in numeric form. |
| {{CURRENTDAY2}} | 01 |
Same as {{CURRENTDAY}}, but with leading zero (01 .. 31). |
| {{CURRENTDAYNAME}} | Friday |
Name of the day in the language of the project or English. |
| {{CURRENTDOW}} | 5 |
Same as {{CURRENTDAYNAME}}, but as a number (0=Sunday, 1=Monday…). |
| {{CURRENTMONTH}} | 01 |
The number 01 .. 12 of the month. |
| {{CURRENTMONTHABBREV}} | Jan |
Same as {{CURRENTMONTH}}, but in abbreviated form as Jan .. Dec. |
| {{CURRENTMONTHNAME}} | January |
Same as {{CURRENTMONTH}}, but in named form January .. December. |
| {{CURRENTTIME}} | 16:03 |
The current time (00:00 .. 23:59). |
| {{CURRENTHOUR}} | 16 |
The current hour (00 .. 23). |
| {{CURRENTWEEK}} | 1 |
Number of the current week (1-53) according to ISO 8601 with no leading zero. |
| {{CURRENTYEAR}} | 2016 |
Returns the current year. |
| {{CURRENTTIMESTAMP}} | 20160101160345 |
ISO 8601 time stamp |
In addition, the {{LOCAL…}} variables are also implemented:{{LOCALDAY}}, {{LOCALDAY2}}, … , {{LOCALTIMESTAMP}}. For example, in UTC+1 {{CURRENTTIMESTAMP}} returns 20160101160345, while {{LOCALTIMESTAMP}} returns 20160101170345.
HTML
HTML code can also be inserted directly. For example:
<b>bold</b> is the same as '''bold''', and λ is rendered λ.
Example
The file mupuzzle.text has the following wiki text:
== MIU system ==
(see [https://en.wikipedia.org/wiki/MU_puzzle MU puzzle])
# ''x''I → ''x''IU
# M''x'' → M''xx''
# ''x''III''y'' → ''x''U''y''
# ''x''UU''y'' → ''xy''
and is translated to the following HTML document:
<?xml version="1.0" encoding="utf-8"?>
<!--begin_wiki_text
== MIU system ==
(see [https://en.wikipedia.org/wiki/MU_puzzle MU puzzle])
# ''x''I → ''x''IU
# M''x'' → M''xx''
# ''x''III''y'' → ''x''U''y''
# ''x''UU''y'' → ''xy''
end_wiki_text-->
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "https://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="https://www.w3.org/1999/xhtml" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title></title>
</head>
<body>
<div>
<h2> MIU system </h2>
(see <a href="https://en.wikipedia.org/wiki/MU_puzzle">MU puzzle</a>)<br />
<ol>
<li> <i>x</i>I → <i>x</i>IU</li>
<li> M<i>x</i> → M<i>xx</i></li>
<li> <i>x</i>III<i>y</i> → <i>x</i>U<i>y</i> </li>
<li> <i>x</i>UU<i>y</i> → <i>xy</i></li>
</ol>
<br />
</div>
</body>
</html>
The file example.text has more examples.
Source code
To checkout and compile the project, use:
$ git clone https://github.com/julianmendez/wikihtml.git
$ cd wikihtml
$ mvn clean install
The created executable library, its sources, and its Javadoc will be in wikihtml/target.
To compile the project offline, first download the dependencies:
$ mvn dependency:go-offline
and once offline, use:
$ mvn --offline clean install
The bundles uploaded to Sonatype are created with:
$ mvn clean install -DperformRelease=true
and then:
$ cd wikihtml/target
$ jar -cf bundle.jar wikihtml-*
The version number is updated with:
$ mvn versions:set -DnewVersion=NEW_VERSION
where NEW_VERSION is the new version.
Architecture
The library reads a wiki text and creates a WikiDocument.
It extracts the wiki text from the given input and processes it line by line.
Each line is transformed into a ConversionToken. Each token is processed by a pipeline of objects where each one is a Renderer. Each renderer (-Renderer) processes each conversion token producing a list of conversion tokens. These are the input for the next renderer, if any. Some renderers are parameterized and grouped (-GroupRenderer). Some renderers process whole lines (in package ...line) and some renderers process pieces of lines (in package ...part).
For example, all variables are processed by ...part.DateVariableRenderer, but the headings are processed by a group of renderers (...line.HeadingGroupRenderer) composed by 6 renderers (h1, h2, …, h6), where each one is a ...line.HeadingRenderer.
Author
Licenses
Apache License Version 2.0, GNU Lesser General Public License version 3
Release notes
See release notes.
Contact
In case you need more information, please contact @julianmendez .