WikiHTML
WikiHTML is a Java library and executable standalone application that converts a document in wiki text format to an HTML document.
Download
- executable JAR file
- The Central Repository
- as dependency:
<dependency>
<groupId>de.tu-dresden.inf.lat.wikihtml</groupId>
<artifactId>wikihtml</artifactId>
<version>0.1.0</version>
</dependency>
Use
It can be used as a Java library or from the command line. For example, use:
java -jar wikihtml-0.1.0.jar inputfile.text outputfile.html
to create a new HTML file from the command line, and use
java -jar wikihtml-0.1.0.jar inputoutputfile.html
to just update an HTML file with embedded wiki text.
Description
Wiki markup, also wikitext or wikicode, is a markup language for wiki-based pages. It is a simplified human-friendly substitute of HTML. This library reads text written in this markup language and produces an HTML document. There are several “dialects” of wiki markup. This library implements a subset of the language used by the MediaWiki software.
The application generates the HTML document with the original wiki markup source code inside. Technically, the source code will be between: <!--begin_wiki_text
and end_wiki_text-->
. This allows to update an HTML file using the source in the same file.
This could be useful, for example, when maintaining documentation of a project. The files can be easily edited using a text editor, but after processing them with this library, they can be viewed with a browser.
When using only the wiki formatting, the produced document is an XHTML 1.1 document.
Sections
Sections are marked at the beginning of a line. The heading should be between a sequence of equals signs (=). Using more equals signs makes the heading smaller. For example:
wiki markup | HTML |
---|---|
= heading 1 = |
<h1>heading 1</h1> |
== heading 2 == |
<h2>heading 2</h2> |
=== heading 3 === |
<h3>heading 3</h3> |
==== heading 4 ==== |
<h4>heading 4</h4> |
===== heading 5 ===== |
<h5>heading 5</h5> |
====== heading 6 ====== |
<h6>heading 6</h6> |
Line breaks
A new line is marked with two new lines. For example,
Two lines
together
are not considered different lines.
is rendered
Two lines together are not considered different lines.
but:
One line.
Another line.
is rendered
One line.
Another line.
Indented text
Text can be indented using colons (:) at the beginning of the line. For example:
: item 1
: item 2
:: item 2.1
:: item 2.2
::: item 2.2.1
: item 3
produces:
item 1
item 2
item 2.1
item 2.2
item 2.2.1
item3
Unordered lists
Items in a list are marked with asterisks (*) at the beginning of the line. A subitem is marked with more asterisks. For example:
* item 1
* item 2
** item 2.1
** item 2.2
*** item 2.2.1
* item 3
is rendered as
- item 1
- item 2
- item 2.1
- item 2.2
- item 2.2.1
- item 3
Ordered lists
Numbered items are marked with hash signs (#) at the beginning of the line. A subitem is marked with more hash signs. For example:
# item 1
# item 2
## item 2.1
## item 2.2
### item 2.2.1
# item 3
is rendered as
- item 1
- item 2
- item 2.1
- item 2.2 1. item 2.2.1
- item 3
Text formatting
The text can be formatted using apostrophes (‘) according to the following table:
wiki markup | HTML |
---|---|
''italics'' |
italics |
'''bold''' |
bold |
'''''bold italics''''' |
bold italics |
Links
Links can be marked with square backets ([ ]). For example:
[https://www.wikipedia.org Wikipedia]
renders Wikipedia.
If the brackets are omitted, the URI is shown directly. For example: https://www.wikipedia.org
renders https://www.wikipedia.org .
The double square brackets ([[ ]]) are rendered as local links.
Tables
This wiki text:
{| border="1"
| 4 || 9 || 2
|-
| 3 || 5 || 7
|-
| 8 || 1 || 6
|}
produces the following table:
4 | 9 | 2 |
3 | 5 | 7 |
8 | 1 | 6 |
(without the white and gray alternation of lines)
The following wiki text is not implemented in MediaWiki, but it also produces the same table:
- using semicolon:
{||; border="1"
4;9;2
3;5;7
8;1;6
||}
- using comma:
{||, border="1"
4,9,2
3,5,7
8,1,6
||}
- using tabs:
{|| border="1"
4 9 2
3 5 7
8 1 6
||}
nowiki
The pair of tags <nowiki>
…</nowiki>
is used to mark text without using the wiki formatting. For example:
<nowiki>'''</nowiki>non-bold<nowiki>'''</nowiki>
is not in bold.
Variables
The following MediaWiki variables are implemented:
name | example | meaning |
---|---|---|
{{CURRENTDAY}} | 1 |
Displays the current day in numeric form. |
{{CURRENTDAY2}} | 01 |
Same as {{CURRENTDAY}}, but with leading zero (01 .. 31). |
{{CURRENTDAYNAME}} | Friday |
Name of the day in the language of the project or English. |
{{CURRENTDOW}} | 5 |
Same as {{CURRENTDAYNAME}}, but as a number (0=Sunday, 1=Monday…). |
{{CURRENTMONTH}} | 01 |
The number 01 .. 12 of the month. |
{{CURRENTMONTHABBREV}} | Jan |
Same as {{CURRENTMONTH}}, but in abbreviated form as Jan .. Dec. |
{{CURRENTMONTHNAME}} | January |
Same as {{CURRENTMONTH}}, but in named form January .. December. |
{{CURRENTTIME}} | 16:03 |
The current time (00:00 .. 23:59). |
{{CURRENTHOUR}} | 16 |
The current hour (00 .. 23). |
{{CURRENTWEEK}} | 1 |
Number of the current week (1-53) according to ISO 8601 with no leading zero. |
{{CURRENTYEAR}} | 2016 |
Returns the current year. |
{{CURRENTTIMESTAMP}} | 20160101160345 |
ISO 8601 time stamp |
In addition, the {{LOCAL…}} variables are also implemented:{{LOCALDAY}}, {{LOCALDAY2}}, … , {{LOCALTIMESTAMP}}. For example, in UTC+1 {{CURRENTTIMESTAMP}} returns 20160101160345
, while {{LOCALTIMESTAMP}} returns 20160101170345
.
HTML
HTML code can also be inserted directly. For example:
<b>bold</b>
is the same as '''bold'''
, and λ
is rendered λ.
Example
The file mupuzzle.text has the following wiki text:
== MIU system ==
(see [https://en.wikipedia.org/wiki/MU_puzzle MU puzzle])
# ''x''I → ''x''IU
# M''x'' → M''xx''
# ''x''III''y'' → ''x''U''y''
# ''x''UU''y'' → ''xy''
and is translated to the following HTML document:
<?xml version="1.0" encoding="utf-8"?>
<!--begin_wiki_text
== MIU system ==
(see [https://en.wikipedia.org/wiki/MU_puzzle MU puzzle])
# ''x''I → ''x''IU
# M''x'' → M''xx''
# ''x''III''y'' → ''x''U''y''
# ''x''UU''y'' → ''xy''
end_wiki_text-->
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "https://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="https://www.w3.org/1999/xhtml" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title></title>
</head>
<body>
<div>
<h2> MIU system </h2>
(see <a href="https://en.wikipedia.org/wiki/MU_puzzle">MU puzzle</a>)<br />
<ol>
<li> <i>x</i>I → <i>x</i>IU</li>
<li> M<i>x</i> → M<i>xx</i></li>
<li> <i>x</i>III<i>y</i> → <i>x</i>U<i>y</i> </li>
<li> <i>x</i>UU<i>y</i> → <i>xy</i></li>
</ol>
<br />
</div>
</body>
</html>
The file example.text has more examples.
Source code
To checkout and compile the project, use:
$ git clone https://github.com/julianmendez/wikihtml.git
$ cd wikihtml
$ mvn clean install
The created executable library, its sources, and its Javadoc will be in wikihtml/target
.
To compile the project offline, first download the dependencies:
$ mvn dependency:go-offline
and once offline, use:
$ mvn --offline clean install
The bundles uploaded to Sonatype are created with:
$ mvn clean install -DperformRelease=true
and then:
$ cd wikihtml/target
$ jar -cf bundle.jar wikihtml-*
The version number is updated with:
$ mvn versions:set -DnewVersion=NEW_VERSION
where NEW_VERSION is the new version.
Architecture
The library reads a wiki text and creates a WikiDocument
.
It extracts the wiki text from the given input and processes it line by line.
Each line is transformed into a ConversionToken
. Each token is processed by a pipeline of objects where each one is a Renderer
. Each renderer (-Renderer
) processes each conversion token producing a list of conversion tokens. These are the input for the next renderer, if any. Some renderers are parameterized and grouped (-GroupRenderer
). Some renderers process whole lines (in package ...line
) and some renderers process pieces of lines (in package ...part
).
For example, all variables are processed by ...part.DateVariableRenderer
, but the headings are processed by a group of renderers (...line.HeadingGroupRenderer
) composed by 6 renderers (h1, h2, …, h6), where each one is a ...line.HeadingRenderer
.
Author
Licenses
Apache License Version 2.0, GNU Lesser General Public License version 3
Release notes
See release notes.
Contact
In case you need more information, please contact @julianmendez .