Used by 1 package in nightly-2025-05-06(full list with versions):

texmath

CI tests

texmath is a Haskell library for converting between formats used to represent mathematics. Currently it provides functions to read and write TeX math, presentation MathML, and OMML (Office Math Markup Language, used in Microsoft Office), and to write Gnu eqn, typst, and pandoc’s native format (allowing conversion, using pandoc, to a variety of different markup formats). The TeX reader and writer supports basic LaTeX and AMS extensions, and it can parse and apply LaTeX macros. The package also includes several utility modules which may be useful for anyone looking to manipulate either TeX math or MathML. For example, a copy of the MathML operator dictionary is included.

You can try it out online here.

By default, only the Haskell library is installed. To install a test program, texmath, use the executable Cabal flag:

cabal install -fexecutable

By default, the executable will be installed in ~/.cabal/bin.

Alternatively, texmath can be installed using stack. Install the stack binary somewhere in your path. Then, in the texmath repository,

stack setup
stack install --flag texmath:executable

The texmath binary will be put in ~/.local/bin.

Macro definitions may be included before a LaTeX formula.

Running texmath as a server

texmath will behave as a CGI script when called under the name texmath-cgi (e.g. through a symbolic link).

But it is also possible to compile a full webserver with a JSON API. To do this, set the server cabal flag, e.g.

stack install --flag texmath:server

To run the server on port 3000:

texmath-server -p 3000

Sample of use, with httpie:

% http --verbose localhost:3000 text='2^2' from=tex to=mathml display:=false Accept:'text/plain'
POST /convert HTTP/1.1
Accept: text/plain
Accept-Encoding: gzip, deflate
Connection: keep-alive
Content-Length: 64
Content-Type: application/json
Host: localhost:3000
User-Agent: HTTPie/3.1.0

{
    "display": false,
    "from": "tex",
    "text": "2^2",
    "to": "mathml"
}


HTTP/1.1 200 OK
Content-Type: text/plain;charset=utf-8
Date: Mon, 21 Mar 2022 18:29:26 GMT
Server: Warp/3.3.17
Transfer-Encoding: chunked

<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML">
  <msup>
    <mn>2</mn>
    <mn>2</mn>
  </msup>
</math>

Possible values for from are tex, mathml, and omml. Possible values for to are tex, mathml, omml, eqn, typst and pandoc (JSON-encoded Pandoc).

Alternatively, you can use the /batch endpoint, passing in in a JSON-encoded list of conversions and getting back a JSON-encoded list of results.

If you rename pandoc-server to pandoc-server.cgi, it will function as a CGI program that accepts POST requests.

Generating lookup tables

There are three main lookup tables which are built form externally compiled lists. This section contains information about how to modify and regenerate these tables.

In the lib direction there are two sub-directories which contain the necessary files.

MMLDict.hs

The utility program xsltproc is required. You can find these files in lib/mmldict/

  1. If desired replace unicode.xml with and updated version (you can download a copy from here
  2. xsltproc -o dictionary.xml operatorDictionary.xsl unicode.xml
  3. runghc generateMMLDict.hs
  4. Replace the operator table at the bottom of src/Text/TeXMath/Readers/MathML/MMLDict.hs with the contents of mmldict.hs

ToTeXMath.hs

You can find these files in lib/totexmath/

  1. If desired, replace unimathsymbols.txt with an updated version from here
  2. runghc unicodetotex.hs
  3. Replace the record table at the bottom of src/Text/TeXMath/Unicode/ToTeXMath.hs with the contents of UnicodeToLaTeX.hs

ToUnicode.hs

You can find these files in lib/tounicode/.

  1. If desired, replace UnicodeData.txt with an updated verson from here.
  2. runghc mkUnicodeTable.hs
  3. Replace the table at the bottom of src/Text/TeXMath/Unicode/ToUnicode.hs with the output.

Editing the tables

It is not necessary to edit the source files to add records to the tables. To add to or modify a table it is easier to add modify either unicodetotex.hs or generateMMLDict.hs. This is easily achieved by adding an item to the corresponding updates lists. After making the changes, follow the above steps to regenerate the table.

The test suite

To run the test suite, do cabal test or stack test.

In its standard mode, the test suite will run golden tests of the individual readers and writers. Reader tests can be found in test/reader/{mml,omml,tex}, and writer tests in test/writer/{eqn,mml,omml,tex}. Regression tests linked to specific issues are in test/regression.

Each test file consists of an input and an expected output. The input begins after a line <<< FORMAT and the output begins after a line >>> FORMAT.

If many tests fail as a result of changes, but the test failures are all because of improvements in the output, you can pass --accept to the test suite (e.g., with --test-arguments=--accept on stack test), and the existing golden files will be overwritten. If you do this, inspect the outputs very carefully to make sure they are correct.

If you pass the --roundtrip option into the test suite (e.g., using --test-arguments=--roundtrip with stack test), round-trip tests will be run instead. Many of these will fail. In these tests, the native inputs in test/roundtrip/*.native will be converted to (respectively) mml, omml, or tex, then converted back, and the result will be compared with the starting point. Although we don’t guarantee that this kind of round-trip transformation will be the identity, looking at cases where it fails can be a guide to improvements.

Authors

John MacFarlane wrote the original TeX reader, MathML writer, Eq writer, and OMML writer. Matthew Pickering contributed the MathML reader, the TeX writer, and many of the auxiliary modules. Jesse Rosenthal contributed the OMML reader. Thanks also to John Lenz for many contributions.

Changes

texmath (0.12.10)

  • texmath-server:

    • Change endpoints: /convert to root, and /convert-batch to /batch.
    • Allow running as CGI if renamed pandoc-server.cgi. In this mode it accepts JSON content with POST requests or parameters with GET requests, just like pandoc-server itself.
  • TeX reader:

    • Fix parsing bug with comment at beginning of braced (#258).
    • Support negative numbers in \hspace (#259).
    • Allow decimals in \hspace (#259).
    • Support \quad in \text (#260).

texmath (0.12.9)

  • Better handling of primes in eqn, typst, and tex writers. These writers now render a superscript containing prime characters using a sequence of ' characters, e.g. f''. This syntax is supported by all three formats.

  • Use Planck constant for italic h in unicodeTable. There is no regular italic h in Unicode because this already existed in Planck constant.

  • Text.TeXMath.Shared: expose isRLSequence and toPrimes and new function isUpppercaseGreek [API change, non-breaking].

  • OMML reader: consolidate adjacent texts with same style. This affects EStyled and EText elements. This way we get \mathbf{123abc456} in LaTeX output instead of \mathbf{123}\mathbf{abc}\mathbf{456}. See http://github.com/jgm/pandoc/discussions/10560 for discussion.

  • OMML writer:

    • Use m:eqArr rather than m:m for arrays with alternating right, left alignments (#209). (In other writers, TeX and MathML, we presume that these are aligned equations.)
    • Fix order of scr and sty elements (#253).
    • Use upright style by default for uppercase Greek.
  • MathML reader: properly handle mmultiscripts (#252).

  • MathML writer:

    • Improve formatting of aligned equations (#207). An EArray with alternating R,L alignments will be assumed to be aligned equations and rendered without padding.
    • Use mi for EMathOperator. Also: insert the unicode 0x2061 “function application” character after a math operator, unless it’s already there. Ensure that all 0x2061 characters are in mo rather than mi.
    • Use upright style for uppercase Greek by default (#255).
    • Fix invalid displaystyle attribute. The attribute goes on enclosing mstyle, not directly on mfrac.
  • TeX reader:

    • Parse primes as superscripts (#254). This seems to be how TeX behaves under the hood: f' is equivalent to f^{\prime}.
    • Avoid implicit pairing of delimiters (reverts #172).
    • Use EIdentifier instead of ESymbol Alpha for \ell, etc (#256).
    • Use Pun for primes in TeX reader, revise type of toPrimes. This will ensure that mo is used for primes in MathML.
  • test-texmath: allow mathml in addition to mml to specify mathml.

  • Add binpath Makefile target.

texmath (0.12.8.13)

  • Remove special override for \perp in Text.TeXMath.Readers.TeX.Commands (#247). This caused \perp to be read as U+22A5 instead of U+27C2. This addresses the mismatch with the TeX writer (which associates \bot with U+22A5 and \perp with U+27C2).

  • Typst writer:

    • Fix several issues with accents and attachments (#245).
    • Fix handling of some EOver with combining accents (#245).
    • Escape backslash in text context (#245).

texmath (0.12.8.12)

  • TeX writer: render prime and superscripted prime as ' (#246).

  • TeX reader:

    • Don’t crash on array with \hline before blank cell (#244).
    • Skip whitespace in array column specifier (#244).
  • OMML writer:

    • Fix order of dPr attributes (#243).
  • Typst writer:

    • Escape commas (#242). Otherwise we can get bad results e.g. in fractions, when the commas separate arguments.
  • Require typst-symbols 0.1.7, update tests.

texmath (0.12.8.11)

  • TeX reader: Ignore @{..} and !{..} in array alignment specifiers (#241).

  • TeX reader: ignore \color instead of crashing (#225).

texmath (0.12.8.10)

  • TeX reader: allow \lVert .. \vVert to create an EDelimited (#238).

  • Typst writer: improved handling of primes (#239). Use ' instead of e.g. prime. Don’t put a space before primes.

  • Typst writer: improve rendering of EDelimited (#238).

  • Typst writer: use mid() for middle delimiters (#238).

texmath (0.12.8.9)

  • Parse TeX \mathbf as both bold and upright (#236).

texmath (0.12.8.8)

  • TeX reader: support unicode-math Greek symbols, e.g. \Alpha (#235). This includes symbols like \Alpha and \omicron that weren’t defined in original TeX.

  • Use typst-symbols 0.1.6

texmath (0.12.8.7)

  • TeX reader: convert Bin symbols to Ord when appropriate (#234). E.g. in ‘-3’, we should have an Ord rather than a Bin, so the spacing will be appropriate.

  • Pandoc writer: fix spacing inside EDelimited (#234). Previously spaces around binary operators were omitted when they occurred inside parens or brackets.

  • test-texmath: allow pandoc output.

texmath (0.12.8.6)

  • Typst writer: avoid redundant lrs (#233).

texmath (0.12.8.5)

  • Typst writer: use ASCII symbols when possible instead of symbols (#232). E.g., + instead of plus. Add \ to characters needing escape. Enhance list of characters that need escaping.

  • Typst writer: fixed EBoxed output so it includes a border.

  • Handle \ddot better in conversion to typst (#231).

  • Use typst-symbols 0.1.5

texmath (0.12.8.4)

  • TeX reader: ignore \allowbreak (#230).

  • TeX reader: handle *{5}{lr} in array column specifier (#229).

  • OMML reader: allow m:e to be missing in m:nary (#228). Technically this is not allowed, according to the spec, but Word and LibreOffice seem to tolerate it.

texmath (0.12.8.3)

  • OMML writer: use “on” and “off” instead of “1” and “0” for m:CT_OnOff type. It is said that “1” and “0” work in Word but not Powerpoint.

texmath (0.12.8.2)

  • Typst writer: use binom instead of a fraction (jgm/pandoc#9063).

texmath (0.12.8.1)

  • Typst writer: several fixes (#223, Lleu Yang).

    • Escape quotes (”) in inQuotes
    • Accent \8407 corresponds to arrow()
    • Write #none’s for matrices with blanks at the beginning of a row

texmath (0.12.8)

  • Expose Text.TeXMath.Shared [API change]

  • Typst writer: Fix bug where ‘s’ turned into ‘space’ (#219).

  • Typst writer: Fix handling of overline (#214).

  • Typst writer: Fix underbrace (#217).

  • Typst writer: Improve some accents (#216).

  • TeX writer: don’t include \ on last line of matrix.

  • TeX writer: Remove escaping of spaces inside \text{}. It isn’t needed, and it causes problems in MathJax rendering.

  • TeX reader: allow empty matrices.

  • MathML writer: Fix rendering of vectors (#218).

  • Depend on external typst-symbols package.

texmath (0.12.7.1)

  • Typst writer:

    • Improve under/overbrace/bracket/line.
    • Fix bugs with super/subscript grouping (#212).
    • Fix case where super/subscript is on an empty element, by inserting a zws.

texmath (0.12.7)

  • Add typst writer. New module: Text.TeXMath.Writers.Typst.

  • TeX reader: Support multilined environment. Closes #210.

texmath (0.12.6)

  • MathML writer:

    • Use style with CSS as well as columnalign (#205). This seems to be needed by browser implementations of MathML.
    • Remove reliance on mstyle (#205). mstyle doesn’t seem to be supported any more, at least in browser implementations of MathML, and the documentation indicates that it is treated like mrow now that styles can go directly on child elements. This commit removes our use of mstyle. Instead of using mstyle, we change mathvariant attributes on descendent elements (and displaystyle attributes on direct children, in the case of fractions).
    • Extend our existing use of unicode replacements, since many implementations don’t properly handle mathvariant. We now get variant characters for mo, mn, and all elements that can sensibly take them, not just mi and mtext.
    • Omit mathvariant attribute unless we can’t find appropriate Unicode. When MathML is displayed by ODT, having BOTH a bold math Unicode character and a mathvariant=“bold” attribute seems to confuse it. (Browsers don’t care either way.) This gives us more compact and readable output, as well.

texmath (0.12.5.5)

  • Allow pandoc-types 1.23.

  • TeX reader: remove false positives for isConvertible (#204). “Convertible” symbols are those in which subscripts render under the symbol in display environments, and as subscripts in inline environments. Previously the TeX parser recognized all relation and binary symbols as convertible, which does not match TeX’s behavior.

  • TeX reader: Support \enspace (#203).

texmath (0.12.5.4)

  • OMML reader: fix treatment of eqArr (#196). This change also includes a change to the TeX writer: any array with an alternating sequence of R,L alignments will be rendered as an aligned environment (not just a single [R,L] as currently).

  • OOML Writer: Add low line char (”_”) to isBarChar (#193, Hagb). Closes jgm/pandoc#8152.

  • Eqn writer: use - for minus, cdots for cdots (#200).

texmath (0.12.5.3)

  • Eqn writer: avoid empty {} (#198). This causes an error, along the lines of
    eqn:<standard input>:73: error: syntax error
     context is
            } above { >>> } <<<
    
    which can be avoided if we use {""}.

texmath (0.12.5.2)

  • Fix bug in implementation of \mspace (#195).

texmath (0.12.5.1)

  • Compile texmath-server with -threaded. This should fix the crashes we have experienced.

  • Add apache style logging to web server.

  • Add more strictness in Unicode.ToTeX.

texmath (0.12.5)

  • TeX reader: Improve treatment of \operatorname (#147). We can now handle spaces, as in \operatorname{arg\,max}. We also now have a better fallback when the operator name contains content that can’t be turned into plain text. (In this case, we just pass through the contents, since EMathOperator takes a text argument.)

  • TeX: Support more \var... commands for greek letters (Albert Krewinkel). AMSmath defines \varGamma, \varDelta, \varTheta, \varLambda, \varXi, \varPi, \varSigma, \varUpsilon, \varPhi, \varPsi, and \varOmega, all of which are now parsed as unicode characters MATHEMATICAL ITALIC CAPITAL …. Also, \varsigma is now parsed as MATHEMATICAL SMALL FINAL SIGMA.

  • OMML writer: better handling for scaled delimeter symbols (#140). We now try to represent these using m:d when possible. This allows the parentheses to expand with the content; previously we’d often get small parentheses with large contents.

  • OMML reader:

    • Allow m:pos to be missing or lack an attribute in m:bar (#187).
    • Set the default value of pos to “bot” (Maximilian Meier).
    • Implement support for noBar fractions (#191, Meimax).
  • Add servant-based server with a JSON API.

  • Remove old cgi directory.

  • Improve test suite (#189). The existing test suite was a complicated mess, so that it was hard to add new tests. (One of the problems was that the same files were used as golden files for reader tests and as sources for writer tests.) This commit shifts the same tests to a new, easier to understand format, so that it will be simple to add new tests in the future. We now use the tasty test framework, and we use pretty-show to make the native golden tests easier to comprehend.

  • Add regression tests.

texmath (0.12.4)

  • TeX reader: handle hyperref better (#186). We don’t parse it as a link, but we pass its contents through rather than failing.

  • Update scripts and data in lib/ directory. These are not build dependencies, but they were used to produce some of the large tables in the source code. Fixed the scripts and Makefile to work with recent texmath and cabal. Removed two very large unicode data files that can be downloaded when needed. (This reduces the size of the source tarball considerably.) Remove lib/toascii (no longer used).

  • Update MMLDict using latest unicode.xml.

  • TeX reader: support siunitx \qty, \qtyrange, \unit (#185).

  • Remove Text.TeXMath.Compat. We can now safely require mtl >= 2.2.1.

  • Use symbolMap from ToTeX to shorten the long hardcoded symbols list. Now we only hard-code items that differ what what is in symbolMap. This reduces the code size by thousands of lines.

  • Unicode.ToTeX: export symbolMap [API change]. This uses the data in records to create a backwards mapping from TeX commands to Exps (ESymbol elements). This can replace most of the hardcoded list in the current TeX reader.

  • Split out TeXMath.Readers.TeX.Commands internal module. This makes the TeX reader shorter and should help compile times.

  • OMML reader: better handling of m:t nodes (#151). Previously we parsed an m:t element as an EIdentifier if it contains a single letter, but an EText TextNormal if it contains more than one. This gave bad results in some cases. It is better to reserve EText for the case where the m:nor property is specified for “normal text.”

  • Require base >= 4.11.

  • Remove network-uri flag from stack.yaml.

texmath (0.12.3.3)

  • OMML writer: use nary only for operators supported by LibreOffice (Albert Krewinkel). LibreOffice (and possibly Word, too) can handle only a small set of operators in an nary element.

  • TeX writer: use \xleftarrow, \xrightarrow where sensible (Albert Krewinkel). The commands are generated for expressions over or . Besides being more idiomatic, this change also prevents the generation of invalid LaTeX, as \leftarrow and \rightarrow are not math operators and hence may not be followed by \limit. Both commands are part of amsmath.sty.

  • TeX reader:

    • Improve angled-bracket support (Albert Krewinkel). The amsmath package allows \left< and \right> as alternatives to \left\langle and \right\rangle, respectively.
    • Ignore stared version of \tag (Albert Krewinkel).
    • Support \dots{c,b,m,i,o} from amsmath (#179).
    • Change symbol returned for \dots{b,i,m} from to (Albert Krewinkel).

texmath (0.12.3.2)

  • OMML writer: remove m:nor element in math operators (#178). This caused the document’s main font, rather than the math font, to be used in formatting operators, which is undesirable.

texmath (0.12.3.1)

  • MathML reader: don’t allow mfenced attributes to inherit (#177). When open and close attributes aren’t given on an mfenced, we should use defaults rather than inheriting these from a parent mfenced.

texmath (0.12.3)

  • TeX reader: implement logic to convert a Bin symbol to an Op to Op when it occurs at the beginning of a group, or after an Open, Pun, or Op symbol. This will give much better results for unary - (#176).

  • OMML writer: fixed rendering of EDelimited (#173). We now properly render “middles” (separators).

texmath (0.12.2)

  • MathML input: support mmultiscripts element (#158, #100).

  • Make MathML tag/attr recognition case-insensitive (#158).

  • Pandoc writer: better handling of styling such as \mathrm (#145). Previously identifiers were always italic, no matter what styling was applied.

  • Ignore \tag in TeX input (#162).

  • TeX writer: avoid unneeded \left and \right for delimited. We don’t need \left and \right when the contents are “standard height.”

  • TeX reader: parse implicit EDelimited sections (#172). We now parse (x) as EDelimited, even though \right and \left are not used.

texmath (0.12.1.1)

texmath (0.12.1)

  • OMML writer: explicitly mark symbols as non-italic (#109). Otherwise, for some reason, they appear as italic by default.
  • Improve error messages in reading tex arrays.
  • Improve support for \bmod, \mod, etc. (#165). Allow them to take complex arguments like \left( 1 \right).
  • Improve support for \genfrac (#164).
  • Ignore \textstyle, \scriptstyle, \scriptscriptstyle, as we currently ignore \displaystyle.
  • Parse siunitx commands in reading tex (#157).
  • Improve handling of \not in reading tex (#161). Previously we only handled \not in front of certain symbols.
  • Support \pod and \pmod and clean up spacing and font for \mod and \bmod (#160).

texmath (0.12.0.3)

  • Allow pandoc-types 1.22.

texmath (0.12.0.2)

  • Allow pandoc-types 1.21.
  • Pandoc output: omit empty Emph for sub/superscript without base (#155).
  • tex writer: Use \overline{\overline{B}} instead of unicode double line accent (#153).

texmath (0.12.0.1)

  • OMML writer: Fix overline and accent rendering (#152).
  • OMML reader: Fix dropped arrows (#153). Add tests.

texmath (0.12)

  • Use Text instead of String in data types and functions (Christian Despres) [API change]. Note that there are still a few places where we unpack Text to String with a view pattern: performance could likely be increased with further rewriting.
  • Avoid use of !! with negative index (jgm/pandoc#5853).

texmath (0.11.3)

  • Use error instead of fail to allow building with ghc 8.8.
  • Test output: remove superfluous spaces after control sequences, superfluous groups, and unicode VARIATION SELECTOR 1.
  • renderTeX: add space between control sequence and any non-ASCII character. There are differences in behavior of isAlphaNum between different ghc versions that would affect test output otherwise.
  • charToLaTeXString: Ignore 65024 VARIATION SELECTOR 1 to avoid putting it literally in the output ; it is used in mathml output and occurs in many of the test cases.
  • Add cabal.project.
  • Use actions rather than travis for CI.

texmath (0.11.2.3)

  • OMML reader: properly distinguish normal text from math (#136). If m:nor or m:lit is set in m:rPr, we interpret the contents as literal text and not as math.
  • TeX reader: use different symbol (_) for \underline (#142). This gets the right accent properties on MathML output, so that the underline is not lower than it should be.
  • TeX reader: Treat \bmod as a relational symbol rather than an operator (#143). This fixes spacing problems in several output formats.

texmath (0.11.2.2)

  • OMML writer: use m:nor for normal text (#135).

texmath (0.11.2.1)

  • OMML reader: Don’t collapse fName to a string (#133). This fixes cases where fName has some complexity, e.g. a subscript or limit.

texmath (0.11.2)

  • Improved handling of \mathop etc (#126). We now allow