PdfToTextViaOCR (Basic plaintext extraction from bitmap images which are rendered from a PDF document. Contact: Radim Hatlapatka; e-mail:208155@mail.muni.cz)
PDFTester * (Tests whether a PDF document contains multiple layers, page bitmaps and/or is born digital.)
Maxtract (Analyses a PDF document and extracts the mathematical expressions from it as a list of MathML structures. Contact: Volker Sorge; e-mail:v.sorge@cs.bham.ac.uk)
TeX2NLM (Identifies TEX math in a CDATA UTF-8 string, returns NLM conformant formula structure with dual streams original TeX and MathML. Contact: Nicolas Houillon; e-mail:nicolas.houillon@ujf-grenoble.fr)
EnhanceNLMTeXwMathML (Takes an XML file containing NLM conformant formula structures with a <texmath> stream, and adds MathML translations for all mathematical expressions encoded as LaTeX sources in the file. Contact: nicolas.houillon@ujf-grenoble.fr)