Open Source Text Processing Software - Page 7

Text Processing Software

  • Build Securely on AWS with Proven Frameworks Icon
    Build Securely on AWS with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • Build Securely on Azure with Proven Frameworks Icon
    Build Securely on Azure with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 1
    DOMIT! is a Document Object Model (DOM) XML parser for PHP, written purely in PHP. It is mostly compliant with the DOM Level 2 specification.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    DOMIT! RSS is an RSS parser for PHP, written purely in PHP. Unlike most existing PHP RSS clients, it uses a DOM XML parser -- DOMIT! -- to convert an RSS feed into a DOM document that can be traversed using the standard DOM methods.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Devchekio is a GTK+ 2 FTP check in/out code editor. It was inspired by the use of DreamWeaver's system to access files like a library system.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    An experimental set of tools for text analysis and dictionary construction. One goal is to improve text-input e.g. on devices with touchscreens using dictionary-based symbolic on-screen keyboards.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    Build gen AI apps with an all-in-one modern database: MongoDB Atlas

    MongoDB Atlas provides built-in vector search and a flexible document model so developers can build, scale, and run gen AI apps without stitching together multiple databases. From LLM integration to semantic search, Atlas simplifies your AI architecture—and it’s free to get started.
    Start Free
  • 5
    Distributed Proofreaders
    Project has moved to https://github.com/DistributedProofreaders/dproofreaders Distributed Proofreaders is a web application intended to ease the process of converting public domain books and other printed materials into e-texts. The main site is at http://www.pgdp.net By breaking the work into individual pages, many proofreaders can be working on the same book at the same time. This significantly speeds up the proofreading/E-Text creation process. When a proofer elects to proofread a page for a particular project, the text and image file are displayed on a single webpage. This allows the text file to be easily reviewed and compared to the image file, thus assisting the proofreading of the text file. The edited text is then submitted back to the site via the same webpage that it was edited on. Once all pages for a particular book have been processed, a concatenated text file is made available for final clean-up and submitted to a Project Gutenberg site.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    The DocBook Publishing Utilities tools, which make creation and publishing of DocBook easier. The tools are: Maven plug-in to Transform HTML into XML (use after docbkx); Eclipse DocBook table editor; Eclipse wizards for initial DocBook files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    JSESOFT-DB2PDF provides a transformator for a limited (but expanding) subset of DocBook to PDF. The transformation from DocBook is done via iText directly to PDF. Priority is given to predictability and stability rather than to completeness.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    The DocConversion project provides a distributed document conversion solution with a well defined API which makes use of existing convstion tools and/or a centralized conversion server. This is part of the PRONIR research at http://www.pronir.nl
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Doco is a simple but feature rich and powerful markup language for converting text documents into highly-presentable and navigable web content.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Simple, Secure Domain Registration Icon
    Simple, Secure Domain Registration

    Get your domain at wholesale price. Cloudflare offers simple, secure registration with no markups, plus free DNS, CDN, and SSL integration.

    Register or renew your domain and pay only what we pay. No markups, hidden fees, or surprise add-ons. Choose from over 400 TLDs (.com, .ai, .dev). Every domain is integrated with Cloudflare's industry-leading DNS, CDN, and free SSL to make your site faster and more secure. Simple, secure, at-cost domain registration.
    Sign up for free
  • 10
    A modular system for extracting and converting Python docstrings into useful structured formats like HTML, XML, and TeX. Project inactive. Development taken over by Docutils, http://docutils.sourceforge.net/.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    EasyCite is an easy to use, complete citation project that will assist researchers and students in generating bibliographies and footers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    (The Bliki engine Wikipedia API now moved to http://code.google.com/p/gwtwiki ). Eclipse plugin: Converts Wikipedia syntax to HTML. Features: syntax highlighting, content outline&assist, templates, HTML preview&insert, Java/PHP link2wiki, PDF creation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    EditAll is a simple yet powerful program that allows users to manipulate text with various options. Future versions will allow for text translation, spell-checking, and much more.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    For the latest version, please download from: http://www.splashportal.net/Editor
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    preview-latex, the higly addictive and productive LaTeX previewing and folding tool for Emacs, has become part of the AUCTeX project at http://savannah.gnu.org/projects/auctex now and is integrated since version 11.80.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Embeddable Predictive Text Library
    A C (and JavaScript) library providing predictive text functions. The API is very simple and provides dictionary autocomplete and partial/full matching. Sample cellphone-like examples are included.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    This is a very simple Perl script that will take a file of unknown line endings UNIX, MAC, or DOS/Windows, defaulting converting the file to the UNIX style line endings.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    English C is a meta-language. The project has changed over the time from the programming language that pretended to understand texts written in an English-like language to a self-describing language like the MIME and C programming language are.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Ents is a small Java package designed to simplify the process of converting XML entity references to character references and vice-versa. Ents uses XML to specify lists of equivalent names and character references. Ents supports single character entities
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    EsTexte is a text-to-HTML based on an intuitive text format akin to various wiki formats and ascii text files. Written in Java, it can be used from the command-line or from other Java programs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Prototype for a framework and user interface for combining various structured search and document clustering techniques.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Project to create a unified FAQ XML format with all applicable software to convert it to various formats, such as multiple forms of HTML, TeX, PDF, text files, etc. Useful for most of "FAQ keepers" on various forums and discussion lists.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    This project is powered by FCKeditor and iText library. This promotes iText to new WYSIWYG editor level. FCKeditor is now not only HTML editor but also PDF editor Here is the Korean font supported FCKitext...Please download English version at http://
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24

    FOray

    Modular XSL-FO Implementation for Java.

    FOray is an open-source XSL-FO publishing system that is suitable for converting XML content into PDF and other document formats. Although not yet fully conformant with the XSL-FO standard, it is very useful for many applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    General information, and a pack of tools for manipulating the Persian (Farsi) language and script, on different platforms and operating systems.
    Downloads: 0 This Week
    Last Update:
    See Project
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.