Open Source Linux Text Processing Software - Page 4

Text Processing Software for Linux

View 9 business solutions
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    The database for AI-powered applications.

    MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
    Start Free
  • Build Securely on AWS with Proven Frameworks Icon
    Build Securely on AWS with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 1
    gleditor

    gleditor

    A small programmer's editor.

    A small programmer's editor whith syntax highlight, extended search features, code completion (ctrl+space). Supported languages: htlm, sql, pascal, c/c++, c#, java, basic, javascript, css, php, python.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 2
    OOoLatex is no more maintained. Please consider using TexMaths (http://roland65.free.fr/texmaths/) OOoLatex is a set marcos designed to provide latex support into OpenOffice. Complex equations can be inserted as images, the latex code is saved into the image attribute while simpler equations are expanded into symbol characters to be inserted as text.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 3
    Emerald Text Editor (jEditor)

    Emerald Text Editor (jEditor)

    Emerald Text Editor is a tabbed text editor with heavy customizability

    Emerald Text Editor (Emerald Editor, or Emerald as I call it), formerly called jEditor, is a text editor that is much similar to notepad in the fact that it let's you edit text but it makes use of the tabbed panes which means that you can have multiple tabs up at once allowing you to edit multiple files at one time. Emerald Text Editor also comes with a toolbar which tells you how quickly you are typing and how many characters are in your current document. The program is also customizable, meaning you can edit some of the main features of the program. The name was changed to fit a future naming scheme I'm going to have.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 4
    SciTECO

    SciTECO

    Advanced TECO dialect and interactive screen editor based on Scintilla

    SciTECO is an interactive TECO dialect, similar to Video TECO. It also adds features from classic TECO-11, as well as unique new ideas. Project development takes place here: https://git.fmsbw.de/sciteco The download archive is mirrored at Sourceforge, but for nightly builds check out: https://sciteco.fmsbw.de/downloads/nightly/
    Downloads: 7 This Week
    Last Update:
    See Project
  • Level Up Your Cyber Defense with External Threat Management Icon
    Level Up Your Cyber Defense with External Threat Management

    See every risk before it hits. From exposed data to dark web chatter. All in one unified view.

    Move beyond alerts. Gain full visibility, context, and control over your external attack surface to stay ahead of every threat.
    Try for Free
  • 5
    Web Book Downloader

    Web Book Downloader

    Download websites as e-book: pdf, txt, epub.

    This application allows user to download chapters from website in 3 ways: - from table of contents; - from range: first chapter address, last chapter address; - by crawling from first chapter to n; In settings you can customize language, input(website encoding) for simplicity output is in the same encoding. If you want your language add new class into strings package, and new fields into Settings class and GUI menu(initialize method).
    Downloads: 7 This Week
    Last Update:
    See Project
  • 6
    RefDB is a reference database and bibliography tool for SGML, XML, and LaTeX documents, sort of a Reference Manager or BibTeX for markup languages. It is portable and known to run on Linux, Free/NetBSD, OSX, Solaris, and Windows/Cygwin.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 7
    Unihanconver

    Unihanconver

    Traditional/Simplified Chinese conversion with CLI or GUI

    Tool to convert between Traditional/Simplified Chinese directly in Unicode (not GB/Big5 conversion). It is written in Perl and does not use any external libraries. It provides a command-line utility as well as a GTK+ interface for X Window.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 8
    NunniMCAX is a minimal (19KB) C library for parsing XML. The API recall SAX and is sequential and event-driven. The parser strives to verify that the XML is well-formed, but no validation. NunniMCAX's FSM has been generated using NunniFSMGen
    Downloads: 6 This Week
    Last Update:
    See Project
  • 9
    Mathematics formula renderer for Microsoft Word. Easy to use and really fast + Mathematics drawing toolbar for Microsoft Word + Math exercices storage Database for Microsoft Access. Available in French and English!!!
    Downloads: 10 This Week
    Last Update:
    See Project
  • Simple, Secure Domain Registration Icon
    Simple, Secure Domain Registration

    Get your domain at wholesale price. Cloudflare offers simple, secure registration with no markups, plus free DNS, CDN, and SSL integration.

    Register or renew your domain and pay only what we pay. No markups, hidden fees, or surprise add-ons. Choose from over 400 TLDs (.com, .ai, .dev). Every domain is integrated with Cloudflare's industry-leading DNS, CDN, and free SSL to make your site faster and more secure. Simple, secure, at-cost domain registration.
    Sign up for free
  • 10
    PyRtfLib is a python library that provides a parser and few translators like rtf to html and to simple text.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 11
    cpDetector is a proxy for codepage detection of documents. It delegates to multiple instances that try to detect the codepage by different techinques. A command line executeable is shipped that allows to sort documents by codepage.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 12

    odtPHP

    Open Document Templating System for PHP

    OdtPHP is a PHP librairy designed to use OpenDocument file as a template for PHP. It's a kind of PHPLib for OpenOffice document.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    DocWire SDK

    DocWire SDK

    Award-winning modern data processing SDK in C++20

    DocWire SDK, a standout C++20AI driven data processing tool, has received award from SourceForge and strong backing from Microsoft. It handles nearly 100 file types, empowering efficient text extraction, web data extraction, and document analysis. For businesses, the shift to DocWire SDK signifies a leap forward. It promises comprehensive document format support and the ability to extract valuable insights from email boxes, databases, and websites using cutting-edge AI. DocWire SDK aims to expand its capabilities, focusing on versatile data extraction, platform support, and seamless integration with various systems. DocWire SDK is dedicated to streamlining data processing, reducing development time and costs, and harnessing the potential of AI. Its advancements promise a superior experience compared to its predecessor, DocToText.
    Leader badge
    Downloads: 9 This Week
    Last Update:
    See Project
  • 14
    NoteCase is portable (Linux, Win32) hierarchical notes manager (aka outliner) coded in C++ using GTK+ toolkit. Project is inactive as of 2008/12/09, read more at: http://factoriel.blogspot.com/2008_12_01_archive.html
    Downloads: 9 This Week
    Last Update:
    See Project
  • 15

    dbacl - digramic Bayesian classifier

    commandline multiclass email and text filter

    dbacl is a general purpose digramic Bayesian text classifier. It can learn text documents you provide, and then compare new input with the learned categories. It can be used for spam filtering, or within your own shell scripts. Sometimes it plays che
    Leader badge
    Downloads: 9 This Week
    Last Update:
    See Project
  • 16
    A C++ library to read and write PDF files, plus a GUI editor.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 17
    A Perl script that splits a long HTML file into separate inter-linked pages, according to the headings in the original file. Useful for maintaining both a print version and a browsable version of a site.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 18
    The Xerlin project is a Java™ based XML editor that can run on any Java 2 virtual machine. The application is extensible via custom editor interfaces. Xerlin can be used to provide simple, intuitive interfaces for users who know nothing about XML.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 19
    JODReports is a solution for generating dynamic documents and reports in Java based on the OpenDocument format (ODF). Templates can be easily composed with a word processor such as OpenOffice.org Writer. Data sources include POJOs and XML.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20

    omegat-plugins

    OBSOLETE AS OF OMEGAT 3.0.3. DO NOT USE.

    OBSOLETE AS OF OMEGAT 3.0.3. DO NOT USE. Third-party plugins for OmegaT (https://sourceforge.net/projects/omegat)
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21
    ansifilter

    ansifilter

    ANSI sequence filter

    Ansifilter handles text files containing ANSI terminal escape codes. The command sequences may be stripped or be interpreted to generate formatted output (HTML, RTF, TeX, LaTeX, BBCode and Pango Markup).
    Downloads: 4 This Week
    Last Update:
    See Project
  • 22
    Conky GUI
    Conky GUI eases the customization of Conky configuration files.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 23
    PyRTF is a pure python module for the efficient creation of RTF documents.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 24
    TEA is a text editor that provides a wide range of text-processing functions (over 100) and the syntax highlighting. There are two branches of TEA: Qt-based and GTK-based.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 25
    RTF to HTML converter for use both with your applications and as a standalone tool. Small and fast. Processes tables better than any other tool I've seen.
    Downloads: 7 This Week
    Last Update:
    See Project
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.