Skip to content

Oct4Pie/zero-zerogpt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Zero-ZeroGPT

License: MIT React Material-UI

Table of Contents

What is Zero-ZeroGPT?

Zero-ZeroGPT is a demonstration application that showcases how replacing standard spaces with various Unicode space characters can affect the detection of AI-generated text by common AI detection tools like GPTZero and ZeroGPT. This project looks to explore the limitations of current AI detection methods and promote discussion about more robust processing techniques.

Note: This version is a fork of the original project. It has been enhanced by MasuRii to include full PDF document support, whereas the original application only supported plain and rich text.

Live Demo

Experience Zero-ZeroGPT in action (deployed via GitHub Pages): https://oct4pie.github.io/zero-zerogpt

AI Detection Approach

Most tools designed to identify text generated by AI models use several techniques:

  1. Pattern Analysis: Detects unusual word choices, repetitive patterns, and syntactic structures.
  2. Linguistic Analysis: Examines grammatical structures, coherence, and context to measure inconsistencies.
  3. Statistical Analysis: Compares the statistical distribution of words and phrases to identify anomalies.

Unicode Spacing Technique

AI detection tools generally tokenize text based on standard spaces. By replacing these spaces with special Unicode characters, it's possible to disrupt the tokenization process:

  1. Tokenization Disruption: Many detection models split text into tokens based on spaces. When Unicode spaces are used, these tools fail to recognize them as standard spaces.
  2. Statistical Alteration: The statistical features of the text are changed when spaces are replaced with Unicode spaces, preventing the model from matching the text with its learned patterns.
  3. Pattern Interference: Unicode spaces can disrupt the detection model's ability to identify typical text patterns.

PDF Support

Developed by MasuRii, this fork introduces robust PDF capabilities to Zero-ZeroGPT. While the original tool was limited to text and rich text inputs, this version allows for full document processing.

Zero-ZeroGPT includes comprehensive PDF processing capabilities, powered by pdfjs-dist for extraction and pdf-lib for high-fidelity generation.

Features:

  • Layout Preservation: Maintains the original document's structure, including text positioning, columns, and page dimensions.
  • Font Awareness: Intelligently maps original fonts to standard PDF fonts or compatible fallbacks to preserve the document's visual style.
  • Seamless Transformation: Apply any Unicode spacing pattern to the text while keeping the document layout intact.
  • Custom Font Embedding: Includes high-quality Noto Sans embedding for full Unicode support (including special spaces).
  • Column Detection: Automatically detects and preserves multi-column layouts (2-4 columns) commonly found in research papers and articles.
  • Color Preservation: Accurately extracts and reproduces text colors from the original document using advanced operator list parsing.
  • Hybrid Generation:
    • Layout Preserved Mode: Generates a PDF that mirrors the input (default).
    • Simple Text Mode: Generates a clean, simple text document (fallback).

Testing: The PDF feature suite is backed by a comprehensive testing strategy with 217 tests covering all utility functions. To run tests:

npm test

Limitations:

  • Text-Based Only: The feature supports text-based PDFs. Scanned documents or image-only PDFs are not supported (OCR is not implemented).
  • Font Subsetting: While font families and styles are preserved, exact custom font files are not re-embedded to avoid copyright and size issues; high-quality standard fallbacks are used instead.

Examples

Here are some visual examples demonstrating the effect of Unicode spacing on AI detection tools:

Example 1: Original text detected as AI-generated

Example 2: Text with Unicode spaces bypassing detection

Example 3: Another instance of detection evasion

Example 4: Comparison of different Unicode space effects

Installation and Usage

Prerequisites

  • Node.js (v14.0.0 or later)
  • npm (v6.0.0 or later)

Installation

  1. Clone the repository:

    git clone https://github.com/oct4pie/zero-zerogpt.git
    cd zero-zerogpt
  2. Install dependencies:

    npm install

Running Locally

  1. Start the development server:

    npm start
  2. Open your browser and navigate to http://localhost:3000

Usage Instructions

  1. Select Input Mode: Choose between "Plain Text", "Rich Text", or "PDF" using the toggle buttons.
  2. Input Content:
    • Plain/Rich Text: Enter or paste your text in the input field.
    • PDF: Click the "PDF" mode button, then drag & drop a file or click to upload.
  3. Apply Spacing: Experiment with different Unicode spaces using the preview cards or create your own combination.
  4. Export Results:
    • Copy: Click the copy icon on any card to copy the modified text.
    • Download (PDF Mode): When in PDF mode, click the download icon on any spacing card to generate a new PDF with that specific Unicode spacing applied.
  5. Use the "Clear Text" button to reset the application and clear any uploaded files.

Contributing

We welcome contributions to Zero-ZeroGPT! Please follow these steps to contribute:

  1. Fork the repository
  2. Create a new branch: git checkout -b feature/your-feature-name
  3. Make your changes and commit them: git commit -m 'Add some feature'
  4. Push to the branch: git push origin feature/your-feature-name
  5. Submit a pull request

Please read our Contributing Guidelines for more details.

Disclaimer

This project does not promote plagiarism or the misuse of AI technology. It is intended solely for educational and demonstration purposes to show the limitations of current AI detection methods and encourage the development of more reliable techniques. Users are responsible for ensuring their use of this tool complies with relevant policies and regulations.

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

Bypassing AI Content Detectors like ZeroGPT and GPTZero with Unicode Spacing

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published