Skip to content

Text process#80

Open
EzraBrand wants to merge 15 commits into
mainfrom
text-process
Open

Text process#80
EzraBrand wants to merge 15 commits into
mainfrom
text-process

Conversation

@EzraBrand
Copy link
Copy Markdown
Owner

No description provided.

ezrabrand added 15 commits January 3, 2026 07:05
Replit-Commit-Author: Deployment
Replit-Commit-Session-Id: 3d2a0d38-5a66-4db9-9e12-d1749c458bf8
Replit-Commit-Checkpoint-Type: full_checkpoint
Replit-Commit-Event-Id: a38ea066-ccbb-4438-a512-222b78316495
Replit-Commit-Screenshot-Url: https://storage.googleapis.com/screenshot-production-us-central1/d45a8400-c767-4411-ae94-e845d0995313/3d2a0d38-5a66-4db9-9e12-d1749c458bf8/eheA5Hl
Replit-Commit-Deployment-Build-Id: 6d254b90-b3e0-48f5-808f-03083395a150
Replit-Helium-Checkpoint-Created: true
Update the OpenAI library dependency in package.json from version 6.9.1 to 6.15.0.

Replit-Commit-Author: Agent
Replit-Commit-Session-Id: fa381873-e272-4c61-9993-ef7ef13d0edb
Replit-Commit-Checkpoint-Type: full_checkpoint
Replit-Commit-Event-Id: fd0dcaba-49c5-4e8e-9f94-5215312e4d7b
Replit-Helium-Checkpoint-Created: true
Comment out OpenAI and chat-related imports and routes in server/routes.ts.

Replit-Commit-Author: Agent
Replit-Commit-Session-Id: fa381873-e272-4c61-9993-ef7ef13d0edb
Replit-Commit-Checkpoint-Type: full_checkpoint
Replit-Commit-Event-Id: 607050d5-074d-4d2e-bc72-96cbaafaec68
Replit-Commit-Screenshot-Url: https://storage.googleapis.com/screenshot-production-us-central1/8736e152-fbe4-4af8-bb25-5d3bc617e901/fa381873-e272-4c61-9993-ef7ef13d0edb/PClRtFY
Replit-Helium-Checkpoint-Created: true
…iling regex patterns

Enables an optimized text processing pipeline (V2) with pre-compiled regex patterns for enhanced performance, using a feature flag that checks both Node.js and browser environments.

Replit-Commit-Author: Agent
Replit-Commit-Session-Id: fa381873-e272-4c61-9993-ef7ef13d0edb
Replit-Commit-Checkpoint-Type: intermediate_checkpoint
Replit-Commit-Event-Id: 7a6c2fe0-a9d1-4b21-b3d4-f0e18ea52c67
Replit-Commit-Screenshot-Url: https://storage.googleapis.com/screenshot-production-us-central1/8736e152-fbe4-4af8-bb25-5d3bc617e901/fa381873-e272-4c61-9993-ef7ef13d0edb/PClRtFY
Replit-Helium-Checkpoint-Created: true
Replit-Commit-Author: Agent
Replit-Commit-Session-Id: fa381873-e272-4c61-9993-ef7ef13d0edb
Replit-Commit-Checkpoint-Type: full_checkpoint
Replit-Commit-Event-Id: f24004b6-298e-43e5-a524-bf2e4e1b9c71
Replit-Commit-Screenshot-Url: https://storage.googleapis.com/screenshot-production-us-central1/8736e152-fbe4-4af8-bb25-5d3bc617e901/fa381873-e272-4c61-9993-ef7ef13d0edb/itjB6C8
Replit-Helium-Checkpoint-Created: true
Refactor and add tests to `tests/text-processing.test.ts` and `tests/text-processing-v1-v2-parity.test.ts` to verify parity and isolate V1/V2 module loading.

Replit-Commit-Author: Agent
Replit-Commit-Session-Id: fa381873-e272-4c61-9993-ef7ef13d0edb
Replit-Commit-Checkpoint-Type: intermediate_checkpoint
Replit-Commit-Event-Id: 06285c3e-fced-4bad-846e-45f01dc3fab6
Replit-Commit-Screenshot-Url: https://storage.googleapis.com/screenshot-production-us-central1/8736e152-fbe4-4af8-bb25-5d3bc617e901/fa381873-e272-4c61-9993-ef7ef13d0edb/E0iqLEO
Replit-Helium-Checkpoint-Created: true
Replit-Commit-Author: Agent
Replit-Commit-Session-Id: fa381873-e272-4c61-9993-ef7ef13d0edb
Replit-Commit-Checkpoint-Type: full_checkpoint
Replit-Commit-Event-Id: 642f9332-e550-4377-93c9-298a08910c3e
Replit-Commit-Screenshot-Url: https://storage.googleapis.com/screenshot-production-us-central1/8736e152-fbe4-4af8-bb25-5d3bc617e901/fa381873-e272-4c61-9993-ef7ef13d0edb/E0iqLEO
Replit-Helium-Checkpoint-Created: true
Replit-Commit-Author: Agent
Replit-Commit-Session-Id: fa381873-e272-4c61-9993-ef7ef13d0edb
Replit-Commit-Checkpoint-Type: full_checkpoint
Replit-Commit-Event-Id: e018b224-17cd-4a4b-b1ca-61a2f2bfb982
Replit-Commit-Screenshot-Url: https://storage.googleapis.com/screenshot-production-us-central1/8736e152-fbe4-4af8-bb25-5d3bc617e901/fa381873-e272-4c61-9993-ef7ef13d0edb/E0iqLEO
Replit-Helium-Checkpoint-Created: true
Refactors term replacement logic by moving hardcoded terms to a JSON configuration file and improving the regex builder to correctly handle terms ending with punctuation.

Replit-Commit-Author: Agent
Replit-Commit-Session-Id: fa381873-e272-4c61-9993-ef7ef13d0edb
Replit-Commit-Checkpoint-Type: intermediate_checkpoint
Replit-Commit-Event-Id: 8a6625dc-6003-4111-92ff-4e02fcc108d8
Replit-Commit-Screenshot-Url: https://storage.googleapis.com/screenshot-production-us-central1/8736e152-fbe4-4af8-bb25-5d3bc617e901/fa381873-e272-4c61-9993-ef7ef13d0edb/gYhR4L0
Replit-Helium-Checkpoint-Created: true
Add tests for punctuation-terminated terms in the text processing module.

Replit-Commit-Author: Agent
Replit-Commit-Session-Id: fa381873-e272-4c61-9993-ef7ef13d0edb
Replit-Commit-Checkpoint-Type: full_checkpoint
Replit-Commit-Event-Id: 7412c834-783d-4a34-b460-5623e6a91271
Replit-Commit-Screenshot-Url: https://storage.googleapis.com/screenshot-production-us-central1/8736e152-fbe4-4af8-bb25-5d3bc617e901/fa381873-e272-4c61-9993-ef7ef13d0edb/Czng585
Replit-Helium-Checkpoint-Created: true
Replit-Commit-Author: Agent
Replit-Commit-Session-Id: fa381873-e272-4c61-9993-ef7ef13d0edb
Replit-Commit-Checkpoint-Type: full_checkpoint
Replit-Commit-Event-Id: 80de8db7-bb1d-47f7-95fc-faf0f4ccd288
Replit-Commit-Screenshot-Url: https://storage.googleapis.com/screenshot-production-us-central1/8736e152-fbe4-4af8-bb25-5d3bc617e901/fa381873-e272-4c61-9993-ef7ef13d0edb/Czng585
Replit-Helium-Checkpoint-Created: true
Install the Compromise library to compare its sentence boundary detection capabilities against the current implementation.

Replit-Commit-Author: Agent
Replit-Commit-Session-Id: fa381873-e272-4c61-9993-ef7ef13d0edb
Replit-Commit-Checkpoint-Type: full_checkpoint
Replit-Commit-Event-Id: bac35fe8-0910-414f-93ba-337d391153a9
Replit-Commit-Screenshot-Url: https://storage.googleapis.com/screenshot-production-us-central1/8736e152-fbe4-4af8-bb25-5d3bc617e901/fa381873-e272-4c61-9993-ef7ef13d0edb/Czng585
Replit-Helium-Checkpoint-Created: true
Update `splitHebrewText` in `shared/text-processing.ts` to correctly handle ellipses (`...`) by protecting them before splitting on periods, and add a corresponding test case in `tests/text-processing.test.ts`.

Replit-Commit-Author: Agent
Replit-Commit-Session-Id: fa381873-e272-4c61-9993-ef7ef13d0edb
Replit-Commit-Checkpoint-Type: full_checkpoint
Replit-Commit-Event-Id: d317ba3c-58f1-462b-b801-5ee8f98ed05d
Replit-Commit-Screenshot-Url: https://storage.googleapis.com/screenshot-production-us-central1/8736e152-fbe4-4af8-bb25-5d3bc617e901/fa381873-e272-4c61-9993-ef7ef13d0edb/Czng585
Replit-Helium-Checkpoint-Created: true
…eferences

Update shared/text-processing.ts with detailed comments and architecture overview, removing V1 feature flag and related test file.

Replit-Commit-Author: Agent
Replit-Commit-Session-Id: fa381873-e272-4c61-9993-ef7ef13d0edb
Replit-Commit-Checkpoint-Type: intermediate_checkpoint
Replit-Commit-Event-Id: 285ac52f-910e-4b6a-96fe-826e1b1c1143
Replit-Commit-Screenshot-Url: https://storage.googleapis.com/screenshot-production-us-central1/8736e152-fbe4-4af8-bb25-5d3bc617e901/fa381873-e272-4c61-9993-ef7ef13d0edb/11wTPOR
Replit-Helium-Checkpoint-Created: true
Replit-Commit-Author: Agent
Replit-Commit-Session-Id: fa381873-e272-4c61-9993-ef7ef13d0edb
Replit-Commit-Checkpoint-Type: full_checkpoint
Replit-Commit-Event-Id: 9523d352-a8a0-48cf-bb6f-33b25ba11f96
Replit-Commit-Screenshot-Url: https://storage.googleapis.com/screenshot-production-us-central1/8736e152-fbe4-4af8-bb25-5d3bc617e901/fa381873-e272-4c61-9993-ef7ef13d0edb/Pjc5cjI
Replit-Helium-Checkpoint-Created: true
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant