Skip to content

zaid404/video2pdf2csv

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Video Table/Text Extraction Tool

This repository contains tools and scripts for extracting tables or text from YouTube videos. The extraction process involves multiple steps, including video download, screenshot capture, PDF generation, OCR processing, and table extraction.

Workflow Overview

  1. Download the video using yt-dlp.
  2. Capture screenshots at specified intervals using FFmpeg.
  3. Combine screenshots into a PDF file using PyPDF2.
  4. (Optional) Perform OCR on the PDF to extract text using OCRmyPDF.
  5. Extract tables or text from the OCR-processed PDF using pytesseract. However, for higher accuracy, it’s recommended to use Adobe Acrobat’s PDF to Excel tool for table extraction.

About

parsing video to table

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors