Skip to content

miyako/opc-parser

Repository files navigation

platform license downloads

Dependencies and Licensing

  • the source code of this CLI tool is licensed under the MIT license.
  • see libopc for the licensing of libopc (BSD).

opc-parser

CLI tool to extract text from OOXML

text extractor for ooxml documents

 -i path  : document to parse
 -o path  : text output (default=stdout)
 -        : use stdin for input
 -r       : raw text output (default=json)
 -p pass  : password

JSON (XLSX)

Property Level Type Description
document 0
document.type 0 Text
document.pages 0 Array =sheets
document.pages[].meta 1 Object
document.pages[].meta.name 1 Text sheet name
document.pages[].paragraphs 1 Array =rows
document.pages[].paragraphs[].values 2 Array =cells
document.pages[].paragraphs[].text 2 Text JSON representation of .values

JSON (PPTX)

Property Level Type Description
document 0
document.type 0 Text
document.pages 0 Array =slides
document.pages[].paragraphs 1 Array
document.pages[].paragraphs[].text 2 Text

JSON (DOCX)

Property Level Type Description
document 0
document.type 0 Text
document.pages 0 Array
document.pages[].paragraphs 1 Array
document.pages[].paragraphs[].text 2 Text

About

CLI tool to extract text from OOXML

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published