Skip to content

antonbalucha/text-processing

Repository files navigation

Text-Processing

Project is used for basic text processing. It can download web pages, clean text and identify TF-IDF.

Project modules

Project contains four main modules according its functionality:

  • lib-api-classes - contains shared interfaces and classes
  • lib-calculator - contains classes which provides some computations
  • lib-configuration - contains classes for reading configuration, connection to database and some helper classes
  • app-text-preparation - module, which clean and prepare text
  • app-tf-idf-processor - module, which identify TF-IDF for each word of each downloaded text
  • app-web-downloader - module, which downloads web pages

In conclusion

License

I provide this project under Apache License 2.0.

Contact

In case of any questions about Text Processing or suggestions for improvements or some feedback or whatever is in your mind about The Framework you may contact me on projects@yss.sk.

Keywords

Java, Text Processing, simple, basic text procesing, examples, clean text, TF-IDF, stemmer, downloader

About

Project is used for basic text processing - downloads web pages, cleans text, identifies TF-IDF

Resources

License

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages