Tags · Mingu113/KanjiDictVN

trungnt2910.hannom.20230110-134950.kanjidic2.zip

feat: Huge rework

- Total rework of the fetching engine:
  + Instead of 8 threads in a pool processing the work sequentially,
    the whole Kanji bank is divided into 8 (nearly) equal parts, and
    each thread sequentially downloads each page in the chunks.
  + Instead of relying on web.archive.org as a proxy and fetch stuff
    as old as 2018, we now have access to the latest data from the
    original website using the TOR proxy.
  + New hvdic parsing logic. The messy code is replaced with an object
    oriented approach. This allows type-safe scraping of the dictionary,
    as well as serializing the whole hvdic as JSON or something else
    to be used in the future.
  + The old WebArchiveClient is still kept as a useful reference (Don't
    have the time and enthusiasm to make it a separate NuGet package
    yet).
- Refreshed hvcache with the new pages obtained by this method.
- A new out_vn folder is built.

Jan 10, 2023
b6f3db3
zip
tar.gz

trungnt2910.hannom.20230109-152759.kanjidic2

fix: Read multiple meanings from same source

Jan 9, 2023
8767d50
zip
tar.gz

trungnt2910.hannom.20230109-132817.kanjidic2

feat: Initial commit

Jan 9, 2023
f9808c3
zip
tar.gz

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

trungnt2910.hannom.20230110-134950.kanjidic2.zip

trungnt2910.hannom.20230109-152759.kanjidic2

trungnt2910.hannom.20230109-132817.kanjidic2

Tags: Mingu113/KanjiDictVN

trungnt2910.hannom.20230110-134950.kanjidic2.zip

trungnt2910.hannom.20230109-152759.kanjidic2

trungnt2910.hannom.20230109-132817.kanjidic2