Skip to content

zdimension/ade-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ADE Scraper

Scrapes entries from ADE Planning (ADE Web Direct Planning system) to extract resource IDs.

Requirements

  • Python 3.11+
  • uv (recommended) or pip

Installation

Using uv (recommended)

Install uv if you haven't already, then run:

uv sync

Configuration

  1. Copy .env.example to .env:

    cp .env.example .env
  2. Edit .env and fill in your credentials:

    • URL_ROOT: The base URL of your ADE Planning instance, e.g., https://ade.example.com/
    • COOKIE: Your session cookie (JSESSIONID)
    • RPC_TOKEN: The RPC token for API requests

These two last parameters can be obtained by opening ADE in your web browser, and looking at the network requests made by the ADE web application. Search for any ending with "Proxy".

It should have "Cookie" header, containing a string starting with "JSESSIONID=", and its body should end with something like:

|NAME|1|2|3|4|4|5|6|7|8|Zqcskna|6|9|7|10|0|10|1|10|2|10|3|10|4|10|5|10|6|124|11|12|0|0|9|1|10|49129|9|1|10|16|7|13|1|14|9|0|

You should look for a 7-character string usually starting with 'Z', here Zqcskna, which is the RPC token. If you're getting a GWT "Invalid identifier" error, it means one or the other has expired.

ADE structure overview

ADE behaves like a file system containing resources which can be either folders (categories) or files (trainees, instructors, classrooms, etc.). Each resource has a unique ID and a name. Resources can be nested within folders, creating a hierarchical structure.

A folder resource behaves like the union of all of its children for time-planning purposes.

The hierarchy starts with at a special "root" section with no name and ID -100. Its direct children are the main sections like trainee, instructor, classroom, etc. Those have a negative ID starting at -1.

The main root section behaves like being at depth -2, with its children (main sections) being at depth -1.

Basic usage

When no path is provided, the script will start at root with a max depth of -1. -l displays a table.

uv run python main.py -l
 Browsing root (-100): total 6 , got [6]
  ID  Name
----  ----------
  -1  trainee
  -2  instructor
  -3  classroom
  -5  category5
  -6  category6
  -8  category8

We can then list the contents of a specific root, e.g., trainee:

uv run python main.py -l -d 0 trainee
 Browsing root (-100): total 6 , got [6]
 Browsing trainee (-100): total 45 , got [45]
   ID  Name
-----  ---------------------------------------------
  101  John Doe
  102  Jane Smith
...

Scraping the whole "category5" section recursively and saving to output.json:

uv run python main.py category5 -o output.json

CLI arguments

Note: .env settings can be passed using CLI too.

script [--depth/-d DEPTH] [--only {files,folders,both}] [--list/-l] [--output/-o file.json] [path]

  • --depth/-d DEPTH: Maximum depth (inclusive) to recurse. Default is -1 (main sections) if no path is given, otherwise 100000.
  • --only {files,folders,both}: Filter to include only files, only folders, or both. Default is folders if no path is given, otherwise files.
  • --list/-l: List results in a table format instead of writing to JSON.
  • --output/-o file.json: Output results to a JSON file. Default is output.json.
  • path: The starting path in ADE Planning (e.g., trainee, instructor/thing123, etc.). If omitted, starts at root.

License

See LICENSE file for details.

About

Scrapes entries from ADE Planning (ADE Web Direct Planning system) to extract resource IDs.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages