Skip to content

Tags: MIT-IR/simeon

Tags

v0.0.26

Toggle v0.0.26's commit message
chore: Add support for python 3.12 and 3.13 and fix bug related to pa…

…rsing --extra-args files

REPORT: fixing person_enrollment_verified query to account for people whose only event is deactivating

v0.0.25

Toggle v0.0.25's commit message
Release v0.0.25

v0.0.24

Toggle v0.0.24's commit message
CLI:

      Log messages in JSON format and de-duplicate extraction results
CORE:
      Make all simeon exceptions inherit from a base SimeonError exception
EMAIL:
      Handle empty email opt-in packages

EXTRACT:
      Add a timestamp column to the extracted youtube and geoip data
      Capture both load and query jobs' errors
      Fix bug that failed to deduplicate ip addresses extracted using simeon-geoip
PUSH:
      Upload table schemas as well during loading, if --update-description is given
REPORT:
      Capture errors from merge queries
      using safe_cast() to account for blank strings in studentmodule
      Add filters to the jinja2 environment of queries to dynamically check if tables exist
      Export compiled SQL queries in files under the given --target directory

SPLIT:
      Remove extra fhandle.write line to de-duplicate records from generated JSON log files
SQL:
      Adapt person course to be more tolerant of missing tables
      New query file for problem_check table

v0.0.23

Toggle v0.0.23's commit message
PUSH:

    Handle file paths that are not unix glob patterns, but still contain asterisks
CLI:
    Let the CLI handler set the logging level of the Logger

v0.0.22

Toggle v0.0.22's commit message
CLI:

    Make simeon-youtube and simeon-geoip invokable as python modules

v0.0.21

Toggle v0.0.21's commit message
CORE:

    Define the logger at the package level, not in the scripts
    Update batch_split_tracking_logs to fix any use of undefined variables

v0.0.20

Toggle v0.0.20's commit message
LOG:

    Check to make sure that only dict objects from process_line are used for further processing.

v0.0.19

Toggle v0.0.19's commit message
CORE:

    Provide options for extra S3 credential attributes and bucket names
    Use UTC time for default begin date
GEOIP:
    Don't extract IPs with geolocation data, and update geoip when merge matches a record
    Add options to include additional conditions to WHEN MATCHED
    Remove temp tables whenever an error occurs during merge
    Continue processing (but raise warnings) when geolocation extraction raises an exception
CLI:
    Invoke the cli tool using python -m

v0.0.18

Toggle v0.0.18's commit message
SQL:

    Respect the given batch size when batching files by directory name for decryption
    Add private function to check that a course folder has all required files for making .json.gz files
    Wait for the entirety of the package to be decrypted if the encrypted files are not to be kept around
REPORT:
    Add browser agent information to the generated person_course
    Add pc_day_agent_counts and course_modal_agent to the default list of tables to generate before person_course
CLI:
    Use a pager to display the help messages

v0.0.17

Toggle v0.0.17's commit message
0.0.17 release

CI/CD:
    Add code analysis Action
CLI:
    Add option to ignore course ID filtering
    Replace namespace.no_create with .no_courses to skip filtering for courses
    Do not allow abbreviated options and provide an option to skip the reordering of table names when simeon report is invoked with a set of table names
    Stop allowing abbreviated options in the subparsers of all scripts
    DO NOT stop process SQL bundles due to decryption failures unless --fail-fast is provided
REPORT:
    Catch exceptions from making a secondary report and continue when generating multiple tables
    Add simeon indicators to BigQuery job IDs, so those IDs can be queried from the information schema