Tags: MIT-IR/simeon
Tags
CLI:
Log messages in JSON format and de-duplicate extraction results
CORE:
Make all simeon exceptions inherit from a base SimeonError exception
EMAIL:
Handle empty email opt-in packages
EXTRACT:
Add a timestamp column to the extracted youtube and geoip data
Capture both load and query jobs' errors
Fix bug that failed to deduplicate ip addresses extracted using simeon-geoip
PUSH:
Upload table schemas as well during loading, if --update-description is given
REPORT:
Capture errors from merge queries
using safe_cast() to account for blank strings in studentmodule
Add filters to the jinja2 environment of queries to dynamically check if tables exist
Export compiled SQL queries in files under the given --target directory
SPLIT:
Remove extra fhandle.write line to de-duplicate records from generated JSON log files
SQL:
Adapt person course to be more tolerant of missing tables
New query file for problem_check table
CORE:
Provide options for extra S3 credential attributes and bucket names
Use UTC time for default begin date
GEOIP:
Don't extract IPs with geolocation data, and update geoip when merge matches a record
Add options to include additional conditions to WHEN MATCHED
Remove temp tables whenever an error occurs during merge
Continue processing (but raise warnings) when geolocation extraction raises an exception
CLI:
Invoke the cli tool using python -m
SQL:
Respect the given batch size when batching files by directory name for decryption
Add private function to check that a course folder has all required files for making .json.gz files
Wait for the entirety of the package to be decrypted if the encrypted files are not to be kept around
REPORT:
Add browser agent information to the generated person_course
Add pc_day_agent_counts and course_modal_agent to the default list of tables to generate before person_course
CLI:
Use a pager to display the help messages
0.0.17 release
CI/CD:
Add code analysis Action
CLI:
Add option to ignore course ID filtering
Replace namespace.no_create with .no_courses to skip filtering for courses
Do not allow abbreviated options and provide an option to skip the reordering of table names when simeon report is invoked with a set of table names
Stop allowing abbreviated options in the subparsers of all scripts
DO NOT stop process SQL bundles due to decryption failures unless --fail-fast is provided
REPORT:
Catch exceptions from making a secondary report and continue when generating multiple tables
Add simeon indicators to BigQuery job IDs, so those IDs can be queried from the information schema
PreviousNext