Skip to content
This repository was archived by the owner on Nov 19, 2025. It is now read-only.
This repository was archived by the owner on Nov 19, 2025. It is now read-only.

Error scraping a previously deleted record #90

@jonbiemond

Description

@jonbiemond

Upon running the load-db command a duplicate key violation is raised by the database when the script tries to insert a record that is marked as deleted.

I believe this is happening because on merge SQLAlchemy does not see previously deleted records due to the soft delete hook. A work around may be pre-selecting objects into the identity map.

Trace:

(.venv) root@bcitflex:~# flask --app bcitflex load-db
/root/.venv/lib/python3.11/site-packages/bcitflex/scripts/scrape_and_load.py:400: SAWarning: Object of type <PrerequisiteOr> not in session, add operation along 'Course.prerequisites_of' will not proceed
  session.commit()
Traceback (most recent call last):
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1965, in _exec_single_context
    self.dialect.do_execute(
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/engine/default.py", line 921, in do_execute
    cursor.execute(statement, parameters)
psycopg2.errors.UniqueViolation: duplicate key value violates unique constraint "pk_offering"
DETAIL:  Key (offering_id)=(964) already exists.


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/root/.venv/bin/flask", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/root/.venv/lib/python3.11/site-packages/flask/cli.py", line 1064, in main
    cli.main()
  File "/root/.venv/lib/python3.11/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/root/.venv/lib/python3.11/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.venv/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.venv/lib/python3.11/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.venv/lib/python3.11/site-packages/bcitflex/scripts/scrape_and_load.py", line 454, in load_db_command
    bcit_to_sql(db_url, all_subjects)
  File "/root/.venv/lib/python3.11/site-packages/bcitflex/scripts/scrape_and_load.py", line 441, in bcit_to_sql
    raise exc
  File "/root/.venv/lib/python3.11/site-packages/bcitflex/scripts/scrape_and_load.py", line 433, in bcit_to_sql
    count = load_courses(session, courses)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.venv/lib/python3.11/site-packages/bcitflex/scripts/scrape_and_load.py", line 400, in load_courses
    session.commit()
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 1923, in commit
    trans.commit(_to_root=True)
  File "<string>", line 2, in commit
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/orm/state_changes.py", line 139, in _go
    ret_value = fn(self, *arg, **kw)
                ^^^^^^^^^^^^^^^^^^^^
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 1239, in commit
    self._prepare_impl()
  File "<string>", line 2, in _prepare_impl
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/orm/state_changes.py", line 139, in _go
    ret_value = fn(self, *arg, **kw)
                ^^^^^^^^^^^^^^^^^^^^
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 1214, in _prepare_impl
    self.session.flush()
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 4179, in flush
    self._flush(objects)
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 4314, in _flush
    with util.safe_reraise():
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/util/langhelpers.py", line 147, in __exit__
    raise exc_value.with_traceback(exc_tb)
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 4275, in _flush
    flush_context.execute()
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/orm/unitofwork.py", line 466, in execute
    rec.execute(self)
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/orm/unitofwork.py", line 642, in execute
    util.preloaded.orm_persistence.save_obj(
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/orm/persistence.py", line 93, in save_obj
    _emit_insert_statements(
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/orm/persistence.py", line 1226, in _emit_insert_statements
    result = connection.execute(
             ^^^^^^^^^^^^^^^^^^^
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1412, in execute
    return meth(
           ^^^^^
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/sql/elements.py", line 515, in _execute_on_connection
    return connection._execute_clauseelement(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1635, in _execute_clauseelement
    ret = self._execute_context(
          ^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1844, in _execute_context
    return self._exec_single_context(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1984, in _exec_single_context
    self._handle_dbapi_exception(
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 2339, in _handle_dbapi_exception
    raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1965, in _exec_single_context
    self.dialect.do_execute(
  File "/root/.venv/lib/python3.11/site-packages/sqlalchemy/engine/default.py", line 921, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.IntegrityError: (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "pk_offering"
DETAIL:  Key (offering_id)=(964) already exists.

[SQL: INSERT INTO offering (offering_id, crn, instructor, price, duration, status, course_id, term_id, created_at, updated_at, deleted_at) VALUES (%(offering_id)s, %(crn)s, %(instructor)s, %(price)s, %(duration)s, %(status)s, %(course_id)s, %(term_id)s, now(), now(), %(deleted_at)s) RETURNING offering.created_at, offering.updated_at]
[parameters: {'offering_id': 964, 'crn': '49028', 'instructor': 'Chris Harris', 'price': 509.83, 'duration': '12 weeks', 'status': 'In Progress', 'course_id': 27, 'term_id': '202410', 'deleted_at': None}]
(Background on this error at: https://sqlalche.me/e/20/gkpj)

Metadata

Metadata

Assignees

No one assigned

    Labels

    backendChange to the backendbugSomething isn't workingdatabaseChange to how data is stored

    Projects

    Status

    Idea

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions