Bulk pull, transform, push #1329
Unanswered
brianbruggeman
asked this question in
Q&A
Replies: 1 comment 2 replies
-
|
Hey @brianbruggeman, interesting use cases! Regarding the error. I think the trait bound of Then, you can create the stream by |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Context:
I'm working on a tool that allows my team to stream data for testing from a number of different sources to several different sink (e.g. s3, db, file system, spark, kafka, etc.). In the process, I need to also do some transformation of data for testing (e.g. pseudonymization and/or anonymization of personal data). In a non-testing scenario, I figure I should probably also have the option of just a straight copy/sync.
Initial Approach:
I had started with sqlx, and I had a solution that allowed me to stream from a source database, but I quickly ran into ergonomic issues trying to create generic push methods. It looks to me like sea-orm should be able to solve the ergonomic solutions. With raw sqlx, I had created a (set of) tokio channel(s) to do the following operations: pull, transform, capture metadata/metrics, push, validate. To allow for parallelization, I created an enumeration (Record) which had an entry for each of the types of tables rows I wanted to process. This worked really well for a pull. This took the form (more or less) of:
Problem:
I realized that I was going to need to use sqlx's
.bind(...)a little late for a push. This causes a bunch of problems for a generic push and would really expand my derive macro. Rather than pushing through, I thought maybe I could find another solution. I stumbled on sea-orm and the interface seemed pretty close to what I wanted for the insert. So I replaced my model above with the expected sea-orm Model version and a DeriveEntityModel. This caused my where clause to require a change, and it ended up here:This was fine, except when I called
pull_table_data, it gave me an error: EntityTrait isn't implemented. Okay, easy... just addEntityTraitas a restriction.Unfortunately, now there's a really strange (to me) error:
I'm not sure why there's a problem. Any chance someone can help me understand why I need to implement EntityTrait for MyRecord? It seems like they really should not be related.
Beta Was this translation helpful? Give feedback.
All reactions