- 
                Notifications
    
You must be signed in to change notification settings  - Fork 34
 
Registry server: wrap mutating actions in transactions #700
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
          Codecov Report
 
 @@            Coverage Diff             @@
##             main     #700      +/-   ##
==========================================
+ Coverage   55.68%   56.70%   +1.02%     
==========================================
  Files          93       93              
  Lines        7959     8008      +49     
==========================================
+ Hits         4432     4541     +109     
+ Misses       3063     3028      -35     
+ Partials      464      439      -25     
 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more  | 
    
This ensures that associated blobs are also stored correctly.
…nd project actions.
… up after the mutating transaction completes.
| logger.Info("Success.") | ||
| case codes.Internal: | ||
| logger.Error("Internal error.") | ||
| case codes.Canceled: | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This adds support for all gRPC status codes so that logging won't report "User error" for the unrecognized ones. It's in this commit because I used it for testing but I'll make a separate PR for just this change (to interceptor.go) to hopefully commit before this one.
| if _, ok := status.FromError(err); ok { | ||
| return err | ||
| } | ||
| if err == context.DeadlineExceeded { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file adds support for new errors that were encountered in load-testing of the server to verify the changes in this PR. I think they are off-topic enough to put in a separate PR that just modifies errors.go and would be merged prior to this one.
| 
           @seaneganx if we merge this before you're back, PTAL anyway, comment on anything, and we'll address it in issues/followup PRs.  | 
    
| var response *rpc.Artifact | ||
| if err := s.runWithTransaction(ctx, func(ctx context.Context, db *storage.Client) error { | ||
| // Creation should only succeed when the parent exists. | ||
| switch parent := parent.(type) { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Probably subjective, but might this be easier to read like this?
	var err error
	switch parent := parent.(type) {
	case names.Project:
		_, err = db.GetProject(ctx, parent)
	case names.Api:
		_, err = db.GetApi(ctx, parent)
	case names.Version:
		_, err = db.GetVersion(ctx, parent)
	case names.Spec:
		_, err = db.GetSpec(ctx, parent)
	case names.Deployment:
		_, err = db.GetDeployment(ctx, parent)
	}
	if err != nil {
		return err
	}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
        
          
                server/registry/server.go
              
                Outdated
          
        
      | return s.storageClient, nil | ||
| } | ||
| 
               | 
          ||
| func (s *RegistryServer) runWithTransaction(ctx context.Context, fn func(ctx context.Context, db *storage.Client) error) error { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Personally, I would prefer runInTransaction() or asTransaction().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I'll change this to "runInTransaction"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| 
           That last commit (a1ece2a) adds code without supporting tests. Is it reasonable to test?  | 
    
| 
           @theganyo let's discuss that -- I was just looking into retry configuration before adding a comment about that change. We can trigger the concurrency problem by running the bulk uploaders, e.g.  The change in that commit is observable here which explicitly ignores the (weird)   | 
    
| 
           I've been worried that we don't have a way of reliably triggering the concurrency / transactional conditions that we've been running into - and thus we'd be unable to verify that we've eliminated the issues. That said, if we can reliably trigger our concurrency issues through the bulk uploader and we don't have the means to more specifically test at this point, perhaps that's good enough for now.  | 
    
| // with improvements closer to the database level. | ||
| updateSpecMutex.Lock() | ||
| defer updateSpecMutex.Unlock() | ||
| log.Debugf(ctx, "Acquired lock after blocking for %v", time.Since(before)) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing this here, but will address the underlying problem in #715
I've been testing with remote databases and very high parallelism, logging and handling errors that arose from that, and saw some issues related to our lack of transaction support.
This PR wraps all mutating operations in action handlers in transactions (stage 1 of #706). The transaction wrapper also translates errors into gRPC errors, which with other proposed changes would allow us remove the majority of
grpcErrorForDBErrorcalls in thestoragepackage.Since delete actions are wrapped in transactions, transactions are removed from the deletion support code in the
storagepackage (stage 2 of #706).Stages 3-5 of #706 are left for future PRs.