Skip to content

Conversation

@tadast
Copy link
Collaborator

@tadast tadast commented Oct 3, 2023

SequenceServer MAKEBLASTDB wrapper was working in two-steps:

  1. invoke #scan - this was eagerly scanning for formatted,
    unformatted and DBs that may require reformatting and storing
    them in instance variables
  2. whenever any makeblast operation was performed, it relied on scan
    being run beforehand to populate the instance variables and was
    using these values to perform listing, formatting and reformatting
    operations.

When SequienceServer.init was invoked (any time the web server starts or the CLI binary is launched) it was calling makeblastdb.scan regardless of whether it will format/reformat the databases. This was rather slow on large database dirs (I saw upwards of a minute on a large dir).

This change refactors MAKEBLASTDB wrapper to only scan for DBs to format or reformat when it is actually going to perform any of these operations.

Now the class does not rely on running #scan beforehand to perform any operations, and invokes the data gathering methods lazilly (i.e. only when gathering data is required), making sure it does not perform any slow operations when they are not necessary.

SequenceServer MAKEBLASTDB wrapper was working in two-steps:
1) invoke #scan - this was eagerly scanning for formatted,
  unformatted and DBs that may require reformatting and storing
  them in instance variables
2) whenever any makeblast operation was performed, it relied on scan
  being run beforehand to populate the instance variables and was
  using these values to perform listing, formatting and reformatting
  operations.

When SequienceServer.init was invoked (any time the web server starts
or the CLI binary is launched) it was calling makeblastdb.scan regardless
of whether it will format/reformat the databases. This was rather slow
on large database dirs (I saw upwards of a minute on a large dir).

This change refactors MAKEBLASTDB wrapper to only scan for DBs to format
or reformat when it is actually going to perform any of these operations.

Now the class does not rely on running #scan beforehand to perform any
operations, and invokes the data gathering methods lazilly (i.e. only
when gathering data is required), making sure it does not perform
any slow operations when they are not necessary.
@ghost
Copy link

ghost commented Oct 3, 2023

👇 Click on the image for a new way to code review

Review these changes using an interactive CodeSee Map

Legend

CodeSee Map legend

@yannickwurm yannickwurm merged commit 903f1c7 into wurmlab:master Oct 4, 2023
@yannickwurm
Copy link
Member

Awesome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants