Skip to content

zeeshan595/random-challenge

Repository files navigation

Reedsy Challenge Node.js CI

Pre-Requisites

  • node v22.17.1
  • npm 11.5.2

Running the api

npm install
npm run start:dev
# visit: http://localhost:3000/doc

Testing the api

npm install
npm run test:cov
npm run test:e2e

Node.js Backend Engineer Challenge

You can submit all the answers to this assignment in a private, single repository or as a zipped folder, containing markdown and code. If you use GitHub, please share your answers with reedsyapplications.

Answering all the questions to a Production-level standard should take approximately 1 work day.

For coding questions use the latest Node LTS.

1. About you

Tell us about one of your commercial projects with Node.js.

In my current job I worked on an API that can authenticate users and let them publish and deploy their binaries on machines called holoports.

In addition it can also let users who own the holoports; Manage them and connect them to their account so they can accrue revenue for hosting deployed code on their machine.

There is also a vue.js front-end that can be used by the users to interact with the product

Auth

We had in house Authentication using oAuth flow using (google, email magic link etc.)

After you go through the oAuth flow it will give you a access token with a refresh token. Access token is short lived and sent with every request while we use refresh token to get a new access token once it is expired.


Personal Project

I thought I would also take this opertunity to mention that I've worked with websockets in my personal project where I am making an interactive tabletop and multiple people can join a room.

2. Document versioning

Detail how you would store several versioned novels.

Your approach should:

  • show the novel in its current state
  • show the novel at any point in its history
  • show the changes made between two versions
  • prioritize disk space efficiency
  • discuss any trade-offs made, as well as potential mitigations
  • consider any potential domain-specific issues

Let's cover domain specific issues first so we can plan around them.

Domain Specific Issues

In novels only specific paragraphs or chapters will be edited at the same time. So it may be better to store chapters or paragraphs as seperate blobs.

In addition authors may move paragraphs forward or back depending on how they want the story to unfold.

Some authors may experiement with branching paths, Perhraps they are not sure how the story will end so they create multiple endings.

Author's can sometimes think for a while and then write large chunks when they are in the flow.

How to store novel

First lets split each chapter from the novel as seperate blobs and store them in a s3 storage. Then we can reference that in our database.

Whenever there is any changes made to the chapter we will store the diff as a seperate blob in the s3 storage. This means blob only contains the changes made to the chapter, not the full text. Again a reference of this will be stored in the database.

When we need the text of a specific version we can use all of the changes made to build up the chapter at that state.

To stop us re-constructing the chapters over and over again we can store the latest version already constructed in cache and in addition we can keep every nth version where n's value can be adjusted to give us the best performance. We can start of with keeping cache for every 50th version (n=50).

The cache is long lived and may be large so it may be better to store the cache off memory. Perhaps on redis or even store it as another blob on s3 storage. Because the main thing we want to do is avoid is re-constructing the version.

Lastly we should make sure we avoid making lots of small changes this can add a lot of unnecessary versions.

Instead creating a new version for every character changed. We should group the changes so only a new version is created for bigger chunks of changes.

---
title: Entity Relationship
---
erDiagram
  Novel ||--|{ Chapter : OneToMany
  Chapter ||--|{ Version : OneToMany

  Novel {
    int id
    string name
  }
  Chapter {
    int id
    int novel_id
    string name
  }
  Version {
    int id
    int chapter_id
    string blob_uuid
    int timestamp
  }
Loading

The actual blob inside the storage account can be stored as a json that contains all of the changes made to the text at this version.

We can minify the json so it takes up less bytes when transfering. If we need to transfer it even faster we can look at alternative data structure than json.

Comparing Versions

We can now easily re-construct any 2 versions and also check the changes between the 2 versions by moving through all the revisions.

Trade offs

There is a trade off between either using more storage or compute.

We can cache and store less constructed chapters but it will mean when going through the versions it will take more compute to construct them.

OR

We can store more constructed chapters which will mean when the user requests them we don't need to construct them leading to less compute. But we have to store that data somewhere which means more storage space will be used.

3. Node.js REST API

Implement a REST API using Express.js that handles Export and Import requests.

The API should expose endpoints to:

  • POST a request for a new Export job. Valid requests should be saved in memory. Invalid requests should return an error. The request must have the following schema:

    {
      bookId: string,
      type: "epub" | "pdf"
    }
  • GET a list of Export requests, grouped by their current state (see below).

  • POST a request for a new Import job. Valid requests should be saved in memory. Invalid requests should return an error. The request must have the following schema:

    {
      bookId: string,
      type: "word" | "pdf" | "wattpad" | "evernote",
      url: string
    }
  • GET a list of Import requests, grouped by their current state (see below).

Both export and import requests should be created with a pending state, and with a created_at timestamp. An import or export should take the amount of time outlined below. After the specified time, the state should be updated from pending to finished and update an updated_at timestamp.

Job type Processing time (s)
ePub export 10
PDF export 25
import (any) 60

Your solution should:

  • use TypeScript or modern ES features
  • have reasonable test coverage
  • be scalable — this is a small app, but write it as if it will grow into a full, Production-grade server

Scalable

I noticed the GET endpoints in the exercise assumes that it will return the full list of imports and exports.

However, I am not a fan of this, In my opinion production-grade server should have pagination on lists that will grow by user data.

This is to stop sending large amounts of lists. I created 2 GET endpoints. One with pagination and filter and one that gives the full list grouped by status

I decided to use Cron Jobs instead of event based system so the requests act as a type of queuing system and it does not put too much stress on the system. We can specify how many imports and exports we should do in parallel.

I've also added a caching layer so whenever we GET requests by status it caches the result and subsequent GETs will use the cache. This should put less stress on the system if it was getting large number of GET requests.

Since everything is stored in memory, One consern I have is the system running out of memory.

  • be data store agnostic

Data Store Agnostic

I wasn't sure if you wanted me to connect it to a database with an ORM or not. I decided to just make a in memory store because the task specifically says "Valid requests should be saved in memory". You can view the file src/store, which is responsible for keeping the data in memory.

4. Operation collision

This exercise is located at src/operation-collision

When multiple users are collaborating on a document, collisions in their edits inevitably occur. Implement a module that can handle basic text update operations, and combine two colliding edits into a single operation.

An operation is described as an array of any combination of three types of edits:

  • { skip: number } to skip characters
  • { insert: string } to insert the given string
  • { delete: number } to delete a number of characters

Implement the following methods:

  • Operation.prototype.combine(operation) Updates the operation by combining it with another colliding operation
  • Operation.combine(op1, op2) Static method that returns a new operation by combining the arguments without mutating them
  • Operation.prototype.apply(string) Applies the operation to the provided argument

For example:

const s = "abcdefg";
const op1 = new Operation([{ skip: 1 }, { insert: "FOO" }]);
const op2 = new Operation([{ skip: 3 }, { insert: "BAR" }]);

expect(op1.apply(s)).to.equal('aFOObcdefg');
expect(op2.apply(s)).to.equal('abcBARdefg');

const combined1 = Operation.combine(op1, op2); // => [{ skip: 1 }, { insert: 'FOO' }, { skip: 2}, { insert: 'BAR' } ]
expect(combined1.apply(s)).to.equal('aFOObcBARdefg');

const combined2 = Operation.combine(op2, op1);
// NB: This expectation is true for this specific case, but not in the general case.
// Can you think of an example where this assertion might not be true?
expect(combined2.apply(s)).to.equal(combined1.apply(s));

Your solution should:

  • use TypeScript or modern ES features
  • have reasonable test coverage
  • explain any assumptions made

Assumptions

I've made the assumption that there cannot be negative skips and skips should always increment.

I've also made the assumption that if there are multiple deletes at the same time it should take the bigger delete. eg. Math.max(delete1, delete2)

There may be some edge case with interleaved skips that may cause it to give the incorrect value. However, I would need an example of this edge case to code around it.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

No packages published