Commit graph

541 commits

Author SHA1 Message Date
Jakob Ackermann
a4ae0ea12f [ShareJsUpdateManager] double check doc size before flushing 2021-05-06 18:20:51 +01:00
Jakob Ackermann
6e551f9b34 [perf] use MGET/MSET/DEL for manipulating multiple keys in one operation
In some cases we can get rid of MULTI/EXEC operations too.
- putDocInMemory: from 10 down to 2 operations
- removeDocFromMemory: from 14+4 down to 4+4 operations
- updateDoc: from 13 down to 8 operations
2021-04-13 11:47:10 +01:00
Jakob Ackermann
178440395f [perf] switch write sequence for doc contents and doc tracking
Doc contents are added only after the tracking has been setup.
All read paths on the tracking have been checked to gracefully handle
 the case of existing doc_id but missing doc contents.

- getDoc: -1 operation

REF: 0a2b47c660c60b95e360d8f3b3e30b862ceb6d79
2021-04-13 11:46:44 +01:00
Jakob Ackermann
50b24043b7 [perf] use MGET for fetching multiple keys in one operation
- getDoc: from 13 down to 2 operations
2021-04-09 08:42:35 +01:00
Eric Mc Sween
4d70bd664f Reintroduce Node 12 and metrics upgrades
These changes were previously merged, not deployed, and reverted. This
reverts the revert.

This reverts commit a6b8c6c658b33b6eee78b8b99e43308f32211ae2, reversing
changes made to 93c98921372eed4244d22fce800716cb27eca299.
2021-04-01 15:51:00 -04:00
Eric Mc Sween
4dd1b26b2e Revert "Merge pull request #161 from overleaf/em-upgrade-node-12"
This reverts commit d44102751b9436ad89c5b3b05e7abdff51fcc78a, reversing
changes made to 6c2f5b8d053b75c677da2b7ddd04f998d2be6fff.
2021-03-31 12:07:11 -04:00
Eric Mc Sween
1a2235a219 Upgrade to Node 12 2021-03-16 16:48:45 -04:00
Thomas
23738540ed Fix API request errors which could contain API hostname or address (#160)
Wrap errors produced by failing requests to web API, and remove the url/hostname from thrown error messages. (But keep the URL path for info.)
2021-02-24 15:09:19 +01:00
Henry Oswald
20a373d95c stop listening on the pending updates channels 10 times 2021-02-23 08:27:29 +00:00
Henry Oswald
c7e57cd28f add Dispatchers running on old queue while we migrate
revert once migrated
2021-02-15 14:16:45 +00:00
Henry Oswald
854e24bb57 remove unneeded anonymous func 2021-02-15 14:12:28 +00:00
Henry Oswald
11c8cfc939 shard the pending-updates-list queue 2021-02-02 16:38:25 +00:00
Eric Mc Sween
98f8d7f51c Set the diff-match-patch timeout to 100ms
This might result in worse diffs, but we don't want to spend a second
blocking the event loop while we figure out nicer diffs when comparing
documents.
2021-01-14 15:11:15 -05:00
Eric Mc Sween
de247302b1 Use a centralized diff-match-patch package
We use our own fork of the diff-match-patch npm package, which adds an
optimization for the semantic alignment loop.
2020-12-07 16:15:19 -05:00
Eric Mc Sween
dce5b8759a Decaf cleanup: capitalize class names 2020-12-07 15:30:02 -05:00
Eric Mc Sween
db4b0a6f38 Decaf cleanup: do not throw strings 2020-12-07 15:28:25 -05:00
Eric Mc Sween
8c70e72bfa Decaf cleanup: unused variable 2020-12-07 15:27:41 -05:00
Eric Mc Sween
9f17f3ea0a Decaf cleanup: remove default callback 2020-12-07 15:27:01 -05:00
Eric Mc Sween
b74e7f6feb Decaf cleanup: unnecessary returns 2020-12-07 15:25:52 -05:00
Eric Mc Sween
a91770e979 Decaf cleanup: remove Array.from() 2020-12-07 15:25:20 -05:00
Jakob Ackermann
5e00684dbb [misc] bump metrics module to 3.4.1
- renamed package from `metrics-sharelatex` to `@overleaf/metrics`
- drop support for statsd backend
- decaffeinate
- compress `/metrics` response using gzip
- bump debugging agents to latest versions
- expose prometheus interfaces for custom metrics (custom tags)
- cleanup of open sockets metrics
- fix deprecation warnings for header access
2020-11-25 11:57:19 +00:00
Jakob Ackermann
08ed5f6c9b [misc] bump @overleaf/redis-wrapper to version 2.0.0 2020-11-11 16:20:32 +00:00
Jakob Ackermann
16ef0d9610 [misc] mongodb: use the new db connector by default
mongojs was enabling it by default as well.
2020-09-10 10:40:05 +01:00
Jakob Ackermann
c337cf1c4f [misc] mongodb: refactor the process of setting up the db construct
Co-Authored-By: John Lees-Miller <jdleesmiller@gmail.com>
2020-09-07 09:49:06 +01:00
Jakob Ackermann
1d57706d44 [misc] migrate acceptance tests to the native mongo driver, drop mongojs 2020-08-28 17:06:25 +01:00
Jakob Ackermann
f80a92ce46 [misc] migrate the app to the native mongo driver
acceptance tests to follow in a separate commit
2020-08-28 15:12:30 +01:00
Eric Mc Sween
1d1f204021 Remove backwards-compat project update API
The project update endpoint accepted updates both in two array params:
docUpdates and fileUpdates, and in a single array: updates. This commit
removes the docUpdates/fileUpdates params now that web uses the updates
param.
2020-05-20 16:26:22 -04:00
Eric Mc Sween
9799b94752 Accept ordered doc and file updates
Add an `updates` parameter to the project update endpoint. It can be
used instead of `docUpdates` and `fileUpdates` to provide a single list
of updates in the order they should be processed.
2020-05-20 07:57:32 -04:00
Eric Mc Sween
6269ace987 Decaf cleanup: move functions to top level 2020-05-15 09:59:32 -04:00
Eric Mc Sween
3d000bcbe6 Decaf cleanup: use arrow functions for callbacks 2020-05-15 09:59:32 -04:00
Eric Mc Sween
751d9ea718 Decaf cleanup: simplify loops 2020-05-15 09:59:32 -04:00
Eric Mc Sween
7a5a782dc7 Decaf cleanup: camel case variables 2020-05-15 09:59:26 -04:00
Eric Mc Sween
ceb77f4c93 Decaf cleanup: remove unused variables 2020-05-14 16:54:08 -04:00
Eric Mc Sween
6b5760ca28 Decaf cleanup: simplify null checks 2020-05-14 16:53:22 -04:00
Eric Mc Sween
f2c67b66fa Decaf cleanup: remove default callbacks 2020-05-14 16:50:04 -04:00
Eric Mc Sween
2bff83137c Decaf cleanup: unnecessary returns 2020-05-14 16:35:10 -04:00
Eric Mc Sween
819aa378d9 Decaf cleanup: remove Array.from() 2020-05-14 16:32:05 -04:00
Eric Mc Sween
a2a1914a53 Use max_doc_length setting to limit incoming doc size 2020-05-11 11:15:37 -04:00
Eric Mc Sween
cb959ddfc1 Decaf cleanup: use arrow functions for callbacks 2020-05-11 11:14:37 -04:00
Eric Mc Sween
e4ac63dd19 Decaf cleanup: move functions to top level 2020-05-11 11:12:15 -04:00
Eric Mc Sween
64a881461f Decaf cleanup: camel case variables 2020-05-11 11:07:15 -04:00
Eric Mc Sween
fc73bbe1a5 Decaf cleanup: simplify null checks 2020-05-11 10:52:06 -04:00
Eric Mc Sween
80ea49c69c Decaf cleanup: remove __guard__ 2020-05-11 10:47:27 -04:00
Eric Mc Sween
814ac40e07 Decaf cleanup: unnecessary returns 2020-05-11 10:45:39 -04:00
Eric Mc Sween
3385ec5f26 Decaf cleanup: unnecessary Array.from() 2020-05-11 10:43:22 -04:00
Tim Alby
9f6ea07002 fix SyntaxError on export var 2020-05-06 12:17:08 +02:00
Tim Alby
e089cfc93c format config/settings.defaults.js & lib/diff_match_patch.js 2020-05-06 12:16:48 +02:00
Tim Alby
96e7a668b7 disable linting for lib/diff_match_patch.js 2020-05-06 12:16:04 +02:00
Tim Alby
dbf9e88dc3 prettier: convert app/js decaffeinated files to Prettier format 2020-05-06 12:09:33 +02:00
Tim Alby
a519980c10 decaffeinate: rename app/coffee dir to app/js 2020-05-06 12:09:23 +02:00
decaffeinate
dad1d1212f decaffeinate: Run post-processing cleanups on DeleteQueueManager.coffee and 58 other files 2020-05-06 12:09:15 +02:00
decaffeinate
1fa8882674 decaffeinate: Convert DeleteQueueManager.coffee and 58 other files to JS 2020-05-06 12:08:21 +02:00
decaffeinate
249b7247b5 decaffeinate: Rename DeleteQueueManager.coffee and 58 other files from .coffee to .js 2020-05-06 12:07:29 +02:00
Miguel Serrano
f935c392bc Merge branch 'master' into sk-upgrade-dependencies 2020-04-27 13:38:21 +02:00
Brian Gough
a51f61a555 remove redis migration code 2020-04-21 14:48:47 +01:00
Brian Gough
248edc03fa add comment about the two history clients 2020-04-21 14:44:19 +01:00
Brian Gough
af93193d6e remove new_project_history and use project_history instead 2020-04-21 14:43:48 +01:00
Brian Gough
ac68f59487 Merge branch 'master' into bg-use-separate-redis-for-project-history 2020-04-16 15:48:21 +01:00
Brian Gough
beb3691795 add metrics for redis get/update 2020-04-06 10:00:48 +01:00
Brian Gough
2b72ec49a1 add comments for redis metrics 2020-04-02 11:33:52 +01:00
Brian Gough
21824d49da Merge branch 'bg-add-queue-metrics' of github.com:overleaf/document-updater into bg-add-queue-metrics 2020-04-01 16:04:52 +01:00
Brian Gough
3a8c362fba add doclines set/del metric 2020-04-01 15:59:25 +01:00
Brian Gough
00b11bda96 use separate loop for pendingUpdates metric 2020-04-01 14:50:55 +01:00
Jakob Ackermann
17c2add0cf [misc] track redis pub/sub payload sizes on publish 2020-03-30 11:31:43 +02:00
Brian Gough
1a0550364d add metric for getdoc bytes 2020-03-25 14:27:41 +00:00
Brian Gough
891fcc696b add metric for pending updates queue 2020-03-25 14:27:41 +00:00
Brian Gough
e293d86c14 add metric for project history queue 2020-03-25 12:15:16 +00:00
Shane Kilkelly
ada4fba3dc Fix express deprecations 2020-03-19 15:39:57 +00:00
Eric Mc Sween
ff32104fe6 Merge pull request #123 from overleaf/em-doc-hard-delete
Add ignore_flush_errors option to the doc delete endpoint
2020-03-10 10:11:00 -04:00
Eric Mc Sween
d9caced0d6 Change skip_flush option to ignore_flush_errors in delete doc
Instead of skipping the flush, we'll still try to flush and proceed with
the doc deletion, even when the flush fails.
2020-03-10 09:40:49 -04:00
Eric Mc Sween
9b70eb75b3 Rename flush param to skip_flush in delete doc
Also move it to the query string instead of the body.
2020-03-09 16:27:32 -04:00
Eric Mc Sween
c09bc0e868 Add a "flush: false" option to the doc delete endpoint
This will delete the document from Redis without flushing to web,
docstore or history. To be used when something is broken.
2020-03-07 08:59:15 -05:00
nate stemen
ffd8d0745d use empty object for ranges if it doesn't exist 2020-03-06 13:49:30 -05:00
Brian Gough
0419039d4d Merge branch 'master' into bg-use-separate-redis-for-project-history 2020-02-21 14:13:33 +00:00
Brian Gough
338d3609f5 add comment about null byte check 2020-01-30 15:17:13 +00:00
Brian Gough
544ae05212 added note about rollback 2020-01-23 16:22:26 +00:00
Brian Gough
626e19ed1a add logging of migration phase at startup 2020-01-23 15:46:54 +00:00
Brian Gough
d5a2b96df9 add note about deleting the migration key entries 2020-01-23 14:36:59 +00:00
Brian Gough
7036803acf add missing argument to metrics.inc
also track retries rather than attempts (which is always 1 for a successful request)
2020-01-14 15:00:21 +00:00
Brian Gough
3caa0e7c05 add failure/retry metrics for web-api requests 2020-01-14 13:53:50 +00:00
Brian Gough
a638ef4251 add comment about locking in redis migration 2020-01-13 15:56:28 +00:00
Brian Gough
27044c2d02 allow migration phase to be modified at runtime for testing 2020-01-06 16:46:35 +00:00
Brian Gough
8ae95ebf60 fix rclient check in migration metrics 2020-01-06 16:45:36 +00:00
Brian Gough
97cbf46160 add metrics for migration 2019-12-16 11:46:35 +00:00
Brian Gough
a2e63d009e fix migration phase check 2019-12-16 09:55:26 +00:00
Brian Gough
d0c5eb5698 support migration of project history keys to separate redis instance 2019-12-13 16:38:41 +00:00
Brian Gough
ad19fee667 add setting so that double flush is the default
can be disabled to stop flushing to track-changes
2019-11-25 13:36:25 +00:00
Brian Gough
4f6583bbf2 fix getDocVersion and add tests 2019-11-25 13:28:36 +00:00
Brian Gough
68e12f4d2d add metrics for queue operations 2019-11-25 10:51:10 +00:00
Brian Gough
8b73bb9f13 Merge branch 'master' into bg-filter-track-changes-updates 2019-11-22 10:41:33 +00:00
Brian Gough
b7055eecee add metrics for history flushes 2019-11-22 09:14:32 +00:00
Brian Gough
65cf4cf7c7 make flush to track-changes failsafe 2019-11-21 14:58:35 +00:00
Brian Gough
dcd7649bad filter track-changes updates for projects using project-history 2019-11-19 10:02:56 +00:00
Brian Gough
d82b180b76 avoid project history queues building up with deferred flush 2019-10-03 04:05:24 +01:00
Brian Gough
c1454bc4ac Merge pull request #92 from overleaf/bg-flush-queue-prod-fixes
add continuous background flush
2019-10-02 13:11:00 +01:00
Brian Gough
0c14b7d2f8 add comment about background flush limit 2019-10-01 15:06:01 +01:00
Brian Gough
2845b23b70 add smoothing of delete spikes 2019-10-01 15:01:53 +01:00
Brian Gough
2c22a60052 add random jitter to cutoff time 2019-10-01 15:01:20 +01:00
Brian Gough
a32495d2b4 make background flush more adaptive 2019-10-01 14:09:41 +01:00
Brian Gough
73b4262186 add continuous background flush 2019-09-30 16:05:53 +01:00
Brian Gough
33fadf51c1 fix getDocTimestamps for multiple docs 2019-09-30 13:50:25 +01:00
Brian Gough
260923f291 keep flushQueuedProjects in the foreground 2019-09-27 10:46:24 +01:00
Brian Gough
7561e05660 check timestamps array length 2019-09-27 10:39:56 +01:00
Brian Gough
b7f3b848af remove unused dryRun option
Co-Authored-By: Jakob Ackermann <das7pad@outlook.com>
2019-09-26 15:50:55 +01:00
Brian Gough
3bc176259b fix log line 2019-09-26 15:46:54 +01:00
Brian Gough
8cdc8c410a fix error logging 2019-09-26 15:46:45 +01:00
Brian Gough
fc62abfcfa run flush of queued projects in the background 2019-09-26 15:46:14 +01:00
Brian Gough
ba35c73cb6 add comment about ZPOPMIN 2019-09-26 15:18:10 +01:00
Brian Gough
a709a0adaa for simplicity keep the cutoff time the same while flushing the queue 2019-09-26 15:05:38 +01:00
Brian Gough
eae4b352ca remove unnecessary check 2019-09-26 14:59:03 +01:00
Brian Gough
b49621b3e9 add comments 2019-09-26 10:14:49 +01:00
Brian Gough
83dd43b809 add metric for queue length 2019-09-25 17:04:36 +01:00
Brian Gough
f6b2ac7360 queue deletes for deferred processing 2019-09-25 16:42:49 +01:00
Brian Gough
912a3a7753 remove redis server-side hashing for performance
we still  compute the document hash in node, and check it on retrieval
but we don't check the hash at the point of writing it in redis which
was previously done with a redis Lua script.
2019-09-09 15:27:58 +01:00
Henry Oswald
aa15a76059 added log lines for all project flushing 2019-08-30 07:38:53 +01:00
Henry Oswald
0ae838dd2d add logger into project flusher 2019-08-29 20:36:00 +01:00
Brian Gough
a76e0dca88 skip history flush when project is cleared by realtime shutdown
history is flushed by a background cron job anyway
2019-08-15 09:51:16 +01:00
Simon Detheridge
6721b904a7 Merge pull request #82 from overleaf/bg-mongo-health-check
add a combined health check for mongo and redis
2019-08-08 14:16:38 +01:00
Simon Detheridge
06444d2cc4 Improve/fix serializers for update logging (#80)
* Improve/fix serializers for update logging
2019-08-08 14:10:54 +01:00
Brian Gough
40f6494b19 add a combined health check for mongo and redis 2019-08-07 16:25:23 +01:00
Simon Detheridge
df9ca8b272 Add serializer to print only length of large fields in production 2019-07-31 16:42:28 +01:00
Brian Gough
618880f99d remove unnecessary check for doc_id 2019-07-24 16:57:43 +01:00
Brian Gough
c9ccf62d71 support per-doc pubsub channels 2019-07-22 12:20:06 +01:00
Brian Gough
97487a077e fix cluster/sentinel connection with real-time 2019-07-10 09:42:05 +01:00
Henry Oswald
06ad0f7acd Merge pull request #75 from overleaf/ho-pubsub-connection
Remove real time redis connection and consolidate on pubsub
2019-07-08 13:58:41 +01:00
Henry Oswald
3b3b2da0f5 add pubsub redis connection and remove real time redis connection 2019-07-04 13:34:31 +01:00
Brian Gough
16fb297043 Revert "skip hash check when non-BMP characters replaced" 2019-06-27 11:39:45 +01:00
Brian Gough
f37860599d skip hash check when non-BMP characters replaced 2019-06-25 16:36:10 +01:00
Henry Oswald
fdef197271 Merge branch 'master' into ho-detailed-flush-status 2019-06-13 14:33:22 +01:00
Henry Oswald
d9a737f97c return failed and succesfully flushed projects when flushing everything 2019-06-13 14:21:38 +01:00
Brian Gough
e8dd1aae9c Merge pull request #70 from overleaf/bg-metric-for-invalid-hash
add metric for invalid hash and other sharejs errors
2019-06-12 13:50:34 +01:00
Brian Gough
d50b93df2f add metric for invalid hash and other sharejs errors 2019-06-11 16:48:06 +01:00
Brian Gough
966478cac4 fix hash check to use 'v' field instead of version 2019-06-11 14:11:46 +01:00
Brian Gough
e95059f98e handle non-urgent flushes in background 2019-06-03 10:01:10 +01:00
Brian Gough
0bbfa7de27 Merge branch 'master' into bg-downgrade-delete-component-error 2019-05-08 09:07:02 +01:00
Brian Gough
27a8248196 convert "Delete component" errors into warnings 2019-05-07 16:55:17 +01:00
Henry Oswald
daca83a057 add dryRun option to flush all projects 2019-05-02 16:54:22 +01:00
Henry Oswald
d5d1736a5e adds /flush_all_projects project 2019-05-02 16:43:35 +01:00
Tim Alby
c1c23e4bee record last author id on document flush
This is a multi-steps process:
* get a update's `user_id` from the metadata
* store the `user_id` (`lastUpdatedBy`) and current date (`lastUpdatedAt`) for
  the document in Redis on every updates
* fetch `lastUpdatedAt` and `lastUpdatedBy` from Redis on document flush
* send the data to web to be persisted in Mongo
2019-05-02 11:10:02 +01:00
Brian Gough
68e7b9c4e9 Merge pull request #48 from sharelatex/bg-check-incoming-hash
check incoming hash when present
2019-04-29 10:15:44 +01:00
James Allen
52f3596e53 Review feedback 2019-04-16 11:05:17 +01:00
James Allen
3d76f4b9bf Record a snapshot to mongo when a doc's comments/changes get collapsed 2019-04-11 13:27:46 +01:00
Brian Gough
3c635c8d98 check version before it is modified by applyOp 2019-04-09 09:20:48 +01:00
Brian Gough
cc1f3fce5b check incoming hash when present 2019-04-08 14:12:18 +01:00
Brian Gough
fd1425d83f include a unique id in every message published to redis 2019-03-21 12:10:15 +00:00
Brian Gough
8c5d74faef use explicit json content-type to avoid security issues with text/html 2019-02-12 16:45:11 +00:00
Henry Oswald
3bc4cb492a added log line 2019-02-07 16:30:53 +00:00
Henry Oswald
ecaef6485b revert the removal of realtime keyspace 2019-02-07 15:27:51 +00:00
Henry Oswald
4e1a2c787c Revert "turn down logging, use logger.info for less important data"
This reverts commit c5f91428e3c7702fbbd3ffd1ef7a772d513f33f2.
2019-02-06 15:29:22 +00:00
Christopher Hoskin
1217d8a80a Merge branch 'master' into csh-ho-docker-issue-1338-bulk-upgrade 2019-01-04 09:18:40 +00:00