Commit graph

495 commits

Author SHA1 Message Date
Jakob Ackermann
45ac2bdd97 Merge pull request #153 from overleaf/jpa-backport-drainmanager-tests
[backport] 108 and 112: DrainManager acceptance tests
2020-06-10 11:28:08 +02:00
Jakob Ackermann
cfe37dcbb5 Merge pull request #152 from overleaf/jpa-vendor-cookie
[misc] vendor a patched session.socket.io middleware
2020-06-10 11:27:58 +02:00
Jakob Ackermann
5b85e0154d Merge pull request #151 from overleaf/jpa-perf-parser
[perf] add a few short cuts to the packet decoding
2020-06-10 11:27:50 +02:00
Jakob Ackermann
b2e4448992 [misc] test/acceptance: ReceiveUpdateTests: test remotely sent update 2020-06-10 09:53:48 +01:00
Jakob Ackermann
83e3ff0ed7 [misc] test/acceptance: ReceiveUpdateTests: add 2nd project/3rd client
...and check for cross project leakage.
2020-06-10 09:53:29 +01:00
Jakob Ackermann
56fda1f9b0 [misc] test/acceptance: use the correct redis instances 2020-06-10 09:43:15 +01:00
Jakob Ackermann
bc44494466 [HttpController] return 404 in case of a missing client and add tests
Add acceptance tests for the client view.
2020-06-10 09:41:36 +01:00
Jakob Ackermann
eabff1d6b2 [perf] test/acceptance: DrainManagerTests: cleanup previous clients
...before starting to drain.
2020-06-09 18:02:38 +01:00
Jakob Ackermann
de35fc5ecf [HttpApiController] implement the disconnection of a single client
The http route returns as soon as the client has fully disconnected.
2020-06-09 18:01:08 +01:00
Jakob Ackermann
91e296533f [misc] test/acceptance: add tests for the draining of connections 2020-06-09 18:00:44 +01:00
Jakob Ackermann
acb7d7df5a [misc] add test cases for the validation of the callback argument
When the user provides a function as last argument for socket.emit,
 socket.io will flag this as an RPC and add a cb as the last argument
 to the client.on('event', ...) handler on the server side.
Without a function as last argument for socket.emit, the callback
 argument on the server side is undefined, leading to invalid function
 calls (`undefined()`) and an unhandled exception.
The user can also provide lots of other arguments, so the 2nd/3rd ...
 argument is of arbitrary type, again leading to invalid function calls
 -- e.g. `1()`.
2020-06-09 16:30:03 +01:00
Jakob Ackermann
853ee994a6 [perf] add a few short cuts to the packet decoding 2020-06-09 15:26:17 +01:00
Jakob Ackermann
dc553c4150 [misc] vendor a patched session.socket.io middleware 2020-06-09 15:21:33 +01:00
Jakob Ackermann
1c9eaf574a Merge pull request #147 from overleaf/jpa-backport-141
[backport] 141: Router: validate the callback argument
2020-06-09 14:41:24 +02:00
Jakob Ackermann
5238e6ad36 Merge pull request #145 from overleaf/jpa-backport-140
[backport] 140: stop processing requests as we detect a client disconnect
2020-06-09 14:41:12 +02:00
Jakob Ackermann
589bedc3dd Merge pull request #146 from overleaf/jpa-downgrade-logging
[misc] downgrade logging when running tests
2020-06-09 14:40:59 +02:00
Jakob Ackermann
32af7001fc [misc] Router: prefix the publicId with 'P.' for easy differentiation 2020-06-08 11:29:40 +01:00
Jakob Ackermann
f40241a037 [misc] downgrade logging when running tests 2020-06-05 11:38:09 +01:00
Jakob Ackermann
0b2cccf1e0 [misc] apply review feedback: adjust metric names
Co-Authored-By: Brian Gough <brian.gough@overleaf.com>

(cherry-picked from commit 67674b83efb452ece05cdc39525ee3a5eeb8a4d7)
2020-06-05 11:33:00 +01:00
Jakob Ackermann
ddcb9cf8c8 [misc] downgrade a warning message from clients leaving non-joined rooms
This can now happen all the time, as we skip the join for clients that
 disconnect before joinProject/joinDoc completed.

(cherry-picked from commit f357931de74e088800f3cced3898cce4f251dad0)
2020-06-05 11:32:43 +01:00
Jakob Ackermann
7fa9061015 [misc] stop processing requests as we detect a client disconnect
v2 expose `client.connected`; v0 exposes `client.disconnected`

(cherry-picked from commit a9d70484343ca9be367d45bf7bb949e4be449647)
2020-06-05 11:32:13 +01:00
Jakob Ackermann
0840700ffd [Router] validate the callback argument 2020-06-05 10:59:01 +01:00
Jakob Ackermann
c6d08647c7 [misc] socket.io: use a secondary publicId for public facing usages 2020-06-04 17:18:07 +01:00
Brian Gough
f973b377f0 update node to v10.21.0 2020-06-03 09:12:21 +01:00
Tim Alby
94d57f50c0 add fake lint and format targets
Highly hacky!
Lint and format steps are coming very soon thanks to the decaf, but in the
meantime we need steps to pass CI. Updating the build scripts after the
decaf will undo this change.
2020-05-28 16:09:27 +02:00
Jakob Ackermann
d13acb8ca3 [ChannelManager] port related and still mostly valid test from v2
I skipped the bulk of verifyConsistentBehaviour tests which are not
 valid for the new implementation -- there is no optimization and
 always cleanup.
2020-05-15 18:39:33 +02:00
Jakob Ackermann
41debfae0f [ChannelManager] rework (un)subscribing to redis
- send a subscribe request on every request
- wait for a pending unsubscribe request before subscribing
- wait for a pending subscribe request before unsubscribing

Co-Authored-By: Brian Gough <brian.gough@overleaf.com>
2020-05-15 18:34:33 +02:00
Jakob Ackermann
1095851dfe [misc] test/unit: fix typos and assertion of error messages
Sinon does not check the contents of the passed error when checked via
 sinon.stub().calledWith.
```
callback = sinon.stub()
callback(new Error("some message"))
  .calledWith(new Error("completely different message"))
  === true
```

Cherry-pick plus an additional patch for the joinProject bail-out.
(cherry picked from commit d9570fee70701a5f431c39fdbec5f8bc5a7843fe)
2020-05-15 14:46:05 +02:00
Jakob Ackermann
17d04b9041 [misc] bump sinon to 2.x for promise support with stubs
(cherry picked from commit 3c92b937f9430175d7c51660e03c507635448e88)
2020-05-15 12:01:22 +02:00
Jakob Ackermann
b713beb7f0 Merge pull request #135 from overleaf/jpa-skip-leave-project-for-invalid-clients
[WebsocketController] skip leaveProject when joinProject didn't complete
2020-05-13 15:34:41 +02:00
Jakob Ackermann
55af5e502f [WebsocketController] skip leaveProject when joinProject didn't complete
Also drop dead code:
 - user_id bailout

   There is a check on a completed joinProject call now. It will always
    set a user_id, see Router.coffee which has a fallback `{_id:"..."}`.

 - late project_id bailout

   WebsocketLoadBalancer.emitToRoom will not work without a project_id.
   We have to bail out before the call.
2020-05-12 17:15:08 +02:00
Jakob Ackermann
684cb3903c [WebsocketController] handle redis subscribe error on joinProject
joinProject should not complete when the redis pub/sub subscribe request
 failed.
2020-05-12 13:03:50 +02:00
Brian Gough
5c28da1031 add metric for pendingUpdates queue 2020-04-07 11:53:54 +01:00
Brian Gough
5765884f38 Merge branch 'master' into jpa-pub-sub-metrics 2020-04-07 11:46:23 +01:00
Henry Oswald
8711abdb66 bump redis to 1.0.12 2020-03-31 14:04:33 +01:00
Jakob Ackermann
56628a16c6 [misc] track redis pub/sub payload sizes on publish 2020-03-30 11:31:44 +02:00
Jakob Ackermann
a9b8e9be3b [misc] upgrade metrics-sharelatex to 2.6.2 2020-03-30 10:47:01 +02:00
Jakob Ackermann
69569e3571 [misc] config: add headroom for JSON serialization in maxUpdateSize 2020-03-24 16:21:29 +01:00
Jakob Ackermann
af53d3b603 [misc] skip duplicate JSON serialization for size check 2020-03-24 11:22:28 +01:00
Jakob Ackermann
cb675d38c2 [misc] SafeJsonParse: align the size limit with the frontend->rt limit
frontend -> real-time and doc-updater -> real-time should be in sync.
Otherwise we can send a payload to doc-updater, but can not receive the
 confirmation of it -- and the client will send it again in a loop.

Also log the size of the payload.
2020-03-24 09:14:15 +01:00
Jakob Ackermann
15244a54be [misc] WebsocketController: limit the update size to 7mb
bail out early on -- especially do not push the update into redis for
 doc-updater to discard it.

Confirm the update silently, otherwise the frontend will send it again.
Broadcast a 'otUpdateError' message and disconnect the client, like
 doc-updater would do.
2020-03-24 09:12:12 +01:00
Jakob Ackermann
d320c2d5f3 [misc] let proxys observe an upcoming shutdown before starting to drain
Otherwise clients may be routed to the same pod upon reconnecting.
2020-03-17 16:41:48 +01:00
Jakob Ackermann
2b1c5bf436 Merge pull request #100 from overleaf/jpa-dependencies-cleanup
[misc] cleanup dependencies
2020-02-14 10:56:11 +01:00
Jakob Ackermann
43013e0820 [misc] cleanup unused dependency on mongo 2020-02-12 14:44:01 +01:00
Jakob Ackermann
902b4fca46 [misc] rename npm-shrinkwrap.json to package-lock.json and run npm i 2020-02-12 14:39:53 +01:00
Brian Gough
1ad8315437 remove unused .travis.yml file 2020-02-12 12:37:00 +00:00
Jakob Ackermann
24d46e9d4b [misc] update the build scripts to 1.3.5 2020-02-11 12:27:56 +01:00
Brian Gough
9a4124ee11 Merge pull request #97 from overleaf/bg-update-node
Update to node:10.19.0 docker image
2020-02-10 10:00:43 +00:00
Brian Gough
98b29889bd Merge pull request #95 from overleaf/bg-revert-unnecessary-changes
Revert debugging PRs for node 10 upgrade
2020-02-10 10:00:17 +00:00
Brian Gough
e0e2090a42 update node version in nvmrc and buildscripts 2020-02-07 14:41:12 +00:00
Brian Gough
04a9d66784 use public node:10.19.0 image 2020-02-07 14:15:48 +00:00
Brian Gough
abe4d1d525 update to gcr.io/overleaf-ops/node:10.19.0 2020-02-06 03:34:30 +00:00
Brian Gough
64bd739a87 Revert "Merge pull request #91 from overleaf/spd-trycatch-all-the-things"
This reverts commit 2bf7f14f9d050c58f141f465633bb6e274b903dd, reversing
changes made to 989240812532ca43a52513339f4dda8f44a80a64.
2020-02-05 10:05:36 +00:00
Brian Gough
4ec82b1baa upgrade to local node:10.18.1 image 2020-02-04 15:02:15 +00:00
Simon Detheridge
4102aa0580 Add more detail to metric 2020-02-04 14:03:56 +00:00
Simon Detheridge
8e45a62e32 Handle ECONNRESET in the same way as EPIPE 2020-02-04 13:58:45 +00:00
Simon Detheridge
7663e9689e Merge pull request #91 from overleaf/spd-trycatch-all-the-things
Add try/catch around all client emissions
2020-02-04 13:21:49 +00:00
Simon Detheridge
fbff3fe727 Don't shut down on uncaught EPIPE 2020-02-04 12:56:43 +00:00
Simon Detheridge
9fd4699098 Merge remote-tracking branch 'origin/bg-avoid-emit-when-client-not-connected' into spd-trycatch-all-the-things 2020-02-04 12:35:35 +00:00
Simon Detheridge
216a977922 Add try/catch around all client emissions 2020-02-04 12:13:03 +00:00
Brian Gough
ebb83e4633 use diconnected property, not connected 2020-02-04 11:59:25 +00:00
Brian Gough
7380d523d5 avoid emitting when client not connected
the emit is happening asynchronously after the client list is computed,
so clients may have disconnected in the intervening time.
2020-02-04 11:39:37 +00:00
Brian Gough
e263d37476 pass the signal correctly to the shutdown handler 2020-02-04 11:14:53 +00:00
Brian Gough
1fc8cc44c3 log shutdown messages as warnings 2020-02-04 11:14:14 +00:00
Simon Detheridge
c7e2b99a7b Update hybi-16 patch to work with socket.io 0.9.19 2020-02-04 10:43:06 +00:00
Simon Detheridge
ef852dfa33 Update socket.io to latest patch release 2020-02-04 10:32:54 +00:00
Brian Gough
4f94110225 Merge pull request #84 from overleaf/spd-catch-errors
Bump to Node 10 and add error handlers for socket.io
2020-02-03 14:59:04 +00:00
Brian Gough
49a8e1214b use a separate field for client errors 2020-02-03 14:47:45 +00:00
Brian Gough
e04b6e1e49 Update app/coffee/Router.coffee
Co-Authored-By: Jakob Ackermann <das7pad@outlook.com>
2020-02-03 14:46:14 +00:00
Shane Kilkelly
e63c6f4395 Merge pull request #87 from overleaf/sk-restricted-users
Filter "comments" if restricted user.
2019-10-31 10:21:11 +00:00
Shane Kilkelly
6df88ebc49 Filter "comments" if restricted user. 2019-10-30 13:54:40 +00:00
Shane Kilkelly
403caa65e8 Revert "Revert "Track the isRestrictedUser flag on clients""
This reverts commit 651e392a7c644403f199e1b03e7494b61ce71d0c.
2019-10-30 13:52:36 +00:00
Nate Stemen
9a838bd071 bump build script to 1.1.24 2019-10-25 13:23:13 -04:00
Nate Stemen
3dc7c357a5 add public link to contributing docs 2019-10-25 13:22:58 -04:00
Simon Detheridge
925a8651c1 Revert "Track the isRestrictedUser flag on clients" 2019-10-22 10:17:38 +01:00
Simon Detheridge
c31c2d292d Merge pull request #81 from overleaf/sk-restricted-users
Track the `isRestrictedUser` flag on clients
2019-10-22 09:45:00 +01:00
Simon Detheridge
ce366fdbee Bump Dockerfile to node 10 2019-10-17 12:46:07 +01:00
Simon Detheridge
7543f2fcbd Catch errors from socket.io and attempt graceful cleanup 2019-10-17 12:45:56 +01:00
Brian Gough
dff4d66209 Merge pull request #83 from overleaf/bg-upgrade-ioredis
upgrade ioredis to v4.14.1
2019-10-14 11:18:45 +01:00
Brian Gough
f028148fe2 upgrade ioredis to v4.14.1 2019-10-14 11:10:20 +01:00
Shane Kilkelly
2cc2be3d9c send messages to clients with concurrency of 2 2019-10-11 10:01:21 +01:00
Shane Kilkelly
06aa578bdc Make it an error when we get no data from joinProject 2019-10-11 09:57:16 +01:00
Simon Detheridge
85b23d7da7 Add maxRetriesPerRequest setting for redis 2019-10-10 16:56:58 +01:00
Shane Kilkelly
df6cd4a054 Also block getConnectedUsers for restricted users.
Plus refactor to use a pass list instead of a deny list.
2019-10-04 13:41:49 +01:00
Shane Kilkelly
6765d03339 Track the isRestrictedUser flag on clients
Then, don't send new chat messages and new comments to those restricted clients.
We do this because we don't want to leak private information (email addresses
and names) to "restricted" users, those who have read-only access via a
shared token.
2019-10-04 10:30:24 +01:00
Simon Detheridge
21e294c6eb Generate retryable error when hitting rate limits in web 2019-09-02 11:27:04 +01:00
Brian Gough
fe2e7b3065 minimal fix for undefined connected users 2019-08-16 10:07:30 +01:00
Henry Oswald
38ed780d80 add log line to draining 2019-08-15 14:41:22 +01:00
Brian Gough
b0f0fb64ac clean up unused variable, convert setting to number 2019-08-15 09:48:42 +01:00
Brian Gough
a7a161556f Merge branch 'bg-status-on-shutdown' 2019-08-15 09:42:00 +01:00
Brian Gough
fa94e3d5e3 Merge pull request #69 from overleaf/ho-drain-connections-timewindow
add shutdownDrainTimeWindow, drains all connections within time range
2019-08-15 09:22:10 +01:00
Brian Gough
022e47b5c8 Merge pull request #73 from overleaf/bg-connected-client-metrics
add connected client count metric
2019-08-15 09:20:03 +01:00
Brian Gough
3552fa40c2 Merge pull request #72 from overleaf/bg-refresh-client-list
refresh client list
2019-08-15 09:19:20 +01:00
Henry Oswald
78629610d5 add health check endpoint and http route logger 2019-08-14 15:38:02 +01:00
Brian Gough
f13e66b453 fix client count so that result is zero when all clients have left 2019-08-14 15:34:23 +01:00
Brian Gough
8270c14d86 add connected client count metric 2019-08-14 15:22:03 +01:00
Brian Gough
d57b229e17 update tests 2019-08-14 13:03:14 +01:00
Brian Gough
d3171e4e2e remove unwanted argument 2019-08-14 13:03:06 +01:00
Henry Oswald
2ae4c8c174 Merge branch 'ho-drain-connections-timewindow' of github.com:overleaf/real-time into ho-drain-connections-timewindow 2019-08-14 11:52:22 +01:00
Henry Oswald
4a984f533e remove forceDrainMsDelay
as soon as a pod is marked as being killed we should start draining
2019-08-14 11:51:25 +01:00
Brian Gough
20d442120f notify docupdate if the flush is from a shutdown 2019-08-13 17:36:53 +01:00
Brian Gough
7db882f339 fix unit tests 2019-08-13 17:26:49 +01:00
Brian Gough
0708f717fd reject connections when shutdown in progress
send a message to the client to reconnect immediately
2019-08-13 16:59:15 +01:00
Brian Gough
53431953fc make shutDownInProgress available via settings 2019-08-13 16:56:48 +01:00
Brian Gough
b3e5709b64 enforce a minimum drain rate 2019-08-13 16:15:30 +01:00
Henry Oswald
00cca29d9e add shutdownDrainTimeWindow, drains all connections within time range 2019-08-13 14:21:47 +01:00
Brian Gough
5b54d36b37 fail readiness check when shutting down 2019-08-13 10:41:35 +01:00
Brian Gough
2000f478a7 refresh the client list on demand 2019-08-13 10:40:03 +01:00
Miguel Serrano
49c7bde799 Merge pull request #66 from overleaf/msm-patched-eventemitter-socketio-node7
Patched EventEmitter for socket.io compatibility with Node >= 7
2019-08-12 11:30:11 +02:00
Brian Gough
478a727c61 ignore spurious requests to leave other docs 2019-07-29 15:19:08 +01:00
Brian Gough
04a171171f fix async behaviour of join/leave 2019-07-29 11:54:02 +01:00
mserranom
cf0df28f4c Patched EventEmitter for socket.io compatibility with Node >= 7 2019-07-25 09:22:24 +00:00
Brian Gough
22d722f3e8 add metric for RoomEvents listeners 2019-07-24 16:25:45 +01:00
Brian Gough
277ec71a5b subscribe to doc updates before requesting doc content 2019-07-24 15:49:29 +01:00
Brian Gough
1c74cbbc4e add comments 2019-07-24 15:49:29 +01:00
Brian Gough
273af3f3aa refactor subscribe resolution 2019-07-24 14:30:48 +01:00
Brian Gough
e14a94906a update naming from Set -> Map 2019-07-24 14:18:15 +01:00
Brian Gough
cb53bfafd6 remove unnecessary require 2019-07-24 09:52:31 +01:00
Brian Gough
61b3a000b4 fix whitespace 2019-07-24 09:52:20 +01:00
Brian Gough
159b39c491 ensure redis channel is subscribed when joining room 2019-07-23 17:02:09 +01:00
Brian Gough
84e6ff616f whitespace fix 2019-07-22 12:25:41 +01:00
Brian Gough
bb629c27a1 rename unit test ChannelManager to ChannelManagerTests 2019-07-22 11:28:49 +01:00
Brian Gough
1afebd12a1 unit tests 2019-07-22 11:23:43 +01:00
Brian Gough
92e6910180 cleanup 2019-07-22 11:23:33 +01:00
Brian Gough
8c7b73480f upgrade sinon to 1.17.7 for onCall support 2019-07-22 11:23:02 +01:00
Brian Gough
9f7df5f10c wip unit tests 2019-07-19 11:58:40 +01:00
Brian Gough
a538d10488 extend comment re disconnection 2019-07-19 08:56:38 +01:00
Brian Gough
616014e05d add comment about automatically leaving rooms 2019-07-19 08:50:43 +01:00
Brian Gough
40353a410f fix unit tests 2019-07-19 08:49:57 +01:00
Brian Gough
3bf5dd5d6b clarify errors for subscribe/unsubscribe 2019-07-18 14:25:25 +01:00
Brian Gough
f6f6f549d9 don't publish on individual channels until explicitly set 2019-07-18 12:55:23 +01:00
Brian Gough
804f4c2bd2 listen on separate channels for each project/doc 2019-07-18 12:55:23 +01:00
Brian Gough
ae512dc9fb Merge pull request #62 from overleaf/bg-patch-socket-io-frame-bug
monkeypatch socket.io to fix frame handler in v0.9.16
2019-07-17 13:48:04 +01:00
Brian Gough
9ecce32ff9 Merge pull request #63 from overleaf/bg-log-out-of-order-events
log out of order events now that the rate is lower
2019-07-17 13:47:45 +01:00
Brian Gough
0c6ba4c1a8 monkeypatch socket.io to fix frame handler in v0.9.16 2019-07-16 14:02:52 +01:00
Brian Gough
8a7804f0a7 make event order check a configuration setting 2019-07-15 13:45:34 +01:00
Brian Gough
24a4709cff log out of order events now that the rate is lower 2019-07-15 11:14:48 +01:00
Brian Gough
e632f9f29d only create per-client metrics when there are multiple redis clients 2019-07-11 11:35:48 +01:00
Brian Gough
80f8f2465e remove unused pubsub client 2019-07-11 11:10:33 +01:00
Brian Gough
689a75f397 add logging for redis clients at start up 2019-07-09 14:18:39 +01:00
Brian Gough
dd54789e2b fix build problems 2019-07-09 12:20:59 +01:00
Brian Gough
580b100362 only publish to one redis client in WebsocketLoadBalancer
but listen to all of them
2019-07-09 12:03:13 +01:00
Brian Gough
999cbd8ee6 add a per-client metric 2019-07-09 12:01:58 +01:00
Brian Gough
cb289f2dec make redis client list dynamic based on settings 2019-07-09 11:45:00 +01:00
Brian Gough
b5f9bc422b support multple redis instances for pubsub 2019-07-08 15:56:25 +01:00
Henry Oswald
520857cf7a simplify redis continual traffic
we can't send double health check events to same redis, it causes
health check duplicate errors. Commit just sends health check data to
pub sub pair and then sends non health check traffic to cluster to keep
the connection open
2019-07-08 12:07:28 +01:00
Henry Oswald
1038c5cd0d send health check to pubsub channel and use different var name 2019-07-08 11:53:42 +01:00
Henry Oswald
eadef7b133 Merge pull request #59 from overleaf/ho-redis-natmap-sentinel
Move pubsub to seperate connection
2019-07-08 11:18:12 +01:00
Henry Oswald
9953c933ee Update package.json 2019-07-08 11:18:02 +01:00
Henry Oswald
487865fad3 Merge pull request #57 from overleaf/sk-dep-upgrades-2019-06
update logger and metrics
2019-07-08 11:17:29 +01:00