Commit graph

268 commits

Author SHA1 Message Date
Brian Gough
618cf99548 Merge pull request #4950 from overleaf/bg-realtime-log-relevelling
realtime log re-levelling

GitOrigin-RevId: 81d86ba648e7fc1436b638b67371b35fd831a62f
2021-09-15 08:03:25 +00:00
June Kelly
7b1044b8a8 Merge pull request #4870 from overleaf/jk-bg-validate-ids
RealTime: Validate IDs
GitOrigin-RevId: 884600125d362c5632faa75dc22d957cdddc101b
2021-09-02 08:03:14 +00:00
Jakob Ackermann
b8fcb265b2 [misc] EventLogger: drop explicit metrics.inc amount
The module has a hard-coded increment of 1, which is the only value the
 prometheus backend supports.
2021-07-13 12:40:46 +01:00
Jakob Ackermann
7e8e231059 [misc] run format_fix and lint:fix 2021-07-13 12:04:45 +01:00
Jakob Ackermann
a26ae73597 [misc] switch from settings-sharelatex to @overleaf/settings 2021-07-12 17:47:18 +01:00
Brian Gough
3044bc4cf5 add transport to get-connected-users metric 2021-05-13 14:51:26 +01:00
Brian Gough
9622b341b7 add transport type to metrics 2021-05-10 10:16:50 +01:00
Brian Gough
a3009d2ef2 Merge pull request #210 from overleaf/bg-real-time-status-file
add blue-green deployment support with load-balancer health-checks
2021-04-01 09:09:57 +01:00
Thomas Mees
e63b9ec5ea Lower log level of 'no project_id found on client' messages 2021-03-30 12:40:26 +02:00
Brian Gough
a55aa61d71 use .includes instead of .indexOf 2021-03-29 11:13:59 +01:00
Brian Gough
32184330d9 delay closing by 1 minute for deployment flip 2021-03-22 11:26:02 +00:00
Brian Gough
17e3068499 close real-time via a status file 2021-03-22 10:46:25 +00:00
Henry Oswald
49bed6595d added queueKey to error info when trying to push to redis 2021-02-09 12:42:58 +00:00
Henry Oswald
46b389e8b3 fix off by 1 error in key sharding 2021-02-09 10:48:40 +00:00
Henry Oswald
31e1808dd8 shard the pending-updates-list queue 2021-02-08 16:02:41 +00:00
Jakob Ackermann
746c5aeb80 [misc] bump metrics module to 3.4.1
- renamed package from `metrics-sharelatex` to `@overleaf/metrics`
- drop support for statsd backend
- decaffeinate
- compress `/metrics` response using gzip
- bump debugging agents to latest versions
- expose prometheus interfaces for custom metrics (custom tags)
- cleanup of open sockets metrics
- fix deprecation warnings for header access
2020-11-25 11:57:22 +00:00
Jakob Ackermann
15af5c7977 [misc] bump @overleaf/redis-wrapper to version 2.0.0 2020-11-11 16:24:22 +00:00
Jakob Ackermann
d6ac8c14e7 [RoomManager] drop duplicate joining of entities
REF: 0437e1d03f89a058f97a8884e3532a9a58b68b9d
REF: 62be5e29e5232150e7063bc189c5ad8a1189f972
Signed-off-by: Jakob Ackermann <das7pad@outlook.com>
2020-10-19 15:54:37 +01:00
Jakob Ackermann
c6f2a3b387 Merge pull request #189 from overleaf/jpa-relevel-log
[misc] re-level expected error log messages
2020-09-02 10:46:08 +02:00
Jakob Ackermann
4960bdd6fe [misc] re-level log: 404 from web -> WARN and emit 'project not found'
A stale browser tab tried to join a deleted project.
Emitting 'project not found'/'ProjectNotFound' will trigger a page
 reload in the frontend, upon web can render a 404.
See frontend: ConnectionManager.joinProject callback
2020-08-27 11:51:57 +01:00
Jakob Ackermann
884b340c75 [misc] re-level log: 403 from web goes to WARN and emit 'not authorized'
Users will get redirected to the login page and will see a 'restricted'
 page after logging in again.
See frontend: ConnectionManager.reportConnectionError
2020-08-27 11:51:56 +01:00
Jakob Ackermann
1ff9c1e71b [misc] add the rpc-method into the log context in Router._handleError 2020-08-27 11:51:55 +01:00
Jakob Ackermann
0647abf433 [misc] drop info-log in WebApiManager for joinProject being rate-limited
The CodedError is logged at warn-level in Router._handleError.
2020-08-27 11:51:09 +01:00
Jakob Ackermann
2ce7b36c95 [misc] drop duplicate log line for unauthorized applyOtUpdate calls
The violation is logged in Router._handleError.
2020-08-27 10:22:31 +01:00
Jakob Ackermann
dee4749e6d [misc] re-level log: properly silence unauthorized updateClientPosition 2020-08-27 10:11:40 +01:00
Jakob Ackermann
425052ff91 Merge pull request #187 from overleaf/jpa-o-error-tagging
[misc] migrate to OError tagging/wrapping
2020-08-25 11:46:28 +02:00
Jakob Ackermann
64e659bf43 Merge pull request #186 from overleaf/jpa-fix-join-project-error-context
[misc] fix join project error context
2020-08-25 11:46:18 +02:00
Jakob Ackermann
849a1cf416 Merge pull request #185 from overleaf/jpa-doc-id-in-error-context
[misc] add/bring back doc_id in error context
2020-08-25 11:42:41 +02:00
Jakob Ackermann
ee3d3b09ed [misc] wrap redis errors as tagging does not work with them
ioredis may reuse the error instance for multiple callbacks. E.g. when
 the connection to redis fails, the queue is flushed with the same
 MaxRetriesPerRequestError instance.
2020-08-24 10:12:20 +01:00
Jakob Ackermann
537e97be73 [misc] OError.tag all the errors in async contexts
See the docs of OError.tag:
https://github.com/overleaf/o-error#long-stack-traces-with-oerrortag
(currently at 221dd902e7bfa0ee92de1ea5a3cbf3152c3ceeb4)

I am tagging all errors at each async hop. Most of the controller code
 will only ever see already tagged errors -- or new errors created in
 our app code. They should have enough info that we do not need to tag
 them again.
2020-08-24 10:12:06 +01:00
Jakob Ackermann
8e31cc5c23 [Router] _handleError: joinProject error-context may not have project_id
The ol_context patch changed the priority of client context and rpc
 context.
This lead to the (possibly missing) project_id of the client context
 overwriting the project_id of the rpc context.
REF: f1d55c0a5437a518e9f4617473caed9ba928e648
2020-08-21 13:29:28 +01:00
Jakob Ackermann
f935b1881a [Router] leaveDoc: pass the doc_id into the error-context 2020-08-21 12:47:42 +01:00
Jakob Ackermann
fd88819eec [Router] _handleError: ol_context.doc_id does not exist, drop overwrite 2020-08-21 12:47:42 +01:00
Jakob Ackermann
880056d397 [Router] use a new UnexpectedArgumentsError 2020-08-21 12:47:08 +01:00
Jakob Ackermann
50140f785a [WebsocketController] use a new JoinLeaveEpochMismatchError 2020-08-21 12:47:08 +01:00
Jakob Ackermann
0462e3e437 [WebsocketController] use a new NotJoinedError 2020-08-21 12:47:07 +01:00
Jakob Ackermann
4cb8cc4a85 [DocumentUpdaterManager] use a new ClientRequestedMissingOpsError 2020-08-21 12:47:07 +01:00
Jakob Ackermann
8abfdb87ff [DocumentUpdaterManager] use a new DocumentUpdaterRequestFailedError 2020-08-21 12:47:07 +01:00
Jakob Ackermann
02a2382264 [WebApiManager] use a new CorruptedJoinProjectResponseError 2020-08-21 12:47:07 +01:00
Jakob Ackermann
68bc9d0d23 [WebApiManager] use a new WebApiRequestFailedError 2020-08-21 12:47:06 +01:00
Jakob Ackermann
59c4c884a5 [WebsocketController] use the new NotAuthorizedError 2020-08-21 12:47:06 +01:00
Jakob Ackermann
a8c51de510 [AuthorizationManager] use a new NotAuthorizedError 2020-08-21 12:47:06 +01:00
Jakob Ackermann
de518ea4eb [SessionSockets] use a new MissingSessionError 2020-08-21 12:47:05 +01:00
Jakob Ackermann
6828becb46 [DocumentUpdaterManager] use a new NullBytesInOpError 2020-08-21 12:47:05 +01:00
Jakob Ackermann
af50f9b02c [DocumentUpdaterManager] use a new UpdateTooLargeError 2020-08-21 12:47:05 +01:00
Jakob Ackermann
5950b26a42 [SafeJsonParse] migrate to OError and use a new DataTooLargeToParseError 2020-08-21 12:47:05 +01:00
Jakob Ackermann
f82177a46a [Errors] migrate to OError 2020-08-21 12:47:04 +01:00
Jakob Ackermann
ee59056c6e [misc] forcefully disconnect stale clients from shutdown process 2020-08-13 13:39:22 +01:00
Brian Gough
831d794bf4 clean up join/leave handling
Co-Authored-By: Jakob Ackermann <jakob.ackermann@overleaf.com>
2020-08-12 10:54:22 +01:00
Jakob Ackermann
562375d351 [misc] fix express deprecations 2020-07-22 09:45:14 +01:00