James Allen
dfe26262ec
Return a No-op if diff returns nothing
2016-09-16 11:50:44 +01:00
James Allen
80375ae2dd
Run a diff against big delete - insert changes which are likely copy-pastes
2016-09-16 11:33:36 +01:00
Brian Gough
6e5eadfa86
include a timeout on WebApiManager requests
2016-04-13 16:42:36 +01:00
Brian Gough
a55b72871f
don't let s3 errors stop archive worker
2016-04-13 14:39:11 +01:00
Brian Gough
b343be844e
added metrics to pack worker for archiving
2016-04-08 10:29:04 +01:00
Brian Gough
76fe194815
add a metric for append-pack
2016-04-07 15:56:11 +01:00
Brian Gough
78100e40c8
add missing metrics file
2016-04-07 15:33:20 +01:00
Brian Gough
1a1fa8798d
log attempted update when throwing error in DiffGenerator
2016-04-07 15:16:50 +01:00
Brian Gough
6db310bf6b
add insert/archive/unarchive metrics
2016-04-07 15:16:50 +01:00
Brian Gough
d0e08039da
don't modify expiry for temporary packs
2016-04-07 15:16:50 +01:00
Brian Gough
fd49601716
preserve existing history when user upgrades
2016-04-07 15:16:38 +01:00
Brian Gough
e292de5eb0
fix to avoid ever appending permanent changes to expiring packs
2016-04-06 17:00:16 +01:00
Brian Gough
8b7bdd345b
consider all packs for archiving
2016-04-06 15:17:31 +01:00
Brian Gough
ef47337c78
remove additional fields
2016-04-06 15:17:20 +01:00
Brian Gough
0b9a0730c0
mark temporary packs with a last_checked date in the far future
...
they do not need to be checked for archiving
2016-04-06 14:29:49 +01:00
Brian Gough
08fc151eee
avoid unnecessary call to insert packs into index
2016-04-06 14:29:21 +01:00
Brian Gough
719e0291aa
consider all packs for processing
...
to allow finalisation of old head packs
2016-04-06 14:27:44 +01:00
Brian Gough
79baa99634
clean up logging
2016-04-06 14:26:54 +01:00
Brian Gough
6ab75795a2
archive head packs after sufficient time
2016-04-06 13:30:09 +01:00
Brian Gough
6e18d49736
support archiving from list of project_ids/doc_ids
2016-04-04 17:00:19 +01:00
Brian Gough
31348141d8
increase logging for discarded updates and version mismatch
2016-03-24 11:55:29 +00:00
Brian Gough
181cebecef
avoid call to fetch packs unnecessarily
2016-03-24 11:55:29 +00:00
Brian Gough
98683de3ae
temporarily disable ttl behaviour
...
allow existing packs without temporary flag to expire
2016-03-24 11:38:59 +00:00
Brian Gough
3f388fb0ac
only change ttl on cached packs, not temporary ones
...
temporary = without versioning feature enabled
cached = permanent versioned retrieved from s3
2016-03-24 11:38:09 +00:00
Brian Gough
8d900013d9
record whether a pack is temporary in the pack itself
...
using the expiresAt field no longer determines if the pack is
temporary because archived packs have an expiresAt field added when
they are retrieved from s3
2016-03-24 11:02:58 +00:00
Brian Gough
98738d1344
fix for acceptance test
2016-03-10 15:15:29 +00:00
Brian Gough
f01bf99682
acceptance tests - work in progress
2016-03-09 16:56:49 +00:00
Brian Gough
f6367e21b8
give separate error for archive in progress vs completed
2016-03-09 14:44:59 +00:00
Brian Gough
7350ab531d
exclude already cached packs from archival
2016-03-09 14:44:59 +00:00
Brian Gough
28b184e0ca
fix incorrect use of _.union (argument must be array)
2016-03-09 14:44:59 +00:00
Brian Gough
8922b97bd7
avoid duplicate filling of UserInfo in getDocUpdates
2016-03-09 14:44:59 +00:00
Brian Gough
7e6ea2793b
remove startup dependency on s3 settings
2016-03-09 13:28:02 +00:00
Brian Gough
1419d20b1f
fix indentation
2016-03-04 15:43:32 +00:00
Brian Gough
3175f6d3a6
handle case where index does not exist
2016-03-03 14:36:16 +00:00
Henry Oswald
e8b3fb5be6
added more logging to failed health checks
2016-03-03 10:50:55 +00:00
Brian Gough
795f717bab
added index definitions
2016-03-01 11:38:23 +00:00
Brian Gough
3d9dfeccc3
remove pack worker
...
remove the op-specific code
remove tests for ops, now only packing
remove unused packing code
work in progress
store index for completed packs only
support archiving and unarchiving of individual packs
remove support for archiving whole document history
split out ArchiveManager, IndexManager
remove old DocArchive code
remove docHistoryStats collection
comment about archiving
added method to look at index when last pack has been archived
added start of iterator for project results
use a proper iterator
added heap module
getting it working
increase pack size since bulk operations no longer needed
remove unused MongoAWSexternal
cleanup
added doc iterator
remove old query code
added missing files
cleanup
clean upclean up
started adding pack worker for archiving
work in progress
work in progress
getting pack worker working
updating worker
getting packworker working
added lock
use correct key name for track changes aws access
use correct key name for track changes aws access
always send back users array
fix up comparison of retrieved objects
handle op ids inside packs
log when s3 download completes
comments
cleanup, remove finalisation ideacleanup, remove finalisation idea
remove logging
2016-03-01 10:10:02 +00:00
Brian Gough
a23ddf31c0
allow packing of temporary ops
2016-01-29 12:36:03 +00:00
Brian Gough
77cafa36af
support continuing from last packed doc
2016-01-28 16:40:20 +00:00
Brian - Work
666a07e5ba
move lock check into HealthChecker
...
to avoid dependency of HttpController on LockManager in unit tests
2016-01-27 16:04:55 +00:00
Brian Gough
199d2aaa92
script to pack existing docs
2016-01-27 15:14:23 +00:00
Brian Gough
b44a7b9aa6
reject very large ops
2016-01-26 14:52:40 +00:00
Brian Gough
b7a4c72f9c
avoid compressing updates if the result would be too big
2016-01-26 12:23:21 +00:00
Brian Gough
ed0aaa189d
add test for non-overlapping insert-delete case
2016-01-26 12:13:43 +00:00
Brian Gough
b3ddd839e6
add logging of raw updates
2016-01-26 11:28:02 +00:00
Brian Gough
29c7c5e249
enable packs by default for new docs
2016-01-25 09:55:55 +00:00
Brian Gough
d10123d3c4
include n parameter when packing
2016-01-25 09:45:25 +00:00
Brian Gough
9b2cd11cd4
don't try to append to packs when using the old op code
2016-01-22 10:45:24 +00:00
Brian Gough
84ace7f4c7
use packs only for temporary ops
2016-01-20 14:22:13 +00:00
Brian Gough
78b3412ca8
decrease delay when packing
2016-01-19 15:58:09 +00:00
Brian Gough
679a81564e
respect mongo 3 limit of 1000 bulk operations
2016-01-19 15:58:09 +00:00
Brian Gough
f592611cac
always create a new pack, never keep as op
2016-01-19 15:58:09 +00:00
Brian Gough
c6be12f3d5
set v_end on pack creation
2016-01-19 15:58:09 +00:00
Brian - Work
f64969c784
added comment about query memory usage for toArray
2016-01-19 15:58:09 +00:00
Brian Gough
0532a4daaa
use compound index to replace separate index for packs
2016-01-19 15:56:09 +00:00
Brian Gough
0ba00a9eb7
expire temporary packs and roll over to a new pack each day
2016-01-19 15:56:09 +00:00
Brian Gough
5e830cbbdb
put all new ops into packs
2016-01-19 15:56:09 +00:00
Brian Gough
dc564fd5d0
archiving document history now sends all changes to s3
2016-01-15 15:54:46 +00:00
Brian Gough
5153ed8217
make peekLastUpdate alway return lastVersion when available
2016-01-15 15:54:44 +00:00
Brian Gough
8e53d66079
log the key for lock timeouts
2016-01-12 10:47:15 +00:00
Brian Gough
6199532d08
increase logging on s3 operations
2016-01-12 10:36:00 +00:00
Brian Gough
ca1f1dc944
handle exception in parsing retrieved json from aws
2016-01-12 09:26:29 +00:00
Brian Gough
b8862ca5af
switch to node-byline module to avoid buffering problem with readline-stream
...
for lines > 64k the readline-stream module is affected by
https://github.com/jahewson/node-byline/issues/30 which is fixed in
node-byline (readline-stream was an earlier fork of the byline module)
2016-01-11 16:51:35 +00:00
Brian Gough
cb109a27a6
allow PackWorker to shut down cleanly
2016-01-06 09:43:10 +00:00
Brian Gough
ffe30962c9
add a close() method to LockManager to allow clean shutdown
2016-01-06 09:34:39 +00:00
Brian Gough
05163837cb
add sentry error reporting to PackWorker
2016-01-05 16:00:52 +00:00
Brian Gough
6754bdca1c
log timestamp in human-readable form for inconsistent ops
2016-01-05 11:30:24 +00:00
Brian Gough
e1aa436286
respect mongo bulk operations limit of 1000 operations
2016-01-05 11:13:13 +00:00
Brian Gough
bb7153c6c1
workaround for mongojs db.close issue
...
https://github.com/mafintosh/mongojs/issues/224
2015-12-22 15:36:15 +00:00
Brian Gough
d3583b4ef6
respect limit of 1000 ops in bulk operation with mongojs 1.x
2015-12-22 14:38:04 +00:00
Brian Gough
c7b4062412
remove unsupported options argument in count() method of mongojs 1.x
2015-12-22 14:20:34 +00:00
Brian Gough
d49997d9f3
fix usage of BSON module
2015-12-21 16:56:49 +00:00
Brian Gough
b7de6f2f71
don't try to compress updates across point of broken history
2015-12-21 13:52:26 +00:00
Brian Gough
4a6374efe8
fix read order when retrieving diffs
2015-12-18 12:38:42 +00:00
Brian Gough
9f69c95192
Merge branch 'upgrade-mongojs'
2015-12-17 16:31:04 +00:00
Brian Gough
4a82dfe618
add setting trackchanges.continueOnError to allow recovery from missing ops
2015-12-17 16:28:02 +00:00
Brian Gough
b84a9e6e91
upgrade mongojs
2015-12-17 14:11:44 +00:00
Brian Gough
54d1036e37
skip ops marked as broken in database
2015-12-09 15:13:37 +00:00
Brian Gough
2a7c33d7ca
added /check endpoint for documents
2015-12-09 14:57:04 +00:00
Brian Gough
23c43b8042
skip any broken ops when viewing history diffs
2015-12-04 15:17:28 +00:00
Brian Gough
be2136de7c
fix update-in-place bug for array ops
2015-12-04 15:17:28 +00:00
Brian Gough
3842f0d1cc
Merge pull request #9 from sharelatex/only-delete-applied-ops
...
Only delete applied ops
2015-11-27 12:45:51 +00:00
Brian Gough
8ebc069ddb
modify last compressed op in place
2015-11-26 16:17:18 +00:00
Brian Gough
3432d9e91a
added comments for redis delete
2015-11-26 15:16:54 +00:00
Brian Gough
e65549099c
only delete the applied ops from redis
2015-11-25 16:01:07 +00:00
Brian Gough
992857d6a2
added redis write check to healthcheck
2015-10-29 10:52:23 +00:00
Brian Gough
c44d5b1b3d
added healthcheck
2015-10-19 10:59:39 +01:00
Brian Gough
ad144371d0
gracefully handle updates marked as broken
...
set update.broken == true to allow the user to view history without a
crash
2015-10-16 11:24:50 +01:00
Brian Gough
8961e23954
enhance LockManager to avoid accidental unlocking
2015-10-14 14:42:17 +01:00
Brian Gough
b6dae59655
fix callback logic in compressAndSaveRawUpdates
2015-10-08 16:39:13 +01:00
Brian Gough
8226bf3be4
increase lock time to 5 minutes
2015-10-08 16:11:39 +01:00
Brian Gough
add6a68fe1
add missing callback in compressAndSaveRawUpdates
2015-10-08 10:53:25 +01:00
James Allen
1a4b8f4269
API/service layout deprecation warning
2015-10-07 13:44:40 +01:00
James Allen
2a03591030
Stub out noisy/slow logger-sharelatex and mongojs modules in tests
2015-09-25 13:46:20 +01:00
James Allen
23dfe68cb8
Don't error when rewinding and insert op which is beyond the length of the document.
...
ShareJS will accept an op where p > content.length when applied,
and it applies as though p == content.length. However, the op is
passed to us with the original p > content.length. Detect if that
is the case with this op, and shift p back appropriately to match
ShareJS if so.
2015-09-25 13:44:44 +01:00
Brian Gough
92e0b0f04c
add logging to each stage of archiving
2015-09-24 09:10:06 +01:00
Brian Gough
e683b0275a
bug fix for clear archive in progress flag
2015-09-24 09:09:49 +01:00
Brian Gough
692e8c657c
Revert to the default lock timeout now we have write barriers
...
Revert "increase lock timeouts for archiving"
This reverts commit 9eee1b383772adf058130d6e5eab409f57ce03cd.
2015-09-24 08:53:09 +01:00
Brian Gough
2ab1778dd9
move default value of lastVersion into function body
2015-09-23 16:31:33 +01:00
Brian Gough
dc0044020f
only archive entries older than the current update
...
to avoid a stale version of the current update ever being pulled back
from S3
2015-09-23 14:33:40 +01:00
Brian Gough
696a866b67
pause the stream of ops, not the download
...
the download is buffered in the lineStream so a lot comes out even
after pausing the S3 download.
2015-09-23 13:38:57 +01:00
Brian Gough
847a553344
prevent double archiving by checking if any inS3 field is already present
2015-09-23 13:29:32 +01:00
Brian Gough
e49f260507
allow rollback/locking by setting inS3:false when starting the archive process
2015-09-23 13:28:07 +01:00
Brian Gough
551e8334cf
compressedUpdates are now never inserted with inS3
...
it is now always added later, and a new update is forced for any
addition to an archived update
2015-09-23 13:25:10 +01:00
Brian Gough
d6b827426c
support forcing new compressed update in popLastCompressedUpdate
...
callback with a null update, passing the version as an additional
argument
2015-09-23 13:22:38 +01:00
Brian Gough
a10dc4f898
Merge pull request #6 from heukirne/s3-archive
...
Add S3 archive track changes feature
2015-09-21 11:25:06 +01:00
Brian Gough
0e627c92d8
avoid clobbering global _ in loop
2015-09-18 16:26:05 +01:00
Henrique Dias
aa66c5ee8c
improve size function
2015-09-17 10:41:53 -03:00
Henrique Dias
3f712c452a
add size bulk limit
2015-09-17 09:23:13 -03:00
Brian Gough
7af5050370
add lock to unarchive doc
2015-09-16 16:18:36 +01:00
Brian Gough
18f06a3daf
increase lock timeouts for archiving
2015-09-16 16:09:38 +01:00
Brian Gough
b4ffa7d57e
share the document lock between archiving and packing
2015-09-16 16:03:55 +01:00
Brian Gough
9d39012b49
add error handler to each stage of download pipeline
2015-09-16 16:00:37 +01:00
Brian Gough
d9085a5e5e
add error handler to each stage of upload pipeline
2015-09-16 16:00:25 +01:00
Brian Gough
1c1b1d9595
log the case where there are no entries in the document history
2015-09-16 15:34:30 +01:00
Brian Gough
82d0f4fce8
make unarchive more responsive by downloading documents in parallel
...
unarchive is triggered interactively so we should try to make it
reasonably fast
2015-09-16 15:33:59 +01:00
Brian Gough
dfa0036507
pause stream while writing to mongo
2015-09-16 15:32:36 +01:00
Brian Gough
70200a9cf1
only log document ids, not document content
...
avoid filling the log with large documents
2015-09-16 15:31:43 +01:00
Brian Gough
d3dff28bea
Merge remote-tracking branch 'origin/master' into heukirne-s3-archive
2015-09-15 15:19:43 +01:00
Brian Gough
092f98d3ad
suppress error in normal shutdown case
2015-09-12 11:07:54 +01:00
Shane Kilkelly
eab8b4b6c8
Null safe access of id
property, needed as user can be null.
2015-09-11 14:07:06 +01:00
Shane Kilkelly
0ad374556d
Add a comment for clarity.
2015-09-10 16:43:40 +01:00
Shane Kilkelly
8387383cb4
In _summarizeUpdates, allow null users through.
...
A null value represents a deleted or otherwise missing user record.
2015-09-10 14:32:47 +01:00
Shane Kilkelly
810bddb2cb
Log a message when the web api produces a 404 response.
2015-09-10 14:32:35 +01:00
Shane Kilkelly
522786d45e
Produce a null value, rather than crashing when the user info service returns 404.
2015-09-09 15:48:22 +01:00
Henry Oswald
18d817ee0a
added some missing error handling
2015-09-08 16:33:45 +01:00
Henry Oswald
17b0d99a65
rework the archiveDocChangesWithLock function
...
make it a bit more readable for me, struggle to trust indentation
based calls in coffeescript
2015-09-08 16:26:01 +01:00
Henry Oswald
0b3ebcff06
remove if statments checking if s3 is a backend
...
if its not enable then it can crash. In prod it should always be there
or not used at all
2015-09-08 16:23:15 +01:00
Henrique Dias
c5a8a249c6
add unarchive acceptance tests
2015-09-03 08:36:32 -03:00
Henrique Dias
da9e7dc7e1
init archive acceptance tests
2015-09-02 18:47:34 -03:00
Henrique Dias
d2b1243701
split MongoAWS files
2015-09-02 15:45:29 -03:00
Henrique Dias
1abcea1a66
add some unit test
2015-08-31 18:13:18 -03:00
Henrique Dias
efff026a79
handle easier propagation
2015-08-25 16:52:28 -03:00
Henrique Dias
f910e63e90
fix null case
2015-08-24 12:22:17 -03:00
Henrique Dias
fcbe4aa925
fix inS3 propagation
2015-08-24 12:19:19 -03:00
Henrique Dias
1ccba422c8
remove unused function
2015-08-24 10:55:27 -03:00
Henrique Dias
98ce03b2f2
replace docs collection to DocstoreHandler
2015-08-24 10:38:31 -03:00
Henrique Dias
04ec45529f
restore updates from S3 when exists
...
fix: avoid rearchiving
2015-08-18 17:11:19 -03:00
Henrique Dias
20c3e15f93
fix bulk insert limit
2015-08-14 19:58:38 -03:00
Henrique Dias
26c8048729
change mongo stream method (still have a bug in bulk insert limit)
2015-08-14 19:19:54 -03:00
Henrique Dias
fd4afb3574
Archive changes, care about: versioin, expiresAt and Lock
2015-08-14 15:07:16 -03:00
Henrique Dias
6bc9c9010a
handle auto unarchive track changes
2015-08-09 19:52:32 -03:00
Henrique Dias
3bc5380468
handle inS3 flag
2015-08-09 17:50:15 -03:00
Henrique Dias
daa42bcea0
change s3Stream lib
2015-08-09 15:47:47 -03:00
Henrique Dias
bca48ac117
add unarchive doc track from s3
2015-08-06 17:09:36 -03:00
Henrique Dias
438c4f4d0c
using mongoexport for s3 archive
2015-08-06 15:46:44 -03:00
Henrique Dias
028fe2fa03
archive docChanges list to s3
2015-08-06 11:11:43 -03:00
Henrique Dias
ae047ecf76
init s3 feature
2015-08-06 10:00:09 -03:00
Brian Gough
775f5ebbe1
add configurable limit, delay and timeout to /pack via query string
2015-06-05 13:38:47 +01:00
Brian Gough
23d2518ebb
added a /pack endpoint to fork a pack worker
2015-06-04 16:36:56 +01:00