diff --git a/services/web/copybara/.gitignore b/services/web/copybara/.gitignore new file mode 100644 index 0000000000..20d4953744 --- /dev/null +++ b/services/web/copybara/.gitignore @@ -0,0 +1,3 @@ +.ssh +.ssh/ +.cache/ diff --git a/services/web/copybara/README.md b/services/web/copybara/README.md new file mode 100644 index 0000000000..3fff1a3e67 --- /dev/null +++ b/services/web/copybara/README.md @@ -0,0 +1,166 @@ +# Copybara Overleaf sync + +[Copybara](https://github.com/google/copybara) is a utility for syncing one +git repository with another, while performing modifications, such as removing +directories. We use this to keep a public OSS mirror of our web repo, but with +the proprietary modules directory removed. + +We also use copybara to import Pull Requests from the public mirror to our private +repo, while preserving attribution for the original contributor. + + +## Running a sync locally + +You will need a copy of the `sharelatex/copybara` container, which can be pulled +in, or built from the [copy.bara project]( + https://github.com/google/copybara#getting-started-using-copybara): + +```bash +> git clone git@github.com:google/copybara.git +> cd copybara +> docker build --rm -t sharelatex/copybara . +``` + +There is a `docker-compose.yml` file in this directory which configures +everything. We mount out the copybara cache directory so we don't need to do a +full git clone each time. Check the file for further instructions on running +a sync from your local machine or initialising a sync. + +The `.ssh` directory in this directory should have a private key with GitHub +access placed into it have github.com pre-authorized. Your personal `.ssh` will +do the job, and changes in the target repo will maintain the original author. + +## Initializing a sync + +In order to initialize a sync with a new repository we'll instruct copybara to +start synchronizing from the initial commit: + +```yaml + copybara: + ... + environment: + ... + COPYBARA_OPTIONS: "--init-history" +``` + +**Important**: If the repository is not empty the synchronization will start by +removing all the existing content. + +## Fixing a bad state + +By default, copy.bara expects to find some metadata in the destination repo +commits which it wrote on the last run. This tells it where to pick up syncing +any new changes in the source repo. If things get in a bad state, you can provide +with an explicit reference to a commit in the source repo to start replaying +commits from. Add the following to the `docker-compose.yml` config: + +```yaml + copybara: + ... + environment: + ... + COPYBARA_OPTIONS: "--last-rev=COMMIT_SHA_FROM_SOURCE_REPO" +``` + +If the destination repo gets out of sync in some way, reset its master branch +to a point when things were in a good state, and then do a re-sync as above, +but with the last-rev set to the corresponding good commit in the source repo. + +## Running a sync in CI + +The same `sharelatex/copybara` image and copybara config files is used by +Jenkins to perform the sync at the end of a successful CI build of master. See +the `Jenkinsfile` in the top level directory for this. + + +## Importing a PR from public to private + +We can import a public PR using the `importPullRequest` workflow. + + +### Setup + +#### 1: Get a Github API key + +You need a Github API key to manipulate pull requests via the Github api. + +- Open https://github.com/settings/tokens +- Create a new token with the `repo` scope turned on +- Open `~/.git-credentials` and add a line like this: `https://user%40example.com:@github.com` + - Note that the `@` in the email address is encoded as `%40` + +This `~/.git-credentials` file will be mounted into the copybara container by +docker-compose. + + +### Running the import job + +Run copybara with `docker-compose run`: + +``` +docker-compose run \ + -e COPYBARA_WORKFLOW=importPullRequest \ + -e COPYBARA_SOURCEREF= \ + -e COPYBARA_OPTIONS='--github-destination-pr-branch ' + copybara copybara +``` + +Change to the number of the public pull request. + +Change to a suitable name for the new private branch, example: 'import-pr-xyz'. +(Note, there's no `=` between the flag and the branch name) + +This will create a new PR on the private repo with the content from the public PR. +When this private PR is eventually merged and synced back to the public repo, the +original public PR will close automatically, and the changes will be attributed to +the original committer. + +### Merging the PR in the private repo + +In order to maintain correct attribution it's **important to squash the changes**, otherwise attribution might not be reflected properly. + + +### Attaching the PR to a particular parent commit + +The copybara process will usually figure out the appropriate place to begin the new PR from, but if you want +to specify the parent commit explicitly, you can set the following flag in `COPYBARA_OPTIONS`: + +``` + --change_request_parent= +``` + + +### Errors + +There are a few things that can go wrong with this invocation: + + +#### Wrong owner or permissions on `.ssh/config` + +Use `docker-compose run copybara bash -c 'chown -R root:root /root/.ssh'` to fix the ownership +of the mounted ssh key. + +You may need to then reclaim ownership on the host later by running `sudo chown -R $USER:$USER ~/.ssh` + + +#### Can't enter ssh passphrase + +Copybara can't handle ssh keys with passphrases, it just hangs at the prompt to enter the passphrase. +We can solve this by mounting our `ssh-agent` socket into the container. + +Add the following options to the `docker-compose` invocation: + +- `--volume $SSH_AUTH_SOCK:/ssh-agent` +- `-e SSH_AUTH_SOCK=/ssh-agent` + +For example: + +``` +docker-compose run \ + --volume $SSH_AUTH_SOCK:/ssh-agent \ + -e SSH_AUTH_SOCK=/ssh-agent \ + -e COPYBARA_WORKFLOW=importPullRequest \ + -e COPYBARA_SOURCEREF= \ + -e COPYBARA_OPTIONS='--github-destination-pr-branch ' + copybara copybara +``` diff --git a/services/web/copybara/community/copy.bara.sky b/services/web/copybara/community/copy.bara.sky new file mode 100644 index 0000000000..fae874dc58 --- /dev/null +++ b/services/web/copybara/community/copy.bara.sky @@ -0,0 +1,91 @@ +privateRepo = "git@github.com:overleaf/web-internal.git" +publicRepo = "git@github.com:overleaf/web.git" + +everythingExceptPrivateFiles = glob( + ["**"], + exclude = [ + "modules/**", + "app/views/external/**", + "public/brand/**", + "copybara/**", + "config/settings.webpack.js", + "config/settings.overrides.saas.js", + "config/settings.overrides.server-pro.js", + "scripts/fix_account_linkage/**", + "scripts/translations/uploadNonEnglish.js", + "scripts/user-export/**", + ".ssh/**", + ".github/dependabot.yml", + "cloudbuild.yaml", + ".sentryclirc.enc" + ] +) + glob([ + "modules/launchpad/**", + "modules/user-activate/**", + "modules/server-ce-scripts/**", + "modules/modules-*.js" +]) + +core.workflow( + name = "default", + origin = git.origin( + url = privateRepo, + ref = "master" + ), + destination = git.destination( + url = publicRepo, + fetch = "master", + push = "master" + ), + # Exclude proprietary code and non-local build scripts + origin_files = everythingExceptPrivateFiles, + mode="ITERATIVE", + migrate_noop_changes=True, + authoring = authoring.pass_thru("Copybot "), + transformations = [ + metadata.restore_author(label='ORIGINAL_AUTHOR', search_all_changes=True) + ] +) + + +# ---- Import a PR from public repo to private ---- + +titleFormatString = "[Imported] ${GITHUB_PR_TITLE}, (#${GITHUB_PR_NUMBER})" +bodyFormatString = """ +Imported from public PR [${GITHUB_PR_NUMBER}](https://github.com/overleaf/web/pull/${GITHUB_PR_NUMBER}). +Original PR opened by ${GITHUB_PR_USER} + +---- + +${GITHUB_PR_BODY} +""" + +core.workflow( + name = "importPullRequest", + origin = git.github_pr_origin( + url = publicRepo + ), + destination = git.github_pr_destination( + url = privateRepo, + integrates = [], + pr_branch = "import_public_pr_${CONTEXT_REFERENCE}", + title = titleFormatString, + body = bodyFormatString + ), + mode = "CHANGE_REQUEST", + set_rev_id = False, + + origin_files = glob(["**"]), + + # Same as origin_files in the default workflow, + # without this here a PR will delete the excluded files from + # the private repo. + destination_files = everythingExceptPrivateFiles, + + authoring = authoring.pass_thru("Overleaf CopyBot "), + transformations = [ + metadata.save_author(), + metadata.expose_label("COPYBARA_INTEGRATE_REVIEW"), + metadata.expose_label("GITHUB_PR_NUMBER", new_name ="Closes", separator=" #"), + ], +) diff --git a/services/web/copybara/docker-compose.yml b/services/web/copybara/docker-compose.yml new file mode 100644 index 0000000000..1a8568a6f6 --- /dev/null +++ b/services/web/copybara/docker-compose.yml @@ -0,0 +1,28 @@ +version: '2' + +services: + copybara: + image: sharelatex/copybara:2019-08.01 + volumes: + # Mount the host ssh folder + - ~/.ssh/id_rsa:/root/.ssh/id_rsa + - ~/.ssh/id_rsa.pub:/root/.ssh/id_rsa.pub + - ~/.ssh/known_hosts:/root/.ssh/known_hosts + # Mount this directory to the place the base image expects + - .:/usr/src/app + # Mount the cache + - ./.cache/:/root/copybara/cache/ + # Mount the hosts git configuration + - $HOME/.gitconfig:/root/.gitconfig + # Mount the hosts git credentials, required for migrating PRs via the github API + - $HOME/.git-credentials:/root/.git-credentials + environment: + COPYBARA_CONFIG: "./community/copy.bara.sky" + + # Uncomment this to force copybara to start syncing from a certain commit + # COPYBARA_OPTIONS: "--last-rev=67edeed2c2d8c1d478c9a65d19020a301174cc8e" + + # Uncomment this to force copybara to bootstrap a sync in a new repository + # COPYBARA_OPTIONS: "--init-history" + + command: copybara diff --git a/services/web/scripts/add_feature_override.js b/services/web/scripts/add_feature_override.js index a14b852a1b..a2fb9d5c95 100644 --- a/services/web/scripts/add_feature_override.js +++ b/services/web/scripts/add_feature_override.js @@ -56,6 +56,7 @@ function _validateUserIdList(userIds) { } async function _handleUser(userId) { + console.log('updating user', userId) const user = await UserGetter.promises.getUser(userId, { features: 1, featuresOverrides: 1, @@ -99,7 +100,6 @@ async function _handleUser(userId) { ) } } - if (!COMMIT) { // not saving features; nothing else to do return @@ -167,10 +167,15 @@ async function processUsers(userIds) { console.log(`---Starting to process ${userIds.length} users---`) const limit = pLimit(CONCURRENCY) - await Promise.all( + const results = await Promise.allSettled( userIds.map(userId => limit(() => _handleUser(ObjectId(userId)))) ) - + results.forEach((result, idx) => { + if (result.status !== 'fulfilled') { + console.log(userIds[idx], 'failed', result.reason) + processLogger.failed.push(userIds[idx]) + } + }) processLogger.printSummary() process.exit() }