Merge pull request #5468 from overleaf/bg-fix-feature-override-script

fix feature override script

GitOrigin-RevId: f123843e1ed40b90f55a32d687a2ade4a5b44a05
This commit is contained in:
Brian Gough 2021-10-18 11:34:52 +01:00 committed by Copybot
parent 89dfcaf528
commit acc9a1ace1
5 changed files with 296 additions and 3 deletions

3
services/web/copybara/.gitignore vendored Normal file
View file

@ -0,0 +1,3 @@
.ssh
.ssh/
.cache/

View file

@ -0,0 +1,166 @@
# Copybara Overleaf sync
[Copybara](https://github.com/google/copybara) is a utility for syncing one
git repository with another, while performing modifications, such as removing
directories. We use this to keep a public OSS mirror of our web repo, but with
the proprietary modules directory removed.
We also use copybara to import Pull Requests from the public mirror to our private
repo, while preserving attribution for the original contributor.
## Running a sync locally
You will need a copy of the `sharelatex/copybara` container, which can be pulled
in, or built from the [copy.bara project](
https://github.com/google/copybara#getting-started-using-copybara):
```bash
> git clone git@github.com:google/copybara.git
> cd copybara
> docker build --rm -t sharelatex/copybara .
```
There is a `docker-compose.yml` file in this directory which configures
everything. We mount out the copybara cache directory so we don't need to do a
full git clone each time. Check the file for further instructions on running
a sync from your local machine or initialising a sync.
The `.ssh` directory in this directory should have a private key with GitHub
access placed into it have github.com pre-authorized. Your personal `.ssh` will
do the job, and changes in the target repo will maintain the original author.
## Initializing a sync
In order to initialize a sync with a new repository we'll instruct copybara to
start synchronizing from the initial commit:
```yaml
copybara:
...
environment:
...
COPYBARA_OPTIONS: "--init-history"
```
**Important**: If the repository is not empty the synchronization will start by
removing all the existing content.
## Fixing a bad state
By default, copy.bara expects to find some metadata in the destination repo
commits which it wrote on the last run. This tells it where to pick up syncing
any new changes in the source repo. If things get in a bad state, you can provide
with an explicit reference to a commit in the source repo to start replaying
commits from. Add the following to the `docker-compose.yml` config:
```yaml
copybara:
...
environment:
...
COPYBARA_OPTIONS: "--last-rev=COMMIT_SHA_FROM_SOURCE_REPO"
```
If the destination repo gets out of sync in some way, reset its master branch
to a point when things were in a good state, and then do a re-sync as above,
but with the last-rev set to the corresponding good commit in the source repo.
## Running a sync in CI
The same `sharelatex/copybara` image and copybara config files is used by
Jenkins to perform the sync at the end of a successful CI build of master. See
the `Jenkinsfile` in the top level directory for this.
## Importing a PR from public to private
We can import a public PR using the `importPullRequest` workflow.
### Setup
#### 1: Get a Github API key
You need a Github API key to manipulate pull requests via the Github api.
- Open https://github.com/settings/tokens
- Create a new token with the `repo` scope turned on
- Open `~/.git-credentials` and add a line like this: `https://user%40example.com:<TOKEN>@github.com`
- Note that the `@` in the email address is encoded as `%40`
This `~/.git-credentials` file will be mounted into the copybara container by
docker-compose.
### Running the import job
Run copybara with `docker-compose run`:
```
docker-compose run \
-e COPYBARA_WORKFLOW=importPullRequest \
-e COPYBARA_SOURCEREF=<PR_NUMBER> \
-e COPYBARA_OPTIONS='--github-destination-pr-branch <BRANCH_NAME>'
copybara copybara
```
Change <PR_NUMBER> to the number of the public pull request.
Change <BRANCH_NAME> to a suitable name for the new private branch, example: 'import-pr-xyz'.
(Note, there's no `=` between the flag and the branch name)
This will create a new PR on the private repo with the content from the public PR.
When this private PR is eventually merged and synced back to the public repo, the
original public PR will close automatically, and the changes will be attributed to
the original committer.
### Merging the PR in the private repo
In order to maintain correct attribution it's **important to squash the changes**, otherwise attribution might not be reflected properly.
### Attaching the PR to a particular parent commit
The copybara process will usually figure out the appropriate place to begin the new PR from, but if you want
to specify the parent commit explicitly, you can set the following flag in `COPYBARA_OPTIONS`:
```
--change_request_parent=<COMMIT_ID>
```
### Errors
There are a few things that can go wrong with this invocation:
#### Wrong owner or permissions on `.ssh/config`
Use `docker-compose run copybara bash -c 'chown -R root:root /root/.ssh'` to fix the ownership
of the mounted ssh key.
You may need to then reclaim ownership on the host later by running `sudo chown -R $USER:$USER ~/.ssh`
#### Can't enter ssh passphrase
Copybara can't handle ssh keys with passphrases, it just hangs at the prompt to enter the passphrase.
We can solve this by mounting our `ssh-agent` socket into the container.
Add the following options to the `docker-compose` invocation:
- `--volume $SSH_AUTH_SOCK:/ssh-agent`
- `-e SSH_AUTH_SOCK=/ssh-agent`
For example:
```
docker-compose run \
--volume $SSH_AUTH_SOCK:/ssh-agent \
-e SSH_AUTH_SOCK=/ssh-agent \
-e COPYBARA_WORKFLOW=importPullRequest \
-e COPYBARA_SOURCEREF=<PR_NUMBER> \
-e COPYBARA_OPTIONS='--github-destination-pr-branch <BRANCH_NAME>'
copybara copybara
```

View file

@ -0,0 +1,91 @@
privateRepo = "git@github.com:overleaf/web-internal.git"
publicRepo = "git@github.com:overleaf/web.git"
everythingExceptPrivateFiles = glob(
["**"],
exclude = [
"modules/**",
"app/views/external/**",
"public/brand/**",
"copybara/**",
"config/settings.webpack.js",
"config/settings.overrides.saas.js",
"config/settings.overrides.server-pro.js",
"scripts/fix_account_linkage/**",
"scripts/translations/uploadNonEnglish.js",
"scripts/user-export/**",
".ssh/**",
".github/dependabot.yml",
"cloudbuild.yaml",
".sentryclirc.enc"
]
) + glob([
"modules/launchpad/**",
"modules/user-activate/**",
"modules/server-ce-scripts/**",
"modules/modules-*.js"
])
core.workflow(
name = "default",
origin = git.origin(
url = privateRepo,
ref = "master"
),
destination = git.destination(
url = publicRepo,
fetch = "master",
push = "master"
),
# Exclude proprietary code and non-local build scripts
origin_files = everythingExceptPrivateFiles,
mode="ITERATIVE",
migrate_noop_changes=True,
authoring = authoring.pass_thru("Copybot <copybot@overleaf.com>"),
transformations = [
metadata.restore_author(label='ORIGINAL_AUTHOR', search_all_changes=True)
]
)
# ---- Import a PR from public repo to private ----
titleFormatString = "[Imported] ${GITHUB_PR_TITLE}, (#${GITHUB_PR_NUMBER})"
bodyFormatString = """
Imported from public PR [${GITHUB_PR_NUMBER}](https://github.com/overleaf/web/pull/${GITHUB_PR_NUMBER}).
Original PR opened by ${GITHUB_PR_USER}
----
${GITHUB_PR_BODY}
"""
core.workflow(
name = "importPullRequest",
origin = git.github_pr_origin(
url = publicRepo
),
destination = git.github_pr_destination(
url = privateRepo,
integrates = [],
pr_branch = "import_public_pr_${CONTEXT_REFERENCE}",
title = titleFormatString,
body = bodyFormatString
),
mode = "CHANGE_REQUEST",
set_rev_id = False,
origin_files = glob(["**"]),
# Same as origin_files in the default workflow,
# without this here a PR will delete the excluded files from
# the private repo.
destination_files = everythingExceptPrivateFiles,
authoring = authoring.pass_thru("Overleaf CopyBot <copybot@overleaf.com>"),
transformations = [
metadata.save_author(),
metadata.expose_label("COPYBARA_INTEGRATE_REVIEW"),
metadata.expose_label("GITHUB_PR_NUMBER", new_name ="Closes", separator=" #"),
],
)

View file

@ -0,0 +1,28 @@
version: '2'
services:
copybara:
image: sharelatex/copybara:2019-08.01
volumes:
# Mount the host ssh folder
- ~/.ssh/id_rsa:/root/.ssh/id_rsa
- ~/.ssh/id_rsa.pub:/root/.ssh/id_rsa.pub
- ~/.ssh/known_hosts:/root/.ssh/known_hosts
# Mount this directory to the place the base image expects
- .:/usr/src/app
# Mount the cache
- ./.cache/:/root/copybara/cache/
# Mount the hosts git configuration
- $HOME/.gitconfig:/root/.gitconfig
# Mount the hosts git credentials, required for migrating PRs via the github API
- $HOME/.git-credentials:/root/.git-credentials
environment:
COPYBARA_CONFIG: "./community/copy.bara.sky"
# Uncomment this to force copybara to start syncing from a certain commit
# COPYBARA_OPTIONS: "--last-rev=67edeed2c2d8c1d478c9a65d19020a301174cc8e"
# Uncomment this to force copybara to bootstrap a sync in a new repository
# COPYBARA_OPTIONS: "--init-history"
command: copybara

View file

@ -56,6 +56,7 @@ function _validateUserIdList(userIds) {
}
async function _handleUser(userId) {
console.log('updating user', userId)
const user = await UserGetter.promises.getUser(userId, {
features: 1,
featuresOverrides: 1,
@ -99,7 +100,6 @@ async function _handleUser(userId) {
)
}
}
if (!COMMIT) {
// not saving features; nothing else to do
return
@ -167,10 +167,15 @@ async function processUsers(userIds) {
console.log(`---Starting to process ${userIds.length} users---`)
const limit = pLimit(CONCURRENCY)
await Promise.all(
const results = await Promise.allSettled(
userIds.map(userId => limit(() => _handleUser(ObjectId(userId))))
)
results.forEach((result, idx) => {
if (result.status !== 'fulfilled') {
console.log(userIds[idx], 'failed', result.reason)
processLogger.failed.push(userIds[idx])
}
})
processLogger.printSummary()
process.exit()
}