Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core: multi SCM support #7420

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from
Draft

core: multi SCM support #7420

wants to merge 8 commits into from

Conversation

grouville
Copy link
Member

@grouville grouville commented May 20, 2024

First working draft of multi SCM support.

Currently only works on public VCS repo, as well as public vanity URL relying on the go-get vanity URL logic from Go.

TODO

  • More detailed history of https://github.com/grouville/go-vcs when merging
  • Improve error messages: on init, install, call, functions -> add ++++ integration tests
  • Add vanity URL test by changing dagger.io/dagger redirect (side PR) -> https://github.com/dagger/dagger.io/pull/3730
  • Add vcsToBuildKitRef logic in core/schema/git` to handle same ref format for dirs (follow-up PR), and remove test from this one
  • Extend moduleSource type to handle SSH_AUTH_SOCK (ongoing) for private module handling (potentially follow-up PR)

@grouville grouville force-pushed the multi-scm branch 6 times, most recently from f8a7116 to 1dca458 Compare May 28, 2024 09:39
@grouville grouville force-pushed the multi-scm branch 4 times, most recently from a9aa139 to a176391 Compare May 28, 2024 14:51
grouville added 8 commits May 28, 2024 17:10
Extend the originToPath implementation to handle Git URLs without scheme and always trim the .git on the path.

This is a no-op for current implementation, but it allows us to extract the root of a Git URL in a Go compatible format

Signed-off-by: grouville <guillaume@dagger.io>
This commit adds a forked / improved version of Google's RepoRootForImportPath logic. Most of this code is originally part of the go/internal section of Google's codebase: https://github.com/golang/go/blob/16d3040a84be821d801b75bd1a3d8ab4cc89ee36/src/cmd/go/internal/vcs/vcs.go#L1281.

However, Justin found a version of this library exposed by Google without all the internal dependencies. It is now deprecated as unmaintained, but the core logic: the dynamic resolving coupled with the regexes are still relevant and enough for our use-case: https://pkg.go.dev/golang.org/x/tools/go/vcs#RepoRootForImportPath.

We had to fork the library in order to upgrade the go version used, bump the dependencies and more importantly fix a few unported changes:
1. We remove unsupported VCS, as BuildKit only support Git based VCS
2. We remove the deprecated mercurial check from bitbucket vcs options (golang/go@5b1b80b) leading to an unability to parse the root of a repo from a bitbucket private ref

However, in the context of Dagger, the important part is to be able to split the root of the repo from its subdir. I think it is ok as it will fail at the gitdns.Git level (to be verified)

3. Update the regexes and the tests for jazz, apache, git.sr.ht SCM (following current official package's logic)
4. Remove the deprecation notices, as this package does not aim to be 100% compliant with upstream but just handle the split between the root of a repo and the subpath from its URL + rely on io.ReadAll instead of ioutil

Signed-off-by: grouville <guillaume@dagger.io>
Correctly isolate subpath on all VCS:
- when querying the vcs with go-get, Go's repoRootForImportPath sometimes return a ref with `.git` inside

Signed-off-by: grouville <guillaume@dagger.io>
Go's repoRootForImportPath was not following our golintci conventions

This commit fixes them or ignores them when it does not impact the pertinence not the security:
- exec.Command() is hidden with nolint:gosec as
- naked return are hidden with nolint:nakedret, as this is a pattern used on the official library from Google. Instead of refactoring the function, ignore the error
- use %w in fmt.Errorf to wrap errors
- use ReplaceAll instead of Replace(..., -1)

Signed-off-by: grouville <guillaume@dagger.io>
Previously, local / git refs were estimated using such heuristic:
- no version specified AND ref starting with "github.com"

However, as we are now extending the logic to other CI providers, this
needs to be updated.

The current logic is extended to stat if the dir is present ; or if the
ref starts with "/", ".", "./".

This shall enable handling subdirectories without local path prefixes
nor absolute path.

However, in order to keep the current logic in place, make sure that the
sourceRootRelPath starts with "./", in the resolveFromCaller func. This enables
 us to keep the rel path correct whithout changing the logic

Signed-off-by: Guillaume de Rouville <guillaume@dagger.io>
parseRefString continues to fallbacks on local remote when not able to check if
it is a git remote. This allows us to avoid changing a lot of the core
design decisions such as: do we create a directory on init when the
source points to an inexistant dir ?

Signed-off-by: Guillaume de Rouville <guillaume@dagger.io>
Current ref library taken from Google does not allow path traversal in the refs extracted with regexes.

Due to our internal handling of those edge cases, it is easier to  extend the matching patterns to handle it and not error.

Signed-off-by: Guillaume de Rouville <guillaume@dagger.io>
…efString

- Fixes a typo on the testOriginPath function leading to a CI break
- Better comment on the new parseRefString logic

Signed-off-by: Guillaume de Rouville <guillaume@dagger.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant