Skip to content

Instantly share code, notes, and snippets.

@gquere
Created July 15, 2020 09:26
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save gquere/fbf6fffd36565f15f98923ab6174a9c2 to your computer and use it in GitHub Desktop.
Save gquere/fbf6fffd36565f15f98923ab6174a9c2 to your computer and use it in GitHub Desktop.
#!/usr/bin/env python3
import argparse
import requests
import json
import urllib3
from urllib.parse import urlparse
import os
import re
from getpass import getpass
# SUPPRESS WARNINGS ############################################################
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
# CLONE ########################################################################
def do_clone_ssh(repo_list):
for repo in repo_list:
parsed_git_url = urlparse(repo)
directory = OUTPUT_DIR + re.sub('.git', '', parsed_git_url.path)
os.system('git clone \'{}\' \'{}\''.format(repo, directory))
def do_clone_http(repo_list):
for repo in repo_list:
parsed_http_url = urlparse(repo)
directory = OUTPUT_DIR + re.sub('.git', '', repo.split('/scm/')[1])
if args.user and args.password:
# for some unknown reason, urlparse cannot replace username/password directly ...
parsed_http_url = parsed_http_url._replace(netloc=args.user + ':' + args.password + '@' + parsed_http_url.netloc)
os.system('git clone \'{}\' \'{}\''.format(parsed_http_url.geturl(), directory))
def do_clone(repo_list, method):
if method == 'HTTP':
do_clone_http(repo_list)
if method == 'SSH':
do_clone_ssh(repo_list)
# API ##########################################################################
def api_get_repo_list(session, url, method):
repo_list = []
r = session.get(url + '/rest/api/1.0/projects?limit=10000')
projects = json.loads(r.text)
for project in projects['values']:
r = session.get(url + '/rest/api/1.0/projects/' + project['key'] + '/repos?limit=10000')
repos = json.loads(r.text)
for repo in repos['values']:
clone_options = repo['links']['clone']
for clone_option in clone_options:
if clone_option['name'] == method.lower():
repo_list.append(clone_option['href'])
return repo_list
# MAIN #########################################################################
parser = argparse.ArgumentParser()
parser.add_argument('url', type=str)
parser.add_argument('-u', '--user', type=str)
parser.add_argument('-p', '--password', type=str)
parser.add_argument('-o', '--output-dir', type=str, required=True)
parser.add_argument('-m', '--method', type=str, default='HTTP', help='Cloning method: HTTP or SSH (default: HTTP)')
args = parser.parse_args()
s = requests.Session()
s.verify = False
if args.user:
if not args.password:
args.password = getpass("password: ")
s.auth = (args.user, args.password)
args.password = args.password.replace('@', '%40') # escape for HTTP later on
OUTPUT_DIR = args.output_dir + '/'
repo_list = api_get_repo_list(s, args.url.rstrip('/'), args.method)
print('[+] Cloning {} repositories...'.format(len(repo_list)))
do_clone(repo_list, args.method)
@jensim
Copy link

jensim commented Dec 5, 2021

Cool stuff ๐Ÿ˜€ I too wanted a cloneAll feature for bitbucket, so I made the bitbucket_server_cli, mostly to learn rust at that point..

I think its great that you're writing your own, there's a ton of learning from integration with APIs ๐Ÿ‘๐Ÿป
One thing that might, or might not be a problem for someone using this might be the pagination part. I actually had some real world examples where i missed repos due to the max limit was set to 100 (not 10000, like your code tries for). I had set my request param limit to 9999, but only got 100.

Hope this script does all you need it to do, and that you learn by writing it ๐ŸŒž
Best regards
Jensim

@gquere
Copy link
Author

gquere commented Dec 6, 2021

Hello,
Thanks for the feedback. Though are you sure it's a setting for BitBucket, and not for GitLab (which indeed has a default limit set to 100, see here https://gist.github.com/gquere/ec75dfeefe725a87aada0a09d30962b6)?

@jensim
Copy link

jensim commented Dec 6, 2021

The limit parameter indicates how many results to return per page. APIs default to returning 25 if the limit is left unspecified.
https://docs.atlassian.com/bitbucket-server/rest/7.18.1/bitbucket-rest.html

Cant see any mention of any override here
https://docs.atlassian.com/bitbucket-server/rest/7.18.1/bitbucket-rest.html#idp151
But I haven't checked the server config, so might be overridable from run conf..?

Might also be that the APIs are more pemissive than the docs ๐Ÿ˜„
But I seem to remember that the BitBucket Server APIs return the actual limit used in the return body. What does that indicate?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment