Rclone

This guide explains how to setup and use rclone to sync data between HPC clusters and Google Drive.

Rclone can be used with a wide variety of cloud services including box, but this example deals specifically with Google Drive. An rclone installation on a device with a web browser is required to complete this configuration.

Full documentation on using rclone can be found here: https://rclone.org/

 

You can use using rclone either with your personal Google account or ISU Google account. Note that the ISU google account no longer provides unlimited space. 

To use rclone with the ISU google account you must have a cymail account and have accessed it at least once to initialize it in the google cloud.

To setup rclone login to a cluster dtn node (condodtn, novadtn) with ssh and then run "rclone config".

Then run though the setup dialog as shown below.

$ rclone config

2019/02/12 16:28:28 NOTICE: Config file "/home/netid/.config/rclone/rclone.conf" not found - using defaults

No remotes found - make a new one

n) New remote

s) Set configuration password

q) Quit config

n/s/q> n

The "name" can be whatever you like, you will need to type it in every rclone command so you might want to keep it short and memorable.

name> gdrive

At the time this documentation was created Google Drive is 18

Type of storage to configure.

Enter a string value. Press Enter for the default ("").

Choose a number from below, or type in your own value

 1 / 1Fichier
   \ (fichier)
 2 / Akamai NetStorage
   \ (netstorage)
...
16 / FTP
   \ (ftp)
17 / Google Cloud Storage (this is not Google Drive)
   \ (google cloud storage)
18 / Google Drive
   \ (drive)
19 / Google Photos
   \ (google photos)
20 / HTTP
   \ (http)
...
49 / premiumize.me
   \ (premiumizeme)
50 / seafile
   \ (seafile)

Storage> 18

The next 2 parameters should be left blank

Google Application Client Id

Leave blank normally.

Enter a string value. Press Enter for the default ("").

client_id> 

Google Application Client Secret

Leave blank normally.

Enter a string value. Press Enter for the default ("").

client_secret> 

 

Scope will be Full Access if you wish to write to the drive

Scope that rclone should use when requesting access from drive.

Enter a string value. Press Enter for the default ("").

Choose a number from below, or type in your own value

 1 / Full access all files, excluding Application Data Folder.

   \ "drive"

 2 / Read-only access to file metadata and file contents.

   \ "drive.readonly"

   / Access to files created by rclone only.

 3 | These are visible in the drive website.

   | File authorization is revoked when the user deauthorizes the app.

   \ "drive.file"

   / Allows read and write access to the Application Data folder.

 4 | This is not visible in the drive website.

   \ "drive.appfolder"

   / Allows read-only access to file metadata but

 5 | does not allow any access to read or download file content.

   \ "drive.metadata.readonly"

scope> 1

Leave the next parameter blank

Service Account Credentials JSON file path 

Leave blank normally.

Needed only if you want use SA instead of interactive login.

Enter a string value. Press Enter for the default ("").

service_account_file> 

Enter n for advanced config

Edit advanced config? (y/n)

y) Yes

n) No

y/n> n

Enter N in order to authenticate on a local machine

Use web browser to automatically authenticate rclone with remote?
 * Say Y if the machine running rclone has a web browser you can use
 * Say N if running rclone on a (remote) machine without web browser access
If not sure try Y. If Y failed, try N.

y) Yes (default)
n) No
y/n> n

At this point open terminal on your local device and run the rclone command to link the account:

rclone authorize "drive" "eyJRzX23wZSI6IzRynaPZlIn3"

This should open a web browser to sign in with:

enter your ISU NetID as an email to link to your ISU Google Drive

Google Sign-in Page
Google Sign-in Page

You should then get your usual ISU Okta SSO login

 

 

After correctly identifying you will see this screen.

 

Click Allow

A config token should now be printed in your local computer's terminal. Copy the token to the remote terminal to link rclone on the cluster

config_token> rgsgs9gsrgsgreX06Zesfaevaefa7vsdgsrsrsrvsvrvsr1rYffKRJmdjkadsIU4ERjkDkjdafkj

Do not configure this as a team drive

Configure this as a team drive?

y) Yes

n) No

y/n> n

You will now get a summary, type y if everything is ok 

--------------------

[gdrive]

type = drive

scope = drive

token = {"access_token":"asdfadsfasdfaeaefaseaefaefaefaeJ6asdfaefaewfaaefaevaeaefaewfawefawefeffvvrvrvrvr","token_type":"Bearer","refresh_token":"1asdfasdfasfaefaefaefafaafaefawesfaefaewf","expiry":"2019-02-12T17:31:49.265667681-06:00"}

--------------------

y) Yes this is OK

e) Edit this remote

d) Delete this remote

y/e/d> y

You will be returned to the config screen. Enter q to quit:

Current remotes:

 

Name                 Type

====                 ====

gdrive               drive

 

e) Edit existing remote

n) New remote

d) Delete remote

r) Rename remote

c) Copy remote

s) Set configuration password

q) Quit config

e/n/d/r/c/s/q> q

 

You can now use it like this,

List directories in top level of your drive

rclone lsd gdrive:

List all the files in your drive

rclone ls gdrive:

To copy a local directory to a drive directory called backup

rclone copy /home/source gdrive:backup

 

Limitations

Google Drive has rate limiting. This causes rclone to be limited to transferring about 2 files per second only. Individual files may be transferred much faster at 100s of MBytes/s but lots of small files can take a long time.  Google Drive limits us to 750 GB/person/day.   If you exceed 750G you may be banned until the following day.  This is problematic for large transfers or lots of files.  If your planning on transferring more than 750G use the "--bwlimit 8.6" option this should keep you under the 750G limit. Doing  so should allow arbitrarily large transfers. Google Drive supports single files up to 5TB. There are also some limits on filenames.  The "Advanced Rclone" instructions show a possible way of dealing with these limitations.

More Commands

For full information on the rclone commands and their syntax see here: https://rclone.org/

Group Accounts

We recommend using group role accounts to store data long term so  your entire research group can access it in the future.  Google Drive does have a "team drive" feature but that is not currently enabled for Iowa State accounts.

Transferring ownership of files

If you store your data in an individual account when you leave ISU you may need to transfer your files to another user or your major professor. 

These are the Google instructions on doing so:

from https://support.google.com/drive/answer/2494892?hl=en

Transfer file ownership

You’re the owner by default for files that you create in Docs, Sheets, and Slides, or upload into Drive. But, you can transfer ownership of your Google files (Docs, Sheets, and Slides) and folders to anyone you'd like, as long as that person has a Google Account.

Note: If you use Google apps through work or school, you can't transfer ownership to or receive ownership from someone else who is outside of your domain.

How to change owners

You can change who owns a file or folder in Drive.

  1. Go to Drive or a Docs, Sheets, or Slides home screen.
  2. Open the sharing box:
    • In Drive: Select the file or folder and click the share icon at the top .
    • In a Docs, Sheets, or Slides home screen: Open the file and click Share in the top-right corner of the file
  3. If the new owner already has edit access, skip to Step 4. Otherwise, follow these steps:
    1. Type the email address of the new owner in the "Invite people" field
    2. Click Share & save.
  4. Click Advanced in the bottom-right corner of the sharing box.
  5. Click the drop-down menu next to the name of the person you want to own the file or folder.
  6. Select Is owner.
  7. Click Done.

You'll have access to the file as an editor after you transfer ownership.

Things to consider before you transfer ownership

  • The things you’ll no longer be able to do once you transfer file ownership include:
    • Remove others from the file
    • Share with as many people as you like
    • Change visibility options
    • Allow your collaborators to change access privileges for others
    • Permanently delete something from Google Drive. After it’s deleted, no one can access it, including those it was shared with.
  • When you transfer ownership of a folder from yourself to another person, the new owner of the folder becomes an editor of the files in that folder. The original owners of the files remain the owners, and if the original owner deletes a file, it'll be removed from the folder.
  • If your current Google Account is being deleted, transfer ownership of your files, folders, and Google files to another active account. Once the original account is deleted, you won’t be able to recover any of your files or folders from it.