Copying data to and from Sciama

You are here:
Estimated reading time: 3 min

Transferring Files To and From Sciama

From a user perspective there are two data areas on Sciama. Your home account which, although not currently enforced, should not exceed 10Gbytes and a project that can be several Tbytes. The project area can be accessed from /mnt/lustre/ . An area corresponding to your account name will be created upon request. It should be stressed that NO DATA ON SCIAMA IS BACKED UP. The size of the project data area will be monitored.

**Old data may be deleted without warning.**

 

There are a few methods for transferring data to and from Sciama from the command line:

SCP:

For single file transfer. The SCP syntax is as follows. You can either push the data:-
scp source_file username@login.sciama.icg.port.ac.uk:~/destination
or pull the data:-
scp username@login.sciama.icg.port.ac.uk:source_file  destination

SFTP:

for transferring small directories. The SFTP syntax is as follows:-
sftp user@login.sciama.icg.port.ac.uk
put local-path
get remote-path
exit

Rsync:

Rsync can resume sending at the last file if it is interrupted during the transfer . For that reason, it’s best to send a number of smaller files rather than a single large file, as it can only start at the beginning of whole files. Again you can push or pull the data. The Rsync push syntax is:-
rsync -ravP -e ssh source user@login.sciama.icg.port.ac.uk:~/destination

Rclone:

Rclone uses the rsync command but can be configured for many different cloud sites.
load module cpython/2.7.16 or cpython/3.8.3 to use rclone.
On first use you will need to configure your cloud storage, use ‘rclone config’ and follow the documentation: https://rclone.org/docs/ and https://rclone.org/drive/   for  Google Drive setup

See also Rclone Knowledge Article

Gdcp – google drive copy

Load cpython/2.7.16 module. note gdcp is not compatible with Python3. Currently gdc only works from login5 – login8 so you will need to ssh to one of those nodes.

The first time you run gdcp a new directory ~/.gdcp/ will be created and you’ll be greeted with the following instructions for obtaining your own OAuth2 client ID.

- Visit https://console.developers.google.com/
- Create a new project and select it
- Under 'APIs' make sure the 'Drive API' is turned on
- Under 'Credentials' create a new OAuth client ID
  Choose 'Installed -> Other' for application type
- Click 'Download JSON' to download the secrets file
- Copy the secrets file to ~/.gdcp/client_secrets.json

Once you’ve created ~/.gdcp/client_secrets.json, run gdcp again to authorize access to your Google Drive. You should see something like

Go to the following link in your browser:

followed by a long URL. Visiting this URL in a browser where you’re already logged into your Google account will yield a verification code. Copy and paste this code into the terminal to complete the authentication process. A new file, ~/.gdcp/credentials.json, will be created which grants access to your Google Drive files. If you want to use gdcp on a different computer without going through authentication again, just copy ~/.gdcp/ to the new computer.

Commands: gdcp {list, upload, download, mount}

upload

Recursively upload files or folders. For example, to upload a local folder into a Google Drive parent folder with ID 0Bxt5Ia3JxzdHfkJDeUxCQ3RyaWp.

$ gdcp upload -p 0Bxt5Ia3JxzdHfkJDeUxCQ3RyaWp ./subfolder
subfolder/
subfolder/subfile.txt
  100.00% 209715200 44.08MB/s 4.76s MD5...OK
Uploaded 2 file(s) and folder(s)

download

Recursively download files or folders. For example, to download a Google Drive folder with ID 0Bxt5Ia3JxzdHfkJDeUxCQ3RyaWp.

$ gdcp download -i 0Bxt5Ia3JxzdHfkJDeUxCQ3RyaWp .
./foo/subfolder/
./foo/subfolder/subfile.txt
  100.00% 209715200 50.32MB/s 4.17s MD5...OK
./foo/bar.txt
  100.00% 7 0.00MB/s 0.49s MD5...OK
Downloaded 4 file(s) and folder(s)
 

Globus

Globus is the preferred method to transfer very large amounts of data between institutes.  You will need a globus account which is free, go to globus.org and login with your @port.ac.uk account and use the University Of Portsmouth ICG Endpoint.

You will need to use the iccguest account, ask icg-computing team for the password. Transfer your data to our globus machine: globusconnect-app-01.iso.port.ac.uk using scp or sftp as above.

 

 

Was this article helpful?
Dislike 0
Views: 216