Remote Data Server Setup
In this section, you will see how to setup your special remote data server to store Datalad-managed datasets.
As one normal user usually does not have root/admin privileges to the server, this prevents her/him to install them via apt-get
.
In this case, there exist two user-based installation solutions, depending on the accessibility of your remote data server to internet:
Internet access: Installation with Conda
No internet access: Installation of standalone git-annex
Installation with Conda
From solutions of the DataLad handbook, installation with Conda
is the most convenient user-based installation but it requires an internet access from the remote data server.
In this situation, with Conda
or Miniconda
installed (If not, please check Installation of Miniconda 3 for instructions), the DataLad package can be installed from the conda-forge
channel as follows:
$ conda install -c conda-forge datalad
In general, all software dependencies of DataLad (including git-annex) are automatically installed too.
Note
This approach has the advantage that any dataset could be then directly managed on the remote server with Datalad.
Installation of standalone git-annex
The remote data server might not be connected to internet for security reasons and so, it would be impossible to install DataLad via conda
or pip
. But do not worry! One can still use a Linux standalone distribution of git-annex
. It consists of the following steps:
Download from the official website the Linux standalone for git-annex: git-annex-standalone-amd64.tar.gz.
Create a folder called for instance
Softwares
in your/home
directory with themkdir
command viassh
:$ ssh user@stockage.server.ch \ "mkdir -p /home/user/Softwares"
Important
The command
mkdir -p /home/user/Softwares
MUST be put inside""
in order to pass and execute this command viassh
.Copy the downloaded archive to the created folder on the remote server. This can be achieved with the
scp
command:$ scp /local/path/to/git-annex-standalone-amd64.tar.gz \ user@stockage.server.ch:/home/user/Softwares/git-annex-standalone-amd64.tar.gz
Extract the content of the archive to a folder
git-annex-standalone
with thetar
command and remove it viassh
:$ ssh user@stockage.server.ch \ "tar xzvf /home/user/Softwares/git-annex-standalone-amd64.tar.gz -C git-annex-standalone && rm /home/user/Softwares/git-annex-standalone-amd64.tar.gz"
Important
The command
tar [...] && rm [...]
MUST be put inside""
in order to pass and execute this sequence of commands viassh
.Connect to the remote data server via
ssh
:$ ssh user@stockage.server.ch
Then, open the
~/.bashrc
file withvim
text editor for instance ($ vim ~/.bashrc
) and add the following lines to update systemPATH
andLD_LIBRARY_PATH
:export LD_LIBRARY_PATH="/home/user/Softwares/git-annex-standalone/bin:$LD_LIBRARY_PATH" export PATH="/home/user/Softwares/git-annex-standalone:$PATH"
This finalizes the installation of the standalone
git-annex
binaries and libraries.Tip
In
vim
, the keyi
goes into edition mode. When you are done, press the keyesc
and then:wq
to tell vim to save your change (w
) and quit (q
).
Note
In this approach, only git-annex is installed on the remote server and so, it would not be possible to directly manage Datalad datasets with Datalad directly there. If one wants to do so, this would require the installation of the dataset on a host machine where an installation of Datalad is available.