Remote Data Server Setup
In this section, you will see how to setup your special remote data server to store Datalad-managed datasets.
As one normal user usually does not have root/admin privileges to the server, this prevents her/him to install them via apt-get.
In this case, there exist two user-based installation solutions, depending on the accessibility of your remote data server to internet:
Internet access: Installation with Conda
No internet access: Installation of standalone git-annex
Installation with Conda
From solutions of the DataLad handbook, installation with Conda is the most convenient user-based installation but it requires an internet access from the remote data server.
In this situation, with Conda or Miniconda installed (If not, please check Installation of Miniconda 3 for instructions), the DataLad package can be installed from the conda-forge channel as follows:
$ conda install -c conda-forge datalad
In general, all software dependencies of DataLad (including git-annex) are automatically installed too.
Note
This approach has the advantage that any dataset could be then directly managed on the remote server with Datalad.
Installation of standalone git-annex
The remote data server might not be connected to internet for security reasons and so, it would be impossible to install DataLad via conda or pip. But do not worry! One can still use a Linux standalone distribution of git-annex. It consists of the following steps:
Download from the official website the Linux standalone for git-annex: git-annex-standalone-amd64.tar.gz.
Create a folder called for instance
Softwaresin your/homedirectory with themkdircommand viassh:$ ssh user@stockage.server.ch \ "mkdir -p /home/user/Softwares"
Important
The command
mkdir -p /home/user/SoftwaresMUST be put inside""in order to pass and execute this command viassh.Copy the downloaded archive to the created folder on the remote server. This can be achieved with the
scpcommand:$ scp /local/path/to/git-annex-standalone-amd64.tar.gz \ user@stockage.server.ch:/home/user/Softwares/git-annex-standalone-amd64.tar.gz
Extract the content of the archive to a folder
git-annex-standalonewith thetarcommand and remove it viassh:$ ssh user@stockage.server.ch \ "tar xzvf /home/user/Softwares/git-annex-standalone-amd64.tar.gz -C git-annex-standalone && rm /home/user/Softwares/git-annex-standalone-amd64.tar.gz"
Important
The command
tar [...] && rm [...]MUST be put inside""in order to pass and execute this sequence of commands viassh.Connect to the remote data server via
ssh:$ ssh user@stockage.server.chThen, open the
~/.bashrcfile withvimtext editor for instance ($ vim ~/.bashrc) and add the following lines to update systemPATHandLD_LIBRARY_PATH:export LD_LIBRARY_PATH="/home/user/Softwares/git-annex-standalone/bin:$LD_LIBRARY_PATH" export PATH="/home/user/Softwares/git-annex-standalone:$PATH"
This finalizes the installation of the standalone
git-annexbinaries and libraries.Tip
In
vim, the keyigoes into edition mode. When you are done, press the keyescand then:wqto tell vim to save your change (w) and quit (q).
Note
In this approach, only git-annex is installed on the remote server and so, it would not be possible to directly manage Datalad datasets with Datalad directly there. If one wants to do so, this would require the installation of the dataset on a host machine where an installation of Datalad is available.