Commandline Usage

Important

Before using NeuroDataPub, the remote data server should provide at least an installation of git-annex. Please see Remote Data Server Setup for instructions.

Note also that NeuroDataPub takes as principal input the path of your dataset that should be compliant to the Brain Imaging Data Structure (BIDS) format by default. If you are using a dataset in BIDS format, you should always make sure that your dataset is in valid BIDS format before using NeuroDataPub using the free, online BIDS Validator, or its standalone version. See BIDS standard for more information about BIDS. If it does not make any sense to adopt the BIDS format for your dataset, NeuroDataPub can also handle dataset not necessary in the BIDS format, since v0.4, with the --is_not_bids option.

Commandline Arguments

Command-line argument parser of NeuroDataPub (v0.4)

usage: neurodatapub [-h] --mode {all,create-only,publish-only} --dataset_dir
                    DATASET_DIR [--is_not_bids] --datalad_dir DATALAD_DIR
                    --github_sibling_config GITHUB_SIBLING_CONFIG
                    (--git_annex_ssh_special_sibling_config GIT_ANNEX_SSH_SPECIAL_SIBLING_CONFIG | --osf_sibling_config OSF_SIBLING_CONFIG)
                    [--gui] [--generate_script] [-v]

Named Arguments

--mode

Possible choices: all, create-only, publish-only

Mode in which neurodatapub is run: "create-only" create the datalad dataset only, "publish-only" publish the datalad dataset only, "all" create and publish the datalad dataset.

--dataset_dir

The directory with the input dataset formatted according to the BIDS standard.

--is_not_bids

Specify if the directory with the input dataset is not formatted according to the BIDS standard.

Default: False

--datalad_dir

The local directory where the Datalad dataset should be.

--github_sibling_config

Path to a JSON file containing configuration parameters for the GitHub dataset repository sibling.

--git_annex_ssh_special_sibling_config

Path to a JSON file containing configuration parameters for the git-annex SSH special remote dataset sibling.

--osf_sibling_config

Path to a JSON file containing configuration parameters for the git-annex OSF special remote dataset sibling.

--gui

Run NeuroDataPub in GUI mode.

Default: False

--generate_script

Dry run that generates a bash script called neurodatapub_DD-MM-YYYY_hh:mm:ss.sh in the code/ folder of the input dataset that records all commands for later execution.

Default: False

-v, --version

show program’s version number and exit

Sibling configuration files

Git-annex special remote sibling configuration file

The Git-annex special remote sibling configuration file specified by the input flag --git_annex_ssh_special_sibling_config adopts the following JSON schema:

{
    "remote_ssh_login": "user",
    "remote_ssh_url": "ssh://neurodatapub.server.org",
    "remote_sibling_dir": "/remote/path/of/dataset/sibling/.git"
}
where:
  • "remote_ssh_login" (mandatory): user’s login to the remote

  • "remote_ssh_url" (mandatory): SSH-URL of the remote in the form “ssh://…”

  • "remote_sibling_dir" (mandatory): Remote .git/ directory of the sibling dataset

GitHub sibling configuration file

The GitHub sibling configuration file specified by the input flag --github_sibling_config adopts the following JSON schema:

{
    "github_login": "GitHubUserName",
    "github_email": "GitHubUserEmail",
    "github_organization": "NCCR-SYNAPSY",
    "github_token": "Personal github authentication token",
    "github_repo_name": "DatasetName"
}
where:
  • "github_login" (mandatory): user’s login to GitHub.

  • "github_email" (mandatory): user’s email associated with GitHub account.

  • "github_organization" (mandatory): GitHub organization the GitHub account has access to.

  • "github_token" (mandatory): user’s github authentication token. Please see “Creating a personal access token” Github documentation for more details on how to get one. Make also sure that the write:org and read:org options are enabled.

  • "github_repo_name" (mandatory): Dataset repository name on GitHub.

OSF sibling configuration file

The OSF sibling configuration file specified by the input flag --osf_sibling_config adopts the following JSON schema:

{
    "osf_dataset_title": "DatasetName",
    "osf_token": "Personal OSF authentication token",
}
where:
  • "osf_dataset_title" (mandatory): Dataset title on OSF.

  • "osf_token" (mandatory): user’s OSF authentication token. To make a Personal Access Token, please go to the relevant OSF settings page and create one. If you do not an OSF account yet, you will need to create one a-priori.

Running neurodatapub

The neurodatapub command-line interface can be run in in the “create-only”, “publish-only”, and “all” modes with the --mode option flag (as described in Commandline Arguments). For example, an invocation of the interface to create and publish a dataset (“all” mode) to a ssh sibling would be as follows:

$ neurodatapub --mode "all" \
     --dataset_dir '/local/path/to/input/bids/dataset' \
     --datalad_dir  '/local/path/to/output/datalad/dataset' \
     --git_annex_ssh_special_sibling_config '/local/path/to/special_annex_sibling_config.json' \
     --github_sibling_config '/local/path/to/github_sibling_config.json'

Note

When you use directly the command-line interface, you would need to provide the JSON files with the option flags --github_sibling_config, and --git_annex_ssh_special_sibling_config, or --git_annex_osf_sibling_config to describe the configuration of the GitHub and special remote dataset siblings.

Need more control?

Since v0.4, NeuroDataPub can be run with the --generate_script option to give more control to more advanced users familiar with the Linux shell:

$ neurodatapub --mode "all" \
     --generate_script \
     --dataset_dir '/local/path/to/input/bids/dataset' \
     --datalad_dir  '/local/path/to/output/datalad/dataset' \
     --git_annex_ssh_special_sibling_config '/local/path/to/special_annex_sibling_config.json' \
     --github_sibling_config '/local/path/to/github_sibling_config.json'

Using this option, NeuroDataPub will run in a “dryrun” mode and will only create a Linux shell script, called neurodatapub_%d-%m-%Y_%H-%M-%S.sh in the code/ directory of your input dataset, that records all the underlined commands. If it appears that the code/ folder does not exist yet, it will be automatically created.

Support, bugs and new feature requests

All bugs, concerns and enhancement requests for this software are managed on GitHub and can be submitted at https://github.com/NCCR-SYNAPSY/neurodatapub/issues.