The steps to create a GitHub repository are outlined below.
If you don’t already have a GitHub account, sign up here (it is free!)
Navigate to the Hakai organizational GitHub repository
Under the tab Repositories you can select ‘New repository’ in the top right corner
For a template, select ‘HakaiInstitute/hakai-dataset-repository-template’ (see Figure 1)
Give your repository a name and decide whether you want it to be publicly accessible or private.
Figure 1. Selecting the hakai-dataset-repository-template
This template populates your repository with a useful organizational structure and files that are strongly recommended to be included in your data package, such as a data dictionary, readme file, reference citation, methods section and resources. These should be updated prior to release.
How to add data files using the GitHub website
You may copy a repository to your local machine to make edits, commit and push changes, but if you are unfamiliar with that workflow it is possible to simply use the GitHub web interface to upload files, delete files and edit existing files (Figure 2). To delete files, click on the file name (left hand side of Figure 2) and you will be taken to another page that previews the file. You will see a button with three dots (...) if you click that you can choose to delete a file.
Figure 2. A GitHub repository will always include the ‘Add files’ button at the middle top of this image which allows you to upload files, and edit plain text files such as your data dictionary, readme file or changelog.
How to release versions on GitHub
When you have finalized your data package and updated all the relevant documentation, you can release this version of your repository. A ‘release’ essentially takes a snapshot of your repository, and makes it easier for data users to navigate and access different versions of your data package. Releases can be found on the right hand side of your repository. Select ‘create a new release’. On the next page you can give your release a title, and a description. Under ‘Choose a tag’, the default setting is that you will have to create a new tag when publishing your version. You must match the release tag with the version element in your metadata record.
Once you have released your new version, you can find and access the latest version release and older releases under the ‘Releases’ tab (see e.g. JSP Releases). Releasing versions of a dataset when there are significant changes is good practice regardless of whether the repository is public or private. The most important future collaborator is often you, and leaving a history of versions and descriptions of what changed between versions will make your life easier if you need to go back to an older version.
It is strongly recommended to include any changes between versions in a Changelog file. A recommended format for a changelog file can be found here. Additionally, include the recommended citation specific to each version in a readme file. The readme file can be used to capture additional metadata information not included in the Metadata Entry Form but that might be useful for data users to interpret or use the data. It can also be used to explain the structure of the zipped data package and help users navigate the contents.
If a new release is made, you must update the metadata record with a new version number using the metadata intake form.
When filling out a metadata record and including links to data in the Download and Resources section of the metadata form, please provide a link to the ‘Releases’ page (e.g. Hakai Juvenile Salmon Program Releases so that when you click the link users can navigate to their desired version and click the ‘Source code (zip)’ link to download its contents.