Other Tutorials

Some of the content that we produce is not as detailed as our full workshops but has a wider scope than the content included in our Coding Tips suggestions. This information can broadly be defined as “tutorials” though their depth and scope can vary significantly depending on the topic being tutorialized. As with our Coding Tips, this page is under constant development so please post an issue if you have an idea for a tutorial that you’d like to suggest that we create.

Using the googledrive R Package

The googledrive R package is a package that lets R users directly interact with files on GoogleDrive. This can be extremely useful because it lets all members of a team share the same source data file(s) and guarantees that updates to “living” documents are received by all group members the next time they run their R script. This package is technically part of the Tidyverse but is not loaded by running library(tidyverse).

Because this package requires access to an R user’s GoogleDrive, you must “authenticate” the googledrive package. This essentially tells Google that it is okay if an R package uses your credentials to access and (potentially) modify your Drive content. There are only a few steps to this process but follow along with the below tutorial and we’ll get you ready to integrate the Google Drive into your code workflows using the googledrive package in no time!

Prerequisites

To follow along with this tutorial you will need to take the following steps:

Feel free to skip any steps that you have already completed!

Authorize googledrive

In order to connect R with a GoogleDrive, we’ll need to authorize googledrive to act on our behalf. This only needs to be done once (per computer) so follow along and you’ll be building GoogleDrive into your workflows in no time!

First, install the googledrive and httpuv R packages. The googledrive package’s need is self-evident while the httpuv package makes the following steps a little easier than googledrive makes it alone. Be sure to load the googledrive package after you install it!

# Install packages
install.packages(c("googledrive", "httpuv"))

# Load them
library(googledrive)

Once you’ve installed the packages we can begin the authentication in R using the drive_auth function in the googledrive package.

googledrive::drive_auth(email = "enter your gmail here!")

If this is your first time using googledrive, drive_auth will kick you to a new tab of your browser (see below for a screen grab of that screen) where you can pick which Gmail you’d like to connect to R.

Click the Gmail you want to use and you will get a second screen where Google tells you that “Tidyverse API” wants access to your Google Account. This message is followed by three checkboxes, the first two are grayed out but the third is unchecked.

NOTE

This next bit is vitally important so carefully read and follow the next instruction!

In this screen, you must check the unchecked box to be able to use the googledrive R package. If you do not check this box all attempts to use googledrive functions will get an error that says “insufficient permissions”.

While granting access to “see, edit, create, and”delete” all of your Google Drive files” sounds like a significant security risk, those powers are actually why you’re using the googledrive package in the first place! You want to be able to download existing Drive files, change them in R on your computer, and then put them back in Google Drive which is exactly what is meant by “see, edit, create, and delete”.

Also, this power only applies to the computer you’re currently working on! Granting access on your work computer allows only that computer to access your Drive files. So don’t worry about giving access to your Drive to the whole world, that is protected by the same failsafes that you use when you let your computer remember a password to a website you frequent.

After you’ve checked the authorization box, scroll down and click the “Continue” button.

This should result in a plain text page that tells you to close this window and return to R. If you see this message you are ready to use the googledrive package!

Problems with Authorization

If you have tried to use drive_auth and did not check the box indicated above, you need to make the googledrive package ask you again. Using drive_auth will not (annoyingly) return you to the place it sent you the first time. However, if you run the following code chunk it should give you another chance to check the needed box.

The gargle R package referenced below is required for interacting with Google Application Program Interfaces (APIs). This package does the heavy lifting of secure password and token management and is necessary for the googledrive authentication chunk below.

googledrive::drive_auth(
  email = gargle::gargle_oauth_email(),
  path = NULL,
  scopes = "https://www.googleapis.com/auth/drive",
  cache = gargle::gargle_oauth_cache(),
  use_oob = gargle::gargle_oob_default(),
  token = NULL)

Unfortunately, to use the googledrive package you must check the box that empowers the package to function as designed. If you’re uncomfortable giving the googledrive that much power you will need to pivot your workflow away from using GoogleDrive directly. However, NCEAS does offer access to an internal server called “Aurora” where data can be securely saved and shared among group members without special authentication like what googledrive requires. Reach out to our team if this seems like a more attractive option for your working group and we can offer training on how to use this powerful tool!

Find and Download Files

Now that you’ve authorized the googledrive package, you can start downloading the Google Drive files you need through R! Let’s say that you want to download a csv file from a folder or shared drive. You can save the URL of that folder/shared drive to a variable.

The googledrive package makes it straightforward to access Drive folders and files with the as_id function. This function allows the full link to a file or folder to serve as a direct connection to that file/folder. Most of the other googledrive functions will require a URL that is wrapped with as_id in this way. You would replace “your url here” with the actual link but make sure it is in quotation marks.

drive_url <- googledrive::as_id("your url here")

To list all the contents of this folder, we can use the drive_ls function. You will get a dataframe-like object of the files back as the output. An example is shown below in the screenshot. Here, this Google Drive folder contains 4 csv files: ingredients.csv, favorite_soups.csv, favorite_fruits.csv and favorite_desserts.csv

drive_folder <- googledrive::drive_ls(path = drive_url)
drive_folder

If it has been a while since you’ve used googledrive, it will prompt you to refresh your token. Simply enter the number that corresponds to the correct Google Drive account.

If you only want to list files of a certain type, you can specify this in the type argument. And let’s say that my folder contains a bunch of csv files, but I only want to download the one named “favorite_desserts.csv”. In that case, I can also put a matching string in the pattern argument in order to filter down to 1 file.

drive_folder <- googledrive::drive_ls(path = drive_url,
                                      type = "csv", 
                                      pattern = "favorite_desserts")
drive_folder

Once we’ve narrowed down to the file we want, we can download it using drive_download. This function takes the file identifier as an argument so we can access it using drive_folder$id.

googledrive::drive_download(file = drive_folder$id)

This will automatically download the file to our working directory. If you want, you can specify a different path to download to. Just put the new file path into the path argument, replacing the “your path here”, but keep the quotation marks.

googledrive::drive_download(file = drive_folder$id, 
                            path = "your path here")

If you’ve downloaded the file before, and you want to overwrite it, there’s a handy overwrite argument that you can set to TRUE. Note that the default is FALSE.

googledrive::drive_download(file = drive_folder$id, 
                            path = "your path here",
                            overwrite = T)

If there are multiple files in the Drive folder and you want to download them all, you can use a loop like so:

# For each file:
for(focal_file in drive_folder$name){
  
  # Find the file identifier for that file
  file_id <- subset(drive_folder, name == focal_file)

  # Download that file
  drive_download(file = file_id$id, 
                 path = "your path here",
                 overwrite = T)
}

Building a Website with Quarto

Quarto is a new tool developed by RStudio (the company, not the program) to create a more ‘what you see is what you get’ editor for creating markdown files and products (e.g., books, websites, etc.). Additionally, it includes a visual editor that allows users to insert headings and embed figures via buttons that are intuitively labeled rather than through somewhat arcane HTML text or symbols. While Quarto is still in its infancy, it is rapidly gathering a following due to the aforementioned visual editor and for the ease with which quarto documents and websites can be created.

Prerequisites

To follow along with this tutorial you will need to take the following steps:

Feel free to skip any steps that you have already completed!

Create a Quarto Website R Project

To begin, click the “Project” button in the top right of your RStudio session.

In the resulting dialogue, click the “New Directory” option.

From the list of options for project templates, select “Quarto Website”.

Pick a title and check the “Create a git repository” checkbox. For your title, short but descriptive titles are most effective. Once that is done, click “Create Project” in the bottom right of the window.

After a few seconds, RStudio should refresh with a Quarto document (such documents have the file extension “.qmd”) and a “_quarto.yml” file open.

Part of Quarto’s central philosophy is that all of the formatting of individual .qmd files in a project is governed by the settings created by a singular .yml file. In an R markdown project some of the global settings are set in .yml but other settings are handled within each .Rmd file. This centralization is a key innovation in streamlining projects and is one reason for Quarto’s quick popularity.

Preparing Project for Web Deployment

To prepare your project for web deployment via GitHub Pages, we have three quick steps that we must first complete.

First, in the “_quarto.yml” file, add output-dir: docs as a subheading beneath the project: heading. Make sure that the indentation is the same as the type: website but the new line can be either above or below that line.

Second, in the “Terminal” pane run touch .nojekyll. This creates a file called “.nojekyll” that is necessary for hosting your website via GitHub Pages.

Third, in the “Terminal” pane run quarto render. This processes the template .qmd files you currently have in the repository and prepares them to become actual web pages.

Once you’ve done these three things you can move on to creating a GitHub repository so that we can take the necessary steps to having GitHub host your website!

Make a New GitHub Repository

From your GitHub “Repositories” tab, click the green “New” button.

Add a title to your repository and add a description. Once you’ve added these two things, scroll down and click the green “Create repository” button.

Be sure that you do not add a README, do not add a gitignore, and do not add a license. Adding any of these three will cause a merge conflict when we link the project that you just created with the GitHub repository that you are in the process of creating.

After a few seconds you should be placed on your new repository’s landing page which will look like the below image because there isn’t anything in your repository (yet).

Copy the link in the field and go back to your RStudio session.

Adding your Project to GitHub

The following steps include a sequence of command line operations that will be relayed in code chunks below. Unless otherwise stated, all of the following code should be run in “Terminal”.

If you didn’t check the “Create a git repository” button while creating the R project, you’ll need to do that via the command line now. If you did check that box, you should skip this step!

# Start a git repository on the "main" branch
git init -b main

Stage all of the files in your project to the git repository. This includes the .yml file, all .qmd files and all of their rendered versions created when you ran quarto render earlier. This code is equivalent to checking the box for the files in the “Git” pane of RStudio.

# Stage all files
git add .

Once everything has been staged, you now must commit those staged files with a message.

# Commit all files with the message in quotes
git commit -m "Initial commit"

Now that your project files have been committed, you need to tell your computer where you will be pushing to and pulling from. Paste the link you copied at the end of the “Make a New GitHub Repository” into the code shown in the chunk below (instead of GITHUB_URL) and run it.

# Tell your computer which GitHub repository to connect to
git remote add origin GITHUB_URL

Verify that URL before continuing.

# Confirm that URL worked
git remote -v

Finally, push your commited changes to the repostory that you set as the remote in the preceding two steps.

# Push all of the content to the main branch
git push -u origin main

Now, go back to GitHub and refresh the page to see your project content safe and sound in your new GitHub repository!

Deploy Website via GitHub

In order to get your new website actually on the web, we’ll need to tell GitHub that we want our website to be accessible at a .github.io URL.

To do this, go to the “Settings” tab with a gear icon and click it. You may be prompted to re-enter your GitHub password, do so and you can proceed.

In the resulting page, look towards the bottom of the left sidebar of settings categories and click the “Pages” option. This is at the very bottom of the sidebar in the screen capture below but is towards the middle of all of the settings categories Github offers you.

Scroll down to the middle of this page and where it says “Branch” click the dropdown menu that says “None” by default.

Select “main” from the dropdown.

This opens up a new dropdown menu where you can select which folder in your repository contains your website’s content (it defaults to “/ (root)”). Because we specified output-dir: docs in the .yml file earlier we can select “/docs” from the dropdown menu.

Once you’ve told GitHub that you want a website generated from the “docs” folder on the main branch, click the “Save” button.

From this moment your website has begun being deployed by GitHub! You can check the status of the building process by navigating to the “Actions” tab of your repository.

Select the “pages build and deployment workflow” in the list of workflows on the bottom righthand side of the page.

This shows you GitHub’s building and deployment process as a flowchart. While it is working on each step there will be an amber circle next to the name of that sub-task. When a sub-task is completed, the amber circle becomes a green circle with a check mark.

When the three steps are complete the amber clock symbol next to the “pages build and deployment” action will turn into a larger green circle with a check mark. This is GitHub’s way of telling you that your website is live and accessible to anyone on the internet.

You can now visit your website by visiting its dedicated URL. This URL can be found by returning to the “Settings” tab and then scrolling through the sidebar to the “Pages” section.

Alternately, the website for your repository always uses the following composition: https://repository owner.github.io/repository name/

If we visit that link, we can see that our website is live!

GitHub Housekeeping

We recommend a quick housekeeping step now to make it easier to find this URL in the future. Copy the URL from the Pages setting area and return to the “Code” tab of the repository.

Once there, click the small gear icon to the right of the “About” header.

In the resulting window, paste the copied URL into the “Website” field. Once you’ve pasted it in, click the green “Save changes” button.

This places the link to your deployed website in an intuitive, easy-to-find location both for interested third parties and yourself in the future.

Adding Website Content

Now that you have a live website you can build whatever you’d like! Given the wide range of possibility, we’ll only cover how to add a new page but the same process applies to any edit to the living webpage.

To add a new page create a new Quarto document. You can do this by going to the “File” menu, entering the “New File” options, and selecting “Quarto Document…”

Similarly to an R markdown file, this will open a new window that lets you enter a title and author as well as decide what format you want to render files to along with some other settings options. You only need to click the “Create” button in the bottom right of this dialogue (though you can definitely play with the other options and text boxes as you desire).

After a moment, a new .qmd file will open in Quarto’s visual editor. For the purposes of this tutorial, you only need to add a title in the top of the file but for a real website you can add whatever content sparks joy for you!

Save that file into your project folder. Its name can be anything but be sure that you remember what you name it!

Add the name of the new Quarto document to the .yml file in the website navbar area (in this example the file is called “more-stuff.qmd”).

Once you’ve added the file to the fundamental architecture of your website, you need to tell Quarto to re-build the part of the website that GitHub looks for when it deploys. To do this run quarto render in the Terminal.

If you want to preview your changes, run quarto preview in the Terminal and a new browser window will be displayed showing your current website content. This preview continues until you click the red stop sign icon in RStudio so be sure to end it when you’re done with the preview!

Regardless, once you’ve run either quarto render or quarto preview you need to stage and commit all changed files indicated in the Git pane of RStudio. As a reminder, to stage files you check the box next to them, to commit staged files, type an informative message and press the “Commit” button in the right side of the window.

Switch back to GitHub and you’ll see an amber dot next to the commit hash just beneath and to the left of the green “Code” button.

When the amber dot turns into a green check mark that means that your edits to your website are now included in the live version of your site!

When you visit your website you may need to refresh the page for your edits to appear but all new visitors will see the updated content when they load the page.

Supplementary Information

Quarto is developing at a rapid pace so quality of life changes and new functionalities are introduced relatively frequently. Additionally, Quarto supports user-created “extensions” that can be downloaded in a given project and then used (similar to the way user-developed R packages can be shared) so if you want to do something that Quarto itself doesn’t support, chances are you’ll be able to find an extension that handles it.

Quarto’s documentation of website creation and formatting is extremely thorough and is a great resource as you become more comfortable with your new website. We hope this tutorial was useful to you and welcome constructively critical feedback! Please post an issue with any thoughts for improvement that you have.