This is Jing, a data analyst with great passion for data science and big data technology. This blog gonna to show you how to connect your Jupyter Notebook to the BigQuery by using a service account.
Before we start, I assume you already have Jupyter Notebook installed on you computer and have a project on the google cloud platform with data saved in BigQuery.
In this blog, I created a project called “helloworld”, with a dataset called “example_dataset” in the my BigQuery. I am going to show you how to get the data from BigQuery on the google cloud from my Jupyter Notebook on my local computer, MAC to be more specific.
Step 0: Enable BigQuery API on the Google Cloud Platform Console
Search for the BigQuery API by using the search function on the Google Cloud Platform console and enable it. You should see something like this after you enabled it:
Step 1 : Get a service account key for the BigQuery in a Json File
The json file of service account key is used to authenticate your Jupyter Notebook on your local PC which will access the BigQuery later on.
Go to GoogleCloudPlatform -> API&Services -> Credentials
Get a service account by Create Credentials -> Service Account key
Select account -> new service account, give a name to your new service account, here we name it as “bq-jupyternotebook”. To limit the permission of this service account, we choose “BigQuery Admin” as the role for this service account. We want full access of BigQuery, but only to BigQuery, not other google products in this project. As for detailed information about service account, please refer to Service Accounts in Google Cloud
Click on “Create”, then you get the key for this service account in a json file. Put this json file in a folder which you created for your project. In this blog, we put it in a “key” folder under a folder called “helloworld” and also renamed it as “helloworld-key.json”.
Step 2 : Install the Google BigQuery API Client Libraries for Python on your computer
The google BigQuery api client python libraries includes the functions you need to connect your Jupyter Notebook to the BigQuery.
Open you terminal -> type in the command below. More information about Google BigQuery API client library
pip install google-cloud-bigquery
Step 3 : Set the environment variable GOOGLE_APPLICATION_CREDENTIALS for your python script
Set the environment variable called “GOOGLE_APPLICATION_CREDENTIALS” for pointing your Notebook to your service account key which you just created in Step 1. This variable lets Google knows where to find your authentication credentials. Run this before you start your Jupyter Notebook every time.
Type in the command below in your terminal:
Be careful with the path to the service account key json file, you need to specify the whole path. Look at the example:
Step 4 : Open your Jupyter Notebook, let’s get data from the BigQuery
Create a new Jupyter Notebook in your folder for your project, and look at the example code to see how it works. More examples for using Python Client for BigQuery
Congratulations! We are done here!
Now you should have successfully connected your Jupyter Notebook to the BigQuery and gotten some data! Enjoy your data analysis and have fun!
This is not something amazing, but hopefully will make your work a little bit easier! This is Jing, thanks for reading my blog! Feel free to commend and ask questions! I will get back to you as much as possible.