To compile and execute our python programs, we need an IDE (Integrated Development Environment) similar to what Visual Studio provides for C/C++ and Eclipse for Java. Keeping the requirement of initial learners in mind, engineers developed an easy-to-use IDE, Jupyter notebook, to run python codes line-by-line or in a cell manner. We will primarily use this IDE to develop real-life applications in Machine Learning and Data Science.
So this blog familiarizes learners with its installation and primary usage.
Let’s start understanding what exactly it is.
The Jupyter notebook is an open-source IDE that provides interactive Data Science and Machine Learning environments. It can be used to create Jupyter documents with embedded live codes, making it very handy for educational and presentation purposes. Embedded live code means we can interact with this code and show the effect of changes in real-time. We can also embed various visualizations in the document.
It provides a web-based interactive computational environment where we can write text, codes, or documents. The name “Jupyter” comes primarily from the languages it supports, Julia, Python, and R.
We can install the Jupyter notebook using the pip command for different OS (Windows, Mac, and Linux) like this:
pip install jupyter
We only need to install Anaconda on our systems, and Jupyter notebook comes preinstalled.
Now that we have installed it, let’s learn about the essential functions it supports. We will start with starting the notebook server, renaming it, and continuing to understand more about the available ‘menu’ options.
To work with commands and codes in Jupyter notebook, we need to know it’s working. So let’s first create a notebook, and then we will write our simple programs in it. For creating a notebook, the foremost step is creating a Jupyter server. Jupyter provides a web-based interactive computational environment. Hence we need a server-client mechanism, where the server will provide the backend support from the respective OS, and the client will be the frontend for the web application.
We want our web application (frontend) to be pointing at some root folder, hence we first go to that particular folder location, open the terminal and then use the following command:
jupyter notebook
It will start Jupyter by opening our default browser with the URL: http://127.0.0.1:8888/tree . Our screen will look as shown below:
Please note that we have just started running a notebook server, not an exact notebook, so let’s see how to create one.
On the right side of the image above, we can find the “New” button in between the “Upload” and “refresh” buttons. Once we click on the ‘New’ button, it will ask to start the notebook as a Python2 kernel or Python3 kernel. If our system has both Python2 and Python3 installed, we can choose as per our project’s needs. Let’s select Python3 for now. A new tab will open in the browser with a clean new notebook, as shown below.
The default name for this notebook file would be “Untitled” which can be confusing. So let’s change the name of this notebook.
The image above shows that the filename (written beside the Jupyter logo) is ‘Untitled1’. We can click on this name which will open a dialogue box that can be used to rename the file, as shown below:
We gave the name “Jupyter intro” to the file. This will create a blank notebook file which we will use to run Python code. Let’s see how we can do that.
We can see an empty cell with In [ ]: in a new notebook where we can write the python code. The base language is the same one we chose while selecting the notebook, i.e., Python3 or Python2. For example:
In [ ]: print ("Hello World")
After writing the code, we can run the cells by choosing the Run button from the row of buttons on the top. We can also press the shift + enter key to run the cells. The notebook file will look as follows:
Do note that the left side of each cell has square brackets. These brackets are empty for a new cell. As we continue to run the cells, they will be filled with a number that indicates the order in which they were executed.
Let’s learn about the main options present in the “menu” in Jupyter, which are essential for us to use notebooks efficiently.
Various options are present in the menu of Jupyter. Let’s start with the File option:
jupyter nbconvert <input notebook> --to <output format>
'''Please note that Jinja is a template made for Python,
and it depends on Pandoc and Tex libraries used for
converting from one format to another. So to make sure that
nbconvert works well, we need to ensure that corresponding
dependencies are installed.'''
Let us look into the Kernel option of the menu in the next section.
A notebook kernel executes the code as mentioned in the notebook. Whenever we run a program, it consumes the RAM. This RAM is freed once the kernel is shut-down.
Let us analyze the various options available in the kernel that are helpful.
We have looked into two options for the menu in detail, and these are the frequently used ones. Let’s look at what other options are present on the menu.
Now let’s discuss the toolbar and its various icons frequently used while working with Jupiter-notebook.
Let’s list the options available from left to right in the toolbar, just below the menu bar. These icons are frequently used while working with the notebook.
Let us look into tabs present on the home page and their uses.
There are two tabs present on the home page of the server (where we create the server) Cluster and Running. The running tab lets us view all the running files. It helps us to save the files before closing the server. Usually, this saving is unnecessary as the notebook is autosaved at frequent intervals. It is also helpful to free the RAM if we close some unused notebooks.
Let us look into different cell types in the next section.
We have three types of cells: Code, Markdown, and Raw NBConvert.
Markdown and Code are the most commonly used types of cells. Let’s see Markdown in more detail.
The operations below are performed when the cell type is markdown. Let's see them 1-by-1 and their corresponding result in the final image in last.
Till now, utilities have been learned, but extra functionalities which make it easier to use still need to be added, like a tool that gives us a table of content. We use extensions for this purpose. Let us learn more about extensions in the next section.
Jupyter supports four types of extensions:
The most commonly used extension is the IPython kernel for notebooks. We can see the extensions if we have them installed beside the clustering tab as an extension tab. If this tab is available, a suitable extension can be chosen once we click on it.
The extension is not on our computer or enabled if this tab is unavailable. This can be installed using pip in most cases. If pip cannot be used, the following command needs to be used.
jupyter nbextension install extension_name
Installing does not mean that the extension can be used. It needs to be enabled. To enable it, the following command needs to be typed.
jupyter nbextension enable extension_name
A standard extension in Jupyter would be ‘Table of contents, which, when used, helps us to see our notebook in table content format. If needed, a hyperlink can also be added to the table of contents.
Jupyter-notebook is the most used IDE for teaching Machine Learning and Data Science applications. Keeping that in mind, in this blog, we discussed the basics of using the Jupyter-notebook, which will be used in our course to execute Python codes of ML and Data Science projects. We hope you find the article informative and enjoyable.