Data Science 💜 Notebooks
Hey 👋🏻👋🏻 Today we gonna talk about Deepnote a data science notebook in browser that aims to ease the data science practitioners lives !!
We are quite aware of notebook’s importance in lives of data scientists and analysts. Most of you know what they are, for others it’s a concept great in itself; a web application that allows to run and edit code with in line visualisations. They are also called computational notebooks. Jupyter Notebooks, Azure Notebooks, Google collab, and Databricks, are quite popular with data scientists/analysts.
According to an analysis by GitHub, it has been counted that more than 2.5 million public Jupyter notebooks were shared in September 2018, which is up by 200,000 counted in 2015 !!
Even though notebooks usage and popularity has risen over the years, the data scientists and analysts run into pain points with more and more complex tasks. Top pain points from mixed-methods study (CHI’20 paper linked in references) based on the observations and interviews were identified as — Setup, Manage code, Archival , Share and Collaborate and Reproduce and Reuse.
Here’s where 💙 Deepnote notebooks bring in their magic !
What is Deepnote?
Deepnote is a cloud-based notebook with all basic Jupyter functionalities as well as real-time collaboration. Its built on top of Jupyter so it has all the features you’d expect — it’s Jupyter-compatible, supports Python, R and Julia and runs in the cloud. Awesome, That’s what we need !!
Deepnote Characteristics:
- Collaborative
- Reproducibility
- Seamless integration
- IDE like intelligence
- Code mangability
Let’s look at each one of them in detail 🔎
🏹 Collaborative
Real time collaboration is seamless in deepnote. You can manage your collaborators. All one needs to know is the deepnote notebook url that can be shared with people using their email ids. The owner of notebook can give them the desired rights on that notebook. This way people or teams can collaborate on the same notebook 🙌
We can see the teammates collaborating on notebook making real times changes to code and can also communicate with each other via comments!! We can set notifications on comments as well. It’s amazing how this makes data science seem interactive and fun.
🏹 Reproducibility
The interface is intuitive for non-technical users and flexible for others. Since it’s easy to define dependencies and write clean code and collaborate we are able to build reproducible notebooks.
Deep note is working on features like prompting you to move installations into requirements.txt, and even offer a embedded code reviews via comments.
🏹 Seamless integration
Let’s look at integrations in deepnote. The integration tab is quite powerful. Just know your credentials file, put them in prompt box and the connection is built. Notice there is no code required to build connections to external services. Moreover, the credentials are encrypted and protected so we don’t need to worry about them at all. 😌
For data imports, small datasets can be dragged and dropped as local files. For large data sets we can access integration tab and use shared dataset to drop the large files and connect to it whenever required. This is great, when we are sharing notebook with others as they will be able to access that data too.
If your data resides in any other services like AWS or mongoDB, the deepnote can connect to that integration tab just by entering your credentials as simple as it can be. Deepnote also offers GPU access. 🤖
🏹 IDE like code intelligence
In Deepnote notebooks we have code intelligence similar to that in the IDEs. When we are working in notebook it can show us functions. Just hover over function and it will display a snap of documentation for it. If we want to know the basic definition of that function, deepnote got you, just do command+click on it and it displays the definition of it.
Deepnote offers you a command palette which is easily accessible. We can use them to save time figuring out basic shortcuts in notebooks. Just press command+P and get all the list of commands.
🏹 Notebook code manageability
We all know the hassle of keep tracking of changes in any data science projects and deepnote take cares of this as well. From deleted cell to changes made in the cell, the deepnote keeps snapshot of all activities in project. Deepnote has an automatic snapshot functionality and also a manual one. What a saver!!
Deepnote is working on version control area❗The version control feature would be the best thing that can happen to notebooks.
🔰 Some more interesting and productive features 🔰
⚡We know how important it is to explore the variables in the dataset. For instance to get the variables distribution or extract the top 10 values, we need to write few lines of code. Deepnote has made this easier with variable explorer functionality where on loading data in cell gives the option of seeing raw data output along with the data table showcasing each variable distribution, the null values and also unique values. This feature is quite productive for data science practitioners.
⚡You can publish the notebook easily using publish button functionality. Also with one click on republish button we can update the published notebook. The published version looks crisp and neat. It has an article like layout where you can access different parts of notebook through left panel of contents.
If someone likes your analysis they can duplicate it and can also give likes on it.
⚡Get a browser notification 🔔 when a cell finishes executing. We can also check how long it took to execute cell/cells.
Friends go Give it a shot!! 👈 Let me know your thoughts 💭
References ➼https://web.eecs.utk.edu/~azh/pubs/Chattopadhyay2020CHI_NotebookPainpoints.pdf
🎗Be updated with new features and functionalities in deepnote by following ⏩ Deepnote Medium | Deepnote publishing