These Jupyter Notebook extensions make the data scientist’s life easier
Every Data Scientist spends most of their time visualizing data, preprocessing it, and tuning the model based on the results. For every data scientist, these are the hardest parts of the process, as a good model can only be obtained if you follow these three steps accurately. And here are 10 very useful Jupyter Notebook extensions to guide you through these steps.
Qgrid Is a Jupyter Notebook widget that uses SlickGridto render pandas dataframes in Jupyter Notebook. This allows you to explore your data frames with intuitive scrolling, sorting, and filtering controls, and to edit frames by double-clicking cells.
pip install qgrid #Installing with pip conda install qgrid #Installing with conda
pip install itables
Activate interactive mode for all series and data frames like this:
from itables import init_notebook_mode init_notebook_mode(all_interactive=True)import world_bank_data as wb df = wb.get_countries() df
3. Jupyter DataTables
Data scientists and many developers work with a dataframe every day to interpret the data for processing. The general workflow is to display a dataframe, look at the data schema, and then create some graphs to check how the data is distributed, getting a clearer picture, and perhaps find new data in the table, etc.
But what if these distribution plots were part of a standard data frame and we could quickly search the table with minimal effort? What if this view were the default view?
pip install jupyter-datatables
How to use the extension?
from jupyter_datatables import init_datatables_mode init_datatables_mode()
ipyvolume helps with Python 3d graphics in Jupyter, using IPython and WebGL as a basis.
Today Ipyvolume can:
- Do multiple volume rendering.
- Render scatter plots (up to ~ 1 million glyphs).
- Draw quiver plots (for example, a scatter, but with an arrow in a certain direction).
- Supports arbitrary areas that you draw with the mouse.
- Renders in stereo for virtual reality using Google Cardboard.
- Animates in a d3 style, for example if the x coordinates or the color of scatter plots change.
- Animation or sequences, all properties of a scatter plot or quiver plot can be a list of arrays, which in turn can represent snapshots, etc.
pip install ipyvolume #Installing with pip conda install -c conda-forge ipyvolume #Installing with conda
bqplot Is a 2D rendering system for Jupyter, based on the Grammar of Graphics constructs.
- A complete 2D visualization framework with Python APIs.
- A robust API to add custom interactions (pan, zoom, select, etc.).
Two APIs presented
- Users can create custom visualizations using an internal object model inspired by the Gramamr of Graphics constructs (drawing, labels, axes, scales) and enrich their visualization with our interaction layer.
- Or you can use a context API like Matplotlib’s pyplot, which provides reasonable defaults for most parameters.
pip install bqplot #Installing with pip conda install -c conda-forge bqplot #Installing with conda
Don’t blindly train deep learning models! Take a look at every era of your learning!
livelossplot provides a real-time loss graph in Jupyter Notebook for Keras, PyTorch and other frameworks models.
pip install livelossplot
How to use the extension?
from livelossplot import PlotLossesKeras model.fit(X_train, Y_train, epochs=10, validation_data=(X_test, Y_test), callbacks=[PlotLossesKeras()], verbose=0)
TensorWatch Is a debugging and visualization tool for data processing, deep learning and knowledge reinforcement from Microsoft Research. The package works in Jupyter Notebook, showing real-time visualizations of your machine learning and performing several other key tasks of analyzing models and data.
pip install tensorwatch
Polyaxon Is a platform for building, training and monitoring large-scale deep learning applications. We create a system for solving problems of reproducibility, automation and scalability of machine learning applications. Polyaxon is deployed in any data center, hosted by any cloud provider, or can be hosted and operated by Polyaxon, and supports all major deep learning frameworks such as Tensorflow, MXNet, Caffe, Torch, and more.
pip install -U polyaxon
handcalcs Is a library for automatically rendering Python computation code in Latex, but in such a way as to simulate the format of the calculation as if it were written in pencil: write a symbolic formula followed by numeric substitutions, and then the result.
pip install handcalcs
jupyternotify provides the magic value %% notify, which notifies the user when a potentially lengthy cell has finished using browser push notifications. The use cases include machine learning models that take a long time to train, grid search, or Spark computations. %% notify allows you to jump to another job and get notified the moment your cell shuts down.
pip install jupyternotify
We hope you find these extensions useful. If you have any useful extensions in mind that were not included in this collection – share them in the comments!
Find out the detailshow to get a Level Up in skills and salary or an in-demand profession from scratch by taking SkillFactory online courses with a 40% discount and a promotional code HABR, which will give another + 10% discount on training:
- Data Scientist Profession
- Data Analyst profession
- Data Engineering Course
- Other professions and coursesPROFESSION
- Java developer profession
- JAVA QA engineer
- Frontend developer profession
- Profession Ethical hacker
- C ++ developer profession
- Profession Unity Game Developer
- Profession Web developer
- The profession of iOS developer from scratch
- Profession Android developer from scratch