Jupyter Notebook的27個竅門，技巧和快捷鍵

0?wx_fmt=jpeg

Jupyther notebook ,也就是一般說的 Ipython notebook，是一個可以把代碼、圖像、注釋、公式和作圖集於一處，從而實現可讀性分析的一種靈活的工具。

Jupyter延伸性很好，支持多種編程語言，可以很輕鬆地安裝在個人電腦或者任何服務器上——隻要有ssh或者http接入就可以啦。最棒的一點是，它完全免費哦。

0?wx_fmt=png

Jupyter 界麵

默認情況下，Jupyter Notebook 使用Python內核，這就是為什麼它原名 IPython Notebook。Jupyter notebook是Jupyter項目的產物——Jupyter這個名字是它要服務的三種語言的縮寫：Julia，PYThon和R，這個名字與“木星（jupiter）”諧音。本文將介紹27個輕鬆使用Jupyter的小竅門和技巧。

◆ ◆ ◆

1.快捷鍵

高手們都知道，快捷鍵可以節省很多時間。Jupyter在頂部菜單提供了一個快捷鍵列表：Help > Keyboard Shortcuts 。每次更新Jupyter的時候，一定要看看這個列表，因為不斷地有新的快捷鍵加進來。另外一個方法是使用Cmd + Shift + P ( Linux 和 Windows下 Ctrl + Shift + P亦可)調出命令麵板。這個對話框可以讓你通過名稱來運行任何命令——當你不知道某個操作的快捷鍵，或者那個操作沒有快捷鍵的時候尤其有用。這個功能與蘋果電腦上的Spotlight搜索很像，一旦開始使用，你會欲罷不能。

0?wx_fmt=gif

幾個我的最愛：

Esc + F 在代碼中查找、替換，忽略輸出。
Esc + O 在cell和輸出結果間切換。
選擇多個cell:
- Shift + J 或 Shift + Down 選擇下一個cell。
- Shift + K 或 Shift + Up 選擇上一個cell。
- 一旦選定cell，可以批量刪除/拷貝/剪切/粘貼/運行。當你需要移動notebook的一部分時這個很有用。
Shift + M 合並cell.

0?wx_fmt=gif

◆ ◆ ◆

2.變量的完美顯示

有一點已經眾所周知。把變量名稱或沒有定義輸出結果的語句放在cell的最後一行，無需print語句，Jupyter也會顯示變量值。當使用Pandas DataFrames時這一點尤其有用，因為輸出結果為整齊的表格。

鮮為人知的是，你可以通過修改內核選項ast_note_interactivity，使得Jupyter對獨占一行的所有變量或者語句都自動顯示，這樣你就可以馬上看到多個語句的運行結果了。

In [1]: from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"
In [2]: from pydataset import data
quakes = data('quakes')
quakes.head()
quakes.tail()
Out[2]:
lat long depth mag stations
1 -20.42 181.62 562 4.8 41
2 -20.62 181.03 650 4.2 15
3 -26.00 184.10 42 5.4 43
4 -17.97 181.66 626 4.1 19
5 -20.42 181.96 649 4.0 11
Out[2]:
lat long depth mag stations
996 -25.93 179.54 470 4.4 22
997 -12.28 167.06 248 4.7 35
998 -20.13 184.20 244 4.5 34
999 -17.40 187.80 40 4.5 14
1000 -21.59 170.56 165 6.0 119

如果你想在各種情形下（Notebook和Console）Jupyter都同樣處理，用下麵的幾行簡單的命令創建文件~/.ipython/profile_default/ipython_config.py即可實現：

c = get_config()
# Run all nodes interactively
c.InteractiveShell.ast_node_interactivity = "all"

◆ ◆ ◆

3.輕鬆鏈接到文檔

在Help 菜單下，你可以找到常見庫的在線文檔鏈接，包括Numpy，Pandas，Scipy和Matplotlib等。

另外，在庫、方法或變量的前麵打上?，即可打開相關語法的幫助文檔。

In [3]: ?str.replace()

Docstring: S.replace(old, new[, count]) -> str Return a copy of S with all occurrences of substring old replaced by new. If the optional argument count is given, only the first count occurrences are replaced. Type: method_descriptor

◆ ◆ ◆

4.在notebok裏作圖

在notebook裏作圖，有多個選擇：

- matplotlib （事實標準）（https://matplotlib.org/），可通過%matplotlib inline 激活，（https://www.dataquest.io/blog/matplotlib-tutorial/）
- %matplotlib notebook 提供交互性操作，但可能會有點慢，因為響應是在服務器端完成的。
- mpld3（https://github.com/mpld3/mpld3）提供matplotlib代碼的替代性呈現（通過d3），雖然不完整，但很好。
- bokeh（https://bokeh.pydata.org/en/latest/）生成可交互圖像的更好選擇。
- plot.ly（https://plot.ly/）可以生成非常好的圖，可惜是付費服務。

0?wx_fmt=png

◆ ◆ ◆

5.Jupyter Magic命令

上文提到的%matplotlib inline 是Jupyter Magic命令之一。

推薦閱讀Jupyter magic命令的相關文檔

（https://ipython.readthedocs.io/en/stable/interactive/magics.html），它一定會對你很有幫助。下麵是我最愛的幾個：

◆ ◆ ◆

6.Jupyter Magic-%env:設置環境變量

不必重啟jupyter服務器進程，也可以管理notebook的環境變量。有的庫（比如theano）使用環境變量來控製其行為，%env是最方便的途徑。

In [55]: # Running %env without any arguments # lists all environment variables # The line below sets the environment # variable OMP_NUM_THREADS %env OMP_NUM_THREADS=4

env: OMP_NUM_THREADS=4

◆ ◆ ◆

7.Jupyter Magic-%run:運行python代碼

%run 可以運行.py格式的python代碼——這是眾所周知的。不那麼為人知曉的事實是它也可以運行其它的jupyter notebook文件，這一點很有用。

注意：使用%run 與導入一個python模塊是不同的。

In [56]: # this will execute and show the output from # all code cells of the specified notebook %run ./two-histograms.ipynb

0?wx_fmt=png

◆ ◆ ◆

8.Jupyter Magic-%load：從外部腳本中插入代碼

該操作用外部腳本替換當前cell。可以使用你的電腦中的一個文件作為來源，也可以使用URL。

In [ ]: # Before Running %load ./hello_world.py In [61]: # After Running # %load ./hello_world.py if __name__ == "__main__": print("Hello World!")

Hello World!

◆ ◆ ◆

9.Jupyter Magic-%store：在notebook文件之間傳遞變量

%store 命令可以在兩個notebook文件之間傳遞變量。

In [62]: data = 'this is the string I want to pass to different notebook' %store data del data # This has deleted the variable

Stored 'data' (str)

現在，在一個新的notebook文檔裏……

In [1]: %store -r data print(data)

this is the string I want to pass to different notebook

◆ ◆ ◆

10.Jupyter Magic-%who：列出所有的全局變量

不加任何參數， %who 命令可以列出所有的全局變量。加上參數 str 將隻列出字符串型的全局變量。

In [1]: one = "for the money" two = "for the show" three = "to get ready now go cat go" %who str

one three two

◆ ◆ ◆

11.Jupyter Magic-計時

有兩種用於計時的jupyter magic命令： %%time 和 %timeit.當你有一些很耗時的代碼，想要查清楚問題出在哪時，這兩個命令非常給力。

仔細體會下我的描述哦。

%%time 會告訴你cell內代碼的單次運行時間信息。

In [4]: %%time import time for _ in range(1000): time.sleep(0.01)# sleep for 0.01 seconds

CPU times: user 21.5 ms, sys: 14.8 ms, total: 36.3 ms Wall time: 11.6 s

%%timeit 使用了Python的 timeit 模塊，該模塊運行某語句100，000次（默認值），然後提供最快的3次的平均值作為結果。

In [3]: import numpy %timeit numpy.random.normal(size=100)

The slowest run took 7.29 times longer than the fastest. This could mean that an intermediate result is being cached. 100000 loops, best of 3: 5.5 µs per loop

◆ ◆ ◆

12.Jupyter Magic-writefile and %pycat:導出cell內容/顯示外部腳本的內容

使用%%writefile magic可以保存cell的內容到外部文件。而%pycat功能相反，把外部文件語法高亮顯示（以彈出窗方式）。

In [7]: %%writefile pythoncode.py import numpy def append_if_not_exists(arr, x): if x not in arr: arr.append(x) def some_useless_slow_function(): arr = list() for i in range(10000): x = numpy.random.randint(0, 10000) append_if_not_exists(arr, x)

Writing pythoncode.py

In [8]: %pycat pythoncode.py

import numpy def append_if_not_exists(arr, x): if x not in arr: arr.append(x) def some_useless_slow_function(): arr = list() for i in range(10000): x = numpy.random.randint(0, 10000) append_if_not_exists(arr, x)

◆ ◆ ◆

13.Jupyter Magic-%prun：告訴你程序中每個函數消耗的時間

使用%prun+函數聲明會給你一個按順序排列的表格，顯示每個內部函數的耗時情況，每次調用函數的耗時情況，以及累計耗時。

In [47]: %prun some_useless_slow_function()

26324 function calls in 0.556 seconds Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) 10000 0.527 0.000 0.528 0.000 :2(append_if_not_exists) 10000 0.022 0.000 0.022 0.000 {method 'randint' of 'mtrand.RandomState' objects} 1 0.006 0.006 0.556 0.556 :6(some_useless_slow_function) 6320 0.001 0.000 0.001 0.000 {method 'append' of 'list' objects} 1 0.000 0.000 0.556 0.556 :1() 1 0.000 0.000 0.556 0.556 {built-in method exec} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}

◆ ◆ ◆

14.Jupyter Magic-用%pdb調試程序

Jupyter 有自己的調試界麵The Python Debugger (pdb)（https://docs.python.org/3.5/library/pdb.html），使得進入函數內部檢查錯誤成為可能。

Pdb中可使用的命令見鏈接（https://docs.python.org/3.5/library/pdb.html#debugger-commands）

In [ ]: %pdb def pick_and_take(): picked = numpy.random.randint(0, 1000) raise NotImplementedError() pick_and_take() Automatic pdb calling has been turned ON --------------------------------------------------------------------------- NotImplementedError Traceback (most recent call last) in () 5 raise NotImplementedError() 6 ----> 7 pick_and_take() in pick_and_take() 3 def pick_and_take(): 4 picked = numpy.random.randint(0, 1000) ----> 5 raise NotImplementedError() 6 7 pick_and_take() NotImplementedError: > (5)pick_and_take() 3 def pick_and_take(): 4 picked = numpy.random.randint(0, 1000) ----> 5 raise NotImplementedError() 6 7 pick_and_take() ipdb>

◆ ◆ ◆

15.末句函數不輸出

有時候不讓末句的函數輸出結果比較方便，比如在作圖的時候，此時，隻需在該函數末尾加上一個分號即可。

In [4]: %matplotlib inline from matplotlib import pyplot as plt import numpy x = numpy.linspace(0, 1, 1000)**1.5 In [5]: # Here you get the output of the function plt.hist(x) Out[5]: (array([ 216., 126., 106., 95., 87., 81., 77., 73., 71., 68.]), array([ 0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. ]), )

0?wx_fmt=png

In [6]: # By adding a semicolon at the end, the output is suppressed. plt.hist(x);

0?wx_fmt=png

◆ ◆ ◆

16.運行Shell命令

在notebook內部運行shell命令很簡單，這樣你就可以看到你的工作文件夾裏有哪些數據集。

In [7]: !ls *.csv

nba_2016.csv titanic.csv pixar_movies.csv whitehouse_employees.csv

◆ ◆ ◆

17.用LaTex寫公式

當你在一個Markdown單元格裏寫LaTex時，它將用MathJax呈現公式：如

$$ P(A \mid B) = \frac{P(B \mid A) , P(A)}{P(B)} $$

會變成

0?wx_fmt=png

◆ ◆ ◆

18.在notebook內用不同的內核運行代碼

如果你想要，其實可以把不同內核的代碼結合到一個notebook裏運行。

隻需在每個單元格的起始，用Jupyter magics調用kernal的名稱：

%%bash
%%HTML
%%python2
%%python3
%%ruby
%%perl
In [6]: %%bash for i in {1..5} do echo "i is $i" done

i is 1 i is 2 i is 3 i is 4 i is 5

◆ ◆ ◆

19.給Jupyter安裝其他的內核

Jupyter的優良性能之一是可以運行不同語言的內核。下麵以運行R內核為例說明：

簡單的方法：通過Anaconda安裝R內核

conda install -c r r-essentials

稍微麻煩的方法：手動安裝R內核

如果你不是用Anaconda，過程會有點複雜，首先，你需要從CRAN安裝R。

之後，啟動R控製台，運行下麵的語句：

install.packages(c('repr', 'IRdisplay', 'crayon', 'pbdZMQ', 'devtools')) devtools::install_github('IRkernel/IRkernel') IRkernel::installspec() # to register the kernel in the current R installation

◆ ◆ ◆

20.在同一個notebook裏運行R和Python

要這麼做，最好的方法事安裝rpy2（需要一個可以工作的R），用pip操作很簡單：

pip install rpy2

然後，就可以同時使用兩種語言了，甚至變量也可以在二者之間公用：

In [1]: %load_ext rpy2.ipython In [2]: %R require(ggplot2) Out[2]: array([1], dtype=int32) In [3]: import pandas as pd df = pd.DataFrame({ 'Letter': ['a', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'c'], 'X': [4, 3, 5, 2, 1, 7, 7, 5, 9], 'Y': [0, 4, 3, 6, 7, 10, 11, 9, 13], 'Z': [1, 2, 3, 1, 2, 3, 1, 2, 3] }) In [4]: %%R -i df ggplot(data = df) + geom_point(aes(x = X, y= Y, color = Letter, size = Z))

0?wx_fmt=png

◆ ◆ ◆

21.用其他語言寫函數

有時候numpy的速度有點慢，我想寫一些更快的代碼。

原則上，你可以在動態庫裏編譯函數，用python來封裝…

但是如果這個無聊的過程不用自己幹，豈不更好？

你可以在cython或fortran裏寫函數，然後在python代碼裏直接調用。

首先，你要先安裝：

!pip install cython fortran-magic In [ ]: %load_ext Cython In [ ]: %%cython def myltiply_by_2(float x): return 2.0 * x In [ ]: myltiply_by_2(23.)

我個人比較喜歡用Fortran，它在寫數值計算函數時十分方便。更多的細節在（https://arogozhnikov.github.io/2015/09/08/SpeedBenchmarks.html）。

In [ ]: %load_ext fortranmagic In [ ]: %%fortran subroutine compute_fortran(x, y, z) real, intent(in) :: x(:), y(:) real, intent(out) :: z(size(x, 1)) z = sin(x + y) end subroutine compute_fortran In [ ]: compute_fortran([1, 2, 3], [4, 5, 6])

還有一些別的跳轉係統可以加速python 代碼。更多的例子見（https://arogozhnikov.github.io/2015/09/08/SpeedBenchmarks.html）

你可以在cython或fortran裏寫函數，然後在python代

◆ ◆ ◆

22.支持多指針

Jupyter支持多個指針同步編輯，類似Sublime Text編輯器。按下Alt鍵並拖拽鼠標即可實現。

0?wx_fmt=gif

◆ ◆ ◆

23.Jupyter外界拓展

Jupyter-contrib extensions（https://github.com/ipython-contrib/jupyter_contrib_nbextensions）是一些給予Jupyter更多更能的延伸程序，包括jupyter spell-checker和code-formatter之類.

下麵的命令安裝這些延伸程序，同時也安裝一個菜單形式的配置器，可以從Jupyter的主屏幕瀏覽和激活延伸程序。

!pip install https://github.com/ipython-contrib/jupyter_contrib_nbextensions/tarball/master !pip install jupyter_nbextensions_configurator !jupyter contrib nbextension install --user !jupyter nbextensions_configurator enable --user

0?wx_fmt=png

◆ ◆ ◆

24.從Jupyter notebook創建演示稿

Damian Avila的RISE（https://github.com/damianavila/RISE）允許你從已有的notebook創建一個powerpoint形式的演示稿。
你可以用conda來安裝RISE：

conda install -c damianavila82 rise

或者用pip安裝：

pip install RISE

然後運行下麵的代碼來安裝和激活延伸程序：

jupyter-nbextension install rise --py --sys-prefix jupyter-nbextension enable rise --py --sys-prefix

◆ ◆ ◆

25.Jupyter輸出係統

Notebook本身以HTML的形式顯示，單元格輸出也可以是HTML形式的，所以你可以輸出任何東西：視頻/音頻/圖像。

這個例子是瀏覽我所有的圖片，並顯示前五張圖的縮略圖。

In [12]: import os from IPython.display import display, Image names = [f for f in os.listdir('../images/ml_demonstrations/') if f.endswith('.png')] for name in names[:5]: display(Image('../images/ml_demonstrations/' + name, width=100))

0?wx_fmt=png

我們也可以用bash命令創建一個相同的列表，因為magics和bash運行函數後返回的是python 變量：

In [10]: names = !ls ../images/ml_demonstrations/*.png names[:5] Out[10]: ['../images/ml_demonstrations/colah_embeddings.png', '../images/ml_demonstrations/convnetjs.png', '../images/ml_demonstrations/decision_tree.png', '../images/ml_demonstrations/decision_tree_in_course.png', '../images/ml_demonstrations/dream_mnist.png']

原文發布時間為：2016-11-08

本文來自雲棲社區合作夥伴“大數據文摘”，了解相關信息可以關注“BigDataDigest”微信公眾號

最後更新：2017-06-01 12:02:40

Jupyter Notebook的27個竅門，技巧和快捷鍵

◆ ◆ ◆

◆ ◆ ◆

◆ ◆ ◆

◆ ◆ ◆

◆ ◆ ◆

◆ ◆ ◆

◆ ◆ ◆

◆ ◆ ◆

8.Jupyter Magic-%load：從外部腳本中插入代碼

◆ ◆ ◆

9.Jupyter Magic-%store：在notebook文件之間傳遞變量

◆ ◆ ◆

10.Jupyter Magic-%who：列出所有的全局變量

◆ ◆ ◆

11.Jupyter Magic-計時

◆ ◆ ◆

12.Jupyter Magic-writefile and %pycat:導出cell內容/顯示外部腳本的內容

◆ ◆ ◆

13.Jupyter Magic-%prun：告訴你程序中每個函數消耗的時間

◆ ◆ ◆

14.Jupyter Magic-用%pdb調試程序

◆ ◆ ◆

15.末句函數不輸出

◆ ◆ ◆

16.運行Shell命令

◆ ◆ ◆

17.用LaTex寫公式

◆ ◆ ◆

18.在notebook內用不同的內核運行代碼

◆ ◆ ◆

19.給Jupyter安裝其他的內核

簡單的方法：通過Anaconda安裝R內核

稍微麻煩的方法：手動安裝R內核

◆ ◆ ◆

20.在同一個notebook裏運行R和Python

◆ ◆ ◆

21.用其他語言寫函數

◆ ◆ ◆

22.支持多指針

◆ ◆ ◆

23.Jupyter外界拓展

◆ ◆ ◆

24.從Jupyter notebook創建演示稿

◆ ◆ ◆

25.Jupyter輸出係統

上一篇： 為什麼我們的微信文章版式比較好看？

下一篇： 《Docker生產環境實踐指南》——2.6 編排

相關內容

熱門內容

最新內容

上一篇：為什麼我們的微信文章版式比較好看？

下一篇：《Docker生產環境實踐指南》——2.6 編排