Saturday, September 24, 2016

Top three features in Bokeh High Charts

Top three features in Bokeh High Charts-Copy1

Top three features in Bokeh High Charts

Bokeh is an impressive visualization library with a wide audience that ranges from begginers to sophisticated developers. This is my first part on a series of blog entries about Bokeh. In this first part, I am addressing the ease of using Bokeh High Level charts.

Why these charts are easy

  1. Colors and titles autoselection: don't worry about colors, legends, and axis titles; Bokeh will choose them without extra parameteres.
  2. Interaction out of the box: the toolbox on the right side can be easily modified.
  3. Aggregation and groupping as chart attributes: the high-level charts accept an aggregation and groupping attribute so you have to do less data manipulation before charting.
In [22]:
from bokeh.io import output_notebook, show
output_notebook()
Loading BokehJS ...

The data

First let's load some data about diamonds to see easy one-liners high level charts. This data about 50,000 diamonds comes from the vincentarelbundock github

In [23]:
import pandas as pd
diamonds = pd.read_csv('./data/diamonds.csv')
diamonds = diamonds.sample(n=1000)
diamonds.head()
Out[23]:
Unnamed: 0 carat cut color clarity depth table price x y z
33958 33959 0.23 Very Good E VVS1 62.2 59.0 465 3.92 3.96 2.45
18918 18919 1.38 Ideal F SI2 62.0 58.0 7767 7.12 7.17 4.43
11647 11648 1.11 Good I VS2 63.6 58.0 5054 6.57 6.52 4.16
21608 21609 1.55 Ideal G SI2 60.7 55.0 9704 7.54 7.51 4.57
5504 5505 1.11 Good J SI1 63.6 58.0 3846 6.48 6.57 4.15

1) Colors autoselection

In our first example, we see a scatter plot defined by the price of diamonds and their carats. But we are also representing the cut of the diamond with colors.

I am not passing axis titles or colors as parameters because they are automatically selected by the Bokeh library.

In [24]:
from bokeh.charts import Scatter, Histogram, Bar
p = Scatter(diamonds, color='cut', x='carat', y='price', title='Price of diamonds by carats')
show(p)
Out[24]:

<Bokeh Notebook handle for In[2]>

Now... you don't have to get stuck with the default palette. Bokeh comes with a pre-built list of palettes.

In the example below we have the same chart but in a palette of greens.

In [25]:
from bokeh.palettes import YlGn6
from bokeh.charts import Scatter, Histogram, Bar
p = Scatter(diamonds, color='cut', x='carat', y='price', title='Price of diamonds by carats', palette=YlGn6)
show(p)

<Bokeh Notebook handle for In[2]>

2) Interaction out of the box - the toolbar

The toolbar is defined by a list of tool names. You can also modify the location of the toolbar through the toolbar_location attribute.

To learn more about the toolbar, including the possible choiced of tools, open bokeh's documentation page

In [26]:
p = Bar(diamonds, 'cut', values='price', title="Sum of carats per diamond cut", color = 'cut', 
        toolbar_location="right", tools='pan,wheel_zoom, undo')
show(p)

3) Groupping and aggregation built-in

The sum of carats shown in the previous chart is not really interesting, and this is where aggregations come in. The agg parameter is used in High Level Charts to pass an aggregation method name. In the chart below I am passing mean, but I could have passed in any of the built-in methods: 'sum', 'mean', 'count', 'nunique', 'median', 'min', and 'max'.

In [27]:
p = Bar(diamonds, 'clarity', values='price', title="Average price per clarity", color = 'clarity', 
        toolbar_location="right", agg='mean')
show(p)
Out[27]:

<Bokeh Notebook handle for In[2]>

Another nice feature is groupping, which in tandem with aggregations, can provide further insight into the displayed data. For instance, in the chart below we are again showing the average price per clarity, but now groupped per cut type.

In [28]:
p = Bar(diamonds, 'clarity', values='price', title="Avg price per cut and clarity", color = 'cut', 
        toolbar_location="right", agg='mean', group='cut', legend="top_right")
show(p)
Out[28]:

<Bokeh Notebook handle for In[2]>

In the next blog entry, I'll discuss the creation of complex charts using Bokeh glyphs.

No comments:

Post a Comment