Top three features in Bokeh High Charts¶
Bokeh is an impressive visualization library with a wide audience that ranges from begginers to sophisticated developers.
This is my first part on a series of blog entries about Bokeh. In this first part, I am addressing the ease of using Bokeh High Level charts.
Why these charts are easy
- Colors and titles autoselection: don't worry about colors, legends, and axis titles; Bokeh will choose them without extra parameteres.
- Interaction out of the box: the toolbox on the right side can be easily modified.
- Aggregation and groupping as chart attributes: the high-level charts accept an aggregation and groupping attribute so you have to do less data manipulation before charting.
from bokeh.io import output_notebook, show
output_notebook()
The data¶
First let's load some data about diamonds to see easy one-liners high level charts. This data about 50,000 diamonds comes from the vincentarelbundock github
import pandas as pd
diamonds = pd.read_csv('./data/diamonds.csv')
diamonds = diamonds.sample(n=1000)
diamonds.head()
1) Colors autoselection¶
In our first example, we see a scatter plot defined by the price of diamonds and their carats. But we are also representing the cut of the diamond with colors.
I am not passing axis titles or colors as parameters because they are automatically selected by the Bokeh library.
from bokeh.charts import Scatter, Histogram, Bar
p = Scatter(diamonds, color='cut', x='carat', y='price', title='Price of diamonds by carats')
show(p)
Now... you don't have to get stuck with the default palette. Bokeh comes with a pre-built list of palettes.
In the example below we have the same chart but in a palette of greens.
from bokeh.palettes import YlGn6
from bokeh.charts import Scatter, Histogram, Bar
p = Scatter(diamonds, color='cut', x='carat', y='price', title='Price of diamonds by carats', palette=YlGn6)
show(p)
2) Interaction out of the box - the toolbar¶
The toolbar is defined by a list of tool names. You can also modify the location of the toolbar through the toolbar_location attribute.
To learn more about the toolbar, including the possible choiced of tools, open bokeh's documentation page
p = Bar(diamonds, 'cut', values='price', title="Sum of carats per diamond cut", color = 'cut',
toolbar_location="right", tools='pan,wheel_zoom, undo')
show(p)
3) Groupping and aggregation built-in¶
The sum of carats shown in the previous chart is not really interesting, and this is where aggregations come in. The agg parameter is used in High Level Charts to pass an aggregation method name. In the chart below I am passing mean, but I could have passed in any of the built-in methods: 'sum', 'mean', 'count', 'nunique', 'median', 'min', and 'max'.
p = Bar(diamonds, 'clarity', values='price', title="Average price per clarity", color = 'clarity',
toolbar_location="right", agg='mean')
show(p)
Another nice feature is groupping, which in tandem with aggregations, can provide further insight into the displayed data. For instance, in the chart below we are again showing the average price per clarity, but now groupped per cut type.
p = Bar(diamonds, 'clarity', values='price', title="Avg price per cut and clarity", color = 'cut',
toolbar_location="right", agg='mean', group='cut', legend="top_right")
show(p)
In the next blog entry, I'll discuss the creation of complex charts using Bokeh glyphs.