Visualising results¶
One thing missing in a lot of corpus linguistic tools is the ability to produce high-quality visualisations of corpus data. corpkit
uses the corpkit.interrogation.Interrogation.visualise
method to do this.
Note
Most of the keyword arguments from Pandas’ plot method are available. See their documentation for more information.
Basics¶
visualise()
is a method of all corpkit.interrogation.Interrogation
objects. If you use from corpkit import *, it is also monkey-patched to Pandas objects.
Note
If you’re using a Jupyter Notebook, make sure you use %matplotlib inline
or %matplotlib notebook
to set the appropriate backend.
A common workflow is to interrogate a corpus, relative results, and visualise:
>>> from corpkit import *
>>> corpus = Corpus('data/P-parsed', load_saved=True)
>>> counts = corpus.interrogate({T: r'MD < __'})
>>> reldat = counts.edit('%', SELF)
>>> reldat.visualise('Modals', kind='line', num_to_plot=ALL).show()
### the visualise method can also attach to the df:
>>> reldat.results.visualise(...).show()
The current behaviour of visualise()
is to return the pyplot
module. This allows you to edit figures further before showing them. Therefore, there are two ways to show the figure:
>>> data.visualise().show()
>>> plt = data.visualise()
>>> plt.show()
Plot type¶
The visualise method allows line
, bar
, horizontal bar (barh
), area
, and pie
charts. Those with seaborn can also use 'heatmap'
(docs). Just pass in the type as a string with the kind
keyword argument. Arguments such as robust=True
can then be used.
>>> data.visualise(kind='heatmap', robust=True, figsize=(4,12),
... x_label='Subcorpus', y_label='Event').show()
Stacked area/line plots can be made with stacked=True
. You can also use filled=True
to attempt to make all values sum to 100. Cumulative plotting can be done with cumulative=True
. Below is an area plot beside an area plot where filled=True
. Both use the vidiris
colour scheme.
Plot style¶
You can select from a number of styles, such as ggplot
, fivethirtyeight
, bmh
, and classic
. If you have seaborn installed (and you should), then you can also select from seaborn styles (seaborn-paper
, seaborn-dark
, etc.).
Figure and font size¶
You can pass in a tuple of (width, height)
to control the size of the figure. You can also pass an integer as fontsize
.
Title and labels¶
You can label your plot with title, x_label and y_label:
>>> data.visualise('Modals', x_label='Subcorpus', y_label='Relative frequency')
Subplots¶
subplots=True
makes a separate plot for every entry in the data. If using it, you’ll probably also want to use layout=(rows,columns)
to specify how you’d like the plots arranged.
>>> data.visualise(subplots=True, layout=(2,3)).show()
TeX¶
If you have LaTeX installed, you can use tex=True
to render text with LaTeX. By default, visualise()
tries to use LaTeX if it can.
Legend¶
You can turn the legend off with legend=False
. Legend placement can be controlled with legend_pos
, which can be:
Margin | Figure | Margin | |
---|---|---|---|
outside upper left | upper left | upper right | outside upper right |
outside center left | center left | center right | outside center right |
outside lower left | lower left | lower right | outside lower right |
The default value, 'best'
, tries to find the best place automatically (without leaving the figure boundaries).
If you pass in draggable=True
, you should be able to drag the legend around the figure.
Colours¶
You can use the colours
keyword argument to pass in:
- A colour name recognised by matplotlib
- A hex colour string
- A colourmap object
There is an extra argument, black_and_white
, which can be set to True
to make greyscale plots. Unlike colours
, it also updates line styles.
Saving figures¶
To save a figure to a project’s images directory, you can use the save
argument. output_format='png'/'pdf'
can be used to change the file format.
>>> data.visualise(save='name', output_format='png')
Other options¶
There are a number of further keyword arguments for customising figures:
A number of these and other options for customising figures are also described in the corpkit.interrogation.Interrogation.visualise
method documentation.