Skip to main content
University of Wisconsin–Madison
SSCC Statistics Notes

Category: Stata

Stata with Jupyter Lab in the SSCC

We have several methods of combining text and Stata code available in the SSCC.

  • As of Stata 15, Stata has built in support for dynamic markdown.
  • R/RStudio has support for Stata in markdown documents, via the knit() function.
  • Jupyter Lab can support a mix of Stata and markdown.

The Jupyter options require some set up by you, the user, plus one step by our system administrators (not yet implemented, 11 Dec 2018, so ask the Help Desk). The RStudio and Stata options are easier to use if you install an additional package as well. None of these is all that difficult to set up on Winstat.

The main benefit of using a Jupyter notebook is that you can edit and run your code live. If you are thinking your way through a problem, coding and writing simultaneously, this will be a very convenient workflow. In Stata or in RStudio, you need to compile a document to see all the text, code, and output in the same place.

The main disadvantage of using a Jupyter notebook (on Winstat) is that each cell is a separate Stata session (until our system adminstrators complete one extra step). So at the moment this is really only going to be useful for fairly simple Stata coding tasks.

(Jupyter can be set up to run separate cells through one continous Stata session, but this has not been implemented on the SSCC Winstats or in the labs.)

One-time Set Up

You will first need to install IPyStata. This is done outside of Jupyter, from a command prompt. On Winstat, open the Anaconda command prompt (from the Start Menu) and type the command

pip install -- user ipystat

This may additionally tell you to install the `msgpack` module. Next, open a Jupyter notebook in order to run some Python commands. This, too, is a one-time set up.

import ipystata
from ipystata.config import config_stata
config_stata('C:\Program Files (x86)\Stata15\StataSE-64.exe', force_batch=True) 
# the force_batch is for now

Finally, Stata Notebooks

Having done the set up steps above, you are ready to write Jupyter notebooks that run Stata code.

When you open a new notebook, you open it with the Python kernel. Your first step is then to invoke ipystata in a preliminary cell.

import ipystata

After this, you include %%stata at the beginning of any cell that has Stata code.

%%stata
display "Hello, printed from Stata"
Hello, printed from Stata
%%stata
sysuse auto
regress mpg price

(1978 Automobile Data)

      Source |       SS           df       MS      Number of obs   =        74
-------------+----------------------------------   F(1, 72)        =     20.26
       Model |  536.541807         1  536.541807   Prob > F        =    0.0000
    Residual |  1906.91765        72  26.4849674   R-squared       =    0.2196
-------------+----------------------------------   Adj R-squared   =    0.2087
       Total |  2443.45946        73  33.4720474   Root MSE        =    5.1464

------------------------------------------------------------------------------
         mpg |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       price |  -.0009192   .0002042    -4.50   0.000    -.0013263   -.0005121
       _cons |   26.96417   1.393952    19.34   0.000     24.18538    29.74297
------------------------------------------------------------------------------

However, keep in mind that each cell is run as a separate Stata batch.

%%stata 
regress mpg weight
no variables defined
r(111);

end of do-file
r(111);

What you need for this second example is really

%%stata
sysuse auto
regress mpg weight
(1978 Automobile Data)

      Source |       SS           df       MS      Number of obs   =        74
-------------+----------------------------------   F(1, 72)        =    134.62
       Model |   1591.9902         1   1591.9902   Prob > F        =    0.0000
    Residual |  851.469256        72  11.8259619   R-squared       =    0.6515
-------------+----------------------------------   Adj R-squared   =    0.6467
       Total |  2443.45946        73  33.4720474   Root MSE        =    3.4389

------------------------------------------------------------------------------
         mpg |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      weight |  -.0060087   .0005179   -11.60   0.000    -.0070411   -.0049763
       _cons |   39.44028   1.614003    24.44   0.000     36.22283    42.65774
------------------------------------------------------------------------------

Tables in Stata Markdown

Four Pandoc table types

Pandoc recognizes markdown tables written in four different styles. It
recognizes all of these table style by default.
These are:

See the Pandoc manual for detailed
description of these table types.

Stata’s (and WordPress’) markdown renderer only recognizes pipe tables.

Pipe Tables

Demonstration of pipe table syntax, from the Pandoc manual.

The markdown is written:

| Right | Left | Default | Center |
|------:|:-----|---------|:------:|
|   12  |  12  |    12   |    12  |
|  123  |  123 |   123   |   123  |
|    1  |    1 |     1   |     1  |

which is rendered as:

Right Left Default Center
12 12 12 12
123 123 123 123
1 1 1 1

Simple Tables

Are not rendered in WordPress or Stata. They would be
written like this:

  Right     Left     Center     Default
-------     ------ ----------   -------
     12     12        12            12
    123     123       123          123
      1     1          1             1

Table:  Demonstration of simple table syntax.

and rendered as

Right Left Center Default
——- —— ———- ——-
12 12 12 12
123 123 123 123
1 1 1 1

Table: Demonstration of simple table syntax.

Multi-line Tables

Are not rendered in WordPress or Stata. They would be
written like this:

-------------------------------------------------------------
 Centered   Default           Right Left
  Header    Aligned         Aligned Aligned
----------- ------- --------------- -------------------------
   First    row                12.0 Example of a row that
                                    spans multiple lines.

  Second    row                 5.0 Here's another one. Note
                                    the blank line between
                                    rows.
-------------------------------------------------------------

Table: Here's the caption. It, too, may span
multiple lines.

Rendered as


Centered Default Right Left
Header Aligned Aligned Aligned


First row 12.0 Example of a row that
spans multiple lines.

Second row 5.0 Here’s another one. Note
the blank line between

rows.

Table: Here’s the caption. It, too, may span
multiple lines.

Grid Tables

Are not rendered in WordPress or Stata. They would be
written like this:

:Sample grid table.

+---------------+---------------+--------------------+
| Fruit         | Price         | Advantages         |
+===============+===============+====================+
| Bananas       | $1.34         | - built-in wrapper |
|               |               | - bright color     |
+---------------+---------------+--------------------+
| Oranges       | $2.10         | - cures scurvy     |
|               |               | - tasty            |
+---------------+---------------+--------------------+

Rendered as

:Sample grid table.

+—————+—————+——————–+
| Fruit | Price | Advantages |
+===============+===============+====================+
| Bananas | $1.34 | – built-in wrapper |
| | | – bright color |
+—————+—————+——————–+
| Oranges | $2.10 | – cures scurvy |
| | | – tasty |
+—————+—————+——————–+

Easter Eggs

Not really documented in Stata, several commands produce piped table
output.

  • _coef_table
  • estimates table
  • tabulate
  • tabstat

Just add a markdown option to each command!

  • mata: matrixmarkdown
  • mata: datamarkdown