Display mean and median test results in Stata

Sometimes we may want to produce the following table to compare the mean and median of two groups:

First of all, please refer to this post to see Stata commands to test equality of mean and median.

However, it is time-consuming to glean numbers from the output of these Stata commands and place them in a table. It is even more struggling that you have to repeat the tedious process every time you update your sample.

I write Stata codes to streamline the process. The codes vary between unpaired (i.e., unmatched) data and paired data.

Unpaired data

The above example is unpaired data, i.e., suspect firm-years and other firm-years are not 1-to-1 or 1-to-m matched. One usage of unpaired data is the first step of Heckman’s two-step procedure, in which two groups of observations (i.e., the group that will be selected into the second step and the group that will not be selected into the second step) are stacked vertically in the dataset. The following codes are used for unpaired data. You only need to modify the first two lines to suit your data. The codes will generate a table in Stata’s output window like this:

You can then select the output and right-click “Copy as table” and paste in Excel for a quick edit. The codes use t-test for mean and Wilcoxon rank-sum test for median.

Paired data

A typical usage of paired data is to identify a matched control group for the treatment group. For example, identify a matched firm-year for an event firm-year based on a set of characteristics (same industry, same year, similar size and book-to-market), or identify a matched firm for every event firm based on the closest propensity score (i.e., propensity score matching).

The following table is an example that compares the mean and median of two matched groups—restating firms and non-restating group. Each restating firm is matched with a non-restating firm.

Because of this matching relationship, every event firm and its control firm will be placed in the same row in the dataset. In other words, event firms and control firms are aligned horizontally. The following codes are used for paired data. You only need to modify the first two lines to suit your data. You must specify the same number of variables in the matched order in the first two lines. In other words, the first variable in the first line must be paired with the first variable in the second line, and so on. The codes will generate a table in Stata’s output window like this:

The codes use paired t-test for mean and Wilcoxon rank-sign test for median.

 

This entry was posted in Stata. Bookmark the permalink.

9 Responses to Display mean and median test results in Stata

  1. HP says:

    Thank you for much for sharing this program.

  2. Victor says:

    You can also directly output the results to excel by using putexcel.
    Just change the display to something like, putexcel A1 = “Mean” , where A1 is cell in Excel.

    Some instructions:
    https://blog.stata.com/2017/01/24/creating-excel-tables-with-putexcel-part-2-macro-picture-matrix-and-formula-expressions/

    • Kai Chen says:

      Thanks for your comments. However, this type of test involves extracting returned scalars from MULTIPLE commands and presenting them in a single table. putexcel works great if you only need returned scalars from a SINGLE command. But I am afraid it cannot pick up results from multiple commands and export them to an Excel.

  3. Pedro Coelho says:

    Thank you very much for the command.

    Is there a way to include significance stars in the diferences (like the example from Zang, at the beginning of the post)?

  4. George says:

    Hi Kai: sorry I tested your code in Stata 14 but it was producing empty results. Is your code working for only 4 variables.

    • Kai Chen says:

      Hi George, it works for >4 variables. But you have to organize your data in certain ways and let the code know the required group variable. For example, unpaired data should be stacked by group and paired data should not.

  5. Vincent says:

    Hi Kai:
    is it possible to modify your code to include quantile regression for medians. According to Stata Journal qreg is best to use for differences in medians. https://journals.sagepub.com/doi/pdf/10.1177/1536867X1201200202

  6. George says:

    Thanks Kai: how can we export your results into Excel format.

Leave a Reply to Vincent Cancel reply

Your email address will not be published. Required fields are marked *