EDGAR index files in Stata dataset (from 1993 Q1 to March 2, 2017)

SEC makes all EDGAR filings publicly available. We can download all 10-Ks, 10-Qs, 8-Ks filed since 1993. However, SEC makes this far away from just a few mouse clicks (in order to reduce the server load and avoid the possible abuse I guess). To download EDGAR filings, we have to download EDGAR index files first to get the full path of each 10-K, 10-Q, 8-K, etc. We cannot download any file without the full path information. See technical details here.

I downloaded all EDGAR index files and converted them into Stata datasets. You can download here: Stata format (1993–2000); Stata format (2001–2005); Stata format (2006–2010); Stata format (2011–2015); Stata format (2016–2019/03/16).

If you want to know how I do this, please read my another blog here.

This entry was posted in Data. Bookmark the permalink.

16 Responses to EDGAR index files in Stata dataset (from 1993 Q1 to March 2, 2017)

  1. Christian says:

    Hello Kai!

    I have one question, is the dataset (“SAS dataset and now the file size is about 400M (unzipped file size is 19G)”) still available somewhere.

    By the way, i love your site, it is perfect. Finally someone who cares also
    about STATA.

    Best Regards from Germany

    P.h.d. Student

  2. Christian says:


  3. Christian says:

    Thank you!!!!!!!

  4. Emma says:

    Dataset not recognized by stata. Can you please confirm if it still works? Thanks

  5. Han says:

    Thanks Kai. This is really helpful. I truly appreciate your sharing.

  6. Yuri says:

    Hi Kai,

    The link is broken. Could you please repair it.

    Thanks a lot

  7. Xiaoli Feng says:

    Hi, Kai, I have used the python code to download 10-Q file from the SEC website. However, I find that some of the files are out of order, and the reason is that the SEC website uses hml instead of txt format for the 10-k file. Have you noticed this and maybe you can E-mail me if you have time to discuss with me about this.

    Thank you for providing so many useful documents!

  8. Huimin (Amy) Chen says:

    Hi Kai,

    I have visited your website a few times and been really amazed by your work and sharing spirit!!
    As I am cleaning up the SEC Edgar server log data, which captures website visitors’ IP address, timestamp of the request, etc., I wonder if there is any program to condense the data. Loughran & McDonald have cleaned the data but didn’t share the code. https://sraf.nd.edu/data/edgar-server-log/

    Thanks a ton in advance!

  9. Jon says:

    Thank you so much for this post, incredible stuff!

    The link to the blog post explaining how you obtain the URLs is broken, any chance you could fix it?

    Thank you again for this amazing site.

  10. bruno says:

    How do I turn these files into the actual datasets?
    I tried to download piece by piece but the link is broken!

Leave a Reply

Your email address will not be published. Required fields are marked *