Category Archives: Python

The art of regular expression

Regular expression is a powerful tool to do text search. It is the foundation of a lot of textual analysis research, though today’s textual analysis in computer science has gone far beyond text search. Regular expression operations are programming language … Continue reading

Posted in Python | Leave a comment

Use Python to download lawsuit data from Stanford Law School’s Securities Class Action Clearinghouse

[Update on 2019-07-07] I am grateful to Shiyu Chen, my research assistant, who did a very good job on not only web scraping the top-level table, but also extracting from the case summary page additional information (link to case summary … Continue reading

Posted in Python | 1 Comment

Use Python to extract URLs to HTML-format SEC filings on EDGAR

[Update on 2019-08-07] From time to time, some readers informed that the first-part code seemingly stopped at certain quarters. I don’t know the exact reason (perhaps a server-side issue). I never encountered the issue. I would suggest that you just … Continue reading

Posted in Python | 26 Comments

Use Python to download data from the DTCC’s Swap Data Repository

I helped my friend to download data from the DTCC’s Swap Data Repository. I am not familiar with the data and just use this as a programming practice. This article gives an introduction to the origin of the data: http://www.dtcc.com/news/2013/january/03/swap-data-repository-real-time The … Continue reading

Posted in Data, Python | Leave a comment

Use Python to download TXT-format SEC filings on EDGAR (Part II)

[Update on 2019-07-31] This post, together with its sibling post “Part I“, has been my most-viewed post since I created this website. However, the landscape of 10-K/Q filings has changed dramatically over the past decade, and the text-format filings are … Continue reading

Posted in Data, Python | 59 Comments

Use Python to extract Intelligence Indexing fields in Factiva articles

First of all, I acknowledge that I benefit a lot from Neal Caren’s blog post Cleaning up LexisNexis Files. Thanks Neal. Factiva (as well as LexisNexis Academic) is a comprehensive repository of newspapers, magazines, and other news articles. I first … Continue reading

Posted in Python | 14 Comments

Use Python to calculate the tone of financial articles

[Update on 2019-03-01] I completely rewrite the Python program. The updates include: I include two domain-specific dictionaries: Loughran and McDonald’s and Henry’s dictionaries, and you can choose which dictionary to use. I add negation check as suggested by Loughran and … Continue reading

Posted in Python | 9 Comments

Use Python to download TXT-format SEC filings on EDGAR (Part I)

[Update on 2019-07-31] This post, together with its sibling post “Part II“, has been my most-viewed post since I created this website. However, the landscape of 10-K/Q filings has changed dramatically over the past decade, and the text-format filings are extremely … Continue reading

Posted in Data, Python | 63 Comments