Kai Chen

TAR-Style Word Template

Posted on September 12, 2016 by Kai Chen

I create a Word template that complies with The Accounting Review editorial style. My design philosophy is “simple but sufficient”. I do not like those templates that are heavy and fancy (e.g., macros everywhere).

This is just version 1. It is quite usable though. Download here.

Good luck to everyone who tries to publish a paper on The Accounting Review!!!

PS: I have lost my love for MathType. It drives me crazy for converting my equations to un-editable graphs over and over again. I start using Word’s built-in Equation Editor. But Microsoft apparently cannot make the font look right. Install STIX math font if you are as picky as I am. STIX develops a math font that makes equations in Word look a lot like Times New Roman. Just google “STIX math font”.

Posted in Learning Resources | 2 Comments

Use Python to download TXT-format SEC filings on EDGAR (Part II)

Posted on April 9, 2016 by Kai Chen

[Update on 2019-07-31] This post, together with its sibling post “Part I“, has been my most-viewed post since I created this website. However, the landscape of 10-K/Q filings has changed dramatically over the past decade, and the text-format filings are extremely unfriendly for researchers nowadays. I would suggest directing our research efforts to html-format filings with the help of BeautifulSoup. The other post deserves more attention.

[Update on 2017-03-03] SEC closed the FTP server permanently on December 30, 2016 and started to use a more secure transmission protocol—https. Since then I have received several requests to update the script. Here it is the new codes for Part II.

import csv
import requests
import re

with open('sample.csv', newline='') as csvfile:
    reader = csv.reader(csvfile, delimiter=',')
    for line in reader:
        fn1 = line[0]
        fn2 = re.sub(r'[/\\]', '', line[1])
        fn3 = re.sub(r'[/\\]', '', line[2])
        fn4 = line[3]
        saveas = '-'.join([fn1, fn2, fn3, fn4])
        # Reorganize to rename the output filename.
        url = 'https://www.sec.gov/Archives/' + line[4].strip()
        with open(saveas, 'wb') as f:
            f.write(requests.get('%s' % url).content)
            print(url, 'downloaded and wrote to text file')

import csv

import requests

import re

with open('sample.csv', newline='') as csvfile:

reader = csv.reader(csvfile, delimiter=',')

for line in reader:

fn1 = line[0]

fn2 = re.sub(r'[/\\]', '', line[1])

fn3 = re.sub(r'[/\\]', '', line[2])

fn4 = line[3]

saveas = '-'.join([fn1, fn2, fn3, fn4])

# Reorganize to rename the output filename.

url = 'https://www.sec.gov/Archives/' + line[4].strip()

with open(saveas, 'wb') as f:

f.write(requests.get('%s' % url).content)

print(url, 'downloaded and wrote to text file')

[Original Post] As I said in the post entitled “Part I“, we have to do two steps in order to download SEC filings on EDGAR:

Find paths to raw text filings;
Select what we want and bulk download from EDGAR using paths we have obtained in the first step.

“Part I” elaborates the first step. This post shares Python codes for the second step.

In the first step, I save index files in a SQLite database as well as a Stata dataset. The index database includes all types of filings (e.g., 10-K and 10-Q). Select from the database the types that you want and export your selection into a CSV file, say “sample.csv”. To use the following Python codes, the format of the CSV file must look as follows (this example selects all 10-Ks of Apple Inc). Please note: both SQLite and Stata datasets contain an index column, and you have to delete that index column when exporting your selection into a CSV file.

320193,APPLE COMPUTER INC,10-K,1994-12-13,edgar/data/320193/0000320193-94-000016.txt
320193,APPLE COMPUTER INC,10-K,1995-12-19,edgar/data/320193/0000320193-95-000016.txt
320193,APPLE COMPUTER INC,10-K,1996-12-19,edgar/data/320193/0000320193-96-000023.txt
320193,APPLE COMPUTER INC,10-K,1997-12-05,edgar/data/320193/0001047469-97-006960.txt
320193,APPLE COMPUTER INC,10-K,1999-12-22,edgar/data/320193/0000912057-99-010244.txt
320193,APPLE COMPUTER INC,10-K,2000-12-14,edgar/data/320193/0000912057-00-053623.txt
320193,APPLE COMPUTER INC,10-K,2002-12-19,edgar/data/320193/0001047469-02-007674.txt
320193,APPLE COMPUTER INC,10-K,2003-12-19,edgar/data/320193/0001047469-03-041604.txt
320193,APPLE COMPUTER INC,10-K,2004-12-03,edgar/data/320193/0001047469-04-035975.txt
320193,APPLE COMPUTER INC,10-K,2005-12-01,edgar/data/320193/0001104659-05-058421.txt
320193,APPLE COMPUTER INC,10-K,2006-12-29,edgar/data/320193/0001104659-06-084288.txt
320193,APPLE INC,10-K,2007-11-15,edgar/data/320193/0001047469-07-009340.txt
320193,APPLE INC,10-K,2008-11-05,edgar/data/320193/0001193125-08-224958.txt
320193,APPLE INC,10-K,2009-10-27,edgar/data/320193/0001193125-09-214859.txt
320193,APPLE INC,10-K,2010-10-27,edgar/data/320193/0001193125-10-238044.txt
320193,APPLE INC,10-K,2011-10-26,edgar/data/320193/0001193125-11-282113.txt
320193,APPLE INC,10-K,2012-10-31,edgar/data/320193/0001193125-12-444068.txt
320193,APPLE INC,10-K,2013-10-30,edgar/data/320193/0001193125-13-416534.txt
320193,APPLE INC,10-K,2014-10-27,edgar/data/320193/0001193125-14-383437.txt
320193,APPLE INC,10-K,2015-10-28,edgar/data/320193/0001193125-15-356351.txt

320193,APPLE COMPUTER INC,10-K,1994-12-13,edgar/data/320193/0000320193-94-000016.txt

320193,APPLE COMPUTER INC,10-K,1995-12-19,edgar/data/320193/0000320193-95-000016.txt

320193,APPLE COMPUTER INC,10-K,1996-12-19,edgar/data/320193/0000320193-96-000023.txt

320193,APPLE COMPUTER INC,10-K,1997-12-05,edgar/data/320193/0001047469-97-006960.txt

320193,APPLE COMPUTER INC,10-K,1999-12-22,edgar/data/320193/0000912057-99-010244.txt

320193,APPLE COMPUTER INC,10-K,2000-12-14,edgar/data/320193/0000912057-00-053623.txt

320193,APPLE COMPUTER INC,10-K,2002-12-19,edgar/data/320193/0001047469-02-007674.txt

320193,APPLE COMPUTER INC,10-K,2003-12-19,edgar/data/320193/0001047469-03-041604.txt

320193,APPLE COMPUTER INC,10-K,2004-12-03,edgar/data/320193/0001047469-04-035975.txt

320193,APPLE COMPUTER INC,10-K,2005-12-01,edgar/data/320193/0001104659-05-058421.txt

320193,APPLE COMPUTER INC,10-K,2006-12-29,edgar/data/320193/0001104659-06-084288.txt

320193,APPLE INC,10-K,2007-11-15,edgar/data/320193/0001047469-07-009340.txt

320193,APPLE INC,10-K,2008-11-05,edgar/data/320193/0001193125-08-224958.txt

320193,APPLE INC,10-K,2009-10-27,edgar/data/320193/0001193125-09-214859.txt

320193,APPLE INC,10-K,2010-10-27,edgar/data/320193/0001193125-10-238044.txt

320193,APPLE INC,10-K,2011-10-26,edgar/data/320193/0001193125-11-282113.txt

320193,APPLE INC,10-K,2012-10-31,edgar/data/320193/0001193125-12-444068.txt

320193,APPLE INC,10-K,2013-10-30,edgar/data/320193/0001193125-13-416534.txt

320193,APPLE INC,10-K,2014-10-27,edgar/data/320193/0001193125-14-383437.txt

320193,APPLE INC,10-K,2015-10-28,edgar/data/320193/0001193125-15-356351.txt

Then we can let Python complete the bulk download task:

import csv
import ftplib

ftp = ftplib.FTP('ftp.sec.gov')
ftp.login()

with open('sample.csv', newline='') as csvfile:
    reader = csv.reader(csvfile, delimiter=',')
    for line in reader:
        saveas = '-'.join([line[0], line[2], line[3]])
        # Reorganize to rename the output filename.
        path = line[4].strip()
        with open(saveas, 'wb') as f:
            ftp.retrbinary('RETR %s' % path, f.write)

ftp.close()

import csv

import ftplib

ftp = ftplib.FTP('ftp.sec.gov')

ftp.login()

with open('sample.csv', newline='') as csvfile:

reader = csv.reader(csvfile, delimiter=',')

for line in reader:

saveas = '-'.join([line[0], line[2], line[3]])

# Reorganize to rename the output filename.

path = line[4].strip()

with open(saveas, 'wb') as f:

ftp.retrbinary('RETR %s' % path, f.write)

ftp.close()

I do not take care of file directories of “sample.csv” and output raw text filings in the codes. You can modify by yourself. saveas = '-'.join([line[0], line[2], line[3]]) is used to name the output SEC filings. The current name is cik-form type-filing date.txt. Please move around these elements to accommodate your needs (thank Eva for letting me know a previous error here).

Posted in Data, Python | 59 Comments

Use Python to extract Intelligence Indexing fields in Factiva articles

Posted on December 20, 2015 by Kai Chen

First of all, I acknowledge that I benefit a lot from Neal Caren’s blog post Cleaning up LexisNexis Files. Thanks Neal.

Factiva (as well as LexisNexis Academic) is a comprehensive repository of newspapers, magazines, and other news articles. I first describe the data elements of a Factiva news article. Then I explain the steps to extract those data elements and write them into a more machine-readable table using Python.

Data Elements in Factiva Article

Each news article in Factiva, no matter how it looks like, contains a number of data elements. In Factiva’s terminology, those data elements are called Intelligence Indexing Fields. The following table lists the label and name for each data element (or, field) along with what is contained in each:

Field Label	Field Name	What It Contains
HD	Headline	Headline
CR	Credit Information	Credit Information (Example: Associated Press)
WC	Word Count	Number of words in document
PD	Publication Date	Publication Date
ET	Publication Time	Publication Time
SN	Source Name	Source Name
SC	Source Code	Source Code
ED	Edition	Edition of publication (Example: Final)
PG	Page	Page on which article appeared (Note: Page-One Story is a Dow Jones Intelligent Indexingª term)
LA	Language	Language in which the document is written
CY	Copyright	Copyright
LP	Lead Paragraph	First two paragraphs of an article
TD	Text	Text following the lead paragraphs
CT	Contact	Contact name to obtain additional information
RF	Reference	Notes associated with a document
CO	Dow Jones Ticker Symbol	Dow Jones Ticker Symbol
IN	Industry Code	Dow Jones Intelligent Indexingª Industry Code
NS	Subject Code	Dow Jones Intelligent Indexingª Subject Code
RE	Region Code	Dow Jones Intelligent Indexingª Region Code
IPC	Information Provider Code	Information Provider Code
IPD	Information Provider Descriptors	Information Provider Descriptors
PUB	Publisher Name	Publisher of information
AN	Accession Number	Unique Factiva.com identification number assigned to each document

Please note that not every news article contains all those data elements, and that the table may not list all data elements used by Factiva (Factiva may make updates). Depending on which display option you select when downloading news articles from Factiva, you may not be able to see certain data elements. But they are there and used by Factiva to organize and structure its proprietary news article data.

How to Extract Data Elements in Factiva Article

You can follow three steps outlined in the above diagram to extract data elements in news articles and for further processing (e.g., calculate tone of full text represented by both LP and TD element; or group by news subject, i.e., by NS element). I explain them one by one as follows.

Step 1: Download Articles from Factiva in RTF Format

It is a lot of pain to download a large number of news articles from Factiva: it is technically difficult to download articles in an automated fashion; you can only download 100 articles at a time, also those 100 articles cannot exceed the word count limit, i.e., 180,000. As a result, it requires a lot of tedious work if you want to gather tens of thousands news articles. While I can do nothing about both issues in this post, I can say a bit more about them.

Firstly, you may see some people discuss methods for automatic downloading (a so-called “webscraping” technique. See here). However, this needs more hacking after Factiva introduced CAPTCHA to determine whether or not the user is a human. You may not be familiar with the term “CAPTCHA”, but you must experience the circumstance where you are asked to input characters or numbers shown in an image before you can download a file or go to the next webpage. That is CAPTCHA. Both Factiva and LexisNexis Academic have introduced CAPTCHA to prohibit robotic downloading. Though CAPTCHA is not unbeatable, it requires advanced technique.

Secondly, the Factiva licence expressly prohibits data mining. However, the licence does not define clearly what constitutes data mining. I was informed that downloading a large number of articles in a short period of time would be red flagged as data mining. But the threshold speed set by Factiva is low and any trained and adept person can beat that threshold speed easily. If you are red flagged by Factiva, things could go ugly. So, do not be too fast, even this may slow down your research.

Let’s get back to the topic. When you manually download news articles from Factiva, the most important thing is to select the right display option. Please select the third one: Full Article/Report plus Indexing as indicated by the following graph:

Then you have to download articles in RTF – Article Format, as indicated by the following graph:

After the download is completed, you will get an RTF document. If you open it, you will find news articles look like this:

The next step is to convert RTF to plain TXT, because Python can process TXT documents more easily. After Python finishes its job, the final product will be a table: each row of the table represents a news article; and each column of the table is a data element.

Step 2: Convert RTF to TXT

Well, this can surely be done by Python. But so far I have not written a Python program to do this. I will complete this “hole” when I have time. For my research, I simply take advantage of the convenience of the default text editor shipped with Mac OS, TextEdit. I select Format – Make Plain Text from the menu bar, and then save the document in TXT format. You can make this happen in an automatic fashion using Automator in Mac OS.

Step 3: Extract Data Elements and Save to a Table

This is where Python does the dirty work. To run the Python program correctly, please save the Python program in the directory where you put all plain TXT documents created in Step 2 before you run the program. This program will:

Read in each TXT document;
Extract data elements of each article and write them to an SQLite database;
Export data to a CSV file for easy processing in other software such as Stata.

I introduce an intermediate step which writes data to an SQLite database, simply because this can facilitate manipulation of news article data using Python for other purposes. Of course, you can directly write data to a CSV file.

import glob
import re
import sqlite3
import csv

def parser(file):

    # Open a TXT file. Store all articles in a list. Each article is an item
    # of the list. Split articles based on the location of such string as
    # 'Document PRN0000020080617e46h00461'

    articles = []
    with open(file, 'r') as infile:
        data = infile.read()
    start = re.search(r'\n HD\n', data).start()
    for m in re.finditer(r'Document [a-zA-Z0-9]{25}\n', data):
        end = m.end()
        a = data[start:end].strip()
        a = '\n   ' + a
        articles.append(a)
        start = end

    # In each article, find all used Intelligence Indexing field codes. Extract
    # content of each used field code, and write to a CSV file.

    # All field codes (order matters)
    fields = ['HD', 'CR', 'WC', 'PD', 'ET', 'SN', 'SC', 'ED', 'PG', 'LA', 'CY', 'LP',
              'TD', 'CT', 'RF', 'CO', 'IN', 'NS', 'RE', 'IPC', 'IPD', 'PUB', 'AN']

    for a in articles:
        used = [f for f in fields if re.search(r'\n   ' + f + r'\n', a)]
        unused = [[i, f] for i, f in enumerate(fields) if not re.search(r'\n   ' + f + r'\n', a)]
        fields_pos = []
        for f in used:
            f_m = re.search(r'\n   ' + f + r'\n', a)
            f_pos = [f, f_m.start(), f_m.end()]
            fields_pos.append(f_pos)
        obs = []
        n = len(used)
        for i in range(0, n):
            used_f = fields_pos[i][0]
            start = fields_pos[i][2]
            if i < n - 1:
                end = fields_pos[i + 1][1]
            else:
                end = len(a)
            content = a[start:end].strip()
            obs.append(content)
        for f in unused:
            obs.insert(f[0], '')
        obs.insert(0, file.split('/')[-1].split('.')[0])  # insert Company ID, e.g., GVKEY
        # print(obs)
        cur.execute('''INSERT INTO articles
                       (id, hd, cr, wc, pd, et, sn, sc, ed, pg, la, cy, lp, td, ct, rf,
                       co, ina, ns, re, ipc, ipd, pub, an)
                       VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?,
                       ?, ?, ?, ?, ?, ?, ?, ?)''', obs)

# Write to SQLITE
conn = sqlite3.connect('factiva.db')
with conn:
    cur = conn.cursor()
    cur.execute('DROP TABLE IF EXISTS articles')
    # Mirror all field codes except changing 'IN' to 'INC' because it is an invalid name
    cur.execute('''CREATE TABLE articles
                   (nid integer primary key, id text, hd text, cr text, wc text, pd text,
                   et text, sn text, sc text, ed text, pg text, la text, cy text, lp text,
                   td text, ct text, rf text, co text, ina text, ns text, re text, ipc text,
                   ipd text, pub text, an text)''')
    for f in glob.glob('*.txt'):
        print(f)
        parser(f)

# Write to CSV to feed Stata
with open('factiva.csv', 'w', newline='') as csvfile:
    writer = csv.writer(csvfile)
    with conn:
        cur = conn.cursor()
        cur.execute('SELECT * FROM articles WHERE hd IS NOT NULL')
        colname = [desc[0] for desc in cur.description]
        writer.writerow(colname)
        for obs in cur.fetchall():
            writer.writerow(obs)

import glob

import re

import sqlite3

import csv

def parser(file):

# Open a TXT file. Store all articles in a list. Each article is an item

# of the list. Split articles based on the location of such string as

# 'Document PRN0000020080617e46h00461'

articles = []

with open(file, 'r') as infile:

data = infile.read()

start = re.search(r'\n HD\n', data).start()

for m in re.finditer(r'Document [a-zA-Z0-9]{25}\n', data):

end = m.end()

a = data[start:end].strip()

a = '\n ' + a

articles.append(a)

start = end

# In each article, find all used Intelligence Indexing field codes. Extract

# content of each used field code, and write to a CSV file.

# All field codes (order matters)

fields = ['HD', 'CR', 'WC', 'PD', 'ET', 'SN', 'SC', 'ED', 'PG', 'LA', 'CY', 'LP',

'TD', 'CT', 'RF', 'CO', 'IN', 'NS', 'RE', 'IPC', 'IPD', 'PUB', 'AN']

for a in articles:

used = [f for f in fields if re.search(r'\n ' + f + r'\n', a)]

unused = [[i, f] for i, f in enumerate(fields) if not re.search(r'\n ' + f + r'\n', a)]

fields_pos = []

for f in used:

f_m = re.search(r'\n ' + f + r'\n', a)

f_pos = [f, f_m.start(), f_m.end()]

fields_pos.append(f_pos)

obs = []

n = len(used)

for i in range(0, n):

used_f = fields_pos[i][0]

start = fields_pos[i][2]

if i < n - 1:

end = fields_pos[i + 1][1]

else:

end = len(a)

content = a[start:end].strip()

obs.append(content)

for f in unused:

obs.insert(f[0], '')

obs.insert(0, file.split('/')[-1].split('.')[0]) # insert Company ID, e.g., GVKEY

# print(obs)

cur.execute('''INSERT INTO articles

(id, hd, cr, wc, pd, et, sn, sc, ed, pg, la, cy, lp, td, ct, rf,

co, ina, ns, re, ipc, ipd, pub, an)

VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?,

?, ?, ?, ?, ?, ?, ?, ?)''', obs)

# Write to SQLITE

conn = sqlite3.connect('factiva.db')

with conn:

cur = conn.cursor()

cur.execute('DROP TABLE IF EXISTS articles')

# Mirror all field codes except changing 'IN' to 'INC' because it is an invalid name

cur.execute('''CREATE TABLE articles

(nid integer primary key, id text, hd text, cr text, wc text, pd text,

et text, sn text, sc text, ed text, pg text, la text, cy text, lp text,

td text, ct text, rf text, co text, ina text, ns text, re text, ipc text,

ipd text, pub text, an text)''')

for f in glob.glob('*.txt'):

print(f)

parser(f)

# Write to CSV to feed Stata

with open('factiva.csv', 'w', newline='') as csvfile:

writer = csv.writer(csvfile)

with conn:

cur = conn.cursor()

cur.execute('SELECT * FROM articles WHERE hd IS NOT NULL')

colname = [desc[0] for desc in cur.description]

writer.writerow(colname)

for obs in cur.fetchall():

writer.writerow(obs)

Posted in Python | 16 Comments

A loop of cross-sectional regressions for calculating abnormal accruals in Stata

Posted on November 7, 2015 by Kai Chen

I write a loop of cross-sectional regressions for calculating abnormal accruals. This program can be easily modified and replaced with Jones, modified Jones, or Dechow and Dichev model.

I add detailed comments in the program to help you prepare the input file.

set more off
cd "D:\Dropbox"

use funda_full_regdata, clear
//This is the input file. Download required Compustat data and then
//compute all dependent variable and independent variable (e.g., total
//accruals scaled by total assets (ta), delta sales scaled by total
//asset (dsale), PP&E scaled by total assets (ppe).

keep if fyear>=2003 & fyear<=2014

* Remove obs without required data
foreach v in ta dsale ppe {
  drop if `v'==.
}

* Handle outliers for each industry year
ge trunc=0
local vlist "ta dsale ppe"
egen groupid=group(fyear sic2)
sum groupid, meanonly
local num=r(max)
forvalues i=1/`num' {
  foreach v in `vlist' {
    _pctile `v' if groupid==`i', p(1 99)
    replace trunc=1 if groupid==`i' & (`v'<r(r1) | `v'>r(r2)) //drop top and bottom 1%. You can do winsorize as well
  }
}
drop if trunc==1
drop trunc groupid

* Require at least 20 obs for each estimation
bysort groupid: egen obsnum=count(datadate)
keep if obsnum>=20
drop groupid
egen groupid=group(fyear sic2)
sum groupid, meanonly
local num=r(max)

* Run cross-sectional regressions
ge da=.
ge regobs=.
ge df=.
ge r2a=.

foreach x in dsale ppe _cons {
  ge `x'_b=.
  ge `x'_se=.
}

forvalues i=1/`num' {
  regress ta dsale ppe if groupid==`i'
  predict resid if groupid==`i', residual
  replace da=resid if groupid==`i'
  drop resid
  replace regobs=e(N) if groupid==`i'
  replace df=e(df_r) if groupid==`i'
  replace r2a=e(r2_a) if groupid==`i'
  foreach x in dsale _cons {
    replace `x'_b=_b[`x'] if groupid==`i'
    replace `x'_se=_se[`x'] if groupid==`i'
  }
}

save da, replace
//This is the output file. It includes:
//DA - discretionary accruals (residuals)
//Regression coefficients such as dsale_b, ppe_b
//Standard error for each coefficient such as dsale_se, ppe_se
//Some regression stats such as number of obs (regobs), (degree of
//freedom (df), and adjusted R squared (r2a)

set more off

cd "D:\Dropbox"

use funda_full_regdata, clear

//This is the input file. Download required Compustat data and then

//compute all dependent variable and independent variable (e.g., total

//accruals scaled by total assets (ta), delta sales scaled by total

//asset (dsale), PP&E scaled by total assets (ppe).

keep if fyear>=2003 & fyear<=2014

* Remove obs without required data

foreach v in ta dsale ppe {

drop if `v'==.

}

* Handle outliers for each industry year

ge trunc=0

local vlist "ta dsale ppe"

egen groupid=group(fyear sic2)

sum groupid, meanonly

local num=r(max)

forvalues i=1/`num' {

foreach v in `vlist' {

_pctile `v' if groupid==`i', p(1 99)

replace trunc=1 if groupid==`i' & (`v'<r(r1) | `v'>r(r2)) //drop top and bottom 1%. You can do winsorize as well

}

drop if trunc==1

drop trunc groupid

* Require at least 20 obs for each estimation

bysort groupid: egen obsnum=count(datadate)

keep if obsnum>=20

drop groupid

egen groupid=group(fyear sic2)

sum groupid, meanonly

local num=r(max)

* Run cross-sectional regressions

ge da=.

ge regobs=.

ge df=.

ge r2a=.

foreach x in dsale ppe _cons {

ge `x'_b=.

ge `x'_se=.

}

forvalues i=1/`num' {

regress ta dsale ppe if groupid==`i'

predict resid if groupid==`i', residual

replace da=resid if groupid==`i'

drop resid

replace regobs=e(N) if groupid==`i'

replace df=e(df_r) if groupid==`i'

replace r2a=e(r2_a) if groupid==`i'

foreach x in dsale _cons {

replace `x'_b=_b[`x'] if groupid==`i'

replace `x'_se=_se[`x'] if groupid==`i'

}

save da, replace

//This is the output file. It includes:

//DA - discretionary accruals (residuals)

//Regression coefficients such as dsale_b, ppe_b

//Standard error for each coefficient such as dsale_se, ppe_se

//Some regression stats such as number of obs (regobs), (degree of

//freedom (df), and adjusted R squared (r2a)

Posted in Stata | 7 Comments

The impact of WRDS transition to the new WRDS Cloud server

Posted on September 20, 2015 by Kai Chen

WRDS has quietly started the transition from the old server to the new Cloud server. This move makes a lot of support documentation on the WRDS website outdated and misleading. That is why I think WRDS should direct its resources on continuously updating tutorials and manuals and providing more ready-to-use research macros and applications, instead of wasting money on website cosmetics as it did recently.

Now, among supporting documentation about accessing WRDS, only the following two are up-to-date:

The WRDS Cloud Manual
PC-SAS on the WRDS Cloud

All other documentation contains outdated information and may cause confusion and unexpected problems.

In its support documentation, WRDS refers to the old server as either WRDS Unix Server or WRDS Interactive Server (wrds3). The new server is called WRDS Cloud.

The address of the old server: wrds.wharton.upenn.edu 4016
The address of the new server: wrds-cloud.wharton.upenn.edu 4016

They are DIFFERENT! Users who are accessing WRDS using SSH and PC-SAS will be impacted by this transition.

PC-SAS users are familiar with the following statements:

%let wrds = wrds.wharton.upenn.edu 4016;
options comamid=TCP remote=WRDS;
signon username=_prompt_;
rsubmit;
(your SAS code here ...)
endrsubmit;
signoff;

%let wrds = wrds.wharton.upenn.edu 4016;

options comamid=TCP remote=WRDS;

signon username=_prompt_;

rsubmit;

(your SAS code here ...)

endrsubmit;

signoff;

PC-SAS users were able to use one of the eight SASTEMP directories on the server to store sizeable data files temporarily, and upload/download data files to/from their home directory (which would be /home/yourinstitution/youraccountname with 750M space limit). In addition, if you use SSH to log onto the old server, you will see the same home directory as using PC-SAS. As a result, if you uploaded a data file to your home directory via easy-to-use SSH File Transfer (an FTP-like app), you would be able to locate the file in your home directory during PC-SAS connections.

Now this has been changed. PC-SAS now (since August 25, 2015) connects through the WRDS Cloud, instead of the older Interactive Server (wrds3), EVEN IF YOU STILL SPECIFY %let wrds = wrds.wharton.upenn.edu 4016;. The consequences of this change are:

You are not able to use one of the eight SASTEMP directories by using PC-SAS. Instead, you are able to use a larger directory for your temporary data (500G shared by your institution), located at /scratch/yourinstitution. You are still able to access the eight SASTEMP directories if you log onto the old server by using SSH.
The WRDS Cloud gives you a new home directory, though its path remains /home/yourinstitution/youraccount (with a new 10G space limit). So if you use SSH to log onto the old server (as many users probably do if they are not aware of the server transition), you cannot see files that you create in your home directory during PC-SAS connections.

These two consequences may cause confusion for users who use both PC-SAS and SSH to access WRDS interchangeably. They may ask: “why cannot I use the temporary directory any more?” or “where is my files?”

To avoid any possible problem, users should use the new WRDS Cloud server consistently with either SSH or PC-SAS from now on. This means whenever you access WRDS, always use the new server address.

If you use PC-SAS, use the following statements:

%let wrds = wrds-cloud.wharton.upenn.edu 4016;
options comamid=TCP remote=WRDS;
signon username=_prompt_;
rsubmit;
(your SAS code here ...)
endrsubmit;
signoff;

%let wrds = wrds-cloud.wharton.upenn.edu 4016;

options comamid=TCP remote=WRDS;

signon username=_prompt_;

rsubmit;

(your SAS code here ...)

endrsubmit;

signoff;

If you use SSH, use the following command:
ssh youraccountname@wrds-cloud.wharton.upenn.edu

With the new WRDS Cloud server, you use a new command to run your SAS program in background in the SSH command line mode:
qsas yourprogram.sas

You can run multiple SAS programs concurrently this way (up to 5 concurrent jobs). If you prefer run your SAS programs sequentially, you need to write a SAS wrapper script and submit a batch job. You can find details here.

You can use qstat to browse your currently running job and get the job id. If you change your mind and want to terminate that job, you can type:
qdel yourjobid

WRDS is going to phase out the old server. The new WRDS Cloud is supposed to be more computationally powerful. Plus, the new WRDS server offers users a larger home directory and temporary directory. Therefore, it is time for users to migrate to the new WRDS Cloud server.

Posted in Learning Resources, SAS | 2 Comments

Rolling-window computation in SAS and Stata

Posted on September 17, 2015 by Kai Chen

SASers often find proc expand plus transformout very useful for rolling-window (or moving-window) computation. Stataers may wonder if there is a counter party in Stata. The answer is “yes”. The command in Stata is rolling. See the manual below:

http://www.stata.com/manuals13/tsrolling.pdf

The benefits of using rolling in Stata comes from two facts:

Stata is superior to SAS in dealing with time-series or panel data. After a single-line command to define time-series or panel data (tsset), Stata can handle gaps in time series intelligently and automatically. In contrast, SAS users have to manually check gaps in time series. 90% of SAS codes using rolling-window transformation in accounting research do not have such gap check. This may generate incorrect inferences.
In Stata, rolling can be combined with any other command such as regress. Therefore rolling-window computation in Stata is more flexible.

However, proc expand plus transformout in SAS is insanely faster than rolling in Stata (by “insanely faster”, I mean maybe millions times faster). This is truly a deal breaker for Stata.

Therefore, the best solution to rolling-window computation is to use Stata to do the gap check and filling (tsfill) first, and then use SAS to do lightening rolling-window computation.

Posted in Learning Resources, SAS, Stata | Leave a comment

SAS macro for event study and beta

Posted on September 16, 2015 by Kai Chen

There are two macros on the List of WRDS Research Macros: EVTSTUDY and BETA, both of which may be commonly used.

I like the first one, authored by Denys Glushkov. Denys’ code is always elegant. I don’t like the second macro because I believe it contains more than minor mistakes and performs many unnecessary calculations.

Since event study and beta calculation are essentially two sides of the same coin, I developed the following macro to output both event study results (e.g., CAR) and beta estimates. My macro heavily borrows from Denys’ code but differs in the following ways:

I add beta to the final output. This is the main difference.
Denys’ macro overwrites original event dates if they are not trading dates, which may result in unintended data loss. I correct this issue.
Deny uses CRSP.DSIY to generate the trading calendar and market returns, but not all institutions have subscription to it. Therefore, I use the more accessible CRSP.DSI instead (thanks to Michael Shen for bringing this to my attention).
I enhance efficiency by de-duplicating event dates when generating estimation and events windows.
Deny’s macro suppresses warnings or error messages, making debugging difficult. I change this setting.

All changes are commented with /* CHANGE HERE */. I compare the results (CAR and beta) from using my macro and those from using a commercial package, EVENTUS (with the help of my friend who has the license to EVENTUS). The accuracy of my macro is assured (Note: EVENTUS does not take delisting returns by default).

Update: WRDS rolled out the event study web inquiry (so-called Event Study by WRDS). I recently checked the accuracy of that product. To my surprise, the accuracy is unsatisfactory, if not terrible.

 %MACRO EVTSTUDY_NEW (INSET=, OUTSET=, OUTSTATS=, ID=, EVTDATE=, 
                      ESTPER=, START=, END=, GAP=, GROUP=, MODEL=);

/* Summary: Perform event study and calculate beta                                   */
/* Parameters:                                                                       */
/*    - ID     : Name of security identifier in INSET: PERMNO or CUSIP               */
/*               CUSIP should be at least 8 (eight) characters                       */
/*    - INSET  : Input dataset containing security IDs and event dates               */
/*    - OUTSET : Name of the output dataset to store mean CAR and t-stats            */
/*    - OUTSTATS:Name of the output dataset to store test statistics (Patell Z, etc) */
/*    - EVTDATE: Name of the event date variable in INSET dataset                    */
/*    - ESTPER : Length of the estimation period in trading days over which          */
/*               the risk model is run, e.g., 110;                                   */
/*    - START  : Beginning of the event window (relative to the event date, eg. -2)  */
/*    - END    : End of the event window (relative to the event date, e.g., +1)      */
/*    - GAP    : Length of pre-event window, i.e., number of trading days between    */
/*               the end of estimation period and the beginning of the event window  */
/*    - GROUP: Defines an subgroup (can be more than 2)                              */
/*    - MODEL: Risk model to be used for risk-adjustment                             */
/*             madj - Market-Adjusted Model (assumes stock beta=1)                   */
/*             m    - Standard Market Model (CRSP value-weighted index as the market)*/
/*             ff   - Fama-French three factor model                                 */
/*             ffm  - Carhart model that includes FF factors plus momentum           */

  %local evtwin factors abret newvars;
  %local oldoptions errors;
  %let oldoptions=%sysfunc(getoption(mprint)) %sysfunc(getoption(notes))
                   %sysfunc(getoption(source));
  %let errors=%sysfunc(getoption(errors));
  options notes mprint source errors=0; /*display codes debugging. CHANGE HERE*/
  
  %let evtwin=%eval(&end-&start+1); *length of event window in trading days;
   
  /*depending on the model, define the model for abnormal returns*/
  %if %lowcase(&model)=madj %then %do; %let factors=vwretd;
              %let abret=ret-vwretd;
              %let newvars=(intercept=alpha);
              %end;%else
  %if %lowcase(&model)=m %then  %do; %let factors=vwretd;
              %let abret=ret-alpha-beta*vwretd;
              %let newvars=(intercept=alpha vwretd=beta);
              %end;%else
  %if %lowcase(&model)=ff %then %do;
              %let factors=vwretd smb hml;
              %let abret=ret-alpha-beta*vwretd-sminb*smb-hminl*hml;
              %let newvars=(intercept=alpha vwretd=beta smb=sminb hml=hminl);
              %end;%else
  %if %lowcase(&model)=ffm %then %do;
              %let factors=vwretd smb hml mom;
              %let abret=ret-alpha-beta*vwretd-sminb*smb-hminl*hml-wminl*mom;
              %let newvars=(intercept=alpha vwretd=beta smb=sminb hml=hminl mom=wminl);
              %end;
 
  %put; %put ### CREATING TRADING DAY CALENDAR...;
  data _caldates;
   merge crsp.dsi (keep=date rename=(date=estper_beg))
   crsp.dsi (keep=date firstobs=%eval(&estper) rename=(date=estper_end))
   crsp.dsi (keep=date firstobs=%eval(&estper+&gap+1) rename=(date=evtwin_beg))
   crsp.dsi (keep=date firstobs=%eval(&estper+&gap-&start+1) rename=(date=evtdate)) /*change &evtdate to evtdate. CHANGE HERE*/
   crsp.dsi (keep=date firstobs=%eval(&estper+&gap+&evtwin) rename=(date=evtwin_end));
   format estper_beg estper_end evtwin_beg evtdate evtwin_end date9.; /*change &evtdate to evtdate. CHANGE HERE*/
   if nmiss(estper_beg, estper_end, evtwin_beg, evtdate, evtwin_end)=0; /*change &evtdate to evtdate. CHANGE HERE*/
   time+1;
  run;
 %put ### DONE!;
  
  /*If primary identifier is Cusip, then link in permno*/
  %if %lowcase(&id)=cusip %then %do;
  proc sql;
   create view  _link
   as select permno, ncusip,
   min(namedt) as fdate format=date9., max(nameendt) as ldate format=date9.
   from crsp.dsenames
   group by permno, ncusip;
     
   create table _temp
   as select distinct b.permno, a.*
   from &inset a left join _link b
   on a.cusip=b.ncusip and b.fdate<=a.&evtdate<=b.ldate
   order by b.permno, a.&evtdate; /*order by both permno and &evtdate. CHANGE HERE*/
  quit;%end;
  %else %do;
  /*pre-sort the input dataset in case it is not sorted yet*/
  proc sort data=&inset out=_temp;
   by permno &evtdate; /*order by both permno and &evtdate. CHANGE HERE*/
  run;
  %end;
 
  /*If event date is a non-trading day, select the closest */
  /*trading day that follows the event day                 */
  /*Merge in relevant dates from the trading calendar      */

  /*CHANGE HERE to improve efficiency and correct errors 
  proc sql;
   create table _temp (drop=&evtdate)
   as select a.*, a.&evtdate as _edate format date9., b.*
   from _temp a left join _caldates (drop=time) b
   on b.&evtdate-a.&evtdate>=0
   group by a.&evtdate
   having (b.&evtdate-a.&evtdate)=min(b.&evtdate-a.&evtdate);
  quit;*/

  proc sql;
   create table _temp1
   as select a.&evtdate, b.estper_beg, b.estper_end, b.evtwin_beg, b.evtwin_end, b.evtdate
   from (select distinct &evtdate from _temp) a left join _caldates b
   on b.evtdate-a.&evtdate>=0
   group by a.&evtdate
   having (b.evtdate-a.&evtdate)=min(b.evtdate-a.&evtdate);

   create table _temp2 (drop=&evtdate)  /*use _temp2 to surpress warnings and retain _temp for later use. CHANGE HERE*/
   as select a.*, a.&evtdate as _edate format date9., b.estper_beg, b.estper_end, b.evtwin_beg, b.evtwin_end, b.evtdate
   from _temp a left join _temp1 b
   on b.&evtdate=a.&evtdate;
  quit;
  
  %put ; %put ### PREPARING BENCHMARK FACTORS... ;
  proc sql;create table _factors
   as select a.date, a.vwretd, b.smb, b.hml, b.umd as mom
   from crsp.dsi (keep=date vwretd) a left join ff.factors_daily b
   on a.date=b.date;
  quit;
  %put ### DONE! ;
  
  %put; %put ### RETRIEVING RETURNS DATA FROM CRSP...;
  proc sql;
   create table _evtrets_temp
   as select a.permno, a.date format date9., a.ret as ret1, b.*
   from crsp.dsf a, _temp2 b /*change _temp reference. CHANGE HERE*/
   where a.permno=b.permno and b.estper_beg<=a.date<=b.evtwin_end;
  quit;
  %put ### DONE!;
  
  %put; %put ### MERGING IN BECHMARK FACTORS...;
  proc sql;
   create table _evtrets1
     as select a.*, b.*, (c.time-d.time) as evttime
   from _evtrets_temp a
   left join _factors (keep=date &factors) b
        on a.date=b.date
   left join _caldates c
        on a.date=c.evtdate /*change &evtdate to evtdate. CHANGE HERE*/
   left join _caldates d
        on a.evtdate=d.evtdate; /*change condition. CHANGE HERE*/
 
   create table _evtrets (where=(not missing(vwretd)))
     as select a.*, a.ret1 label='Ret unadjusted for delisting',
     (1+a.ret1)*sum(1,b.dlret)-1-a.vwretd as exret label='Market-adjusted total ret',
     (1+a.ret1)*sum(1,b.dlret)-1 as ret "Ret adjusted for delisting"
   from _evtrets1 a left join crsp.dsedelist (where=(missing(dlret)=0)) b
   on a.permno=b.permno and a.date=b.dlstdt
   order by a.permno, a._edate, a.date, a.evttime;
 quit;
 %put ### DONE!;

 %put; %put ### ESTIMATING FACTOR EXPOSURES OVER THE ESTIMATION PERIOD...;
 %if %lowcase(&model) ne madj %then %do;
  /*estimate risk factor exposures during the estimation period*/
  proc reg data=_evtrets edf outest=_params (rename=&newvars
    keep=permno _edate intercept &factors _rmse_  _p_ _edf_) noprint;
    where estper_beg<=date<=estper_end;
    by permno _edate;
    model ret=&factors;
  quit;%end;
  %else %do;
   proc reg data=_evtrets edf outest=_params (rename=&newvars
    keep=permno _edate intercept _rmse_  _p_ _edf_) noprint;
    where estper_beg<=date<=estper_end;
    by permno _edate;
    model ret=;
  quit;%end;
 %put ### DONE!;

 %put; %put ### CALCULATING ONE-DAY ABNORMAL RETURN IN THE EVENT WINDOW...;
  data _abrets/view=_abrets;
    merge _evtrets (where=(evtwin_beg<=date<=evtwin_end) in=a) _params;
     by permno _edate;
     abret=&abret;
     logret=log(1+ret);
     var_estp=_rmse_*_rmse_;
     nobs=_p_+_edf_;
     label var_estp='Estimation Period Variance'
           abret=   'One-day Abnormal Return (AR)'
           ret=     'Raw Return'
           _edate=  'Event Date'
           evttime= "Trading day within (&start,&end) event window";
	 drop _p_ _edf_ estper_beg estper_end;
     if a;
  run;
 %put ### DONE!;
  
 %put; %put ### CALCULATING CARS AND VARIOUS STATISTICS...;
  proc means data=_abrets noprint;
   by permno _edate;
   id &group var_estp;
  output out=_car sum(logret)=cret sum(abret)=car n(abret)=nrets;
  
  /*calculate Standardized Cumulative Abnormal Returns*/
  data _car; set _car;
    poscar=car>0;
    scar=car/(&evtwin*var_estp)**0.5;
    cret=exp(cret)-1;
    label poscar='Positive Abnormal Return Dummy'
          scar=  'Standardized Cumulative Abnormal Return (SCAR)'
          car=   'Cumulative Abnormal Return (CAR)'
          cret=  'Cumulative Raw Return'
         nrets=  'Number of non-missing abnormal returns within event window';
  
  /*compute stats across all events (i.e., permno-event date combinations*/
  proc means data=_car noprint;
    var cret car scar poscar;
    class &group;
    output out=_test
  mean= n= t=/autoname;
  
  /*calculate different stats for assessing    */
  /*statistical signficance of abnormal returns*/
  data &outstats; set _test;
    tpatell=scar_mean*((scar_n)**0.5);
    tsign=(poscar_mean-0.5)/sqrt(0.25/poscar_n);
    format cret_mean car_mean percent7.5;
    label tpatell=     "Patell's t-stat"
     car_mean=    'Mean Cumulative Abnormal Return'
     cret_mean=   'Mean Cumulative Raw Return'
     scar_mean=   'Mean Cumulative Standardized Abnormal Return'
     car_t=       'Cross-sectional t-stat'
     scar_t=      "Boehmer's et al. (1991) t-stat"
     car_n=       'Number of events in the portfolio'
     poscar_mean= 'Percent of positive abnormal returns'                                         
     tsign=       'Sign-test statistic';
    drop cret_N scar_N poscar_N cret_t poscar_t;
   run;
  %put ### DONE!;
  
  proc print label u;
    title1 "Output for dataset &inset for a
   (&start,&end) event window using &model model";
    id &group;
    var cret_mean car_mean scar_mean poscar_mean
         car_n tsign tpatell car_t scar_t;
  
 %if "&group" ne "" %then %do;
  title2 "Test for Equality of CARs among groups defined by &group";
  
 /*To find out the results of the hypothesis test for comparing groups   */
 /*find the row of output labeled 'Model' and look at the column labeled */
 /*F-value for the Fisher statistic and Pr>F for the associated p-value  */
 /*HOVTEST tests for whether variances of two groups are the same        */
 proc glm data=_car;
   class &group;
   model scar=&group;
   means &group /hovtest;
  
 proc npar1way data=_car wilcoxon;
  var scar;
  class &group;
 %end;
 run;
 
/*create the final output dataset*/
  %if %lowcase(&model) ne madj %then %do;
              %let _beta=_params(keep=permno _edate beta);
			  %let _beta_label=beta='Beta';
              %end;
  %else %do;
			  %let _beta=;
			  %let _beta_label=;
			  %end;  /*add IF statement. CHANGE HERE*/

data &outset;
   merge _temp (in=a rename=(&evtdate=_edate))  /*change &inset reference. CHANGE HERE*/
         _abrets(keep=permno _edate date evttime ret abret var_estp)
         _car   (keep=permno _edate cret car scar nrets)
         &_beta;  /*add &_beta. CHANGE HERE*/
   by permno _edate;
   rename _edate=&evtdate;  /*use original variable name. CHANGE HERE*/
   label _edate='Event date'
         date='Trading date in event window'
         &_beta_label;  /*add &_beta_label. CHANGE HERE*/
   format _edate date9. date date9.;
   if a;
  run;
  
 /*house cleaning*/
 proc sql; drop table _caldates, _car, _factors, _test,
         _params, _temp, _evtrets1, _evtrets_temp,
         _temp1, _temp2;
          drop view _abrets; quit;
 options errors=&errors &oldoptions;
 %put ;%put ### OUTPUT IN THE DATASET &outset;
 %put ;%put ### TEST STATISTICS IN THE DATASET &outstats;
%MEND;

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

%MACRO EVTSTUDY_NEW (INSET=, OUTSET=, OUTSTATS=, ID=, EVTDATE=,

ESTPER=, START=, END=, GAP=, GROUP=, MODEL=);

/* Summary: Perform event study and calculate beta */

/* Parameters: */

/* - ID : Name of security identifier in INSET: PERMNO or CUSIP */

/* CUSIP should be at least 8 (eight) characters */

/* - INSET : Input dataset containing security IDs and event dates */

/* - OUTSET : Name of the output dataset to store mean CAR and t-stats */

/* - OUTSTATS:Name of the output dataset to store test statistics (Patell Z, etc) */

/* - EVTDATE: Name of the event date variable in INSET dataset */

/* - ESTPER : Length of the estimation period in trading days over which */

/* the risk model is run, e.g., 110; */

/* - START : Beginning of the event window (relative to the event date, eg. -2) */

/* - END : End of the event window (relative to the event date, e.g., +1) */

/* - GAP : Length of pre-event window, i.e., number of trading days between */

/* the end of estimation period and the beginning of the event window */

/* - GROUP: Defines an subgroup (can be more than 2) */

/* - MODEL: Risk model to be used for risk-adjustment */

/* madj - Market-Adjusted Model (assumes stock beta=1) */

/* m - Standard Market Model (CRSP value-weighted index as the market)*/

/* ff - Fama-French three factor model */

/* ffm - Carhart model that includes FF factors plus momentum */

%local evtwin factors abret newvars;

%local oldoptions errors;

%let oldoptions=%sysfunc(getoption(mprint)) %sysfunc(getoption(notes))

%sysfunc(getoption(source));

%let errors=%sysfunc(getoption(errors));

options notes mprint source errors=0; /*display codes debugging. CHANGE HERE*/

%let evtwin=%eval(&end-&start+1); *length of event window in trading days;

/*depending on the model, define the model for abnormal returns*/

%if %lowcase(&model)=madj %then %do; %let factors=vwretd;

%let abret=ret-vwretd;

%let newvars=(intercept=alpha);

%end;%else

%if %lowcase(&model)=m %then %do; %let factors=vwretd;

%let abret=ret-alpha-beta*vwretd;

%let newvars=(intercept=alpha vwretd=beta);

%end;%else

%if %lowcase(&model)=ff %then %do;

%let factors=vwretd smb hml;

%let abret=ret-alpha-beta*vwretd-sminb*smb-hminl*hml;

%let newvars=(intercept=alpha vwretd=beta smb=sminb hml=hminl);

%end;%else

%if %lowcase(&model)=ffm %then %do;

%let factors=vwretd smb hml mom;

%let abret=ret-alpha-beta*vwretd-sminb*smb-hminl*hml-wminl*mom;

%let newvars=(intercept=alpha vwretd=beta smb=sminb hml=hminl mom=wminl);

%end;

%put; %put ### CREATING TRADING DAY CALENDAR...;

data _caldates;

merge crsp.dsi (keep=date rename=(date=estper_beg))

crsp.dsi (keep=date firstobs=%eval(&estper) rename=(date=estper_end))

crsp.dsi (keep=date firstobs=%eval(&estper+&gap+1) rename=(date=evtwin_beg))

crsp.dsi (keep=date firstobs=%eval(&estper+&gap-&start+1) rename=(date=evtdate)) /*change &evtdate to evtdate. CHANGE HERE*/

crsp.dsi (keep=date firstobs=%eval(&estper+&gap+&evtwin) rename=(date=evtwin_end));

format estper_beg estper_end evtwin_beg evtdate evtwin_end date9.; /*change &evtdate to evtdate. CHANGE HERE*/

if nmiss(estper_beg, estper_end, evtwin_beg, evtdate, evtwin_end)=0; /*change &evtdate to evtdate. CHANGE HERE*/

time+1;

run;

%put ### DONE!;

/*If primary identifier is Cusip, then link in permno*/

%if %lowcase(&id)=cusip %then %do;

proc sql;

create view _link

as select permno, ncusip,

min(namedt) as fdate format=date9., max(nameendt) as ldate format=date9.

from crsp.dsenames

group by permno, ncusip;

create table _temp

as select distinct b.permno, a.*

from &inset a left join _link b

on a.cusip=b.ncusip and b.fdate<=a.&evtdate<=b.ldate

order by b.permno, a.&evtdate; /*order by both permno and &evtdate. CHANGE HERE*/

quit;%end;

%else %do;

/*pre-sort the input dataset in case it is not sorted yet*/

proc sort data=&inset out=_temp;

by permno &evtdate; /*order by both permno and &evtdate. CHANGE HERE*/

run;

%end;

/*If event date is a non-trading day, select the closest */

/*trading day that follows the event day */

/*Merge in relevant dates from the trading calendar */

/*CHANGE HERE to improve efficiency and correct errors

proc sql;

create table _temp (drop=&evtdate)

as select a.*, a.&evtdate as _edate format date9., b.*

from _temp a left join _caldates (drop=time) b

on b.&evtdate-a.&evtdate>=0

group by a.&evtdate

having (b.&evtdate-a.&evtdate)=min(b.&evtdate-a.&evtdate);

quit;*/

proc sql;

create table _temp1

as select a.&evtdate, b.estper_beg, b.estper_end, b.evtwin_beg, b.evtwin_end, b.evtdate

from (select distinct &evtdate from _temp) a left join _caldates b

on b.evtdate-a.&evtdate>=0

group by a.&evtdate

having (b.evtdate-a.&evtdate)=min(b.evtdate-a.&evtdate);

create table _temp2 (drop=&evtdate) /*use _temp2 to surpress warnings and retain _temp for later use. CHANGE HERE*/

as select a.*, a.&evtdate as _edate format date9., b.estper_beg, b.estper_end, b.evtwin_beg, b.evtwin_end, b.evtdate

from _temp a left join _temp1 b

on b.&evtdate=a.&evtdate;

quit;

%put ; %put ### PREPARING BENCHMARK FACTORS... ;

proc sql;create table _factors

as select a.date, a.vwretd, b.smb, b.hml, b.umd as mom

from crsp.dsi (keep=date vwretd) a left join ff.factors_daily b

on a.date=b.date;

quit;

%put ### DONE! ;

%put; %put ### RETRIEVING RETURNS DATA FROM CRSP...;

proc sql;

create table _evtrets_temp

as select a.permno, a.date format date9., a.ret as ret1, b.*

from crsp.dsf a, _temp2 b /*change _temp reference. CHANGE HERE*/

where a.permno=b.permno and b.estper_beg<=a.date<=b.evtwin_end;

quit;

%put ### DONE!;

%put; %put ### MERGING IN BECHMARK FACTORS...;

proc sql;

create table _evtrets1

as select a.*, b.*, (c.time-d.time) as evttime

from _evtrets_temp a

left join _factors (keep=date &factors) b

on a.date=b.date

left join _caldates c

on a.date=c.evtdate /*change &evtdate to evtdate. CHANGE HERE*/

left join _caldates d

on a.evtdate=d.evtdate; /*change condition. CHANGE HERE*/

create table _evtrets (where=(not missing(vwretd)))

as select a.*, a.ret1 label='Ret unadjusted for delisting',

(1+a.ret1)*sum(1,b.dlret)-1-a.vwretd as exret label='Market-adjusted total ret',

(1+a.ret1)*sum(1,b.dlret)-1 as ret "Ret adjusted for delisting"

from _evtrets1 a left join crsp.dsedelist (where=(missing(dlret)=0)) b

on a.permno=b.permno and a.date=b.dlstdt

order by a.permno, a._edate, a.date, a.evttime;

quit;

%put ### DONE!;

%put; %put ### ESTIMATING FACTOR EXPOSURES OVER THE ESTIMATION PERIOD...;

%if %lowcase(&model) ne madj %then %do;

/*estimate risk factor exposures during the estimation period*/

proc reg data=_evtrets edf outest=_params (rename=&newvars

keep=permno _edate intercept &factors _rmse_ _p_ _edf_) noprint;

where estper_beg<=date<=estper_end;

by permno _edate;

model ret=&factors;

quit;%end;

%else %do;

proc reg data=_evtrets edf outest=_params (rename=&newvars

keep=permno _edate intercept _rmse_ _p_ _edf_) noprint;

where estper_beg<=date<=estper_end;

by permno _edate;

model ret=;

quit;%end;

%put ### DONE!;

%put; %put ### CALCULATING ONE-DAY ABNORMAL RETURN IN THE EVENT WINDOW...;

data _abrets/view=_abrets;

merge _evtrets (where=(evtwin_beg<=date<=evtwin_end) in=a) _params;

by permno _edate;

abret=&abret;

logret=log(1+ret);

var_estp=_rmse_*_rmse_;

nobs=_p_+_edf_;

label var_estp='Estimation Period Variance'

abret= 'One-day Abnormal Return (AR)'

ret= 'Raw Return'

_edate= 'Event Date'

evttime= "Trading day within (&start,&end) event window";

drop _p_ _edf_ estper_beg estper_end;

if a;

run;

%put ### DONE!;

%put; %put ### CALCULATING CARS AND VARIOUS STATISTICS...;

proc means data=_abrets noprint;

by permno _edate;

id &group var_estp;

output out=_car sum(logret)=cret sum(abret)=car n(abret)=nrets;

/*calculate Standardized Cumulative Abnormal Returns*/

data _car; set _car;

poscar=car>0;

scar=car/(&evtwin*var_estp)**0.5;

cret=exp(cret)-1;

label poscar='Positive Abnormal Return Dummy'

scar= 'Standardized Cumulative Abnormal Return (SCAR)'

car= 'Cumulative Abnormal Return (CAR)'

cret= 'Cumulative Raw Return'

nrets= 'Number of non-missing abnormal returns within event window';

/*compute stats across all events (i.e., permno-event date combinations*/

proc means data=_car noprint;

var cret car scar poscar;

class &group;

output out=_test

mean= n= t=/autoname;

/*calculate different stats for assessing */

/*statistical signficance of abnormal returns*/

data &outstats; set _test;

tpatell=scar_mean*((scar_n)**0.5);

tsign=(poscar_mean-0.5)/sqrt(0.25/poscar_n);

format cret_mean car_mean percent7.5;

label tpatell= "Patell's t-stat"

car_mean= 'Mean Cumulative Abnormal Return'

cret_mean= 'Mean Cumulative Raw Return'

scar_mean= 'Mean Cumulative Standardized Abnormal Return'

car_t= 'Cross-sectional t-stat'

scar_t= "Boehmer's et al. (1991) t-stat"

car_n= 'Number of events in the portfolio'

poscar_mean= 'Percent of positive abnormal returns'

tsign= 'Sign-test statistic';

drop cret_N scar_N poscar_N cret_t poscar_t;

run;

%put ### DONE!;

proc print label u;

title1 "Output for dataset &inset for a

(&start,&end) event window using &model model";

id &group;

var cret_mean car_mean scar_mean poscar_mean

car_n tsign tpatell car_t scar_t;

%if "&group" ne "" %then %do;

title2 "Test for Equality of CARs among groups defined by &group";

/*To find out the results of the hypothesis test for comparing groups */

/*find the row of output labeled 'Model' and look at the column labeled */

/*F-value for the Fisher statistic and Pr>F for the associated p-value */

/*HOVTEST tests for whether variances of two groups are the same */

proc glm data=_car;

class &group;

model scar=&group;

means &group /hovtest;

proc npar1way data=_car wilcoxon;

var scar;

class &group;

%end;

run;

/*create the final output dataset*/

%if %lowcase(&model) ne madj %then %do;

%let _beta=_params(keep=permno _edate beta);

%let _beta_label=beta='Beta';

%end;

%else %do;

%let _beta=;

%let _beta_label=;

%end; /*add IF statement. CHANGE HERE*/

data &outset;

merge _temp (in=a rename=(&evtdate=_edate)) /*change &inset reference. CHANGE HERE*/

_abrets(keep=permno _edate date evttime ret abret var_estp)

_car (keep=permno _edate cret car scar nrets)

&_beta; /*add &_beta. CHANGE HERE*/

by permno _edate;

rename _edate=&evtdate; /*use original variable name. CHANGE HERE*/

label _edate='Event date'

date='Trading date in event window'

&_beta_label; /*add &_beta_label. CHANGE HERE*/

format _edate date9. date date9.;

if a;

run;

/*house cleaning*/

proc sql; drop table _caldates, _car, _factors, _test,

_params, _temp, _evtrets1, _evtrets_temp,

_temp1, _temp2;

drop view _abrets; quit;

options errors=&errors &oldoptions;

%put ;%put ### OUTPUT IN THE DATASET &outset;

%put ;%put ### TEST STATISTICS IN THE DATASET &outstats;

%MEND;

Posted in SAS | 1 Comment

Use Python to calculate the tone of financial articles

Posted on September 7, 2015 by Kai Chen

[Update on 2019-03-01] I completely rewrite the Python program. The updates include:

I include two domain-specific dictionaries: Loughran and McDonald’s and Henry’s dictionaries, and you can choose which dictionary to use.
I add negation check as suggested by Loughran and McDonald (2011). That is, any occurrence of negate words (e.g., isn’t, not, never) within three words preceding a positive word will flip that positive word into a negative one. Negation check only applies to positive words because Loughran and McDonald (2011) suggest that double negation (i.e., a negate word precedes a negative word) is not common. I expand their negate word list though, since theirs seem incomplete. In my sample of 90,000+ press releases, negation check finds that 5.7% of press releases have positive word(s) with a preceding negate word.

Please note:

The Python program first transform an article into a bag of words in their original order. Different research questions may define “word” differently. For example, some research questions only look at alphabetic words (i.e., remove all numbers in an article). I use this definition in the following Python program. But you may want to change this to suit your research question. In addition, there are many nuances in splitting sentences into words. The splitting method in the following Python program is simple but imperfect of course.
To use the Python program, you have to know how to assign the full text of an article to the variable article (using a loop) and how to output the results into a database-like file (Sqlite or CSV).

I acknowledge the work done by C.J. Hutto (see his work at GitHub).

import re

# Loughran and McDonald Sentiment Word Lists (https://sraf.nd.edu/textual-analysis/resources/)
lmdict = {'Negative': ['abandon', 'abandoned', 'abandoning', 'abandonment', 'abandonments', 'abandons', 'abdicated',
                       'abdicates', 'abdicating', 'abdication', 'abdications', 'aberrant', 'aberration', 'aberrational',
                       'aberrations', 'abetting', 'abnormal', 'abnormalities', 'abnormality', 'abnormally', 'abolish',
                       'abolished', 'abolishes', 'abolishing', 'abrogate', 'abrogated', 'abrogates', 'abrogating',
                       'abrogation', 'abrogations', 'abrupt', 'abruptly', 'abruptness', 'absence', 'absences',
                       'absenteeism', 'abuse', 'abused', 'abuses', 'abusing', 'abusive', 'abusively', 'abusiveness',
                       'accident', 'accidental', 'accidentally', 'accidents', 'accusation', 'accusations', 'accuse',
                       'accused', 'accuses', 'accusing', 'acquiesce', 'acquiesced', 'acquiesces', 'acquiescing',
                       'acquit', 'acquits', 'acquittal', 'acquittals', 'acquitted', 'acquitting', 'adulterate',
                       'adulterated', 'adulterating', 'adulteration', 'adulterations', 'adversarial', 'adversaries',
                       'adversary', 'adverse', 'adversely', 'adversities', 'adversity', 'aftermath', 'aftermaths',
                       'against', 'aggravate', 'aggravated', 'aggravates', 'aggravating', 'aggravation', 'aggravations',
                       'alerted', 'alerting', 'alienate', 'alienated', 'alienates', 'alienating', 'alienation',
                       'alienations', 'allegation', 'allegations', 'allege', 'alleged', 'allegedly', 'alleges',
                       'alleging', 'annoy', 'annoyance', 'annoyances', 'annoyed', 'annoying', 'annoys', 'annul',
                       'annulled', 'annulling', 'annulment', 'annulments', 'annuls', 'anomalies', 'anomalous',
                       'anomalously', 'anomaly', 'anticompetitive', 'antitrust', 'argue', 'argued', 'arguing',
                       'argument', 'argumentative', 'arguments', 'arrearage', 'arrearages', 'arrears', 'arrest',
                       'arrested', 'arrests', 'artificially', 'assault', 'assaulted', 'assaulting', 'assaults',
                       'assertions', 'attrition', 'aversely', 'backdating', 'bad', 'bail', 'bailout', 'balk', 'balked',
                       'bankrupt', 'bankruptcies', 'bankruptcy', 'bankrupted', 'bankrupting', 'bankrupts', 'bans',
                       'barred', 'barrier', 'barriers', 'bottleneck', 'bottlenecks', 'boycott', 'boycotted',
                       'boycotting', 'boycotts', 'breach', 'breached', 'breaches', 'breaching', 'break', 'breakage',
                       'breakages', 'breakdown', 'breakdowns', 'breaking', 'breaks', 'bribe', 'bribed', 'briberies',
                       'bribery', 'bribes', 'bribing', 'bridge', 'broken', 'burden', 'burdened', 'burdening', 'burdens',
                       'burdensome', 'burned', 'calamities', 'calamitous', 'calamity', 'cancel', 'canceled',
                       'canceling', 'cancellation', 'cancellations', 'cancelled', 'cancelling', 'cancels', 'careless',
                       'carelessly', 'carelessness', 'catastrophe', 'catastrophes', 'catastrophic', 'catastrophically',
                       'caution', 'cautionary', 'cautioned', 'cautioning', 'cautions', 'cease', 'ceased', 'ceases',
                       'ceasing', 'censure', 'censured', 'censures', 'censuring', 'challenge', 'challenged',
                       'challenges', 'challenging', 'chargeoffs', 'circumvent', 'circumvented', 'circumventing',
                       'circumvention', 'circumventions', 'circumvents', 'claiming', 'claims', 'clawback', 'closed',
                       'closeout', 'closeouts', 'closing', 'closings', 'closure', 'closures', 'coerce', 'coerced',
                       'coerces', 'coercing', 'coercion', 'coercive', 'collapse', 'collapsed', 'collapses',
                       'collapsing', 'collision', 'collisions', 'collude', 'colluded', 'colludes', 'colluding',
                       'collusion', 'collusions', 'collusive', 'complain', 'complained', 'complaining', 'complains',
                       'complaint', 'complaints', 'complicate', 'complicated', 'complicates', 'complicating',
                       'complication', 'complications', 'compulsion', 'concealed', 'concealing', 'concede', 'conceded',
                       'concedes', 'conceding', 'concern', 'concerned', 'concerns', 'conciliating', 'conciliation',
                       'conciliations', 'condemn', 'condemnation', 'condemnations', 'condemned', 'condemning',
                       'condemns', 'condone', 'condoned', 'confess', 'confessed', 'confesses', 'confessing',
                       'confession', 'confine', 'confined', 'confinement', 'confinements', 'confines', 'confining',
                       'confiscate', 'confiscated', 'confiscates', 'confiscating', 'confiscation', 'confiscations',
                       'conflict', 'conflicted', 'conflicting', 'conflicts', 'confront', 'confrontation',
                       'confrontational', 'confrontations', 'confronted', 'confronting', 'confronts', 'confuse',
                       'confused', 'confuses', 'confusing', 'confusingly', 'confusion', 'conspiracies', 'conspiracy',
                       'conspirator', 'conspiratorial', 'conspirators', 'conspire', 'conspired', 'conspires',
                       'conspiring', 'contempt', 'contend', 'contended', 'contending', 'contends', 'contention',
                       'contentions', 'contentious', 'contentiously', 'contested', 'contesting', 'contraction',
                       'contractions', 'contradict', 'contradicted', 'contradicting', 'contradiction', 'contradictions',
                       'contradictory', 'contradicts', 'contrary', 'controversial', 'controversies', 'controversy',
                       'convict', 'convicted', 'convicting', 'conviction', 'convictions', 'corrected', 'correcting',
                       'correction', 'corrections', 'corrects', 'corrupt', 'corrupted', 'corrupting', 'corruption',
                       'corruptions', 'corruptly', 'corruptness', 'costly', 'counterclaim', 'counterclaimed',
                       'counterclaiming', 'counterclaims', 'counterfeit', 'counterfeited', 'counterfeiter',
                       'counterfeiters', 'counterfeiting', 'counterfeits', 'countermeasure', 'countermeasures', 'crime',
                       'crimes', 'criminal', 'criminally', 'criminals', 'crises', 'crisis', 'critical', 'critically',
                       'criticism', 'criticisms', 'criticize', 'criticized', 'criticizes', 'criticizing', 'crucial',
                       'crucially', 'culpability', 'culpable', 'culpably', 'cumbersome', 'curtail', 'curtailed',
                       'curtailing', 'curtailment', 'curtailments', 'curtails', 'cut', 'cutback', 'cutbacks',
                       'cyberattack', 'cyberattacks', 'cyberbullying', 'cybercrime', 'cybercrimes', 'cybercriminal',
                       'cybercriminals', 'damage', 'damaged', 'damages', 'damaging', 'dampen', 'dampened', 'danger',
                       'dangerous', 'dangerously', 'dangers', 'deadlock', 'deadlocked', 'deadlocking', 'deadlocks',
                       'deadweight', 'deadweights', 'debarment', 'debarments', 'debarred', 'deceased', 'deceit',
                       'deceitful', 'deceitfulness', 'deceive', 'deceived', 'deceives', 'deceiving', 'deception',
                       'deceptions', 'deceptive', 'deceptively', 'decline', 'declined', 'declines', 'declining',
                       'deface', 'defaced', 'defacement', 'defamation', 'defamations', 'defamatory', 'defame',
                       'defamed', 'defames', 'defaming', 'default', 'defaulted', 'defaulting', 'defaults', 'defeat',
                       'defeated', 'defeating', 'defeats', 'defect', 'defective', 'defects', 'defend', 'defendant',
                       'defendants', 'defended', 'defending', 'defends', 'defensive', 'defer', 'deficiencies',
                       'deficiency', 'deficient', 'deficit', 'deficits', 'defraud', 'defrauded', 'defrauding',
                       'defrauds', 'defunct', 'degradation', 'degradations', 'degrade', 'degraded', 'degrades',
                       'degrading', 'delay', 'delayed', 'delaying', 'delays', 'deleterious', 'deliberate',
                       'deliberated', 'deliberately', 'delinquencies', 'delinquency', 'delinquent', 'delinquently',
                       'delinquents', 'delist', 'delisted', 'delisting', 'delists', 'demise', 'demised', 'demises',
                       'demising', 'demolish', 'demolished', 'demolishes', 'demolishing', 'demolition', 'demolitions',
                       'demote', 'demoted', 'demotes', 'demoting', 'demotion', 'demotions', 'denial', 'denials',
                       'denied', 'denies', 'denigrate', 'denigrated', 'denigrates', 'denigrating', 'denigration',
                       'deny', 'denying', 'deplete', 'depleted', 'depletes', 'depleting', 'depletion', 'depletions',
                       'deprecation', 'depress', 'depressed', 'depresses', 'depressing', 'deprivation', 'deprive',
                       'deprived', 'deprives', 'depriving', 'derelict', 'dereliction', 'derogatory', 'destabilization',
                       'destabilize', 'destabilized', 'destabilizing', 'destroy', 'destroyed', 'destroying', 'destroys',
                       'destruction', 'destructive', 'detain', 'detained', 'detention', 'detentions', 'deter',
                       'deteriorate', 'deteriorated', 'deteriorates', 'deteriorating', 'deterioration',
                       'deteriorations', 'deterred', 'deterrence', 'deterrences', 'deterrent', 'deterrents',
                       'deterring', 'deters', 'detract', 'detracted', 'detracting', 'detriment', 'detrimental',
                       'detrimentally', 'detriments', 'devalue', 'devalued', 'devalues', 'devaluing', 'devastate',
                       'devastated', 'devastating', 'devastation', 'deviate', 'deviated', 'deviates', 'deviating',
                       'deviation', 'deviations', 'devolve', 'devolved', 'devolves', 'devolving', 'difficult',
                       'difficulties', 'difficultly', 'difficulty', 'diminish', 'diminished', 'diminishes',
                       'diminishing', 'diminution', 'disadvantage', 'disadvantaged', 'disadvantageous', 'disadvantages',
                       'disaffiliation', 'disagree', 'disagreeable', 'disagreed', 'disagreeing', 'disagreement',
                       'disagreements', 'disagrees', 'disallow', 'disallowance', 'disallowances', 'disallowed',
                       'disallowing', 'disallows', 'disappear', 'disappearance', 'disappearances', 'disappeared',
                       'disappearing', 'disappears', 'disappoint', 'disappointed', 'disappointing', 'disappointingly',
                       'disappointment', 'disappointments', 'disappoints', 'disapproval', 'disapprovals', 'disapprove',
                       'disapproved', 'disapproves', 'disapproving', 'disassociates', 'disassociating',
                       'disassociation', 'disassociations', 'disaster', 'disasters', 'disastrous', 'disastrously',
                       'disavow', 'disavowal', 'disavowed', 'disavowing', 'disavows', 'disciplinary', 'disclaim',
                       'disclaimed', 'disclaimer', 'disclaimers', 'disclaiming', 'disclaims', 'disclose', 'disclosed',
                       'discloses', 'disclosing', 'discontinuance', 'discontinuances', 'discontinuation',
                       'discontinuations', 'discontinue', 'discontinued', 'discontinues', 'discontinuing', 'discourage',
                       'discouraged', 'discourages', 'discouraging', 'discredit', 'discredited', 'discrediting',
                       'discredits', 'discrepancies', 'discrepancy', 'disfavor', 'disfavored', 'disfavoring',
                       'disfavors', 'disgorge', 'disgorged', 'disgorgement', 'disgorgements', 'disgorges', 'disgorging',
                       'disgrace', 'disgraceful', 'disgracefully', 'dishonest', 'dishonestly', 'dishonesty', 'dishonor',
                       'dishonorable', 'dishonorably', 'dishonored', 'dishonoring', 'dishonors', 'disincentives',
                       'disinterested', 'disinterestedly', 'disinterestedness', 'disloyal', 'disloyally', 'disloyalty',
                       'dismal', 'dismally', 'dismiss', 'dismissal', 'dismissals', 'dismissed', 'dismisses',
                       'dismissing', 'disorderly', 'disparage', 'disparaged', 'disparagement', 'disparagements',
                       'disparages', 'disparaging', 'disparagingly', 'disparities', 'disparity', 'displace',
                       'displaced', 'displacement', 'displacements', 'displaces', 'displacing', 'dispose', 'dispossess',
                       'dispossessed', 'dispossesses', 'dispossessing', 'disproportion', 'disproportional',
                       'disproportionate', 'disproportionately', 'dispute', 'disputed', 'disputes', 'disputing',
                       'disqualification', 'disqualifications', 'disqualified', 'disqualifies', 'disqualify',
                       'disqualifying', 'disregard', 'disregarded', 'disregarding', 'disregards', 'disreputable',
                       'disrepute', 'disrupt', 'disrupted', 'disrupting', 'disruption', 'disruptions', 'disruptive',
                       'disrupts', 'dissatisfaction', 'dissatisfied', 'dissent', 'dissented', 'dissenter', 'dissenters',
                       'dissenting', 'dissents', 'dissident', 'dissidents', 'dissolution', 'dissolutions', 'distort',
                       'distorted', 'distorting', 'distortion', 'distortions', 'distorts', 'distract', 'distracted',
                       'distracting', 'distraction', 'distractions', 'distracts', 'distress', 'distressed', 'disturb',
                       'disturbance', 'disturbances', 'disturbed', 'disturbing', 'disturbs', 'diversion', 'divert',
                       'diverted', 'diverting', 'diverts', 'divest', 'divested', 'divesting', 'divestiture',
                       'divestitures', 'divestment', 'divestments', 'divests', 'divorce', 'divorced', 'divulge',
                       'divulged', 'divulges', 'divulging', 'doubt', 'doubted', 'doubtful', 'doubts', 'downgrade',
                       'downgraded', 'downgrades', 'downgrading', 'downsize', 'downsized', 'downsizes', 'downsizing',
                       'downsizings', 'downtime', 'downtimes', 'downturn', 'downturns', 'downward', 'downwards', 'drag',
                       'drastic', 'drastically', 'drawback', 'drawbacks', 'dropped', 'drought', 'droughts', 'duress',
                       'dysfunction', 'dysfunctional', 'dysfunctions', 'easing', 'egregious', 'egregiously', 'embargo',
                       'embargoed', 'embargoes', 'embargoing', 'embarrass', 'embarrassed', 'embarrasses',
                       'embarrassing', 'embarrassment', 'embarrassments', 'embezzle', 'embezzled', 'embezzlement',
                       'embezzlements', 'embezzler', 'embezzles', 'embezzling', 'encroach', 'encroached', 'encroaches',
                       'encroaching', 'encroachment', 'encroachments', 'encumber', 'encumbered', 'encumbering',
                       'encumbers', 'encumbrance', 'encumbrances', 'endanger', 'endangered', 'endangering',
                       'endangerment', 'endangers', 'enjoin', 'enjoined', 'enjoining', 'enjoins', 'erode', 'eroded',
                       'erodes', 'eroding', 'erosion', 'erratic', 'erratically', 'erred', 'erring', 'erroneous',
                       'erroneously', 'error', 'errors', 'errs', 'escalate', 'escalated', 'escalates', 'escalating',
                       'evade', 'evaded', 'evades', 'evading', 'evasion', 'evasions', 'evasive', 'evict', 'evicted',
                       'evicting', 'eviction', 'evictions', 'evicts', 'exacerbate', 'exacerbated', 'exacerbates',
                       'exacerbating', 'exacerbation', 'exacerbations', 'exaggerate', 'exaggerated', 'exaggerates',
                       'exaggerating', 'exaggeration', 'excessive', 'excessively', 'exculpate', 'exculpated',
                       'exculpates', 'exculpating', 'exculpation', 'exculpations', 'exculpatory', 'exonerate',
                       'exonerated', 'exonerates', 'exonerating', 'exoneration', 'exonerations', 'exploit',
                       'exploitation', 'exploitations', 'exploitative', 'exploited', 'exploiting', 'exploits', 'expose',
                       'exposed', 'exposes', 'exposing', 'expropriate', 'expropriated', 'expropriates', 'expropriating',
                       'expropriation', 'expropriations', 'expulsion', 'expulsions', 'extenuating', 'fail', 'failed',
                       'failing', 'failings', 'fails', 'failure', 'failures', 'fallout', 'false', 'falsely',
                       'falsification', 'falsifications', 'falsified', 'falsifies', 'falsify', 'falsifying', 'falsity',
                       'fatalities', 'fatality', 'fatally', 'fault', 'faulted', 'faults', 'faulty', 'fear', 'fears',
                       'felonies', 'felonious', 'felony', 'fictitious', 'fined', 'fines', 'fired', 'firing', 'flaw',
                       'flawed', 'flaws', 'forbid', 'forbidden', 'forbidding', 'forbids', 'force', 'forced', 'forcing',
                       'foreclose', 'foreclosed', 'forecloses', 'foreclosing', 'foreclosure', 'foreclosures', 'forego',
                       'foregoes', 'foregone', 'forestall', 'forestalled', 'forestalling', 'forestalls', 'forfeit',
                       'forfeited', 'forfeiting', 'forfeits', 'forfeiture', 'forfeitures', 'forgers', 'forgery',
                       'fraud', 'frauds', 'fraudulence', 'fraudulent', 'fraudulently', 'frivolous', 'frivolously',
                       'frustrate', 'frustrated', 'frustrates', 'frustrating', 'frustratingly', 'frustration',
                       'frustrations', 'fugitive', 'fugitives', 'gratuitous', 'gratuitously', 'grievance', 'grievances',
                       'grossly', 'groundless', 'guilty', 'halt', 'halted', 'hamper', 'hampered', 'hampering',
                       'hampers', 'harass', 'harassed', 'harassing', 'harassment', 'hardship', 'hardships', 'harm',
                       'harmed', 'harmful', 'harmfully', 'harming', 'harms', 'harsh', 'harsher', 'harshest', 'harshly',
                       'harshness', 'hazard', 'hazardous', 'hazards', 'hinder', 'hindered', 'hindering', 'hinders',
                       'hindrance', 'hindrances', 'hostile', 'hostility', 'hurt', 'hurting', 'idle', 'idled', 'idling',
                       'ignore', 'ignored', 'ignores', 'ignoring', 'ill', 'illegal', 'illegalities', 'illegality',
                       'illegally', 'illegible', 'illicit', 'illicitly', 'illiquid', 'illiquidity', 'imbalance',
                       'imbalances', 'immature', 'immoral', 'impair', 'impaired', 'impairing', 'impairment',
                       'impairments', 'impairs', 'impasse', 'impasses', 'impede', 'impeded', 'impedes', 'impediment',
                       'impediments', 'impeding', 'impending', 'imperative', 'imperfection', 'imperfections', 'imperil',
                       'impermissible', 'implicate', 'implicated', 'implicates', 'implicating', 'impossibility',
                       'impossible', 'impound', 'impounded', 'impounding', 'impounds', 'impracticable', 'impractical',
                       'impracticalities', 'impracticality', 'imprisonment', 'improper', 'improperly', 'improprieties',
                       'impropriety', 'imprudent', 'imprudently', 'inability', 'inaccessible', 'inaccuracies',
                       'inaccuracy', 'inaccurate', 'inaccurately', 'inaction', 'inactions', 'inactivate', 'inactivated',
                       'inactivates', 'inactivating', 'inactivation', 'inactivations', 'inactivity', 'inadequacies',
                       'inadequacy', 'inadequate', 'inadequately', 'inadvertent', 'inadvertently', 'inadvisability',
                       'inadvisable', 'inappropriate', 'inappropriately', 'inattention', 'incapable', 'incapacitated',
                       'incapacity', 'incarcerate', 'incarcerated', 'incarcerates', 'incarcerating', 'incarceration',
                       'incarcerations', 'incidence', 'incidences', 'incident', 'incidents', 'incompatibilities',
                       'incompatibility', 'incompatible', 'incompetence', 'incompetency', 'incompetent',
                       'incompetently', 'incompetents', 'incomplete', 'incompletely', 'incompleteness', 'inconclusive',
                       'inconsistencies', 'inconsistency', 'inconsistent', 'inconsistently', 'inconvenience',
                       'inconveniences', 'inconvenient', 'incorrect', 'incorrectly', 'incorrectness', 'indecency',
                       'indecent', 'indefeasible', 'indefeasibly', 'indict', 'indictable', 'indicted', 'indicting',
                       'indictment', 'indictments', 'ineffective', 'ineffectively', 'ineffectiveness', 'inefficiencies',
                       'inefficiency', 'inefficient', 'inefficiently', 'ineligibility', 'ineligible', 'inequitable',
                       'inequitably', 'inequities', 'inequity', 'inevitable', 'inexperience', 'inexperienced',
                       'inferior', 'inflicted', 'infraction', 'infractions', 'infringe', 'infringed', 'infringement',
                       'infringements', 'infringes', 'infringing', 'inhibited', 'inimical', 'injunction', 'injunctions',
                       'injure', 'injured', 'injures', 'injuries', 'injuring', 'injurious', 'injury', 'inordinate',
                       'inordinately', 'inquiry', 'insecure', 'insensitive', 'insolvencies', 'insolvency', 'insolvent',
                       'instability', 'insubordination', 'insufficiency', 'insufficient', 'insufficiently',
                       'insurrection', 'insurrections', 'intentional', 'interfere', 'interfered', 'interference',
                       'interferences', 'interferes', 'interfering', 'intermittent', 'intermittently', 'interrupt',
                       'interrupted', 'interrupting', 'interruption', 'interruptions', 'interrupts', 'intimidation',
                       'intrusion', 'invalid', 'invalidate', 'invalidated', 'invalidates', 'invalidating',
                       'invalidation', 'invalidity', 'investigate', 'investigated', 'investigates', 'investigating',
                       'investigation', 'investigations', 'involuntarily', 'involuntary', 'irreconcilable',
                       'irreconcilably', 'irrecoverable', 'irrecoverably', 'irregular', 'irregularities',
                       'irregularity', 'irregularly', 'irreparable', 'irreparably', 'irreversible', 'jeopardize',
                       'jeopardized', 'justifiable', 'kickback', 'kickbacks', 'knowingly', 'lack', 'lacked', 'lacking',
                       'lackluster', 'lacks', 'lag', 'lagged', 'lagging', 'lags', 'lapse', 'lapsed', 'lapses',
                       'lapsing', 'late', 'laundering', 'layoff', 'layoffs', 'lie', 'limitation', 'limitations',
                       'lingering', 'liquidate', 'liquidated', 'liquidates', 'liquidating', 'liquidation',
                       'liquidations', 'liquidator', 'liquidators', 'litigant', 'litigants', 'litigate', 'litigated',
                       'litigates', 'litigating', 'litigation', 'litigations', 'lockout', 'lockouts', 'lose', 'loses',
                       'losing', 'loss', 'losses', 'lost', 'lying', 'malfeasance', 'malfunction', 'malfunctioned',
                       'malfunctioning', 'malfunctions', 'malice', 'malicious', 'maliciously', 'malpractice',
                       'manipulate', 'manipulated', 'manipulates', 'manipulating', 'manipulation', 'manipulations',
                       'manipulative', 'markdown', 'markdowns', 'misapplication', 'misapplications', 'misapplied',
                       'misapplies', 'misapply', 'misapplying', 'misappropriate', 'misappropriated', 'misappropriates',
                       'misappropriating', 'misappropriation', 'misappropriations', 'misbranded', 'miscalculate',
                       'miscalculated', 'miscalculates', 'miscalculating', 'miscalculation', 'miscalculations',
                       'mischaracterization', 'mischief', 'misclassification', 'misclassifications', 'misclassified',
                       'misclassify', 'miscommunication', 'misconduct', 'misdated', 'misdemeanor', 'misdemeanors',
                       'misdirected', 'mishandle', 'mishandled', 'mishandles', 'mishandling', 'misinform',
                       'misinformation', 'misinformed', 'misinforming', 'misinforms', 'misinterpret',
                       'misinterpretation', 'misinterpretations', 'misinterpreted', 'misinterpreting', 'misinterprets',
                       'misjudge', 'misjudged', 'misjudges', 'misjudging', 'misjudgment', 'misjudgments', 'mislabel',
                       'mislabeled', 'mislabeling', 'mislabelled', 'mislabels', 'mislead', 'misleading', 'misleadingly',
                       'misleads', 'misled', 'mismanage', 'mismanaged', 'mismanagement', 'mismanages', 'mismanaging',
                       'mismatch', 'mismatched', 'mismatches', 'mismatching', 'misplaced', 'misprice', 'mispricing',
                       'mispricings', 'misrepresent', 'misrepresentation', 'misrepresentations', 'misrepresented',
                       'misrepresenting', 'misrepresents', 'miss', 'missed', 'misses', 'misstate', 'misstated',
                       'misstatement', 'misstatements', 'misstates', 'misstating', 'misstep', 'missteps', 'mistake',
                       'mistaken', 'mistakenly', 'mistakes', 'mistaking', 'mistrial', 'mistrials', 'misunderstand',
                       'misunderstanding', 'misunderstandings', 'misunderstood', 'misuse', 'misused', 'misuses',
                       'misusing', 'monopolistic', 'monopolists', 'monopolization', 'monopolize', 'monopolized',
                       'monopolizes', 'monopolizing', 'monopoly', 'moratoria', 'moratorium', 'moratoriums',
                       'mothballed', 'mothballing', 'negative', 'negatively', 'negatives', 'neglect', 'neglected',
                       'neglectful', 'neglecting', 'neglects', 'negligence', 'negligences', 'negligent', 'negligently',
                       'nonattainment', 'noncompetitive', 'noncompliance', 'noncompliances', 'noncompliant',
                       'noncomplying', 'nonconforming', 'nonconformities', 'nonconformity', 'nondisclosure',
                       'nonfunctional', 'nonpayment', 'nonpayments', 'nonperformance', 'nonperformances',
                       'nonperforming', 'nonproducing', 'nonproductive', 'nonrecoverable', 'nonrenewal', 'nuisance',
                       'nuisances', 'nullification', 'nullifications', 'nullified', 'nullifies', 'nullify',
                       'nullifying', 'objected', 'objecting', 'objection', 'objectionable', 'objectionably',
                       'objections', 'obscene', 'obscenity', 'obsolescence', 'obsolete', 'obstacle', 'obstacles',
                       'obstruct', 'obstructed', 'obstructing', 'obstruction', 'obstructions', 'offence', 'offences',
                       'offend', 'offended', 'offender', 'offenders', 'offending', 'offends', 'omission', 'omissions',
                       'omit', 'omits', 'omitted', 'omitting', 'onerous', 'opportunistic', 'opportunistically',
                       'oppose', 'opposed', 'opposes', 'opposing', 'opposition', 'oppositions', 'outage', 'outages',
                       'outdated', 'outmoded', 'overage', 'overages', 'overbuild', 'overbuilding', 'overbuilds',
                       'overbuilt', 'overburden', 'overburdened', 'overburdening', 'overcapacities', 'overcapacity',
                       'overcharge', 'overcharged', 'overcharges', 'overcharging', 'overcome', 'overcomes',
                       'overcoming', 'overdue', 'overestimate', 'overestimated', 'overestimates', 'overestimating',
                       'overestimation', 'overestimations', 'overload', 'overloaded', 'overloading', 'overloads',
                       'overlook', 'overlooked', 'overlooking', 'overlooks', 'overpaid', 'overpayment', 'overpayments',
                       'overproduced', 'overproduces', 'overproducing', 'overproduction', 'overrun', 'overrunning',
                       'overruns', 'overshadow', 'overshadowed', 'overshadowing', 'overshadows', 'overstate',
                       'overstated', 'overstatement', 'overstatements', 'overstates', 'overstating', 'oversupplied',
                       'oversupplies', 'oversupply', 'oversupplying', 'overtly', 'overturn', 'overturned',
                       'overturning', 'overturns', 'overvalue', 'overvalued', 'overvaluing', 'panic', 'panics',
                       'penalize', 'penalized', 'penalizes', 'penalizing', 'penalties', 'penalty', 'peril', 'perils',
                       'perjury', 'perpetrate', 'perpetrated', 'perpetrates', 'perpetrating', 'perpetration', 'persist',
                       'persisted', 'persistence', 'persistent', 'persistently', 'persisting', 'persists', 'pervasive',
                       'pervasively', 'pervasiveness', 'petty', 'picket', 'picketed', 'picketing', 'plaintiff',
                       'plaintiffs', 'plea', 'plead', 'pleaded', 'pleading', 'pleadings', 'pleads', 'pleas', 'pled',
                       'poor', 'poorly', 'poses', 'posing', 'postpone', 'postponed', 'postponement', 'postponements',
                       'postpones', 'postponing', 'precipitated', 'precipitous', 'precipitously', 'preclude',
                       'precluded', 'precludes', 'precluding', 'predatory', 'prejudice', 'prejudiced', 'prejudices',
                       'prejudicial', 'prejudicing', 'premature', 'prematurely', 'pressing', 'pretrial', 'preventing',
                       'prevention', 'prevents', 'problem', 'problematic', 'problematical', 'problems', 'prolong',
                       'prolongation', 'prolongations', 'prolonged', 'prolonging', 'prolongs', 'prone', 'prosecute',
                       'prosecuted', 'prosecutes', 'prosecuting', 'prosecution', 'prosecutions', 'protest', 'protested',
                       'protester', 'protesters', 'protesting', 'protestor', 'protestors', 'protests', 'protracted',
                       'protraction', 'provoke', 'provoked', 'provokes', 'provoking', 'punished', 'punishes',
                       'punishing', 'punishment', 'punishments', 'punitive', 'purport', 'purported', 'purportedly',
                       'purporting', 'purports', 'question', 'questionable', 'questionably', 'questioned',
                       'questioning', 'questions', 'quit', 'quitting', 'racketeer', 'racketeering', 'rationalization',
                       'rationalizations', 'rationalize', 'rationalized', 'rationalizes', 'rationalizing',
                       'reassessment', 'reassessments', 'reassign', 'reassigned', 'reassigning', 'reassignment',
                       'reassignments', 'reassigns', 'recall', 'recalled', 'recalling', 'recalls', 'recession',
                       'recessionary', 'recessions', 'reckless', 'recklessly', 'recklessness', 'redact', 'redacted',
                       'redacting', 'redaction', 'redactions', 'redefault', 'redefaulted', 'redefaults', 'redress',
                       'redressed', 'redresses', 'redressing', 'refusal', 'refusals', 'refuse', 'refused', 'refuses',
                       'refusing', 'reject', 'rejected', 'rejecting', 'rejection', 'rejections', 'rejects',
                       'relinquish', 'relinquished', 'relinquishes', 'relinquishing', 'relinquishment',
                       'relinquishments', 'reluctance', 'reluctant', 'renegotiate', 'renegotiated', 'renegotiates',
                       'renegotiating', 'renegotiation', 'renegotiations', 'renounce', 'renounced', 'renouncement',
                       'renouncements', 'renounces', 'renouncing', 'reparation', 'reparations', 'repossessed',
                       'repossesses', 'repossessing', 'repossession', 'repossessions', 'repudiate', 'repudiated',
                       'repudiates', 'repudiating', 'repudiation', 'repudiations', 'resign', 'resignation',
                       'resignations', 'resigned', 'resigning', 'resigns', 'restate', 'restated', 'restatement',
                       'restatements', 'restates', 'restating', 'restructure', 'restructured', 'restructures',
                       'restructuring', 'restructurings', 'retaliate', 'retaliated', 'retaliates', 'retaliating',
                       'retaliation', 'retaliations', 'retaliatory', 'retribution', 'retributions', 'revocation',
                       'revocations', 'revoke', 'revoked', 'revokes', 'revoking', 'ridicule', 'ridiculed', 'ridicules',
                       'ridiculing', 'riskier', 'riskiest', 'risky', 'sabotage', 'sacrifice', 'sacrificed',
                       'sacrifices', 'sacrificial', 'sacrificing', 'scandalous', 'scandals', 'scrutinize',
                       'scrutinized', 'scrutinizes', 'scrutinizing', 'scrutiny', 'secrecy', 'seize', 'seized', 'seizes',
                       'seizing', 'sentenced', 'sentencing', 'serious', 'seriously', 'seriousness', 'setback',
                       'setbacks', 'sever', 'severe', 'severed', 'severely', 'severities', 'severity', 'sharply',
                       'shocked', 'shortage', 'shortages', 'shortfall', 'shortfalls', 'shrinkage', 'shrinkages', 'shut',
                       'shutdown', 'shutdowns', 'shuts', 'shutting', 'slander', 'slandered', 'slanderous', 'slanders',
                       'slippage', 'slippages', 'slow', 'slowdown', 'slowdowns', 'slowed', 'slower', 'slowest',
                       'slowing', 'slowly', 'slowness', 'sluggish', 'sluggishly', 'sluggishness', 'solvencies',
                       'solvency', 'spam', 'spammers', 'spamming', 'staggering', 'stagnant', 'stagnate', 'stagnated',
                       'stagnates', 'stagnating', 'stagnation', 'standstill', 'standstills', 'stolen', 'stoppage',
                       'stoppages', 'stopped', 'stopping', 'stops', 'strain', 'strained', 'straining', 'strains',
                       'stress', 'stressed', 'stresses', 'stressful', 'stressing', 'stringent', 'subjected',
                       'subjecting', 'subjection', 'subpoena', 'subpoenaed', 'subpoenas', 'substandard', 'sue', 'sued',
                       'sues', 'suffer', 'suffered', 'suffering', 'suffers', 'suing', 'summoned', 'summoning',
                       'summons', 'summonses', 'susceptibility', 'susceptible', 'suspect', 'suspected', 'suspects',
                       'suspend', 'suspended', 'suspending', 'suspends', 'suspension', 'suspensions', 'suspicion',
                       'suspicions', 'suspicious', 'suspiciously', 'taint', 'tainted', 'tainting', 'taints', 'tampered',
                       'tense', 'terminate', 'terminated', 'terminates', 'terminating', 'termination', 'terminations',
                       'testify', 'testifying', 'threat', 'threaten', 'threatened', 'threatening', 'threatens',
                       'threats', 'tightening', 'tolerate', 'tolerated', 'tolerates', 'tolerating', 'toleration',
                       'tortuous', 'tortuously', 'tragedies', 'tragedy', 'tragic', 'tragically', 'traumatic', 'trouble',
                       'troubled', 'troubles', 'turbulence', 'turmoil', 'unable', 'unacceptable', 'unacceptably',
                       'unaccounted', 'unannounced', 'unanticipated', 'unapproved', 'unattractive', 'unauthorized',
                       'unavailability', 'unavailable', 'unavoidable', 'unavoidably', 'unaware', 'uncollectable',
                       'uncollected', 'uncollectibility', 'uncollectible', 'uncollectibles', 'uncompetitive',
                       'uncompleted', 'unconscionable', 'unconscionably', 'uncontrollable', 'uncontrollably',
                       'uncontrolled', 'uncorrected', 'uncover', 'uncovered', 'uncovering', 'uncovers', 'undeliverable',
                       'undelivered', 'undercapitalized', 'undercut', 'undercuts', 'undercutting', 'underestimate',
                       'underestimated', 'underestimates', 'underestimating', 'underestimation', 'underfunded',
                       'underinsured', 'undermine', 'undermined', 'undermines', 'undermining', 'underpaid',
                       'underpayment', 'underpayments', 'underpays', 'underperform', 'underperformance',
                       'underperformed', 'underperforming', 'underperforms', 'underproduced', 'underproduction',
                       'underreporting', 'understate', 'understated', 'understatement', 'understatements',
                       'understates', 'understating', 'underutilization', 'underutilized', 'undesirable', 'undesired',
                       'undetected', 'undetermined', 'undisclosed', 'undocumented', 'undue', 'unduly', 'uneconomic',
                       'uneconomical', 'uneconomically', 'unemployed', 'unemployment', 'unethical', 'unethically',
                       'unexcused', 'unexpected', 'unexpectedly', 'unfair', 'unfairly', 'unfavorability', 'unfavorable',
                       'unfavorably', 'unfavourable', 'unfeasible', 'unfit', 'unfitness', 'unforeseeable', 'unforeseen',
                       'unforseen', 'unfortunate', 'unfortunately', 'unfounded', 'unfriendly', 'unfulfilled',
                       'unfunded', 'uninsured', 'unintended', 'unintentional', 'unintentionally', 'unjust',
                       'unjustifiable', 'unjustifiably', 'unjustified', 'unjustly', 'unknowing', 'unknowingly',
                       'unlawful', 'unlawfully', 'unlicensed', 'unliquidated', 'unmarketable', 'unmerchantable',
                       'unmeritorious', 'unnecessarily', 'unnecessary', 'unneeded', 'unobtainable', 'unoccupied',
                       'unpaid', 'unperformed', 'unplanned', 'unpopular', 'unpredictability', 'unpredictable',
                       'unpredictably', 'unpredicted', 'unproductive', 'unprofitability', 'unprofitable', 'unqualified',
                       'unrealistic', 'unreasonable', 'unreasonableness', 'unreasonably', 'unreceptive',
                       'unrecoverable', 'unrecovered', 'unreimbursed', 'unreliable', 'unremedied', 'unreported',
                       'unresolved', 'unrest', 'unsafe', 'unsalable', 'unsaleable', 'unsatisfactory', 'unsatisfied',
                       'unsavory', 'unscheduled', 'unsellable', 'unsold', 'unsound', 'unstabilized', 'unstable',
                       'unsubstantiated', 'unsuccessful', 'unsuccessfully', 'unsuitability', 'unsuitable', 'unsuitably',
                       'unsuited', 'unsure', 'unsuspected', 'unsuspecting', 'unsustainable', 'untenable', 'untimely',
                       'untrusted', 'untruth', 'untruthful', 'untruthfully', 'untruthfulness', 'untruths', 'unusable',
                       'unwanted', 'unwarranted', 'unwelcome', 'unwilling', 'unwillingness', 'upset', 'urgency',
                       'urgent', 'usurious', 'usurp', 'usurped', 'usurping', 'usurps', 'usury', 'vandalism', 'verdict',
                       'verdicts', 'vetoed', 'victims', 'violate', 'violated', 'violates', 'violating', 'violation',
                       'violations', 'violative', 'violator', 'violators', 'violence', 'violent', 'violently',
                       'vitiate', 'vitiated', 'vitiates', 'vitiating', 'vitiation', 'voided', 'voiding', 'volatile',
                       'volatility', 'vulnerabilities', 'vulnerability', 'vulnerable', 'vulnerably', 'warn', 'warned',
                       'warning', 'warnings', 'warns', 'wasted', 'wasteful', 'wasting', 'weak', 'weaken', 'weakened',
                       'weakening', 'weakens', 'weaker', 'weakest', 'weakly', 'weakness', 'weaknesses', 'willfully',
                       'worries', 'worry', 'worrying', 'worse', 'worsen', 'worsened', 'worsening', 'worsens', 'worst',
                       'worthless', 'writedown', 'writedowns', 'writeoff', 'writeoffs', 'wrong', 'wrongdoing',
                       'wrongdoings', 'wrongful', 'wrongfully', 'wrongly'],
          'Positive': ['able', 'abundance', 'abundant', 'acclaimed', 'accomplish', 'accomplished', 'accomplishes',
                       'accomplishing', 'accomplishment', 'accomplishments', 'achieve', 'achieved', 'achievement',
                       'achievements', 'achieves', 'achieving', 'adequately', 'advancement', 'advancements', 'advances',
                       'advancing', 'advantage', 'advantaged', 'advantageous', 'advantageously', 'advantages',
                       'alliance', 'alliances', 'assure', 'assured', 'assures', 'assuring', 'attain', 'attained',
                       'attaining', 'attainment', 'attainments', 'attains', 'attractive', 'attractiveness', 'beautiful',
                       'beautifully', 'beneficial', 'beneficially', 'benefit', 'benefited', 'benefiting', 'benefitted',
                       'benefitting', 'best', 'better', 'bolstered', 'bolstering', 'bolsters', 'boom', 'booming',
                       'boost', 'boosted', 'breakthrough', 'breakthroughs', 'brilliant', 'charitable', 'collaborate',
                       'collaborated', 'collaborates', 'collaborating', 'collaboration', 'collaborations',
                       'collaborative', 'collaborator', 'collaborators', 'compliment', 'complimentary', 'complimented',
                       'complimenting', 'compliments', 'conclusive', 'conclusively', 'conducive', 'confident',
                       'constructive', 'constructively', 'courteous', 'creative', 'creatively', 'creativeness',
                       'creativity', 'delight', 'delighted', 'delightful', 'delightfully', 'delighting', 'delights',
                       'dependability', 'dependable', 'desirable', 'desired', 'despite', 'destined', 'diligent',
                       'diligently', 'distinction', 'distinctions', 'distinctive', 'distinctively', 'distinctiveness',
                       'dream', 'easier', 'easily', 'easy', 'effective', 'efficiencies', 'efficiency', 'efficient',
                       'efficiently', 'empower', 'empowered', 'empowering', 'empowers', 'enable', 'enabled', 'enables',
                       'enabling', 'encouraged', 'encouragement', 'encourages', 'encouraging', 'enhance', 'enhanced',
                       'enhancement', 'enhancements', 'enhances', 'enhancing', 'enjoy', 'enjoyable', 'enjoyably',
                       'enjoyed', 'enjoying', 'enjoyment', 'enjoys', 'enthusiasm', 'enthusiastic', 'enthusiastically',
                       'excellence', 'excellent', 'excelling', 'excels', 'exceptional', 'exceptionally', 'excited',
                       'excitement', 'exciting', 'exclusive', 'exclusively', 'exclusiveness', 'exclusives',
                       'exclusivity', 'exemplary', 'fantastic', 'favorable', 'favorably', 'favored', 'favoring',
                       'favorite', 'favorites', 'friendly', 'gain', 'gained', 'gaining', 'gains', 'good', 'great',
                       'greater', 'greatest', 'greatly', 'greatness', 'happiest', 'happily', 'happiness', 'happy',
                       'highest', 'honor', 'honorable', 'honored', 'honoring', 'honors', 'ideal', 'impress',
                       'impressed', 'impresses', 'impressing', 'impressive', 'impressively', 'improve', 'improved',
                       'improvement', 'improvements', 'improves', 'improving', 'incredible', 'incredibly',
                       'influential', 'informative', 'ingenuity', 'innovate', 'innovated', 'innovates', 'innovating',
                       'innovation', 'innovations', 'innovative', 'innovativeness', 'innovator', 'innovators',
                       'insightful', 'inspiration', 'inspirational', 'integrity', 'invent', 'invented', 'inventing',
                       'invention', 'inventions', 'inventive', 'inventiveness', 'inventor', 'inventors', 'leadership',
                       'leading', 'loyal', 'lucrative', 'meritorious', 'opportunities', 'opportunity', 'optimistic',
                       'outperform', 'outperformed', 'outperforming', 'outperforms', 'perfect', 'perfected',
                       'perfectly', 'perfects', 'pleasant', 'pleasantly', 'pleased', 'pleasure', 'plentiful', 'popular',
                       'popularity', 'positive', 'positively', 'preeminence', 'preeminent', 'premier', 'premiere',
                       'prestige', 'prestigious', 'proactive', 'proactively', 'proficiency', 'proficient',
                       'proficiently', 'profitability', 'profitable', 'profitably', 'progress', 'progressed',
                       'progresses', 'progressing', 'prospered', 'prospering', 'prosperity', 'prosperous', 'prospers',
                       'rebound', 'rebounded', 'rebounding', 'receptive', 'regain', 'regained', 'regaining', 'resolve',
                       'revolutionize', 'revolutionized', 'revolutionizes', 'revolutionizing', 'reward', 'rewarded',
                       'rewarding', 'rewards', 'satisfaction', 'satisfactorily', 'satisfactory', 'satisfied',
                       'satisfies', 'satisfy', 'satisfying', 'smooth', 'smoothing', 'smoothly', 'smooths', 'solves',
                       'solving', 'spectacular', 'spectacularly', 'stability', 'stabilization', 'stabilizations',
                       'stabilize', 'stabilized', 'stabilizes', 'stabilizing', 'stable', 'strength', 'strengthen',
                       'strengthened', 'strengthening', 'strengthens', 'strengths', 'strong', 'stronger', 'strongest',
                       'succeed', 'succeeded', 'succeeding', 'succeeds', 'success', 'successes', 'successful',
                       'successfully', 'superior', 'surpass', 'surpassed', 'surpasses', 'surpassing', 'transparency',
                       'tremendous', 'tremendously', 'unmatched', 'unparalleled', 'unsurpassed', 'upturn', 'upturns',
                       'valuable', 'versatile', 'versatility', 'vibrancy', 'vibrant', 'win', 'winner', 'winners',
                       'winning', 'worthy']}

# Henry's (2008) Word List
# Henry, Elaine. “Are Investors Influenced By How Earnings Press Releases Are Written.” The Journal of Business
# Communication (1973) 45, no. 4 (2008): 363–407.
hdict = {'Negative': ['negative', 'negatives', 'fail', 'fails', 'failing', 'failure', 'weak', 'weakness', 'weaknesses',
                      'difficult', 'difficulty', 'hurdle', 'hurdles', 'obstacle', 'obstacles', 'slump', 'slumps',
                      'slumping', 'slumped', 'uncertain', 'uncertainty', 'unsettled', 'unfavorable', 'downturn',
                      'depressed', 'disappoint', 'disappoints', 'disappointing', 'disappointed', 'disappointment',
                      'risk', 'risks', 'risky', 'threat', 'threats', 'penalty', 'penalties', 'down', 'decrease',
                      'decreases', 'decreasing', 'decreased', 'decline', 'declines', 'declining', 'declined', 'fall',
                      'falls', 'falling', 'fell', 'fallen', 'drop', 'drops', 'dropping', 'dropped', 'deteriorate',
                      'deteriorates', 'deteriorating', 'deteriorated', 'worsen', 'worsens', 'worsening', 'weaken',
                      'weakens', 'weakening', 'weakened', 'worse', 'worst', 'low', 'lower', 'lowest', 'less', 'least',
                      'smaller', 'smallest', 'shrink', 'shrinks', 'shrinking', 'shrunk', 'below', 'under', 'challenge',
                      'challenges', 'challenging', 'challenged'],
         'Positive': ['positive', 'positives', 'success', 'successes', 'successful', 'succeed', 'succeeds',
                      'succeeding', 'succeeded', 'accomplish', 'accomplishes', 'accomplishing', 'accomplished',
                      'accomplishment', 'accomplishments', 'strong', 'strength', 'strengths', 'certain', 'certainty',
                      'definite', 'solid', 'excellent', 'good', 'leading', 'achieve', 'achieves', 'achieved',
                      'achieving', 'achievement', 'achievements', 'progress', 'progressing', 'deliver', 'delivers',
                      'delivered', 'delivering', 'leader', 'leading', 'pleased', 'reward', 'rewards', 'rewarding',
                      'rewarded', 'opportunity', 'opportunities', 'enjoy', 'enjoys', 'enjoying', 'enjoyed',
                      'encouraged', 'encouraging', 'up', 'increase', 'increases', 'increasing', 'increased', 'rise',
                      'rises', 'rising', 'rose', 'risen', 'improve', 'improves', 'improving', 'improved', 'improvement',
                      'improvements', 'strengthen', 'strengthens', 'strengthening', 'strengthened', 'stronger',
                      'strongest', 'better', 'best', 'more', 'most', 'above', 'record', 'high', 'higher', 'highest',
                      'greater', 'greatest', 'larger', 'largest', 'grow', 'grows', 'growing', 'grew', 'grown', 'growth',
                      'expand', 'expands', 'expanding', 'expanded', 'expansion', 'exceed', 'exceeds', 'exceeded',
                      'exceeding', 'beat', 'beats', 'beating']}

negate = ["aint", "arent", "cannot", "cant", "couldnt", "darent", "didnt", "doesnt", "ain't", "aren't", "can't",
          "couldn't", "daren't", "didn't", "doesn't", "dont", "hadnt", "hasnt", "havent", "isnt", "mightnt", "mustnt",
          "neither", "don't", "hadn't", "hasn't", "haven't", "isn't", "mightn't", "mustn't", "neednt", "needn't",
          "never", "none", "nope", "nor", "not", "nothing", "nowhere", "oughtnt", "shant", "shouldnt", "wasnt",
          "werent", "oughtn't", "shan't", "shouldn't", "wasn't", "weren't", "without", "wont", "wouldnt", "won't",
          "wouldn't", "rarely", "seldom", "despite", "no", "nobody"]


def negated(word):
    """
    Determine if preceding word is a negation word
    """
    if word.lower() in negate:
        return True
    else:
        return False


def tone_count_with_negation_check(dict, article):
    """
    Count positive and negative words with negation check. Account for simple negation only for positive words.
    Simple negation is taken to be observations of one of negate words occurring within three words
    preceding a positive words.
    """
    pos_count = 0
    neg_count = 0

    pos_words = []
    neg_words = []

    input_words = re.findall(r'\b([a-zA-Z]+n\'t|[a-zA-Z]+\'s|[a-zA-Z]+)\b', article.lower())

    word_count = len(input_words)

    for i in range(0, word_count):
        if input_words[i] in dict['Negative']:
            neg_count += 1
            neg_words.append(input_words[i])
        if input_words[i] in dict['Positive']:
            if i >= 3:
                if negated(input_words[i - 1]) or negated(input_words[i - 2]) or negated(input_words[i - 3]):
                    neg_count += 1
                    neg_words.append(input_words[i] + ' (with negation)')
                else:
                    pos_count += 1
                    pos_words.append(input_words[i])
            elif i == 2:
                if negated(input_words[i - 1]) or negated(input_words[i - 2]):
                    neg_count += 1
                    neg_words.append(input_words[i] + ' (with negation)')
                else:
                    pos_count += 1
                    pos_words.append(input_words[i])
            elif i == 1:
                if negated(input_words[i - 1]):
                    neg_count += 1
                    neg_words.append(input_words[i] + ' (with negation)')
                else:
                    pos_count += 1
                    pos_words.append(input_words[i])
            elif i == 0:
                pos_count += 1
                pos_words.append(input_words[i])

    print('The results with negation check:', end='\n\n')
    print('The # of positive words:', pos_count)
    print('The # of negative words:', neg_count)
    print('The list of found positive words:', pos_words)
    print('The list of found negative words:', neg_words)
    print('\n', end='')

    results = [word_count, pos_count, neg_count, pos_words, neg_words]

    return results


# A sample output
article = '''Patent infringement pursued against same companies in U.S. District Court. Test "wasn't good".
SUNNYVALE, Calif.--(BUSINESS WIRE)--December 02, 2010--
Rambus Inc. (Nasdaq:RMBS), one of the world's premier technology licensing companies, today announced it has filed a complaint with the United States International Trade Commission (ITC) requesting the commencement of an investigation pertaining to products from Broadcom Corporation, Freescale Semiconductor, Inc., LSI Corporation, MediaTek Inc., NVIDIA Corporation and STMicroelectronics N. V. The complaint seeks an exclusion order barring the importation, sale for importation, or sale after importation of products from Broadcom, Freescale, LSI, NVIDIA and STMicroelectronics that infringe certain patents from the Dally1 family of patents, and of products from Broadcom, Freescale, LSI, MediaTek and STMicroelectronics that infringe certain patents from the Barth family of patents. In an earlier investigation requested by Rambus the ITC found that these same Barth patents were valid and infringed by NVIDIA products, and issued an exclusion order in July of this year.
"We have been attempting to license these companies for some time to no avail. One of the respondents frankly told us that the only way they would get serious is if we sued them. Others pursued a strategy of delay rather than negotiate a reasonable resolution," said Harold Hughes, president and chief executive officer at Rambus. "Rambus has invested hundreds of millions of dollars developing a portfolio of technologies that are foundational for many digital electronics. There is widespread knowledge within the industry about our patents including their use in standards-compatible products accused in these actions. In fairness to our shareholders and to our paying licensees, we take these steps to protect our patented innovations and pursue fair compensation for their use."
For the Dally patents, the accused semiconductor products from these companies include ones that incorporate PCI Express, certain Serial ATA, certain Serial Attached SCSI (SAS), and DisplayPort interfaces. In the case of the Barth patents, the accused semiconductor products include ones that incorporate DDR, DDR2, DDR3, mobile DDR, LPDDR, LPDDR2, and GDDR3 memory controllers. Accused semiconductor products in the complaint include graphics processors, media processors, communications processors, chip sets and other logic integrated circuits (ICs).
In addition to Broadcom, Freescale, LSI, MediaTek, NVIDIA and STMicroelectronics, the ITC complaint names companies whose products incorporate the accused semiconductor products and are imported, sold for importation, or sold after importation into the United States. These products include personal computers, workstations, servers, routers, mobile phones and other handheld devices, set-top boxes, Blu-ray players motherboards, plug-in cards, hard drives and modems. The ITC is expected to decide whether to initiate an investigation under this complaint within 30-45 days.
Rambus today also filed separate actions for patent infringement against Broadcom, Freescale, LSI, MediaTek and STMicroelectronics in the United States District Court for the Northern District of California. The lawsuits allege that semiconductor products with certain memory controllers and/or serial links from the above companies infringe certain patents from the Farmwald-Horowitz, Barth, and Dally patent families. In the case of MediaTek, only infringement of the Barth and Farmwald-Horowitz patents for certain memory controllers is alleged. Rambus also filed an action in the United States District Court for the Northern District of California against NVIDIA for infringement of certain Dally patents. The categories of accused semiconductor products in the District Court complaints include the same categories accused in the ITC complaint, as well as SDR memory controllers. Rambus is seeking injunctive relief barring the infringement, contributory infringement, and inducement to infringe the patents, as well as monetary damages.
Rambus management will discuss the filing of these actions during a special conference call today at 5:00 p.m. PT. The call will be webcast and can be accessed through the Rambus website. A replay will be available following the call on Rambus' Investor Relations website or for one week at the following numbers: (800) 642-1687 (domestic) or (706) 645-9291 (international) with ID# 29122159. Further information regarding these legal actions will be made available at http://investor.rambus.com in the Litigation Update section.
1 Rambus is the exclusive licensee for the Dally family of patents which are owned by Massachusetts Institute of Technology. This license was assigned to Rambus as a part of its 2003 acquisition of technology and IP from Velio Communications, a company founded by Dr. William Dally.
'''

tone_count_with_negation_check(lmdict, article)

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383

384

385

386

387

388

389

390

391

392

393

394

395

396

397

398

399

400

401

402

403

404

405

406

407

408

409

410

411

412

413

414

415

416

417

418

419

420

421

422

423

424

425

426

427

428

429

430

431

432

433

434

435

436

437

438

439

440

441

442

443

444

445

446

447

448

449

450

451

452

453

454

455

456

457

458

459

460

461

462

463

464

465

466

467

468

469

470

471

472

473

474

475

476

477

478

479

480

481

482

483

484

485

486

487

488

489

490

491

492

493

494

495

496

497

498

499

500

501

502

503

504

505

506

507

508

509

510

511

512

513

514

515

516

517

518

519

520

521

522

523

524

import re

# Loughran and McDonald Sentiment Word Lists (https://sraf.nd.edu/textual-analysis/resources/)

lmdict = {'Negative': ['abandon', 'abandoned', 'abandoning', 'abandonment', 'abandonments', 'abandons', 'abdicated',

'abdicates', 'abdicating', 'abdication', 'abdications', 'aberrant', 'aberration', 'aberrational',

'aberrations', 'abetting', 'abnormal', 'abnormalities', 'abnormality', 'abnormally', 'abolish',

'abolished', 'abolishes', 'abolishing', 'abrogate', 'abrogated', 'abrogates', 'abrogating',

'abrogation', 'abrogations', 'abrupt', 'abruptly', 'abruptness', 'absence', 'absences',

'absenteeism', 'abuse', 'abused', 'abuses', 'abusing', 'abusive', 'abusively', 'abusiveness',

'accident', 'accidental', 'accidentally', 'accidents', 'accusation', 'accusations', 'accuse',

'accused', 'accuses', 'accusing', 'acquiesce', 'acquiesced', 'acquiesces', 'acquiescing',

'acquit', 'acquits', 'acquittal', 'acquittals', 'acquitted', 'acquitting', 'adulterate',

'adulterated', 'adulterating', 'adulteration', 'adulterations', 'adversarial', 'adversaries',

'adversary', 'adverse', 'adversely', 'adversities', 'adversity', 'aftermath', 'aftermaths',

'against', 'aggravate', 'aggravated', 'aggravates', 'aggravating', 'aggravation', 'aggravations',

'alerted', 'alerting', 'alienate', 'alienated', 'alienates', 'alienating', 'alienation',

'alienations', 'allegation', 'allegations', 'allege', 'alleged', 'allegedly', 'alleges',

'alleging', 'annoy', 'annoyance', 'annoyances', 'annoyed', 'annoying', 'annoys', 'annul',

'annulled', 'annulling', 'annulment', 'annulments', 'annuls', 'anomalies', 'anomalous',

'anomalously', 'anomaly', 'anticompetitive', 'antitrust', 'argue', 'argued', 'arguing',

'argument', 'argumentative', 'arguments', 'arrearage', 'arrearages', 'arrears', 'arrest',

'arrested', 'arrests', 'artificially', 'assault', 'assaulted', 'assaulting', 'assaults',

'assertions', 'attrition', 'aversely', 'backdating', 'bad', 'bail', 'bailout', 'balk', 'balked',

'bankrupt', 'bankruptcies', 'bankruptcy', 'bankrupted', 'bankrupting', 'bankrupts', 'bans',

'barred', 'barrier', 'barriers', 'bottleneck', 'bottlenecks', 'boycott', 'boycotted',

'boycotting', 'boycotts', 'breach', 'breached', 'breaches', 'breaching', 'break', 'breakage',

'breakages', 'breakdown', 'breakdowns', 'breaking', 'breaks', 'bribe', 'bribed', 'briberies',

'bribery', 'bribes', 'bribing', 'bridge', 'broken', 'burden', 'burdened', 'burdening', 'burdens',

'burdensome', 'burned', 'calamities', 'calamitous', 'calamity', 'cancel', 'canceled',

'canceling', 'cancellation', 'cancellations', 'cancelled', 'cancelling', 'cancels', 'careless',

'carelessly', 'carelessness', 'catastrophe', 'catastrophes', 'catastrophic', 'catastrophically',

'caution', 'cautionary', 'cautioned', 'cautioning', 'cautions', 'cease', 'ceased', 'ceases',

'ceasing', 'censure', 'censured', 'censures', 'censuring', 'challenge', 'challenged',

'challenges', 'challenging', 'chargeoffs', 'circumvent', 'circumvented', 'circumventing',

'circumvention', 'circumventions', 'circumvents', 'claiming', 'claims', 'clawback', 'closed',

'closeout', 'closeouts', 'closing', 'closings', 'closure', 'closures', 'coerce', 'coerced',

'coerces', 'coercing', 'coercion', 'coercive', 'collapse', 'collapsed', 'collapses',

'collapsing', 'collision', 'collisions', 'collude', 'colluded', 'colludes', 'colluding',

'collusion', 'collusions', 'collusive', 'complain', 'complained', 'complaining', 'complains',

'complaint', 'complaints', 'complicate', 'complicated', 'complicates', 'complicating',

'complication', 'complications', 'compulsion', 'concealed', 'concealing', 'concede', 'conceded',

'concedes', 'conceding', 'concern', 'concerned', 'concerns', 'conciliating', 'conciliation',

'conciliations', 'condemn', 'condemnation', 'condemnations', 'condemned', 'condemning',

'condemns', 'condone', 'condoned', 'confess', 'confessed', 'confesses', 'confessing',

'confession', 'confine', 'confined', 'confinement', 'confinements', 'confines', 'confining',

'confiscate', 'confiscated', 'confiscates', 'confiscating', 'confiscation', 'confiscations',

'conflict', 'conflicted', 'conflicting', 'conflicts', 'confront', 'confrontation',

'confrontational', 'confrontations', 'confronted', 'confronting', 'confronts', 'confuse',

'confused', 'confuses', 'confusing', 'confusingly', 'confusion', 'conspiracies', 'conspiracy',

'conspirator', 'conspiratorial', 'conspirators', 'conspire', 'conspired', 'conspires',

'conspiring', 'contempt', 'contend', 'contended', 'contending', 'contends', 'contention',

'contentions', 'contentious', 'contentiously', 'contested', 'contesting', 'contraction',

'contractions', 'contradict', 'contradicted', 'contradicting', 'contradiction', 'contradictions',

'contradictory', 'contradicts', 'contrary', 'controversial', 'controversies', 'controversy',

'convict', 'convicted', 'convicting', 'conviction', 'convictions', 'corrected', 'correcting',

'correction', 'corrections', 'corrects', 'corrupt', 'corrupted', 'corrupting', 'corruption',

'corruptions', 'corruptly', 'corruptness', 'costly', 'counterclaim', 'counterclaimed',

'counterclaiming', 'counterclaims', 'counterfeit', 'counterfeited', 'counterfeiter',

'counterfeiters', 'counterfeiting', 'counterfeits', 'countermeasure', 'countermeasures', 'crime',

'crimes', 'criminal', 'criminally', 'criminals', 'crises', 'crisis', 'critical', 'critically',

'criticism', 'criticisms', 'criticize', 'criticized', 'criticizes', 'criticizing', 'crucial',

'crucially', 'culpability', 'culpable', 'culpably', 'cumbersome', 'curtail', 'curtailed',

'curtailing', 'curtailment', 'curtailments', 'curtails', 'cut', 'cutback', 'cutbacks',

'cyberattack', 'cyberattacks', 'cyberbullying', 'cybercrime', 'cybercrimes', 'cybercriminal',

'cybercriminals', 'damage', 'damaged', 'damages', 'damaging', 'dampen', 'dampened', 'danger',

'dangerous', 'dangerously', 'dangers', 'deadlock', 'deadlocked', 'deadlocking', 'deadlocks',

'deadweight', 'deadweights', 'debarment', 'debarments', 'debarred', 'deceased', 'deceit',

'deceitful', 'deceitfulness', 'deceive', 'deceived', 'deceives', 'deceiving', 'deception',

'deceptions', 'deceptive', 'deceptively', 'decline', 'declined', 'declines', 'declining',

'deface', 'defaced', 'defacement', 'defamation', 'defamations', 'defamatory', 'defame',

'defamed', 'defames', 'defaming', 'default', 'defaulted', 'defaulting', 'defaults', 'defeat',

'defeated', 'defeating', 'defeats', 'defect', 'defective', 'defects', 'defend', 'defendant',

'defendants', 'defended', 'defending', 'defends', 'defensive', 'defer', 'deficiencies',

'deficiency', 'deficient', 'deficit', 'deficits', 'defraud', 'defrauded', 'defrauding',

'defrauds', 'defunct', 'degradation', 'degradations', 'degrade', 'degraded', 'degrades',

'degrading', 'delay', 'delayed', 'delaying', 'delays', 'deleterious', 'deliberate',

'deliberated', 'deliberately', 'delinquencies', 'delinquency', 'delinquent', 'delinquently',

'delinquents', 'delist', 'delisted', 'delisting', 'delists', 'demise', 'demised', 'demises',

'demising', 'demolish', 'demolished', 'demolishes', 'demolishing', 'demolition', 'demolitions',

'demote', 'demoted', 'demotes', 'demoting', 'demotion', 'demotions', 'denial', 'denials',

'denied', 'denies', 'denigrate', 'denigrated', 'denigrates', 'denigrating', 'denigration',

'deny', 'denying', 'deplete', 'depleted', 'depletes', 'depleting', 'depletion', 'depletions',

'deprecation', 'depress', 'depressed', 'depresses', 'depressing', 'deprivation', 'deprive',

'deprived', 'deprives', 'depriving', 'derelict', 'dereliction', 'derogatory', 'destabilization',

'destabilize', 'destabilized', 'destabilizing', 'destroy', 'destroyed', 'destroying', 'destroys',

'destruction', 'destructive', 'detain', 'detained', 'detention', 'detentions', 'deter',

'deteriorate', 'deteriorated', 'deteriorates', 'deteriorating', 'deterioration',

'deteriorations', 'deterred', 'deterrence', 'deterrences', 'deterrent', 'deterrents',

'deterring', 'deters', 'detract', 'detracted', 'detracting', 'detriment', 'detrimental',

'detrimentally', 'detriments', 'devalue', 'devalued', 'devalues', 'devaluing', 'devastate',

'devastated', 'devastating', 'devastation', 'deviate', 'deviated', 'deviates', 'deviating',

'deviation', 'deviations', 'devolve', 'devolved', 'devolves', 'devolving', 'difficult',

'difficulties', 'difficultly', 'difficulty', 'diminish', 'diminished', 'diminishes',

'diminishing', 'diminution', 'disadvantage', 'disadvantaged', 'disadvantageous', 'disadvantages',

'disaffiliation', 'disagree', 'disagreeable', 'disagreed', 'disagreeing', 'disagreement',

'disagreements', 'disagrees', 'disallow', 'disallowance', 'disallowances', 'disallowed',

'disallowing', 'disallows', 'disappear', 'disappearance', 'disappearances', 'disappeared',

'disappearing', 'disappears', 'disappoint', 'disappointed', 'disappointing', 'disappointingly',

'disappointment', 'disappointments', 'disappoints', 'disapproval', 'disapprovals', 'disapprove',

'disapproved', 'disapproves', 'disapproving', 'disassociates', 'disassociating',

'disassociation', 'disassociations', 'disaster', 'disasters', 'disastrous', 'disastrously',

'disavow', 'disavowal', 'disavowed', 'disavowing', 'disavows', 'disciplinary', 'disclaim',

'disclaimed', 'disclaimer', 'disclaimers', 'disclaiming', 'disclaims', 'disclose', 'disclosed',

'discloses', 'disclosing', 'discontinuance', 'discontinuances', 'discontinuation',

'discontinuations', 'discontinue', 'discontinued', 'discontinues', 'discontinuing', 'discourage',

'discouraged', 'discourages', 'discouraging', 'discredit', 'discredited', 'discrediting',

'discredits', 'discrepancies', 'discrepancy', 'disfavor', 'disfavored', 'disfavoring',

'disfavors', 'disgorge', 'disgorged', 'disgorgement', 'disgorgements', 'disgorges', 'disgorging',

'disgrace', 'disgraceful', 'disgracefully', 'dishonest', 'dishonestly', 'dishonesty', 'dishonor',

'dishonorable', 'dishonorably', 'dishonored', 'dishonoring', 'dishonors', 'disincentives',

'disinterested', 'disinterestedly', 'disinterestedness', 'disloyal', 'disloyally', 'disloyalty',

'dismal', 'dismally', 'dismiss', 'dismissal', 'dismissals', 'dismissed', 'dismisses',

'dismissing', 'disorderly', 'disparage', 'disparaged', 'disparagement', 'disparagements',

'disparages', 'disparaging', 'disparagingly', 'disparities', 'disparity', 'displace',

'displaced', 'displacement', 'displacements', 'displaces', 'displacing', 'dispose', 'dispossess',

'dispossessed', 'dispossesses', 'dispossessing', 'disproportion', 'disproportional',

'disproportionate', 'disproportionately', 'dispute', 'disputed', 'disputes', 'disputing',

'disqualification', 'disqualifications', 'disqualified', 'disqualifies', 'disqualify',

'disqualifying', 'disregard', 'disregarded', 'disregarding', 'disregards', 'disreputable',

'disrepute', 'disrupt', 'disrupted', 'disrupting', 'disruption', 'disruptions', 'disruptive',

'disrupts', 'dissatisfaction', 'dissatisfied', 'dissent', 'dissented', 'dissenter', 'dissenters',

'dissenting', 'dissents', 'dissident', 'dissidents', 'dissolution', 'dissolutions', 'distort',

'distorted', 'distorting', 'distortion', 'distortions', 'distorts', 'distract', 'distracted',

'distracting', 'distraction', 'distractions', 'distracts', 'distress', 'distressed', 'disturb',

'disturbance', 'disturbances', 'disturbed', 'disturbing', 'disturbs', 'diversion', 'divert',

'diverted', 'diverting', 'diverts', 'divest', 'divested', 'divesting', 'divestiture',

'divestitures', 'divestment', 'divestments', 'divests', 'divorce', 'divorced', 'divulge',

'divulged', 'divulges', 'divulging', 'doubt', 'doubted', 'doubtful', 'doubts', 'downgrade',

'downgraded', 'downgrades', 'downgrading', 'downsize', 'downsized', 'downsizes', 'downsizing',

'downsizings', 'downtime', 'downtimes', 'downturn', 'downturns', 'downward', 'downwards', 'drag',

'drastic', 'drastically', 'drawback', 'drawbacks', 'dropped', 'drought', 'droughts', 'duress',

'dysfunction', 'dysfunctional', 'dysfunctions', 'easing', 'egregious', 'egregiously', 'embargo',

'embargoed', 'embargoes', 'embargoing', 'embarrass', 'embarrassed', 'embarrasses',

'embarrassing', 'embarrassment', 'embarrassments', 'embezzle', 'embezzled', 'embezzlement',

'embezzlements', 'embezzler', 'embezzles', 'embezzling', 'encroach', 'encroached', 'encroaches',

'encroaching', 'encroachment', 'encroachments', 'encumber', 'encumbered', 'encumbering',

'encumbers', 'encumbrance', 'encumbrances', 'endanger', 'endangered', 'endangering',

'endangerment', 'endangers', 'enjoin', 'enjoined', 'enjoining', 'enjoins', 'erode', 'eroded',

'erodes', 'eroding', 'erosion', 'erratic', 'erratically', 'erred', 'erring', 'erroneous',

'erroneously', 'error', 'errors', 'errs', 'escalate', 'escalated', 'escalates', 'escalating',

'evade', 'evaded', 'evades', 'evading', 'evasion', 'evasions', 'evasive', 'evict', 'evicted',

'evicting', 'eviction', 'evictions', 'evicts', 'exacerbate', 'exacerbated', 'exacerbates',

'exacerbating', 'exacerbation', 'exacerbations', 'exaggerate', 'exaggerated', 'exaggerates',

'exaggerating', 'exaggeration', 'excessive', 'excessively', 'exculpate', 'exculpated',

'exculpates', 'exculpating', 'exculpation', 'exculpations', 'exculpatory', 'exonerate',

'exonerated', 'exonerates', 'exonerating', 'exoneration', 'exonerations', 'exploit',

'exploitation', 'exploitations', 'exploitative', 'exploited', 'exploiting', 'exploits', 'expose',

'exposed', 'exposes', 'exposing', 'expropriate', 'expropriated', 'expropriates', 'expropriating',

'expropriation', 'expropriations', 'expulsion', 'expulsions', 'extenuating', 'fail', 'failed',

'failing', 'failings', 'fails', 'failure', 'failures', 'fallout', 'false', 'falsely',

'falsification', 'falsifications', 'falsified', 'falsifies', 'falsify', 'falsifying', 'falsity',

'fatalities', 'fatality', 'fatally', 'fault', 'faulted', 'faults', 'faulty', 'fear', 'fears',

'felonies', 'felonious', 'felony', 'fictitious', 'fined', 'fines', 'fired', 'firing', 'flaw',

'flawed', 'flaws', 'forbid', 'forbidden', 'forbidding', 'forbids', 'force', 'forced', 'forcing',

'foreclose', 'foreclosed', 'forecloses', 'foreclosing', 'foreclosure', 'foreclosures', 'forego',

'foregoes', 'foregone', 'forestall', 'forestalled', 'forestalling', 'forestalls', 'forfeit',

'forfeited', 'forfeiting', 'forfeits', 'forfeiture', 'forfeitures', 'forgers', 'forgery',

'fraud', 'frauds', 'fraudulence', 'fraudulent', 'fraudulently', 'frivolous', 'frivolously',

'frustrate', 'frustrated', 'frustrates', 'frustrating', 'frustratingly', 'frustration',

'frustrations', 'fugitive', 'fugitives', 'gratuitous', 'gratuitously', 'grievance', 'grievances',

'grossly', 'groundless', 'guilty', 'halt', 'halted', 'hamper', 'hampered', 'hampering',

'hampers', 'harass', 'harassed', 'harassing', 'harassment', 'hardship', 'hardships', 'harm',

'harmed', 'harmful', 'harmfully', 'harming', 'harms', 'harsh', 'harsher', 'harshest', 'harshly',

'harshness', 'hazard', 'hazardous', 'hazards', 'hinder', 'hindered', 'hindering', 'hinders',

'hindrance', 'hindrances', 'hostile', 'hostility', 'hurt', 'hurting', 'idle', 'idled', 'idling',

'ignore', 'ignored', 'ignores', 'ignoring', 'ill', 'illegal', 'illegalities', 'illegality',

'illegally', 'illegible', 'illicit', 'illicitly', 'illiquid', 'illiquidity', 'imbalance',

'imbalances', 'immature', 'immoral', 'impair', 'impaired', 'impairing', 'impairment',

'impairments', 'impairs', 'impasse', 'impasses', 'impede', 'impeded', 'impedes', 'impediment',

'impediments', 'impeding', 'impending', 'imperative', 'imperfection', 'imperfections', 'imperil',

'impermissible', 'implicate', 'implicated', 'implicates', 'implicating', 'impossibility',

'impossible', 'impound', 'impounded', 'impounding', 'impounds', 'impracticable', 'impractical',

'impracticalities', 'impracticality', 'imprisonment', 'improper', 'improperly', 'improprieties',

'impropriety', 'imprudent', 'imprudently', 'inability', 'inaccessible', 'inaccuracies',

'inaccuracy', 'inaccurate', 'inaccurately', 'inaction', 'inactions', 'inactivate', 'inactivated',

'inactivates', 'inactivating', 'inactivation', 'inactivations', 'inactivity', 'inadequacies',

'inadequacy', 'inadequate', 'inadequately', 'inadvertent', 'inadvertently', 'inadvisability',

'inadvisable', 'inappropriate', 'inappropriately', 'inattention', 'incapable', 'incapacitated',

'incapacity', 'incarcerate', 'incarcerated', 'incarcerates', 'incarcerating', 'incarceration',

'incarcerations', 'incidence', 'incidences', 'incident', 'incidents', 'incompatibilities',

'incompatibility', 'incompatible', 'incompetence', 'incompetency', 'incompetent',

'incompetently', 'incompetents', 'incomplete', 'incompletely', 'incompleteness', 'inconclusive',

'inconsistencies', 'inconsistency', 'inconsistent', 'inconsistently', 'inconvenience',

'inconveniences', 'inconvenient', 'incorrect', 'incorrectly', 'incorrectness', 'indecency',

'indecent', 'indefeasible', 'indefeasibly', 'indict', 'indictable', 'indicted', 'indicting',

'indictment', 'indictments', 'ineffective', 'ineffectively', 'ineffectiveness', 'inefficiencies',

'inefficiency', 'inefficient', 'inefficiently', 'ineligibility', 'ineligible', 'inequitable',

'inequitably', 'inequities', 'inequity', 'inevitable', 'inexperience', 'inexperienced',

'inferior', 'inflicted', 'infraction', 'infractions', 'infringe', 'infringed', 'infringement',

'infringements', 'infringes', 'infringing', 'inhibited', 'inimical', 'injunction', 'injunctions',

'injure', 'injured', 'injures', 'injuries', 'injuring', 'injurious', 'injury', 'inordinate',

'inordinately', 'inquiry', 'insecure', 'insensitive', 'insolvencies', 'insolvency', 'insolvent',

'instability', 'insubordination', 'insufficiency', 'insufficient', 'insufficiently',

'insurrection', 'insurrections', 'intentional', 'interfere', 'interfered', 'interference',

'interferences', 'interferes', 'interfering', 'intermittent', 'intermittently', 'interrupt',

'interrupted', 'interrupting', 'interruption', 'interruptions', 'interrupts', 'intimidation',

'intrusion', 'invalid', 'invalidate', 'invalidated', 'invalidates', 'invalidating',

'invalidation', 'invalidity', 'investigate', 'investigated', 'investigates', 'investigating',

'investigation', 'investigations', 'involuntarily', 'involuntary', 'irreconcilable',

'irreconcilably', 'irrecoverable', 'irrecoverably', 'irregular', 'irregularities',

'irregularity', 'irregularly', 'irreparable', 'irreparably', 'irreversible', 'jeopardize',

'jeopardized', 'justifiable', 'kickback', 'kickbacks', 'knowingly', 'lack', 'lacked', 'lacking',

'lackluster', 'lacks', 'lag', 'lagged', 'lagging', 'lags', 'lapse', 'lapsed', 'lapses',

'lapsing', 'late', 'laundering', 'layoff', 'layoffs', 'lie', 'limitation', 'limitations',

'lingering', 'liquidate', 'liquidated', 'liquidates', 'liquidating', 'liquidation',

'liquidations', 'liquidator', 'liquidators', 'litigant', 'litigants', 'litigate', 'litigated',

'litigates', 'litigating', 'litigation', 'litigations', 'lockout', 'lockouts', 'lose', 'loses',

'losing', 'loss', 'losses', 'lost', 'lying', 'malfeasance', 'malfunction', 'malfunctioned',

'malfunctioning', 'malfunctions', 'malice', 'malicious', 'maliciously', 'malpractice',

'manipulate', 'manipulated', 'manipulates', 'manipulating', 'manipulation', 'manipulations',

'manipulative', 'markdown', 'markdowns', 'misapplication', 'misapplications', 'misapplied',

'misapplies', 'misapply', 'misapplying', 'misappropriate', 'misappropriated', 'misappropriates',

'misappropriating', 'misappropriation', 'misappropriations', 'misbranded', 'miscalculate',

'miscalculated', 'miscalculates', 'miscalculating', 'miscalculation', 'miscalculations',

'mischaracterization', 'mischief', 'misclassification', 'misclassifications', 'misclassified',

'misclassify', 'miscommunication', 'misconduct', 'misdated', 'misdemeanor', 'misdemeanors',

'misdirected', 'mishandle', 'mishandled', 'mishandles', 'mishandling', 'misinform',

'misinformation', 'misinformed', 'misinforming', 'misinforms', 'misinterpret',

'misinterpretation', 'misinterpretations', 'misinterpreted', 'misinterpreting', 'misinterprets',

'misjudge', 'misjudged', 'misjudges', 'misjudging', 'misjudgment', 'misjudgments', 'mislabel',

'mislabeled', 'mislabeling', 'mislabelled', 'mislabels', 'mislead', 'misleading', 'misleadingly',

'misleads', 'misled', 'mismanage', 'mismanaged', 'mismanagement', 'mismanages', 'mismanaging',

'mismatch', 'mismatched', 'mismatches', 'mismatching', 'misplaced', 'misprice', 'mispricing',

'mispricings', 'misrepresent', 'misrepresentation', 'misrepresentations', 'misrepresented',

'misrepresenting', 'misrepresents', 'miss', 'missed', 'misses', 'misstate', 'misstated',

'misstatement', 'misstatements', 'misstates', 'misstating', 'misstep', 'missteps', 'mistake',

'mistaken', 'mistakenly', 'mistakes', 'mistaking', 'mistrial', 'mistrials', 'misunderstand',

'misunderstanding', 'misunderstandings', 'misunderstood', 'misuse', 'misused', 'misuses',

'misusing', 'monopolistic', 'monopolists', 'monopolization', 'monopolize', 'monopolized',

'monopolizes', 'monopolizing', 'monopoly', 'moratoria', 'moratorium', 'moratoriums',

'mothballed', 'mothballing', 'negative', 'negatively', 'negatives', 'neglect', 'neglected',

'neglectful', 'neglecting', 'neglects', 'negligence', 'negligences', 'negligent', 'negligently',

'nonattainment', 'noncompetitive', 'noncompliance', 'noncompliances', 'noncompliant',

'noncomplying', 'nonconforming', 'nonconformities', 'nonconformity', 'nondisclosure',

'nonfunctional', 'nonpayment', 'nonpayments', 'nonperformance', 'nonperformances',

'nonperforming', 'nonproducing', 'nonproductive', 'nonrecoverable', 'nonrenewal', 'nuisance',

'nuisances', 'nullification', 'nullifications', 'nullified', 'nullifies', 'nullify',

'nullifying', 'objected', 'objecting', 'objection', 'objectionable', 'objectionably',

'objections', 'obscene', 'obscenity', 'obsolescence', 'obsolete', 'obstacle', 'obstacles',

'obstruct', 'obstructed', 'obstructing', 'obstruction', 'obstructions', 'offence', 'offences',

'offend', 'offended', 'offender', 'offenders', 'offending', 'offends', 'omission', 'omissions',

'omit', 'omits', 'omitted', 'omitting', 'onerous', 'opportunistic', 'opportunistically',

'oppose', 'opposed', 'opposes', 'opposing', 'opposition', 'oppositions', 'outage', 'outages',

'outdated', 'outmoded', 'overage', 'overages', 'overbuild', 'overbuilding', 'overbuilds',

'overbuilt', 'overburden', 'overburdened', 'overburdening', 'overcapacities', 'overcapacity',

'overcharge', 'overcharged', 'overcharges', 'overcharging', 'overcome', 'overcomes',

'overcoming', 'overdue', 'overestimate', 'overestimated', 'overestimates', 'overestimating',

'overestimation', 'overestimations', 'overload', 'overloaded', 'overloading', 'overloads',

'overlook', 'overlooked', 'overlooking', 'overlooks', 'overpaid', 'overpayment', 'overpayments',

'overproduced', 'overproduces', 'overproducing', 'overproduction', 'overrun', 'overrunning',

'overruns', 'overshadow', 'overshadowed', 'overshadowing', 'overshadows', 'overstate',

'overstated', 'overstatement', 'overstatements', 'overstates', 'overstating', 'oversupplied',

'oversupplies', 'oversupply', 'oversupplying', 'overtly', 'overturn', 'overturned',

'overturning', 'overturns', 'overvalue', 'overvalued', 'overvaluing', 'panic', 'panics',

'penalize', 'penalized', 'penalizes', 'penalizing', 'penalties', 'penalty', 'peril', 'perils',

'perjury', 'perpetrate', 'perpetrated', 'perpetrates', 'perpetrating', 'perpetration', 'persist',

'persisted', 'persistence', 'persistent', 'persistently', 'persisting', 'persists', 'pervasive',

'pervasively', 'pervasiveness', 'petty', 'picket', 'picketed', 'picketing', 'plaintiff',

'plaintiffs', 'plea', 'plead', 'pleaded', 'pleading', 'pleadings', 'pleads', 'pleas', 'pled',

'poor', 'poorly', 'poses', 'posing', 'postpone', 'postponed', 'postponement', 'postponements',

'postpones', 'postponing', 'precipitated', 'precipitous', 'precipitously', 'preclude',

'precluded', 'precludes', 'precluding', 'predatory', 'prejudice', 'prejudiced', 'prejudices',

'prejudicial', 'prejudicing', 'premature', 'prematurely', 'pressing', 'pretrial', 'preventing',

'prevention', 'prevents', 'problem', 'problematic', 'problematical', 'problems', 'prolong',

'prolongation', 'prolongations', 'prolonged', 'prolonging', 'prolongs', 'prone', 'prosecute',

'prosecuted', 'prosecutes', 'prosecuting', 'prosecution', 'prosecutions', 'protest', 'protested',

'protester', 'protesters', 'protesting', 'protestor', 'protestors', 'protests', 'protracted',

'protraction', 'provoke', 'provoked', 'provokes', 'provoking', 'punished', 'punishes',

'punishing', 'punishment', 'punishments', 'punitive', 'purport', 'purported', 'purportedly',

'purporting', 'purports', 'question', 'questionable', 'questionably', 'questioned',

'questioning', 'questions', 'quit', 'quitting', 'racketeer', 'racketeering', 'rationalization',

'rationalizations', 'rationalize', 'rationalized', 'rationalizes', 'rationalizing',

'reassessment', 'reassessments', 'reassign', 'reassigned', 'reassigning', 'reassignment',

'reassignments', 'reassigns', 'recall', 'recalled', 'recalling', 'recalls', 'recession',

'recessionary', 'recessions', 'reckless', 'recklessly', 'recklessness', 'redact', 'redacted',

'redacting', 'redaction', 'redactions', 'redefault', 'redefaulted', 'redefaults', 'redress',

'redressed', 'redresses', 'redressing', 'refusal', 'refusals', 'refuse', 'refused', 'refuses',

'refusing', 'reject', 'rejected', 'rejecting', 'rejection', 'rejections', 'rejects',

'relinquish', 'relinquished', 'relinquishes', 'relinquishing', 'relinquishment',

'relinquishments', 'reluctance', 'reluctant', 'renegotiate', 'renegotiated', 'renegotiates',

'renegotiating', 'renegotiation', 'renegotiations', 'renounce', 'renounced', 'renouncement',

'renouncements', 'renounces', 'renouncing', 'reparation', 'reparations', 'repossessed',

'repossesses', 'repossessing', 'repossession', 'repossessions', 'repudiate', 'repudiated',

'repudiates', 'repudiating', 'repudiation', 'repudiations', 'resign', 'resignation',

'resignations', 'resigned', 'resigning', 'resigns', 'restate', 'restated', 'restatement',

'restatements', 'restates', 'restating', 'restructure', 'restructured', 'restructures',

'restructuring', 'restructurings', 'retaliate', 'retaliated', 'retaliates', 'retaliating',

'retaliation', 'retaliations', 'retaliatory', 'retribution', 'retributions', 'revocation',

'revocations', 'revoke', 'revoked', 'revokes', 'revoking', 'ridicule', 'ridiculed', 'ridicules',

'ridiculing', 'riskier', 'riskiest', 'risky', 'sabotage', 'sacrifice', 'sacrificed',

'sacrifices', 'sacrificial', 'sacrificing', 'scandalous', 'scandals', 'scrutinize',

'scrutinized', 'scrutinizes', 'scrutinizing', 'scrutiny', 'secrecy', 'seize', 'seized', 'seizes',

'seizing', 'sentenced', 'sentencing', 'serious', 'seriously', 'seriousness', 'setback',

'setbacks', 'sever', 'severe', 'severed', 'severely', 'severities', 'severity', 'sharply',

'shocked', 'shortage', 'shortages', 'shortfall', 'shortfalls', 'shrinkage', 'shrinkages', 'shut',

'shutdown', 'shutdowns', 'shuts', 'shutting', 'slander', 'slandered', 'slanderous', 'slanders',

'slippage', 'slippages', 'slow', 'slowdown', 'slowdowns', 'slowed', 'slower', 'slowest',

'slowing', 'slowly', 'slowness', 'sluggish', 'sluggishly', 'sluggishness', 'solvencies',

'solvency', 'spam', 'spammers', 'spamming', 'staggering', 'stagnant', 'stagnate', 'stagnated',

'stagnates', 'stagnating', 'stagnation', 'standstill', 'standstills', 'stolen', 'stoppage',

'stoppages', 'stopped', 'stopping', 'stops', 'strain', 'strained', 'straining', 'strains',

'stress', 'stressed', 'stresses', 'stressful', 'stressing', 'stringent', 'subjected',

'subjecting', 'subjection', 'subpoena', 'subpoenaed', 'subpoenas', 'substandard', 'sue', 'sued',

'sues', 'suffer', 'suffered', 'suffering', 'suffers', 'suing', 'summoned', 'summoning',

'summons', 'summonses', 'susceptibility', 'susceptible', 'suspect', 'suspected', 'suspects',

'suspend', 'suspended', 'suspending', 'suspends', 'suspension', 'suspensions', 'suspicion',

'suspicions', 'suspicious', 'suspiciously', 'taint', 'tainted', 'tainting', 'taints', 'tampered',

'tense', 'terminate', 'terminated', 'terminates', 'terminating', 'termination', 'terminations',

'testify', 'testifying', 'threat', 'threaten', 'threatened', 'threatening', 'threatens',

'threats', 'tightening', 'tolerate', 'tolerated', 'tolerates', 'tolerating', 'toleration',

'tortuous', 'tortuously', 'tragedies', 'tragedy', 'tragic', 'tragically', 'traumatic', 'trouble',

'troubled', 'troubles', 'turbulence', 'turmoil', 'unable', 'unacceptable', 'unacceptably',

'unaccounted', 'unannounced', 'unanticipated', 'unapproved', 'unattractive', 'unauthorized',

'unavailability', 'unavailable', 'unavoidable', 'unavoidably', 'unaware', 'uncollectable',

'uncollected', 'uncollectibility', 'uncollectible', 'uncollectibles', 'uncompetitive',

'uncompleted', 'unconscionable', 'unconscionably', 'uncontrollable', 'uncontrollably',

'uncontrolled', 'uncorrected', 'uncover', 'uncovered', 'uncovering', 'uncovers', 'undeliverable',

'undelivered', 'undercapitalized', 'undercut', 'undercuts', 'undercutting', 'underestimate',

'underestimated', 'underestimates', 'underestimating', 'underestimation', 'underfunded',

'underinsured', 'undermine', 'undermined', 'undermines', 'undermining', 'underpaid',

'underpayment', 'underpayments', 'underpays', 'underperform', 'underperformance',

'underperformed', 'underperforming', 'underperforms', 'underproduced', 'underproduction',

'underreporting', 'understate', 'understated', 'understatement', 'understatements',

'understates', 'understating', 'underutilization', 'underutilized', 'undesirable', 'undesired',

'undetected', 'undetermined', 'undisclosed', 'undocumented', 'undue', 'unduly', 'uneconomic',

'uneconomical', 'uneconomically', 'unemployed', 'unemployment', 'unethical', 'unethically',

'unexcused', 'unexpected', 'unexpectedly', 'unfair', 'unfairly', 'unfavorability', 'unfavorable',

'unfavorably', 'unfavourable', 'unfeasible', 'unfit', 'unfitness', 'unforeseeable', 'unforeseen',

'unforseen', 'unfortunate', 'unfortunately', 'unfounded', 'unfriendly', 'unfulfilled',

'unfunded', 'uninsured', 'unintended', 'unintentional', 'unintentionally', 'unjust',

'unjustifiable', 'unjustifiably', 'unjustified', 'unjustly', 'unknowing', 'unknowingly',

'unlawful', 'unlawfully', 'unlicensed', 'unliquidated', 'unmarketable', 'unmerchantable',

'unmeritorious', 'unnecessarily', 'unnecessary', 'unneeded', 'unobtainable', 'unoccupied',

'unpaid', 'unperformed', 'unplanned', 'unpopular', 'unpredictability', 'unpredictable',

'unpredictably', 'unpredicted', 'unproductive', 'unprofitability', 'unprofitable', 'unqualified',

'unrealistic', 'unreasonable', 'unreasonableness', 'unreasonably', 'unreceptive',

'unrecoverable', 'unrecovered', 'unreimbursed', 'unreliable', 'unremedied', 'unreported',

'unresolved', 'unrest', 'unsafe', 'unsalable', 'unsaleable', 'unsatisfactory', 'unsatisfied',

'unsavory', 'unscheduled', 'unsellable', 'unsold', 'unsound', 'unstabilized', 'unstable',

'unsubstantiated', 'unsuccessful', 'unsuccessfully', 'unsuitability', 'unsuitable', 'unsuitably',

'unsuited', 'unsure', 'unsuspected', 'unsuspecting', 'unsustainable', 'untenable', 'untimely',

'untrusted', 'untruth', 'untruthful', 'untruthfully', 'untruthfulness', 'untruths', 'unusable',

'unwanted', 'unwarranted', 'unwelcome', 'unwilling', 'unwillingness', 'upset', 'urgency',

'urgent', 'usurious', 'usurp', 'usurped', 'usurping', 'usurps', 'usury', 'vandalism', 'verdict',

'verdicts', 'vetoed', 'victims', 'violate', 'violated', 'violates', 'violating', 'violation',

'violations', 'violative', 'violator', 'violators', 'violence', 'violent', 'violently',

'vitiate', 'vitiated', 'vitiates', 'vitiating', 'vitiation', 'voided', 'voiding', 'volatile',

'volatility', 'vulnerabilities', 'vulnerability', 'vulnerable', 'vulnerably', 'warn', 'warned',

'warning', 'warnings', 'warns', 'wasted', 'wasteful', 'wasting', 'weak', 'weaken', 'weakened',

'weakening', 'weakens', 'weaker', 'weakest', 'weakly', 'weakness', 'weaknesses', 'willfully',

'worries', 'worry', 'worrying', 'worse', 'worsen', 'worsened', 'worsening', 'worsens', 'worst',

'worthless', 'writedown', 'writedowns', 'writeoff', 'writeoffs', 'wrong', 'wrongdoing',

'wrongdoings', 'wrongful', 'wrongfully', 'wrongly'],

'Positive': ['able', 'abundance', 'abundant', 'acclaimed', 'accomplish', 'accomplished', 'accomplishes',

'accomplishing', 'accomplishment', 'accomplishments', 'achieve', 'achieved', 'achievement',

'achievements', 'achieves', 'achieving', 'adequately', 'advancement', 'advancements', 'advances',

'advancing', 'advantage', 'advantaged', 'advantageous', 'advantageously', 'advantages',

'alliance', 'alliances', 'assure', 'assured', 'assures', 'assuring', 'attain', 'attained',

'attaining', 'attainment', 'attainments', 'attains', 'attractive', 'attractiveness', 'beautiful',

'beautifully', 'beneficial', 'beneficially', 'benefit', 'benefited', 'benefiting', 'benefitted',

'benefitting', 'best', 'better', 'bolstered', 'bolstering', 'bolsters', 'boom', 'booming',

'boost', 'boosted', 'breakthrough', 'breakthroughs', 'brilliant', 'charitable', 'collaborate',

'collaborated', 'collaborates', 'collaborating', 'collaboration', 'collaborations',

'collaborative', 'collaborator', 'collaborators', 'compliment', 'complimentary', 'complimented',

'complimenting', 'compliments', 'conclusive', 'conclusively', 'conducive', 'confident',

'constructive', 'constructively', 'courteous', 'creative', 'creatively', 'creativeness',

'creativity', 'delight', 'delighted', 'delightful', 'delightfully', 'delighting', 'delights',

'dependability', 'dependable', 'desirable', 'desired', 'despite', 'destined', 'diligent',

'diligently', 'distinction', 'distinctions', 'distinctive', 'distinctively', 'distinctiveness',

'dream', 'easier', 'easily', 'easy', 'effective', 'efficiencies', 'efficiency', 'efficient',

'efficiently', 'empower', 'empowered', 'empowering', 'empowers', 'enable', 'enabled', 'enables',

'enabling', 'encouraged', 'encouragement', 'encourages', 'encouraging', 'enhance', 'enhanced',

'enhancement', 'enhancements', 'enhances', 'enhancing', 'enjoy', 'enjoyable', 'enjoyably',

'enjoyed', 'enjoying', 'enjoyment', 'enjoys', 'enthusiasm', 'enthusiastic', 'enthusiastically',

'excellence', 'excellent', 'excelling', 'excels', 'exceptional', 'exceptionally', 'excited',

'excitement', 'exciting', 'exclusive', 'exclusively', 'exclusiveness', 'exclusives',

'exclusivity', 'exemplary', 'fantastic', 'favorable', 'favorably', 'favored', 'favoring',

'favorite', 'favorites', 'friendly', 'gain', 'gained', 'gaining', 'gains', 'good', 'great',

'greater', 'greatest', 'greatly', 'greatness', 'happiest', 'happily', 'happiness', 'happy',

'highest', 'honor', 'honorable', 'honored', 'honoring', 'honors', 'ideal', 'impress',

'impressed', 'impresses', 'impressing', 'impressive', 'impressively', 'improve', 'improved',

'improvement', 'improvements', 'improves', 'improving', 'incredible', 'incredibly',

'influential', 'informative', 'ingenuity', 'innovate', 'innovated', 'innovates', 'innovating',

'innovation', 'innovations', 'innovative', 'innovativeness', 'innovator', 'innovators',

'insightful', 'inspiration', 'inspirational', 'integrity', 'invent', 'invented', 'inventing',

'invention', 'inventions', 'inventive', 'inventiveness', 'inventor', 'inventors', 'leadership',

'leading', 'loyal', 'lucrative', 'meritorious', 'opportunities', 'opportunity', 'optimistic',

'outperform', 'outperformed', 'outperforming', 'outperforms', 'perfect', 'perfected',

'perfectly', 'perfects', 'pleasant', 'pleasantly', 'pleased', 'pleasure', 'plentiful', 'popular',

'popularity', 'positive', 'positively', 'preeminence', 'preeminent', 'premier', 'premiere',

'prestige', 'prestigious', 'proactive', 'proactively', 'proficiency', 'proficient',

'proficiently', 'profitability', 'profitable', 'profitably', 'progress', 'progressed',

'progresses', 'progressing', 'prospered', 'prospering', 'prosperity', 'prosperous', 'prospers',

'rebound', 'rebounded', 'rebounding', 'receptive', 'regain', 'regained', 'regaining', 'resolve',

'revolutionize', 'revolutionized', 'revolutionizes', 'revolutionizing', 'reward', 'rewarded',

'rewarding', 'rewards', 'satisfaction', 'satisfactorily', 'satisfactory', 'satisfied',

'satisfies', 'satisfy', 'satisfying', 'smooth', 'smoothing', 'smoothly', 'smooths', 'solves',

'solving', 'spectacular', 'spectacularly', 'stability', 'stabilization', 'stabilizations',

'stabilize', 'stabilized', 'stabilizes', 'stabilizing', 'stable', 'strength', 'strengthen',

'strengthened', 'strengthening', 'strengthens', 'strengths', 'strong', 'stronger', 'strongest',

'succeed', 'succeeded', 'succeeding', 'succeeds', 'success', 'successes', 'successful',

'successfully', 'superior', 'surpass', 'surpassed', 'surpasses', 'surpassing', 'transparency',

'tremendous', 'tremendously', 'unmatched', 'unparalleled', 'unsurpassed', 'upturn', 'upturns',

'valuable', 'versatile', 'versatility', 'vibrancy', 'vibrant', 'win', 'winner', 'winners',

'winning', 'worthy']}

# Henry's (2008) Word List

# Henry, Elaine. “Are Investors Influenced By How Earnings Press Releases Are Written.” The Journal of Business

# Communication (1973) 45, no. 4 (2008): 363–407.

hdict = {'Negative': ['negative', 'negatives', 'fail', 'fails', 'failing', 'failure', 'weak', 'weakness', 'weaknesses',

'difficult', 'difficulty', 'hurdle', 'hurdles', 'obstacle', 'obstacles', 'slump', 'slumps',

'slumping', 'slumped', 'uncertain', 'uncertainty', 'unsettled', 'unfavorable', 'downturn',

'depressed', 'disappoint', 'disappoints', 'disappointing', 'disappointed', 'disappointment',

'risk', 'risks', 'risky', 'threat', 'threats', 'penalty', 'penalties', 'down', 'decrease',

'decreases', 'decreasing', 'decreased', 'decline', 'declines', 'declining', 'declined', 'fall',

'falls', 'falling', 'fell', 'fallen', 'drop', 'drops', 'dropping', 'dropped', 'deteriorate',

'deteriorates', 'deteriorating', 'deteriorated', 'worsen', 'worsens', 'worsening', 'weaken',

'weakens', 'weakening', 'weakened', 'worse', 'worst', 'low', 'lower', 'lowest', 'less', 'least',

'smaller', 'smallest', 'shrink', 'shrinks', 'shrinking', 'shrunk', 'below', 'under', 'challenge',

'challenges', 'challenging', 'challenged'],

'Positive': ['positive', 'positives', 'success', 'successes', 'successful', 'succeed', 'succeeds',

'succeeding', 'succeeded', 'accomplish', 'accomplishes', 'accomplishing', 'accomplished',

'accomplishment', 'accomplishments', 'strong', 'strength', 'strengths', 'certain', 'certainty',

'definite', 'solid', 'excellent', 'good', 'leading', 'achieve', 'achieves', 'achieved',

'achieving', 'achievement', 'achievements', 'progress', 'progressing', 'deliver', 'delivers',

'delivered', 'delivering', 'leader', 'leading', 'pleased', 'reward', 'rewards', 'rewarding',

'rewarded', 'opportunity', 'opportunities', 'enjoy', 'enjoys', 'enjoying', 'enjoyed',

'encouraged', 'encouraging', 'up', 'increase', 'increases', 'increasing', 'increased', 'rise',

'rises', 'rising', 'rose', 'risen', 'improve', 'improves', 'improving', 'improved', 'improvement',

'improvements', 'strengthen', 'strengthens', 'strengthening', 'strengthened', 'stronger',

'strongest', 'better', 'best', 'more', 'most', 'above', 'record', 'high', 'higher', 'highest',

'greater', 'greatest', 'larger', 'largest', 'grow', 'grows', 'growing', 'grew', 'grown', 'growth',

'expand', 'expands', 'expanding', 'expanded', 'expansion', 'exceed', 'exceeds', 'exceeded',

'exceeding', 'beat', 'beats', 'beating']}

negate = ["aint", "arent", "cannot", "cant", "couldnt", "darent", "didnt", "doesnt", "ain't", "aren't", "can't",

"couldn't", "daren't", "didn't", "doesn't", "dont", "hadnt", "hasnt", "havent", "isnt", "mightnt", "mustnt",

"neither", "don't", "hadn't", "hasn't", "haven't", "isn't", "mightn't", "mustn't", "neednt", "needn't",

"never", "none", "nope", "nor", "not", "nothing", "nowhere", "oughtnt", "shant", "shouldnt", "wasnt",

"werent", "oughtn't", "shan't", "shouldn't", "wasn't", "weren't", "without", "wont", "wouldnt", "won't",

"wouldn't", "rarely", "seldom", "despite", "no", "nobody"]

def negated(word):

"""

Determine if preceding word is a negation word

"""

if word.lower() in negate:

return True

else:

return False

def tone_count_with_negation_check(dict, article):

"""

Count positive and negative words with negation check. Account for simple negation only for positive words.

Simple negation is taken to be observations of one of negate words occurring within three words

preceding a positive words.

"""

pos_count = 0

neg_count = 0

pos_words = []

neg_words = []

input_words = re.findall(r'\b([a-zA-Z]+n\'t|[a-zA-Z]+\'s|[a-zA-Z]+)\b', article.lower())

word_count = len(input_words)

for i in range(0, word_count):

if input_words[i] in dict['Negative']:

neg_count += 1

neg_words.append(input_words[i])

if input_words[i] in dict['Positive']:

if i >= 3:

if negated(input_words[i - 1]) or negated(input_words[i - 2]) or negated(input_words[i - 3]):

neg_count += 1

neg_words.append(input_words[i] + ' (with negation)')

else:

pos_count += 1

pos_words.append(input_words[i])

elif i == 2:

if negated(input_words[i - 1]) or negated(input_words[i - 2]):

neg_count += 1

neg_words.append(input_words[i] + ' (with negation)')

else:

pos_count += 1

pos_words.append(input_words[i])

elif i == 1:

if negated(input_words[i - 1]):

neg_count += 1

neg_words.append(input_words[i] + ' (with negation)')

else:

pos_count += 1

pos_words.append(input_words[i])

elif i == 0:

pos_count += 1

pos_words.append(input_words[i])

print('The results with negation check:', end='\n\n')

print('The # of positive words:', pos_count)

print('The # of negative words:', neg_count)

print('The list of found positive words:', pos_words)

print('The list of found negative words:', neg_words)

print('\n', end='')

results = [word_count, pos_count, neg_count, pos_words, neg_words]

return results

# A sample output

article = '''Patent infringement pursued against same companies in U.S. District Court. Test "wasn't good".

SUNNYVALE, Calif.--(BUSINESS WIRE)--December 02, 2010--

Rambus Inc. (Nasdaq:RMBS), one of the world's premier technology licensing companies, today announced it has filed a complaint with the United States International Trade Commission (ITC) requesting the commencement of an investigation pertaining to products from Broadcom Corporation, Freescale Semiconductor, Inc., LSI Corporation, MediaTek Inc., NVIDIA Corporation and STMicroelectronics N. V. The complaint seeks an exclusion order barring the importation, sale for importation, or sale after importation of products from Broadcom, Freescale, LSI, NVIDIA and STMicroelectronics that infringe certain patents from the Dally1 family of patents, and of products from Broadcom, Freescale, LSI, MediaTek and STMicroelectronics that infringe certain patents from the Barth family of patents. In an earlier investigation requested by Rambus the ITC found that these same Barth patents were valid and infringed by NVIDIA products, and issued an exclusion order in July of this year.

"We have been attempting to license these companies for some time to no avail. One of the respondents frankly told us that the only way they would get serious is if we sued them. Others pursued a strategy of delay rather than negotiate a reasonable resolution," said Harold Hughes, president and chief executive officer at Rambus. "Rambus has invested hundreds of millions of dollars developing a portfolio of technologies that are foundational for many digital electronics. There is widespread knowledge within the industry about our patents including their use in standards-compatible products accused in these actions. In fairness to our shareholders and to our paying licensees, we take these steps to protect our patented innovations and pursue fair compensation for their use."

For the Dally patents, the accused semiconductor products from these companies include ones that incorporate PCI Express, certain Serial ATA, certain Serial Attached SCSI (SAS), and DisplayPort interfaces. In the case of the Barth patents, the accused semiconductor products include ones that incorporate DDR, DDR2, DDR3, mobile DDR, LPDDR, LPDDR2, and GDDR3 memory controllers. Accused semiconductor products in the complaint include graphics processors, media processors, communications processors, chip sets and other logic integrated circuits (ICs).

In addition to Broadcom, Freescale, LSI, MediaTek, NVIDIA and STMicroelectronics, the ITC complaint names companies whose products incorporate the accused semiconductor products and are imported, sold for importation, or sold after importation into the United States. These products include personal computers, workstations, servers, routers, mobile phones and other handheld devices, set-top boxes, Blu-ray players motherboards, plug-in cards, hard drives and modems. The ITC is expected to decide whether to initiate an investigation under this complaint within 30-45 days.

Rambus today also filed separate actions for patent infringement against Broadcom, Freescale, LSI, MediaTek and STMicroelectronics in the United States District Court for the Northern District of California. The lawsuits allege that semiconductor products with certain memory controllers and/or serial links from the above companies infringe certain patents from the Farmwald-Horowitz, Barth, and Dally patent families. In the case of MediaTek, only infringement of the Barth and Farmwald-Horowitz patents for certain memory controllers is alleged. Rambus also filed an action in the United States District Court for the Northern District of California against NVIDIA for infringement of certain Dally patents. The categories of accused semiconductor products in the District Court complaints include the same categories accused in the ITC complaint, as well as SDR memory controllers. Rambus is seeking injunctive relief barring the infringement, contributory infringement, and inducement to infringe the patents, as well as monetary damages.

Rambus management will discuss the filing of these actions during a special conference call today at 5:00 p.m. PT. The call will be webcast and can be accessed through the Rambus website. A replay will be available following the call on Rambus' Investor Relations website or for one week at the following numbers: (800) 642-1687 (domestic) or (706) 645-9291 (international) with ID# 29122159. Further information regarding these legal actions will be made available at http://investor.rambus.com in the Litigation Update section.

1 Rambus is the exclusive licensee for the Dally family of patents which are owned by Massachusetts Institute of Technology. This license was assigned to Rambus as a part of its 2003 acquisition of technology and IP from Velio Communications, a company founded by Dr. William Dally.

'''

tone_count_with_negation_check(lmdict, article)

[Original Post] I find two internet resources for this task (thank both authors):

The first solution is way more efficient than the second, but the second is more straightforward. The first needs extra knowledge of PostgreSQL and R besides Python. I borrow from the two resources and write the Python code below.

Please note, to use the Python code, you have to know how to assign the full text of an article of interest to the variable text, and how to output the total word count and the counts of positive/negative words in text.

# Get tone dictionary
with open('lmdict.txt') as list:
    lines = list.readlines()
dict = {}
for l in lines:
    if l[0:2] == '>>':
        cat = l[2:].strip()
        dict[cat] = []
    else:
        l = l.strip()
        if l:
            dict[cat].append(l)

# Set up regular expressions
regex = {}
for cat in dict.keys():
    pattern = '\\b(?:' + '|'.join(dict[cat]) + ')\\b'
    regex[cat] = re.compile(pattern, re.IGNORECASE)

# Get tone count
wordcount = len(text.split())
for cat in count.keys():
    count[cat] = len(regex[cat].findall(text))
print(count)

# Get tone dictionary

with open('lmdict.txt') as list:

lines = list.readlines()

dict = {}

for l in lines:

if l[0:2] == '>>':

cat = l[2:].strip()

dict[cat] = []

else:

l = l.strip()

if l:

dict[cat].append(l)

# Set up regular expressions

regex = {}

for cat in dict.keys():

pattern = '\\b(?:' + '|'.join(dict[cat]) + ')\\b'

regex[cat] = re.compile(pattern, re.IGNORECASE)

# Get tone count

wordcount = len(text.split())

for cat in count.keys():

count[cat] = len(regex[cat].findall(text))

print(count)

In the first part of the code, I read the dictionary or the word list into a Python dictionary variable. The word list used here is supposed to be a .txt file and in the following format:

>>positive
BETTER
SUCCESS
VALUABLE

>>negative
ABANDON
ABNORMAL
ANNOY

>>positive

BETTER

SUCCESS

VALUABLE

>>negative

ABANDON

ABNORMAL

ANNOY

For accounting and finance research, a commonly used positive/negative word list was developed by Bill McDonald. See his website.

In the second part of the code, I create regular expressions that are used to find occurrences of positive/negative words. The last few lines of codes are used to get the counts of positive/negative words in the text.

Posted in Python | 14 Comments

How to remove duplicate GVKEY-DATADATE when using Compustat Annual (FUNDA) and Quarterly (FUNDQ) data?

Posted on August 21, 2015 by Kai Chen

The annual data (FUNDA) is easy to deal with; we just need to apply the following conditions:

indfmt=="INDL" & datafmt=="STD" & popsrc=="D" & consol=="C"

If we have converted FUNDA to Stata format, the uniqueness of GVKEY–DATADATE can be verified using the following Stata command:

duplicates report gvkey datadate if indfmt=="INDL" & datafmt=="STD" & popsrc=="D" & consol=="C"

This command should return “no duplicates”.

The quarterly data (FUNDQ) is a bit more complicated. First of all, applying the same conditions won’t work. In fact, 99.7% observations in FUNDQ already satisfy these conditions. However, duplicate GVKEY–DATADATEs still exist in FUNDQ. The root cause of these duplicates is a firm changing its fiscal year-end. I use the following example for illustration:

Variable definition: FYEARQ – fiscal year; FQTR – fiscal quarter; FYR – fiscal year-end month; DATACQTR – calendar quarter; DATAFQTR – fiscal quarter; ATQ – total assets; NIQ – quarterly net income; NIY – year-to-date net income.

In this example, duplicates exist for three DATADATEs: 2010-03-31, 2010-06-30, and 2010-09-30. The data suggest that on March 31, 2010, the firm changed its fiscal year-end from March 31 to December 31 (i.e., FYR changed from 3 to 12). As a result, 2010-03-31 appeared twice in FUNDQ, once as fiscal 2009Q4 (based on the old fiscal year-end) and once as 2010Q1 (based on the new fiscal year-end). FUNDQ also reports additional duplicates for the subsequent two quarters (I don’t know why). Additionally, if we compare NIQ and NIY as highlighted in the red rectangle, the observation for fiscal 2009Q4 indicates NIY > NIQ, which makes sense as NIY is a four-quarter sum and NIQ is single-quarter net income. In contrast, the observation for fiscal 2010Q1 indicates NIQ = NIY as both are single-quarter net income in this case.

So, what’s the best strategy to remove duplicate GVKEY–DATADATEs?

Before we answer this question, let’s take a closer look at duplicate GVKEY–DATADATEs in FUNDQ, which reveals that 99.8% of GVKEY–DATADATEs in FUNDQ are unique as of December 5, 2107. This suggests that no matter how we deal with duplicates, even simply delete all of them, our results probably won’t change in a noticeable way.

That said, if we want to remove duplicates more carefully, COMPUSTAT provides the following clue:

In the definition of DATAFQTR, COMPUSTAT notes that,

Note: Companies that undergo a fiscal-year change may have multiple records with the same datadate. Compustat delivers those multiple records with the same datadate but each record relates to a different fiscal year-end period.

Rule: Select records from the co_idesind data group where datafqtr is not null, to view as fiscal data.

Unfortunately, I find that the suggested rule is not the best strategy because COMPUSTAT seems to set DATAFQTR as missing or non-missing inconsistently. In my opinion, the best strategy is to retain the GVKEY–DATADATE that reflects the most recent change of fiscal year-end. This means, in the above example, we should delete the following observations:

- DATADATE = 2010-03-31 and FYR = 3
- DATADATE = 2010-06-30 and FYR = 3
- DATADATE = 2010-09-30 and FYR = 3

Suppose we have converted FUNDQ to Stata format. The following Stata code will implement the above strategy. The code will also fill in missing DATAFQTR and remove duplicate GVKEY–DATAFQTR, which will later allow us to use the tsset command and perform lag and change calculations in Stata, e.g., to get beginning-of-quarter total assets or calculate quarterly changes in sales. Stata really shines in lag and change calculations for panel data—a superb advantage over SAS.

use fundq, clear

// Keep necessary variables only for testing
keep gvkey datadate fyearq fqtr fyr datacqtr datafqtr atq saleq

// Generate unique ID for stable sorting (keep original order when tied)
gen _id = _n

// Check if gvkey, datadate, and fyr are missing - NO
assert !missing(gvkey, datadate, fyr)  // manually clean up missing values

// Convert datacqtr and datafqtr to datatime
gen _datacqtr = quarterly(datacqtr, "YQ")
gen _datafqtr = quarterly(datafqtr, "YQ")
format _datacqtr _datafqtr %tq
drop datacqtr datafqtr
rename _datacqtr datacqtr
rename _datafqtr datafqtr
order datacqtr datafqtr, after(fyr)

// Check if datafqtr is a simple combination of fyearq and fqtr - YES
gen _datafqtr = yq(fyearq, fqtr)
gen _diff = _datafqtr - datafqtr
sum _diff, detail
drop _datafqtr _diff

// Check if datacqtr is simply based on datadate - NO. Jan, Apr, Jul and Oct will be coded as the previous quarter
gen _datacqtr = qofd(datadate)
format _datacqtr %tq
gen _diff = datacqtr - _datacqtr
sum _diff, detail
drop _datacqtr _diff

// Remove duplicate gvkey-datadate
duplicates tag gvkey datadate, ge(dup)
tab dup
assert dup == 1 | dup == 0  // if false, manually clean up dup >= 2

sort gvkey datadate _id

by gvkey: egen is_fyr_changed = max(dup)
tab is_fyr_changed

by gvkey: egen max_dup_gvkey_datadate = total(dup)
sum max_dup_gvkey_datadate, detail
local iter = r(max) - 2

gen last=1 if dup==1 & dup[_n+1]==0
replace last=0 if dup==1 & last==.
 
by gvkey: gen newfyr=fyr[_n+1] if last==1
by gvkey: replace newfyr=newfyr[_n+1] if last==0 & last[_n+1]==1
forvalues i = 1/`iter' {
	by gvkey: replace newfyr=newfyr[_n+1] if last==0 & last[_n+1]==0
}

drop if dup==1 & fyr!=newfyr

duplicates report gvkey datadate   // no duplicates

drop dup is_fyr_changed max_dup_gvkey_datadate last newfyr

save fundq_no_dup_gvkey_datadate, replace

// Fill in missing datafqtr
* generate next fiscal year end date
gen _fyearend_1 = lastdayofmonth(mdy(fyr, 1, year(datadate)))
gen _fyearend_2 = lastdayofmonth(mdy(fyr, 1, year(datadate)+1))

gen fyearend = _fyearend_1 if datadate <= _fyearend_1
replace fyearend = _fyearend_2 if datadate > _fyearend_1
format fyearend %td
order fyearend, after(datafqtr)

* generate _datafqtr based on Compustat Manual
gen _month_diff = mofd(fyearend) - mofd(datadate)

gen _fyearq = year(fyearend)
replace _fyearq = year(fyearend) - 1 if month(fyearend) <= 5

gen _datafqtr = yq(_fyearq, 1) if _month_diff == 9
replace _datafqtr = yq(_fyearq, 2) if _month_diff == 6
replace _datafqtr = yq(_fyearq, 3) if _month_diff == 3
replace _datafqtr = yq(_fyearq, 4) if _month_diff == 0
format _datafqtr %tq

* check if datafqtr is correct if datafqtr is not missing - YES
gen _diff = datafqtr - _datafqtr if datafqtr!=.
sum _diff, detail
drop _diff

* replace missing datafqtr with computed _datafqtr
replace datafqtr = _datafqtr if datafqtr == .

// Remove duplicate gvkey-datafqtr
duplicates tag gvkey datafqtr, ge(dup)
tab dup

by gvkey: egen has_dup=max(dup)
tab has_dup

gsort - _id
duplicates drop gvkey datafqtr, force
sort _id

duplicates report gvkey datafqtr  // no duplicates
drop dup has_dup _*

replace fqtr = quarter(dofq(datafqtr)) if fqtr==.
assert !missing(gvkey, datadate, fyr, datafqtr, fqtr)  // manually investigate missing values

save fundq_no_dup_gvkey_datafqtr, replace

100

101

102

103

104

105

106

107

108

109

110

111

112

use fundq, clear

// Keep necessary variables only for testing

keep gvkey datadate fyearq fqtr fyr datacqtr datafqtr atq saleq

// Generate unique ID for stable sorting (keep original order when tied)

gen _id = _n

// Check if gvkey, datadate, and fyr are missing - NO

assert !missing(gvkey, datadate, fyr) // manually clean up missing values

// Convert datacqtr and datafqtr to datatime

gen _datacqtr = quarterly(datacqtr, "YQ")

gen _datafqtr = quarterly(datafqtr, "YQ")

format _datacqtr _datafqtr %tq

drop datacqtr datafqtr

rename _datacqtr datacqtr

rename _datafqtr datafqtr

order datacqtr datafqtr, after(fyr)

// Check if datafqtr is a simple combination of fyearq and fqtr - YES

gen _datafqtr = yq(fyearq, fqtr)

gen _diff = _datafqtr - datafqtr

sum _diff, detail

drop _datafqtr _diff

// Check if datacqtr is simply based on datadate - NO. Jan, Apr, Jul and Oct will be coded as the previous quarter

gen _datacqtr = qofd(datadate)

format _datacqtr %tq

gen _diff = datacqtr - _datacqtr

sum _diff, detail

drop _datacqtr _diff

// Remove duplicate gvkey-datadate

duplicates tag gvkey datadate, ge(dup)

tab dup

assert dup == 1 | dup == 0 // if false, manually clean up dup >= 2

sort gvkey datadate _id

by gvkey: egen is_fyr_changed = max(dup)

tab is_fyr_changed

by gvkey: egen max_dup_gvkey_datadate = total(dup)

sum max_dup_gvkey_datadate, detail

local iter = r(max) - 2

gen last=1 if dup==1 & dup[_n+1]==0

replace last=0 if dup==1 & last==.

by gvkey: gen newfyr=fyr[_n+1] if last==1

by gvkey: replace newfyr=newfyr[_n+1] if last==0 & last[_n+1]==1

forvalues i = 1/`iter' {

by gvkey: replace newfyr=newfyr[_n+1] if last==0 & last[_n+1]==0

}

drop if dup==1 & fyr!=newfyr

duplicates report gvkey datadate // no duplicates

drop dup is_fyr_changed max_dup_gvkey_datadate last newfyr

save fundq_no_dup_gvkey_datadate, replace

// Fill in missing datafqtr

* generate next fiscal year end date

gen _fyearend_1 = lastdayofmonth(mdy(fyr, 1, year(datadate)))

gen _fyearend_2 = lastdayofmonth(mdy(fyr, 1, year(datadate)+1))

gen fyearend = _fyearend_1 if datadate <= _fyearend_1

replace fyearend = _fyearend_2 if datadate > _fyearend_1

format fyearend %td

order fyearend, after(datafqtr)

* generate _datafqtr based on Compustat Manual

gen _month_diff = mofd(fyearend) - mofd(datadate)

gen _fyearq = year(fyearend)

replace _fyearq = year(fyearend) - 1 if month(fyearend) <= 5

gen _datafqtr = yq(_fyearq, 1) if _month_diff == 9

replace _datafqtr = yq(_fyearq, 2) if _month_diff == 6

replace _datafqtr = yq(_fyearq, 3) if _month_diff == 3

replace _datafqtr = yq(_fyearq, 4) if _month_diff == 0

format _datafqtr %tq

* check if datafqtr is correct if datafqtr is not missing - YES

gen _diff = datafqtr - _datafqtr if datafqtr!=.

sum _diff, detail

drop _diff

* replace missing datafqtr with computed _datafqtr

replace datafqtr = _datafqtr if datafqtr == .

// Remove duplicate gvkey-datafqtr

duplicates tag gvkey datafqtr, ge(dup)

tab dup

by gvkey: egen has_dup=max(dup)

tab has_dup

gsort - _id

duplicates drop gvkey datafqtr, force

sort _id

duplicates report gvkey datafqtr // no duplicates

drop dup has_dup _*

replace fqtr = quarter(dofq(datafqtr)) if fqtr==.

assert !missing(gvkey, datadate, fyr, datafqtr, fqtr) // manually investigate missing values

save fundq_no_dup_gvkey_datafqtr, replace

Please note: I also agree with one of the readers’ comments that “(how to remove duplicates) depends on what you need”. For example, in one of my projects, I want to examine three-day CAR around earnings announcement date (RDQ) and use total assets as the deflator in my regression. As a result, when duplicate GVKEY–DATADATEs exist, the one with non-missing RDQ and ATQ will be preferred if I want to retain as many observations as possible.

Posted in Learning Resources, Stata | Tagged Learning Resources, Stata | 18 Comments

If beginning year and ending year are known, how to fill in years in between?

Posted on August 21, 2015 by Kai Chen

Question:

Suppose two companies A and B are connected in some years. Say, right now the data structure is the following:

Company 1 Company 2 Starting Year Ending Year
A B 2000 2006
A C 1998 2003
C D 1995 1997

I want to find a way to generate:

Company 1 Company 2 Year
A B 2000
A B 2001
…………….
A B 2006
A C 1998
……………..
A C 2003
C D 1995
……………..
C D 1997

Answer:

DATA temp01;
   INPUT id1 id2 year1 year2;
   DATALINES;
     100  501  1999   2003
     200  688  2007   2011
     333  777  1995   2008;
run;
 
data temp02;
   set temp01;
   do i= 0 to (year2-year1);
      year=year1+i;
      output;
   end;
   keep id1 id2 year;
run;

DATA temp01;

INPUT id1 id2 year1 year2;

DATALINES;

100 501 1999 2003

200 688 2007 2011

333 777 1995 2008;

run;

data temp02;

set temp01;

do i= 0 to (year2-year1);

year=year1+i;

output;

end;

keep id1 id2 year;

run;

Posted in SAS | 3 Comments

TAR-Style Word Template

Use Python to download TXT-format SEC filings on EDGAR (Part II)

Use Python to extract Intelligence Indexing fields in Factiva articles

A loop of cross-sectional regressions for calculating abnormal accruals in Stata

The impact of WRDS transition to the new WRDS Cloud server

Rolling-window computation in SAS and Stata

SAS macro for event study and beta

Use Python to calculate the tone of financial articles

How to remove duplicate GVKEY-DATADATE when using Compustat Annual (FUNDA) and Quarterly (FUNDQ) data?

If beginning year and ending year are known, how to fill in years in between?

Categories

Archives

Site Admin