Lawsuit data from Stanford Law School’s Securities Class Action Clearinghouse

Posted on July 15, 2018 by Kai Chen

The Python script in the original post has been removed as its use violates the Terms of Service of the data provider.

Stanford Law School’s Securities Class Action Clearinghouse is always happy to share the data (subject to a Non-Disclosure Agreement) with academic researchers for non-commercial research or analysis. If you have any data needs, please contact their SCAC Content Manager at scac@law.stanford.edu.

This entry was posted in Python and tagged Python. Bookmark the permalink.

14 Responses to Lawsuit data from Stanford Law School’s Securities Class Action Clearinghouse

Griffin Geng says:

February 23, 2019 at 12:35 am

Awesome! Thanks for sharing!

Reply
Tigran says:

October 2, 2019 at 4:47 pm

I was about to go through building a scraper for this from scratch… you saved me so much time! This is great!

Reply
Tianhua says:

March 9, 2021 at 7:05 pm

Hi Dr. Chen,
Thanks so much for this coding. I just got stuck in using this codes as the Securities Class Action Clearinghouse requires login to get the full data. I tried “mechanize” pckage to login but it doesn’t work. Do you have any ideas about how to get the access to the website?

Reply
- Yannick says:
  
  May 22, 2021 at 5:28 pm
  
  Hi do you have a solution to this problem?
  
  Kind regards,
  Yannick
  
  Reply
- Kai Chen says:
  
  January 8, 2022 at 12:04 pm
  
  See the update on Jan 8, 2022.
  
  Reply
Pengyuan li says:

March 21, 2021 at 9:32 am

added error handling in get_class_period method to avoid the issue if the case’s status is currently Active.

def get_class_period(soup):
section = soup.find(“section”, id=”fic”)
try:
text = section.find_all(“div”, class_=”span4″)
start_date = text[4].get_text()
end_date = text[5].get_text()
except:
start_date = ‘null’
end_date = ‘null’
return start_date, end_date

Reply
- Md Enayet Hossain says:
  
  April 11, 2022 at 12:47 am
  
  Thanks for the correction. But this only solves the error issue. It does not return the class period for any lawsuits. Any idea how I can get the class period and access the lawsuit files? The html does not even show the contents beyond case summary.
  
  Reply
Yuchen says:

March 31, 2021 at 10:20 pm

how do you parse settlement value?

Reply
Mengxi Chen says:

November 3, 2021 at 9:37 am

Hi Dr. Chen and Shiyu! Thank you so much for sharing this! I appreciate it!

Reply
Elisha Yu says:

May 4, 2022 at 3:52 pm

Thank you so much Dr.Chen!

Just a small note: you need to set the Chrome default to maximize the window, or add this before line 18:
driver.maximize_window()

Reply
Marcelo Ortiz says:

July 22, 2022 at 6:00 am

Thanks for sharing the files and codes, very useful!

Reply
Lin says:

November 5, 2022 at 5:43 am

Dr. Chen,

This is awesome. Thank you for your generous sharing!

Reply
carlos rivas says:

January 17, 2023 at 9:11 pm

hi kai,

thanks again for making your code available.
iam also trying to scrape the legal documents/pdfs.
this code works to download other url pdf files but not the pdfs from stanford class action clearinghouse(the only difference, i can see is that there is login required but i am already logged in by the time we reach this code):
import requests
# file_url = “https://www.bu.edu/econ/files/2014/08/DLS1.pdf”
file_url1 = ‘http://securities.stanford.edu/filings-documents/1080/IBMC00108070/2023113_f01c_23CV00332.pdf’
r = requests.get(file_url1, stream = True)

with open(“C:/Users/inter/OneDrive/Desktop/securities_class_action_docs/test.pdf”, “wb”) as file:
for block in r.iter_content(chunk_size = 1024):
if block:
file.write(block)

Reply
- carlos rivas says:
  
  January 26, 2023 at 2:50 am
  
  i meant to ask you. how do you think i can download the pdfs into google drive as pdfs and not htmls?
  
  Reply

Lawsuit data from Stanford Law School’s Securities Class Action Clearinghouse

14 Responses to Lawsuit data from Stanford Law School’s Securities Class Action Clearinghouse

Leave a Reply to Griffin Geng Cancel reply

Categories

Archives

Site Admin