I am currently trying to analyze the presidential campaign contributions made by employees of large US companies for my
thesis ( I am trying to link contributions made by all emplyees to stock returns on election day). I have a few problems
concerning linking the data.
I have a table with the names of the companies, their ticker, CEO name, etc, which looks like this:
Ticker Company Name Market Value Industry CEO Name
ADSK Autodesk 342 19 Carol A. Bartz
AEE Ameren 265 24 Gary L. Rainwater
AES AES 242 24 Paul T. Hanrahan
AET Aetna 115 14 John Wallis Rowe
I also have a table (with over 740,000 records) which contains the campaign contributions to certain candidate
committees, as follows:
Committee Id Contributor Occupation Amount
C00010603 Sant, Roger W Mr. AES Corporation/Chairman 2000
C00383653 Saint, Alexia AES Corparation/Business Analyst 1000
C00383653 Jayaratne, Josh AES Consulting LLC/Consultant 200
C00383653 Crane, Eric C Autodesk Inc./Attorney 1000
C00003418 Cohen, Sarah Ms. A.T. & T./Sales 250
C00010603 lucas, jack AT&T/software engineer 250
C00010603 Ludolph, Charles Consultant 200
C00010603 Lucas, Freddy self/Dad 999
C00010603 Luckhardt, Jack C Dr. Retired 300
C00010603 Luckhardt, John C 500
C00010603 Marsh, Kris Ms. Aspen Valley Medical/Nonprofit Exec 250
C00010603 Marsh, Dave self employed/audio producer 500
C00010603 Marsh, Leslie RMACInc./Insurance Executive 250
The two links I would like to make are:
1. Using the company name from the first table(and linking with the Occupation field in the second table), find the total
donations made by employees of a company to a specific committee,
2. Using the CEO name (and linking it with Contributor field of the second table) find out what the contribution is of the
CEO of each company to a specific Committee.
The problem with the first link is that the company name in the second table is part of the general field "Occupation" which
also contains the job title, such as "AT&T/software engineer" so that I cannot make a direct link with the "AT&T" company
name from the first table. Is their a possible way to link one table with another using wildcards or is their possible
another solution (looking within a field for a certain string)?
There is also the problem that different names are used, for example AT&T and A.T. & T. or AES, AES Corp and AES Corporation.
I think the solution to this is to try and make a table with all possible versions of a companies name, or would their be a
simple way to automate this as well?
The problem with the second link is that the names of contributors aren't written in a standard form, which also means a
direct relationship cannot be made. Any ideas on this?
As you can read, it is quite a complex problem, and as I have no programming skills I am finding it hard to link this data. I
would appreciate any input on this, as I cannot manually go through 740,000 records to look at the amounts that employees of
all S&P 500 companies have conributed.
Thanks in advance for your help!