
Note: This article is the first in a series on recent research conducted by Walton College students
Social media has engrained itself in our society. Politicians and voters are now able
                        to engage with each other like never before on websites like Twitter and Facebook.
                        Historically, voters have had to take politicians on their word and rely on journalists
                        to report on their legislative actions. Now, the public has a near-complete record
                        on what politicians say on the internet to their followers and how they act legislatively.
                        Furthermore, we can analyze the similarities and differences of their words and their
                        political actions, potentially holding policy makers to an even higher level of accountability. 
 
University of Arkansas seniors Sai Elagandula and Griffin Fulton noticed how politicians were using Twitter to campaign and communicate to the American
                        public throughout recent election cycles. Elagandula and Fulton wanted to analyze
                        how senators’ tweets stacked up with their legislative decisions. To do so they trained
                        a classification model to track deviation between predicted political ideology based
                        on tweets and ideology based on legislative actions. 
 
They used a political ideology tracker published by GovTrack, a website that enables users to track bills and members of Congress, as a basis
                        for comparison. From Jan. 3, 2019, through Jan. 3, 2020, GovTrack tracked 100 U.S.
                        senators and assigned each of them a score from most politically right to least according
                        to their legislative behavior.  Topping GovTrack’s list for most politically right
                        was Marsha Blackburn (R – TN) with a score of 1.00; at the other end of the spectrum was Bernie Sanders (I – VT) with a score of 0.0.  
Building the Right Data Set
To identify the deviations between online discourse and legislative action, Elagandula
                        and Fulton trained their model to categorize senators’ tweets as Democratic or Republican
                        using tweets from the Democratic and Republican party official Twitter accounts as
                        well as the chairs of the DNC and GOP as a standard for partisan affiliation. They
                        employed snscrape, an open-source social network scraper, to collect tweets from official party accounts
                        for the data set.  
 
The DNC data set contained 6,752 tweets and the GOP set contained 6,166 tweets. Elagandula
                        and Fulton assigned each tweet the classification of Democrat or Republican accordingly.
                        They then used BERT (Bidirectional Encoder Representations from Transformers), a transformer-based machine learning technique for language processing pre-training
                        developed by Google, to turn tweets into encoded values. A percentage of words in
                        the tweets were replaced with a masked token, and BERT attempted to predict these
                        words. Elagandula and Fulton then sought to minimize the loss function by fine-tuning
                        weights and biases. In the end, they were able to achieve an accuracy rate of 93%. 
 
They applied the model to tweets those senators made during their time in the 117th
                        Congress, from the start of the legislative session to April 20, 2022, and tracked
                        how many were categorized as the opposite party. There are only 86 senators ranked,
                        rather than the full 100, because tweets were collected from senators in the current
                        congress (117th) and the GovTrack rankings were made at the end of the previous congress
                        (116th). Senators not on either list were thus omitted. Using these statistics, they
                        then calculated a match rate to derive their own ranked list of predicted senator
                        political ideology based on tweets. Elagandula and Fulton then deployed a regression
                        analysis to see how twitter characteristics, margin of victory and years in office
                        to explain senators' match rate. 
 
Fulton was especially intrigued by their regression analysis. The political ideology
                        match rate almost always correlated with their established explanatory variables.
                        They considered each politician’s average number of tweets per week, the number of
                        followers each politician had on their account, their margin victory in the most recent
                        election and the amount of time they had spent in office. They were fascinated by
                        the factors they discovered contributing to the perceived discrepancy between what
                        politicians tweet and the legislative actions they take.  
 
When Elagandula and Fulton compared the list, they created with GovTrack’s rankings,
                        they found themselves surprised at some of the senators’ placement. Some politicians
                        were right on the money. For example, Senator Blackburn’s tweets were almost 100%
                        Republican, and Ted Cruz, who ranked No. 4 for most-right legislative behavior, rose
                        to No. 3 for his predicted ideology based on tweets. What is remarkable, however,
                        is how Senator Sanders' ranking climbed from most left-leaning, No. 86 for political
                        ideology based on legislation, to No. 61 based on his tweets. The researchers explained
                        this is largely due to his discourse on Twitter about domestic policy issues, which
                        are typically discussed by Republican Senators and traditionally resonate with the
                        right, according to Elagandula.  
Trust the Tweets
This research may be encouraging for younger people. Roughly 37% of those who identify
                        as Democrat on Twitter are between the ages of 18 and 29, and 22% of Republican users
                        also fall in that demographic, according to a 2020 Pew Research Center study. As the world moves online, these numbers are growing, and political campaigning
                        has largely transitioned to targeted ads on social media. Twitter is currently the
                        main platform policy makers are using to communicate to voters, Elagandula and Fulton
                        said, which is why the public should be wary of what they read online when it pertains
                        to politics.  
 
By tracking and recording what politicians say on Twitter, the public can better hold
                        policy makers accountable. The discrepancy between what politicians say and do can now be explained, or at least
                           monitored, by research using models and machine-based learning. Elagandula and Fulton’s project proves how accurate a text classification model can
                        be, especially when compared with GovTrack’s political ideology chart. However, the
                        range of political affiliation in the U.S. did create limiting factors. 
 
Although Elagandula and Fulton had a short time frame and a narrow scope, they encourage
                        others to further train their model to identify ideological subgroups. Politicians
                        and their tweets could be considered alt-left or alt-right, but their current model,
                        which used the official party accounts as a foundation for categorizing tweets as
                        right or left, would not be able to identify them as such. The open-source, online
                        community has proven to be useful and many tools for creating machine-based learning
                        are free on the internet, according to Elagandula and Fulton. All of the tools they
                        used when creating their model were found for free on websites like GitHub or snscrape.
                        The senior’s project and methods can be viewed on their own GitHub portfolio. The accessibility of tools such as these speaks to the promise of continuing their
                        research or creating new projects.  
 
If researchers monitor the words and actions of politicians, transparency, accountability and trust between the government and the public will
                           all increase naturally. They encourage Twitter users to take what they see on the platform with a grain
                        of salt and believe their project can be a helpful tool when deciding who to trust
                        and how to cast your ballot. 


 
  
  
 