Democratic Debate #1: Television Coverage
The GDELT Project and the Internet Archive are partnering together to help better understand which soundbites and speakers are dominating the political discourse on television. In particular, we are working to translate the social media concepts of "memes" and "going viral" to the television world. Using the Internet Archive's Television News Archive, which monitors major American and international television stations in realtime, along with an archive of more than 735,000 television shows since 2009, we scan all monitored television programming for audio fingerprints of each soundbite of selected major political speeches and identify all excerpts of those soundbites across television news shows in the following days. The tool we use, audfprint, developed by the Laboratory for the Recognition and Organization of Speech and Audio at Columbia University, scans the audio track of each show, so it is not dependent on closed captioning, which is extremely noisy and entirely absent from many foreign language broadcasts. The tool is extremely sensitive, able to detect brief excerpts even when they are overdubbed by a commentator and/or other sound effects.
Today we are excited to unveil our latest application: the Democratic Presidential Prime Debate, held at 8:30PM EST on October 13, 2015 at the Wynn Las Vegas hotel in Las Vegas, Nevada. The entire transcript of the debate was hand-segmented into soundbites and all television news programming monitored by the Internet Archive for 24 hours following the debate were scanned for any excerpt of those soundbites, which are displayed below. Browse the entire transcript below and click on any passage to see how many times and where it was excerpted, and click on the video icon to the left of each passage or the list of shows mentioning the excerpt in the bottom right to view a brief video clip of the soundbite. These numbers only reflect those television shows monitored by the Internet Archive, representing only a small set of television stations in the United States. Thus, these numbers are far from exhaustive in terms of measuring the total reach of the debate, but offers a powerful glimpse into which pieces of the speech resonated and where.
What you are seeing here is a first glimpse of a whole new way of exploring television, using enormously powerful computer algorithms as a new lens through which to explore the Internet Archive's massive archive of television news to create for the first time a way of tracking what's "going viral" on television. Quite literally this project took an hour-long political speech, broke it into soundbites, and scanned two weeks of national television news programming for any excerpt of any of those soundbites. Imagine the future possibilities for tracking how soundbites move between social and mainstream media, and the future ability to apply these techniques to explore soundbites in online video!
Visualizing the Debate
The final results of this analysis are available through the interface below. By default the entire debate transcript is shown, but you can use the search box below to narrow to only soundbites containing a particular keyword or that were aired on a particular station or show. The timeline below shows how many times each soundbite was broadcast. As you scroll through the transcript, the top-most paragraph will automatically highlight in yellow and the corresponding time period will highlight in the timeline below,l while the sidebar to the right of the transcript will display key statistics about that passage, along with a list of links to view previews of every identified mention of that soundbite on a news show. The timeline allows you to zoom into any section to see it more clearly - click anywhere in the middle of the graph on the white background (not the bottom of the graph) and drag with your mouse to highlight a section of the timeline - a "reset zoom" button will appear at the top right of the timeline display to zoom back out to the original view. While zoomed in you can hold down the shift key on your keyboard and click and drag to pan the timeline forward/backwards.
By default the entire debate transcript is displayed below. You can use the options below to filter to only a subset of the debate, such as only those lines appearing on a particular television station or show, or only those lines containing a certain keyword/phrase or spoken by a particular person. Only lines matching all of your criteria below are included.
VIEW BY-STATION BREAKDOWN
(Displays a grid of piecharts, one per television network, that shows the percentage of matching soundbites on that network from each candidate.)
(SANDERS) Let me say -- let me say something that may not be great politics. But I think the secretary is right, and that is that the American people are sick and tired of hearing about your damn e-mails.(APPLAUSE)
All shows mentioning line: