VoxGov | eDesiderata

CRL Status:

Expired Offer

Feedback:

0 User comments 0 0

CRL licensing and community input features are only available with a CRL member login.

If your institution is a CRL Member please:

Overview

VoxGov, first launched for subscription access in January 2014, is a unique “discovery platform” which aggregates a broad range of official and ephemeral information resources issued by individual representatives and organizations from all branches of the U.S. Federal Government, and links that content to publicly accessible government documentation. It is an online subscription-based service primarily used for Political Science and Public Policy research.

VoxGov's 62M documents come from approximately 8500 sources and located from across 600,000 + website locations, including approximately 45,000 unique government web locations as well as social media sites of government representatives and agencies. The provider facilitates continuous daily indexing, and cache web content including “information that is no longer retrievable.” VoxGov assert that many governmental press releases, as well as social media posts, have “never before been aggregated, much less collected in real-time.”

Provider

voxgov

May 31, 2024 4:27pm

Details

Subject Areas

Social Sciences

Resource Types

Archival materials

Correspondence

Government publications

Major Languages

CRL Review

Collection Content

First launched in January 2014, VoxGov is a discovery platform that aggregates a broad range of official and ephemeral information resources issued by individual representatives and organizations from all branches of the U.S. Federal Government, and links that content to publicly accessible government documentation. VoxGov harvests content in "real-time" through continuous indexing and caches web content for discovery, retaining information that may no longer be online.

According to the provider, the "Elections" section of the platform was highly used during the 2016 Presidential election. The Provider has confirmed that the VoxGov Election tools will be available during the 2020 Presidential Primaries and the 2020 Presidential election.

Content Sources

As of 2020, VoxGov represents that it aggregates and indexes information from over 8500 sources and is located from across 600,000 + website locations, including approximately 45,000 unique government web locations as well as social media sites of government representatives and agencies. According to the provider, content is updated 24 hours a day, 7 days a week.

VoxGov harvests content from all three branches of the Federal government. For legislative content, VoxGov indexes official congressional representative websites and "release sites" (press releases, speeches, testimony, articles and editorial statements), as well as commentary posted by their offices on over 2,000 social media sites (Facebook, Twitter, and YouTube). Executive content includes release sites (news, announcements, reports) and social media from the executive office, cabinet, departments and agencies such as the Department of State, Defense, U.S. embassies, etc. VoxGov also indexes judicial and regulatory information disseminated by various other federal government bodies, as well as autonomous bodies such as the Voice of America and U.S. political parties.

Additionally the service indexes official documents such as the Congressional Record, Congressional Research Service reports, the Federal Register, legislation (bills, amendments, resolutions, etc.), and congressional documents (including testimony transcripts and committee reports). VoxGov has estimated that these primary source documents--already widely available elsewhere--represent less than 10% of VoxGov indexed content (2.2 million document files out of an estimated 30 million documents).¹ Results from these sources are reported in line with the other collected material, enabling users to track commentary and official responses to issues during all phases of the political, legislative and regulatory process. VoxGov links out to some of these sources on external platforms, including open access content on congress.gov. Federal Register content is gathered from GPO.gov, with public inspection documents from FederalRegister.gov.

Representing the published output of tens of thousands of U.S. Federal Government web locations, the resulting database yields an enormous amount of information. As of February 2019, the platform has indexed more than five million official releases and over 31 million social media posts. VoxGov indicates that more than 20,000 files are added daily.² Content ranges widely in scope, from official statements by congressional representatives on legislation to commentary and "retweets" posted on Twitter. VoxGov applies rigorous indexing to each entry to aid filtering and navigation by source, topic, and content type. There is some redundancy to content as official releases are disseminated and reposted via other channels or as representatives post content to multiple sites. VoxGov does not capture outlinks to other sites.

Chronological Coverage

While the VoxGov platform was first released in 2014, it includes social media posts gathered since at least 2002. VoxGov has also indexed historical documentation from official sites dating back to the 19th century, with the majority of the content falling between 1993 and present day. As new source entities and resource types are added, the developers seek to “include all associated available archives so they remain complete.”

Given VoxGov’s broad aims for harvesting content, the researcher may be left with the interesting methodological question of discerning what portion of congressional online and social media content the database will come to represent. The inherent difficulty in defining the entire U.S. federal government “domain” on the web has often been cited by authorities on preservation like James A. Jacobs in his March 2014 report Born-Digital U.S. Federal Government Information: Preservation and Access. This difficulty complicates the task of interpreting the term frequency data utilized in the interface.VoxGov represents that it has employed several librarians to identify and map federal agencies in detail.

Delivery

The proprietary platform developed for VoxGov combines complex features with a fairly intuitive display. The simple keyword search box prompts suggested terms and phrases for users.

The advanced search page is intuitive, and employs Boolean operators as well as allowing application of many of the database’s filter elements prior to searching, including by content type, source, source type, demographics and political parties. There is a “my voxgov” login tool allowing the user to save search strings and otherwise track various subjects or congressional figures in the news. Users can "follow" individual sources so as to quickly access a source results page.

Search results can yield an overwhelming amount of content, but a range of filters are offered to control the display of results, including filtering by document types such as press releases, congressional documents, or social media. Filtering by additional keywords, personal names, organizations (government agencies and NGOs), and place names is also available. VoxGov applies these 244 unique publishing points to simplify user discovery of content.

A unique feature is the ability to filter results by the source entity’s party affiliation, gender, race, and other demographic criteria. Results are extensively indexed, with named entities (people, places, and agencies) and keywords linked to related search suggestions. Search results can be easily sifted through using the "search within" function as well.

The VoxGov platform provides a summary results page for each search event, with a content feeds, a timeline and a graph visualization. The "graph" feature can be displayed or hidden at the top of the user interface. The graphs depict a breakdown of the top sources of a search event, with another graph representing the chronological scope of the search, and finally a word cloud that represents the significant keyword of the search event.

VoxGov employs the “Lucene scoring model” plus some additional proprietary ranking methods to determine term relevance. Feedback on relevance and term frequency is provided in relation to the overall occurrence of a term in the source documents collected (noting the number of source documents searched), and additionally on the occurrence of the term in relation to the various indexed “entities”. The frequency of results is also presented in a graphic display, in the form of charts. Charts can be modified to display results filtered by political party or government agencies, and also graphically indicate the percentage of results within sources.

VoxGov is faced with the challenging problem of providing reliable publication date metadata for materials sourced from websites. They estimate that 27% of current resources do not have a publication date associated with the “starting URL page.” In that situation dates are extrapolated from document texts, and in particularly challenging cases from other contextual information such as dated resources citing the resource in question. They indicate a date accuracy rate of 99.95%. Some cases may cause confusion, such as the publication dates of supplemental and related documentation (where the original documentation was published much earlier but was included as part of a recent report).

Training videos are available via https://www.voxgov.com/About/TrainingVideos but are not discernibly linked from the site.

Data downloading is available to users; as stated in the "Terms" section belowo the researcher must comply with certain criteria, complete a data application, and sign a scholarly use license agreement in order to receive exported datasets.

Terms

Subscription access was previously provided by East View Information Resources. In 2018, VoxGov reported it was moving to a direct sales model.

The 2019 VoxGov General University License is intended to ensure that use is restricted to research, education and non-commercial purposes by current students and faculty at subscribing academic institutions. Institutions enter into the license agreement with VoxGov, directly.

The License restricts excessive downloading of content by users. Automated searching or downloading, by use of scripted searches, robots, spiders, crawlers, or otherwise, is prohibited except with the prior written permission of VoxGov. Upon request, datasets representing substantial portions of the VoxGov product will be provided to users, subject to the terms and conditions of a data export agreement. According to VoxGov's "Data Download FAQs" the researcher must: comply with certain criteria, complete a data application, and sign a scholarly use license agreement in order to receive exported datasets.

Strengths and Weaknesses

The VoxGov database is a powerful tool to both view and analyze ephemeral information scattered in various government-produced media outlets. Though it launched in 2014, developers had collected content over ten years prior. The sophisticated and elegant interface offers powerful filtering and analysis of results, which may require continued refinement as the universe of new and archived content expands exponentially over time. The developers report that the indexing and search engine selected for development of VoxGov has solid potential for “indexing scalability.”

When VoxGov was launched in 2014 it provided access to 8.2 million documents. Today the site has almost 46.4M documents, increasing content by a factor of approximately 5.5. According to the provider, VoxGov continues to grow in terms of its current and archived content.

The price of VoxGov, like many databases produced for the financial and public policy markets, will represent a significant investment for some libraries. On the other hand it will be of particular use to researchers in the fields of journalism, public administration, government, law, and politics, who are often stymied by the bewildering array of U.S Federal government information sources. Its aggregation of ephemeral information will also be very useful for public policy and political science researchers, who will be led to related legislative and executive agency documentation in other government document databases.

Additional Reviews in Other Sources

"VoxGov" ccAdviscor, November 2017. http://ccadvisor.org/review/10.5260/CCA.199456 (accessed October 2, 2019).

"Internet Resources: June Edition" Choice360, June 2017. http://www.choice360.org/blog/internet-resources-june-edition (accessed October 2, 2019).

Endnotes

¹ "East View Presents VoxGov," http://www.eastview.com/online/voxgov (accessed 2/19/2019)

² "VoxGov," https://www.voxgov.com/

Community Ratings

Content scope and completeness

Cost and price-structure

Platform and user interface