First launched in January 2014, VoxGov is a discovery platform that aggregates a broad range of official and ephemeral information resources issued by individual representatives and organizations from all branches of the U.S. Federal Government, and links that content to publicly accessible government documentation. VoxGov harvests content in "real-time" through continuous indexing and caches web content for discovery, retaining information that may no longer be online.
As of 2019, VoxGov represents that it aggregates and indexes information from over 14,000 government web locations, including official government sites as well as social media sites of government representatives and agencies.
VoxGov harvests content from all three branches of the Federal government. For legislative content, VoxGov indexes official congressional representative websites and "release sites" (press releases, speeches, testimony, articles and editorial statements), as well as commentary posted by their offices on over 2,000 social media sites (Facebook, Twitter, and YouTube). Executive content includes release sites (news, announcements, reports) and social media from the executive office, cabinet, departments and agencies such as the Department of State, Defense, U.S. embassies, etc. VoxGov also indexes judicial and regulatory information disseminated by various other federal government bodies, as well as autonomous bodies such as the Voice of America and U.S. political parties.
Additionally the service indexes official documents such as the Congressional Record, Congressional Research Service reports, the Federal Register, legislation (bills, amendments, resolutions, etc.), and congressional documents (including testimony transcripts and committee reports). VoxGov has estimated that these primary source documents--already widely available elsewhere--represent less than 10% of VoxGov indexed content (2.2 million document files out of an estimated 30 million documents).1 Results from these sources are reported in line with the other collected material, enabling users to track commentary and official responses to issues during all phases of the political, legislative and regulatory process. VoxGov links out to some of these sources on external platforms, including open access content on congress.gov. Federal Register content is gathered from GPO.gov, with public inspection documents from FederalRegister.gov. As of February 2019, the platform has indexed more than five million official releases and over 31 million social media posts.
Representing the published output of tens of thousands of U.S. Federal Government web locations, the resulting database yields an enormous amount of information. VoxGov indicates that more than 20,000 files are added daily.2 Content ranges widely in scope, from official statements by congressional representatives on legislation to commentary and "retweets" posted on Twitter. VoxGov applies rigorous indexing to each entry to aid filtering and navigation by source, topic, and content type. There is some redundancy to content as official releases are disseminated and reposted via other channels or as representatives post content to multiple sites. VoxGov does not capture outlinks to other sites.
While the VoxGov platform was first released in 2014, it includes social media posts gathered since at least 2002. VoxGov has also indexed historical documentation from official sites dating back to the 19th century, with the majority of the content falling between present day and 1993. As new source entities and resource types are added, the developers seek to “include all associated available archives so they remain complete.”
Given VoxGov’s broad aims for harvesting content, the researcher may be left with the interesting methodological question of discerning what portion of congressional online and social media content the database will come to represent. The inherent difficulty in defining the entire U.S. federal government “domain” on the web has often been cited by authorities on preservation like James A. Jacobs in his March 2014 report Born-Digital U.S. Federal Government Information: Preservation and Access. This difficulty complicates the task of interpreting the term frequency data utilized in the interface.VoxGov represents that it has employed several librarians to identify and map federal agencies in detail.