Gale Cengage designed Nineteenth Century Collections Online (NCCO) to be an extensive database with multiple content types, covering most regions of the world. The collection was released in twelve modules (called “Archives” by Gale) between 2012-2014, and totals in all nearly 18 million pages. Ninety source libraries provided content from over 180 primary source collections. Gale commissioned conservation efforts for an extensive amount of the original source materials, while around 60% of the content was sourced from existing microfilm sets.
NCCO encompasses a broad range of content sources including monographs, newspapers, manuscripts, photographs, maps, ephemera, and statistical data. The collection modules are topically oriented and intended to integrate in varying degrees sources from world regions including the Americas (North, Central, and South), Africa, East and South Asia, and the Middle East, as well as Europe. Most of the content in the initial four modules is in English, although the later collection modules expanded coverage to other languages, mostly European.
Topics of the twelve “Archives” are:
British politics and society (2012)
c. 1.7 M pages. Newspapers, periodicals, correspondence, pamphlets, etc.
Asia and the West: diplomacy and cultural exchange (2012)
c. 2 M pages. Ministerial and consular papers; foreign missions.
British theatre, music and literature: high and popular culture (2012)
c. 1 M pages. Annotated programs, compositions, correspondence, financial records.
European literature, 1790-1840: the Corvey collection (2012)
c. 5.2 M pages. Sourced from microfiche and scans produced by Belser Wissenschaftlicher Dienst; c. 9500+ titles from a specialized European literature collection; languages evenly distributed among German, English, and French.
History of science, technology, and medicine (2013)
c. 5 M pages; includes American medical periodicals, periodicals on biology, botany, chemistry, zoology, etc.; monographs on the theory of evolution, electricity, color theory, engineering, etc.
Photography-the world through the lens (2013)
c. 2 M pages and images. Documents photography used as historical record, reproduction, art form, and scientific methodology.
Women: transnational networks (2013)
c. 1.5 M pages. The intersection of gender and class through the era of suffrage movement. Includes diaries and personal papers, periodicals, biographical entries.
Europe and Africa: commerce, Christianity, civilization, and conquest (2013)
c. 1.5 M pages. Consular and colonial office documents from the U.S., Britain, Germany etc.; colonial-era African newspapers; exploration narratives; missionary documents.
Science, Technology, and Medicine: 1780-1925, PART II (2014)
c. 3.2 M pages. Includes natural history monographs from the Huntington Library and monographs and journals from the Brill microfilm collection "Academies of Science Publications". (See the Appendix for full list of Brill Academies of Science Publications.)
Children's Literature and Childhood (2014)
c. 3+ M pages. Largely from the Baldwin Library Collection of Historical Childhood Literature at the University of Florida, including microfilmed and hardcopy sources from the 19th and early 20th centuries. Also monographs from the British Library.
Mapping the World (2014)
Extensive British sources, including maps and atlases from the British Ministry of Defense and the War Office, as well as travel journal manuscripts recorded by British travelers to South Asia and the Middle East. Includes maps serving a variety of functions, such as maritime navigation, ordinance, and gazeteers. All scanned from original sources in color.
Religion and the Periodical: Point of View and Perspective (2014)
c. 1.4 M pages. British and U.S. content featuring monographs, periodicals, and manuscripts from nineteenth-century religious and social reform movements and their leaders. Described by the publisher as "the intersection of religion and society . . . providing an awareness of the influence religion had in shaping culture [and] essential documentary evidence that explores the corresponding influence social forces had on religious attitudes."
Microform and original sources
Gale projected that existing microfilm collections would account for approximately 65% of the content in the modules 1-4, with 35% scanned directly from original sources. (See Appendix: NCCO 1-4 Microform Sources.) Two significant sources of original source material were the British Library and The National Archives. Gale also targeted previously uncataloged material from the British courts.
Gale originally indicated that published microform sources would make up a smaller component compared with first four modules, with about 30% of modules 5-8 to come from Gale microfilm collections. In fact the overall average film source content for completed modules 5-8, as reported by Gale to CRL, turned out to be around 68%. Some of the modules have a much higher proportion from film than others, as noted below. (See Appendix: NCCO 5-8 Microform Sets and Other Collection Sources.)
*History of science, technology, and medicine
Sources identified include: American Medical Periodicals, 1797-1900 from the National Library of Medicine; Scientific & Technical Periodicals from the Royal Society of London’s Catalogue of Scientific Papers, 1800-1900 from the National Academy of Natural Sciences in Philadelphia (as well as the Academy’s Minutes and Correspondence1812-1924); journal titles from the aggregated content in Landmarks of Science from Readex (about a 15% overlap). (See Appendix: NCCO 5 Journal Titles and Holdings from Microform; see also Appendix: American Medical Periodicals Digital Collections Overlap Analysis.) Original content: rare scientific monographs from the Burndy Library of the Dibner Institute, acquired by The Huntington Library in 2006. Overall estimated proportion of microfilm source content is 47%.
* Photography-the world through the lens
Microform content: History of Photography from Gale’s Primary Source Media microform collections. Original content includes: Records of the Copyright Office of the Stationers' Company; British Colonial Office Photographic Collection; British Admiralty Office Photographs; Photographs from the Wellcome Library for the History of Medicine. Overall estimated proportion of microfilm source content is 87.6%. (See Appendix: NCCO 6 Photography Source Collections.)
* Women: transnational network
History of Women from Gale’s Primary Source Media microform collections. Overall estimated proportion of microfilm source content is 99.9%. (See Appendix: NCCO 7 Title List for Monographs, Periodicals, and Manuscripts.)
* Europe and Africa
Overall estimated proportion of microfilm source content is 37%. (See Appendix: NCCO 8 Title List for Monographs, Periodicals, and Manuscripts.)
Three of the modules contain significant portions from existing microfilm collections, while the maps module includes a large amount of original manuscript and gazetteer content from the British Library.
Overlap with other digital collections
There are questions about both comprehensiveness and duplication in NCCO, since many key documents and monographs are already available elsewhere. Gale has stated that a key selection criterion is avoidance of overlap with existing digitized 19th-century materials, especially the content scanned for the Google Books project and available in HathiTrust. Particular attention is paid to material having multiple potential sources.
On the other hand, Gale notes that collections of published documents in NCCO, such as the scientific journals, have not been selected through individual title checking against licensed or open access digital sources. For instance, at least seven of the American medical journals in module 5 can also be found in American Periodicals from the Center for Research Libraries (APCRL), distributed by ProQuest. In a sampling of 20% of the science and technology journals from that module, CRL found 50% overlap with existing digital collections, including licensed content and some open access sources.
Note that Gale has indicated plans to eventually integrate their previously digitized 19th- century content, which they expect will require re-indexing prior to mounting on the newly developed Artemis research platform.
Q & A with Gale
In 2011, prior to the initial launch of modules 1-4, questions about the product under development were submitted to Gale in 2011, with the following responses:
Q: What are the anticipated sizes of the collection modules, and the estimated overall size of the completed database?
A: “The plan for NCCO allows for a total of approximately eight to twelve million pages across four Archives per year, for the next three years, with an overall program size in excess of thirty million pages. It is not anticipated that each Archive will be equal to the next in size or document count, but rather that the total number of pages/documents will reflect the quantity of content needed to fit the research needs (and budgets) of the subject areas which the Archive is meant to support.” Gale originally indicated that additional material could be developed after the initial twelve modules, depending on market response.
Q: What percentage of the overall digital collection will be European content? What percentage will be from North America?
A: “This cannot be determined at this point. This is an evolving program and the direction it takes, guided by our Advisory Board and ad hoc experts in the subject areas on which Gale will focus, will determine where Gale looks for sources and content. For example, if Gale focuses on photography [for one Archive] as currently planned, sources would include the U.S., Germany, France, and Japan—but MAY also include institutions outside of those geographies with significant collections. Gale will announce each library partner as agreements are signed. Gale intends to react to changing research needs and an ever-changing digital landscape, and so will not seek to define the program so far into the future as to be unable to be certain that we are bringing unique materials of value to researchers and students . . . at any given time.”
Q: What percentage of the documents will be unpublished/archival documents?
A: “At present, Gale is looking outside of monographs and newspapers in a more intensive way than at the corpus of published materials; manuscripts and ephemera will play a large role in the program going forward. However, it is impossible to say, at this point, what the percentages may be.”
Q: Will the item-level metadata be exposed in various web scale discovery tools?
A: “At present, Gale is working through how we are working with discovery tools and services per our Gale Digital Collections, however progress is being made and we anticipate a program to roll-out hopefully in 2013. Per exposed metadata in general, it is viewable and available in most viewing options of the NCCO platform itself, and the underlying scanned XML content is downloadable.”
Q: What metadata schema (or schemas) will be used for the archival materials?
A: “This will be very similar to our current collections structure.” Note that Gale uses metadata authority sources developed in-house.