Portico

    Overview

    Portico preserves scholarly literature published in electronic form and ensures that these materials remain accessible to future scholars, researches, and students. Portico is part of ITHAKA, a not-for-profit organization with a mission to help the academic community use digital technologies to preserve the scholarly record and to advance research and teaching in sustainable ways. Content is received from participating publishers, migrated to standard formats and preserved. Access is provided to eligible libraries when a trigger event occurs. Portico is a division of Ithaka Harbors, Inc. Portico was certified by CRL in 2010 as a trustworthy digital repository.

     

    Type of Organization
    Provider Role(s)
    Prior Names
    Electronic-Archiving Initiative
    Parent Organization
    Ithaka
    Year established
    2002
    Still in Operation
    Yes
    Main Address

    100 Campus Drive, Suite 100
    18th Floor
    Princeton, NJ 08540
    United States

    Mission Statement

    Portico is part of ITHAKA, a not-for-profit organization that encourages the academic community to use digital technologies to preserve the scholarly record and to advance research and teaching in sustainable ways. ITHAKA's mission statement is, "ITHAKA is a not-for-profit organization that works with the global higher education community to advance and preserve knowledge and improve teaching and learning through the use of digital technologies. In two decades, we have launched three of the most transformative and widely used services in higher education: JSTOR, Portico, and ITHAKA S+R – and recently our strategic alliance with Artstor has allowed us to further enhance our mission by facilitating access to its services for researchers, teachers, and students worldwide." - https://www.ithaka.org/content/our-mission, accessed April 18, 2017

    At the time of the last CRL audit Portico did not have a mission statement, but instead stated its purpose as, "Portico is among the largest community-supported digital archives in the world. Working with libraries and publishers, we preserve e-journals, e-books, and other electronic scholarly content to ensure researchers and students will have access to it in the future."(accessed at http://www.portico.org/digital-preservation/about-us, on September 2015.)  This statement has since been removed from the Portico site. This is due to the merging of Parotico with the ITHAKA organizationa. However, it should be pointed out that 

     

    History

    2002: JSTOR initiates a project known as the Electronic-Archiving (e-archive) Initiative, the precursor to Portico. 

    2005: Portico launched by JSTOR and ITHAKA, with a grant of $1.5 million from The Andrew W. Mellon Foundation.

    2007: Portico fulfills its first ejournal post-cancellation access request2.

    2009: JSTOR and ITHAKA merge to form a new organization called ITHAKA, which provides three services: JSTOR, Portico and ITHAKA S+R

    2009: Portico initiates D-Collections service for publishers, with commitment of content by Gale and Adam Matthews Digital.

    2009: Portico ingests the first eBooks into the archive as part of an aggregated e-journal and e-book service

    2010: CRL certifies Portico as a trustworthy repository of e-journal content.

    2011: Portico offers a separate eBook preservation service.

    2012: Portico begins to work with the British Library to create workflows to export standardized journal content from Portico to the British Library3.  

    2013: Portico reaches milestone of preserving 25 million journal articles.

    2014: Alfred P. Sloan Foundation grant awarded to Portico, The Data Conservancy, and IEEE to design and prototype a data curation infrastructure that connects published research and associated data sets for the long-term benefit of researchers worldwide.

    2015: EBSCO commits to preserving its archive databases with Portico under the Portico D-Collections program.

    2017: Portico targets titles from small publishers (those who publish 10 titles or less) in response to the 2CUL report.

    2018: Portico and the Koninklijke Bibliotheek (KB), the National Library of the Netherlands, finalized an agreement to place an online replica of the Portico archive at the KB

     

    Financial Information

    Portico is a division of ITHAKA Harbors Inc., a non-profit corporation with 2014 reported revenues of approximately $86 million. Ithaka's 2014 990 reported Portico's revenue as $5,723,000 and expenses of $5,479,6704.  

    Portico has three main sources of income: ITHAKA, grants, and additional support from libraries and publisher’s subscriptions. Portico’s goal is to be supported through a tiered subscription model. For e-journal publishers the annual financial contribution is based on a publisher's total revenues, including print and electronic subscriptions, licensing, and advertising. Current yearly rates for publishers are between $250 and $81,960. For libraries, the subscription is based on the total library materials expenditures (LME). Portico uses the definition of LME provided by the Association of Research Libraries (ARL) to determine payment. Rates for most libraries range from $1,500 to $25,462 annually. 

    Subscriptions from libraries and publishers continue to grow. In June 2019, Portico had over 1,024 library members worldwide, including libraries on every continent but Africa. Publisher subscriptions have also increased, with 600 publishers archiving content in Portico. However, some publisher's content is included as part of an aggregator's collection, and so it cannot be concluded that all listed publishers are Portico subscribers.

    In 2008 Portico added another type of content with a new business model, D-Collections. Publishers pay all the costs to house the D-Collections within Portico. Access to these collections is limited to the publisher’s previous customers. This new business model lets Portico extend preservation services to new types of content while gaining additional revenue without impacting its library subscribers. Publishers buy access to Portico's tools and services and extend these to their customers. This business model gives the customers long term access to their content without further cost to them.  

    Grant funding, another source of income, does not sustain regular Portico activities, but is usually earmarked for development of a particular facet of Portico services. This is a sound financial practice because grant funding is unreliable. Portico's grants have come from a variety of funders, including the Library of Congress, the Institute of Museum and Library Science (IMLS), the National Endowment for the Humanities (NEH), the Alfred P. Sloan Foundation and other funders. The Library of Congress contributed a startup grant of $3 million5. Other grants have come for the development of specific repository tools. Portico was one of three partners included in a grant project of the Library of Congress National Digital Information Infrastructure and Preservation Program (NDIIPP). The project was to develop JHOVE2 (JHOVE and JHOVE2 are repository tools for validating content). The tools Portico has participated in creating have been shared throughout the repository community and improve repository processes.

    Some additional income is earned by Portico consultations with national libraries.Portico provides data services to the British National Library. Portico normalizes e-journal content.that helps the British Library meet UK government regulations for legal deposit of non-print publications in the UK web domain. They also offers preservation and access consulting service to other national libraries.

    Governance: Board / Owners / Parent organization

    Portico operates under the corporate governance of Ithaka Harbors Inc., a non-profit corporation with a Board of Directors that incudes individuals from the academic, publishing, library, and business communities. Portico operations are guided by an advisory committee consisting of individuals from the library and publishing communities2.

    ITHAKA is responsible for Portico's Human Resources, financial control, legal counsel, information technology, and tool building for subscribers. Portico's staff focuses on two areas: content management and customer services. Content management includes the activities of acquisition, ingestion, and dissemination of content. Customer service includes outreach to current and perspective subscribers and communication with content owners.

    Planning effectively is an important aspect of repository administration. Portico holds planning sessions with senior staff twice a year. The Portico Leadership Group (PLG), made up of Portico staff and invited guests, holds a strategic planning session over two days each summer and fall. These meetings set goals and objectives for Portico departments and staff. The time Portico sets aside for bringing leadership staff together has been an effective mechanism for setting future activities and monitoring change within the enterprise.

    In order to ensure Portico's stakeholders are included in decision making an advisory committee of librarians and publishers provide guidance on Portico’s policies and digital preservation practices. There is an even split between advisory committee participants, with six librarians and six publishers serving on the committee.

    Technical Information

    Portico was certified by CRL in 2010 as a trustworthy digital repository.

    Workflow6

    A major part of Portico's workflow efforts involve verifying and normalizing source files for ejournals and ebooks to a common archival format. Portico’s policy is to accept what the publisher has available, regardless of its file formats or structure. This policy is important because publishers have little incentive to devote resources to post-processing published content and a requirement to normalize content would slow publisher submissions. Publishers submit new content in any form and Portico verifies the content and then creates an archival information unit (AIU) that is preserved. The file format Portico uses is commonly known as the NLM DTD. According to Portico in 2016 about 17% of submitting publishers used one of the two NLM DTD formats.  "Almost 30 publishers submitted a total of about 450,000 articles in versions 2.x of the Archiving and Interchange DTD, and a similar number of publishers submitted about 300,000 articles in versions 2.x of the Publishing DTD7." For ejournals they use a version of the NLM DTD known as the Journal Article tag suite (JATS) DTD and for ebooks the BITS DTD.

    Delivery to Portico is usually accomplished through ftp pull or push from the publisher.  Publisher content submitted to Portico is never uniform and must be normalized before ingestion into the archive. For ebooks and ejournals each publishers' source files are converted using Portico generated scripts. These scripts are publisher specific (and sometimes there are several for the same publisher). Once files are converted to Portico standards they are zipped into an archival information package (AIP) for storage. Archival objects are usually submitted in either TIFF or PDF format, along with publisher full-text or bibliographic (header-only) metadata in an XML file. Additional supplemental files (including spreadsheets, executable files, audio and video, etc.) may be included in the AIP and are in formats commonly associated with the information they hold. 

    Once ebooks and ejournals are normalized the content is housed in its long-term archival management system. The system includes format migration as needed. Portico chooses to use commonly available hardware and software, such as that sold by Oracle, to build its technology infrastructure. It commits to updating system's technology as it evolves. The system's hardware is stored at the Princeton University campus data center. Portico staff have run several large-scale tests of their systems and data, including fixity verifications and a web penetration test. System problems are captured and stored using JIRA issue-tracking software. 

    Digitized Primary Source Collections, known as D-collections are archived by Portico but have different workflows, policies and services than other content. D-collections are not normalized by Portico but are instead saved in the form in which the publisher sends them to Portico. D-collections content if triggered will be hosted by the publisher. D-Collections are only available to subscribers of the content, and so all costs are born by the publishers of these collections. D-Collections are not part of the content available for auditing through the Portico audit tool

    In 2015 Portico began to focus on preserving the content produced by small publishers. This concern for preserving the "long tail" of academic publications was highlighted by findings from the 2CUL Project that found many small publishers' digital content was not archived. Portico has begun to create new services and workflows designed to encourage small publishers to work with Portico. Among these services is a new plugin designed for archiving publications that use the Open Journal System (OJS) publishing platform. Portico staff works actively with small publishers to ease submission through flexibility, alternative solution that address the restrictions on file formats and accompanying metadata. Portico is also experimenting with crawling publisher websites and harvesting using DOI's. Portico also notes many small publishers' titles are already being preserved in Portico through the membership of aggregators and larger publishers who provide access to their content and are members of Portico.

    At the 2019 IPRES conference Portico presented on a six month project to target development of automated management and remediation tools for analysis of problematic content . The solutions they develop will help lower cost drivers for digital preservation

    Access

    Content

    The Portico collection consists of e-journals, e-books, and d-collections (digital collections). Portico aims to preserve the scholarly record. With this goal, it is likely they will continue to expand the types of content they will preserve. In April 2017, Portico’s website listed approximately 26,866 committed e-journal titles, 793,783 committed e-book titles, and 174 d-collections. Existing Portico subscribers may choose to subscribe to either the e-journals or e-books collections, or both when they renew their Portico agreement. This subscription model allows libraries the flexibility to limit spending, should they be interested in narrowing their preservation goals. D-collections have a different business model, as the publishers are paying all the storage costs and, should a trigger event occur, the publisher's subscribers will be the only ones with access.

    D-collections started in 20088 and encompass digitized historical collections. They are predominantly primary source content, such as newspapers and correspondence. Many of these digital collections originated as microfilm sets, such as "19th Century U.S. Newspapers." Currently three publishers, Adam Matthew Digital, EBSCO and Gale, a part of Cengage Learning, contribute content. Libraries do not pay to subscribe to D-collection content, which is supported solely by the individual publishers that have committed their collections to the archive. If a trigger event requires access to a D-collection, the access is limited to a publisher’s previous customers. Portico also preserves the CrossRef Metadata Database with their D-collections.

    There are currently 600 publishers contributing e-journals to Portico. Subject areas for e-books have a scholarly focus, with science, law, medicine, and the liberal arts all strongly represented. Some of the individual titles share a series or a society affiliation with one another. Also included in the collection are four e-reference books. Forty-two percent of the e-journals within the repository have perpetual content-access designations. As of  there has been only one triggered ebook, The New Partridge Dictionary of Slang and Unconventional English.

    Portico began preserving e-books as a service for publishers in 200811. In 2011 Portico began offering Post Cancellation Access (PCA) services to libraries for ebooks. Currently they have publications from 131 publishers. Approximately one third of the ebooks preserved by Portico are not available to libraries through Portico's PCA platform. Portico's ebooks covered under the PCA services will only trigger if they are unavailable on any database platform. Most of the journal titles available in Portico originate in the United States and Northern Europe. Languages are predominantly Western European, with English being the major language.

    Ebooks and ejournals are sold both as individual titles through a publisher and through aggregated collections or packages. Both directly by publishers and through intermediary distributors, or aggregators, who group titles from many publishers in a common platform. Portico preserves content from e-journal aggregator relationships (e.g., BioOne, Project Muse) but does not currently have any commercial eBook aggregator relationships. This may be largely due to the complicated rights issues associated with eBooks12.

    Portico is a dark archive. Access to content within the repository is not available to subscribers unless a trigger event occurs.

    Portico conditions for trigger events are:
    1. A publisher ceases operations and no entity purchases and makes its titles accessible
    2. A publisher ceases to publish a title and it is not offered by another entity
    3. Back issues are removed from a publisher's site and are not available elsewhere
    4. Catastrophic and sustained failure disables the publisher's delivery platform
    5. A publisher opts to rely upon Portico to meet perpetual access obligations

    In November 2007 the first trigger event occurred when the journal Graft: Organ and Cell Transplantation (SAGE Publications) was removed from SAGE’s online platform. Portico made the title available to its library participants through the Portico website on December 31, 2007. Currently, there are 130 titles available through post-cancellation access on their website. The content for the volumes and issues that have been triggered are available to all Portico participants, regardless of whether the institution previously subscribed to the title or not. Of the triggered titles, 25 are available to Portico members only, the rest are available to any web-user through open access. Portico works with the linking services and publishers to ensure that information about triggered content is available in their knowledge bases.

    Should a trigger event occur for a D-Collection publisher, libraries having subscribed to or purchased the collection from the publisher will be eligible for continued access to the subscribed/purchased content through Portico. (The publisher provides Portico a list of eligible libraries annually for each of its products.) The eligible library would then pay Portico a yet undetermined fee for that access on an annual basis. Portico accepts D-Collections content for newspaper collections in the aggregator's own proprietary formats, which can include zoning and other location metadata. It is not clear the extent to which the zoning and markup will be functional in the platform adopted by Portico to expose the archived content should a trigger event occur.

    Portico provides a web database for subscribers who want to audit the archive content. This tool is restricted to access for one to four users from a subscribing institution. Users have access to all content and metadata within the repository. The verification tool is not meant to be used as a substitute for regular library delivery channels such as ILL or document delivery. It is not the same access platform that Portico provides to deliver triggered content. The subscriber is given access to a replica of the repository.This is important because providing access to the actual archived files would put the content at risk, violating Portico's dark archive status.

    Portico helps libraries and other knowledge institutions by connecting patrons to licensed content without additional work in their local catalog. They utilize tools such as Open URL links, linking and vendors like CrossRef, CUFTS, EBSCO, ExLibris, OCLC Openly Informatics, SerialsSolutions, and TDnet to reroute library catalogs and other knowledge bases to the Portico delivery platform.

    Portico also provides a downloadable Excel spreadsheet of all titles and issues, which is a useful tool for those who wish to analyze the titles and holdings. Subscribing libraries can receive a customized list of Portico holdings against their own.

    Sources

    1. Center for Research Libraries. Portico Audit Report 2010. Portico Audit Report 2010. Center for Research Libraries, 1 Jan. 2010. Web. 12 Apr. 2017. Link.

    2. University of Illinois at Chicago. Library. Portico Archive Supports First Trigger Event. ULIB Library Information Bulletin. University of Illinois at Chicago Library, 29 Nov. 2007. Web. 12 Apr. 2017. Link.

    3. Research Information. The British Library and Portico work together on e-journal deposit infrastructure. Research Information. Europa Science Ltd, 27 June 2013. Web. 11 Apr. 2017. Link

    4. ITHAKA HARBORS INC. Form 990. New York: ITHAKA HARBORS, 2014. Link.

    5. Library of Congress. October 5, 2005. "Library of Congress Announces Award of $3 Million to Portico, a Nonprofit Electronic Archiving Service."  Library of Congress, 05 Oct. 2005. Web. 04 Oct. 2017. Link.

    6. Sheila Morrissey et al., “Portico: A Case Study in the Use of XML for the Long-Term Preservation of Digital Artifacts” (paper presented at International Symposium on XML for the Long Haul: Issues in the Long-term Preservation of XML, Montréal, Canada, August 2, 2010). In Proceedings of the International Symposium on XML for the Long Haul: Issues in the Long-term Preservation of XML. Balisage Series on Markup Technologies, vol. 6 (2010). Accessed February 28, 2011, doi:10.4242/Balisage.

    7. Library of Congress. Sustainability of Digital Formats: Planning for Library of Congress Collections. Digital Formats Web site. Library of Congress, 11 Mar. 2017. Web. 12 Apr. 2017. Link

    8. "Gale and Portico Enter into an Agreement To Preserve Gale Digital Collections," Portico, December 2, 2009. Accessed February 2, 2011, Link

    9. Kirchhoff, Amy. "EBooks: The Preservation Challenge." Against the Grain 23.4 (2014): 32-34. Http://www.against-the-grain.com/. Web. 11 Apr. 2017. Link.

    10. Wittenberg, Kate,  Stephanie Orphan and Amy Kirchoff (2016, February 24). Preserving the "LongTail" of Elusive Publishers. CRL Webinar, Online.

    11. Kirchhoff, A. (2016, May 25). Portico Comparison Tools for Collection Management. CRL Webinar, Online.

    12. Orphan, Stephanie and Amy Kirchhoff (2017, April 5). Portico e-Book Preservation – Progress Made, Lessons Learned, and Future Directions. CRL Webinar, Online.

    No votes have been submitted yet.
    I recommend that CRL prioritize this provider for additional analysis and research.