129,112
129K
Oct 24, 2018
10/18
Oct 24, 2018
Recordings made available under the Music Modernization Act . A reasonable search has been conducted to determine that these items are not commercially available. Read more on Internet Archive's blog . Please be aware that subsequent uses may not be permitted under US copyright such as reproduction, distribution, display or public performance. Works may be in the public domain already, in which case, these restrictions do not apply.
Topics: unlocked, 108h
This item may not be used for commercial purposes. Whilst it is believed these recordings are in the public domain, no warranties are given as to copyright ownership. Please contact us at [email protected] if you believe this content should be removed. Performer: Richard Tauber Writer: Purcell; Parr-Davies Tenor; With Orchestra; As Sung in Film "The Lisbon Story"; Sung in English. Digitized at 78 revolutions per minute. Four stylii were used to transfer this record. They are 3.5mil... Topics: 78rpm, Popular Music
38
38
Oct 1, 2018
10/18
Oct 1, 2018
movies
eye 38
favorite 0
comment 0
Slide shows for the Buddhist Digital Resource Center collection
Topic: BDRC
36
36
Oct 1, 2018
10/18
Oct 1, 2018
movies
eye 36
favorite 0
comment 0
Slide shows from the Afghan Media Resource Center collection.
Topic: amrc
Images from both days of the Decentralized Web Summit 2018, August 1-2, 2018.
Topic: dweb
Images taken by Alexis Rossi of the Decentralized Web Summit 2018 Builders' Day on July 31, 2018.
Topic: dweb
Builders Day! July 31, 2018 Decentralized Web Summit 2018 Internet Archive *Photographs by Brad Shirakawa courtesy of the Internet Archive Images from the Decentralized Web Summit 2018 Builders' Day at Internet Archive.* Breakfast & Registration Introduction & Welcome Speakers: Wendy Hanamura, Director of Partnerships, Internet Archive "Vision for 2018: what do we hope to accomplish?" Speakers: Brewster Kahle, Founder, Internet Archive "Meeting New Allies" Speakers:...
Topic: dweb, decentralized, technology, internet archive, web, builders day
1,418
1.4K
Jul 27, 2018
07/18
Jul 27, 2018
Published findings from academics, scholars and researchers.
Topics: journals, articles, scholarly communications
52
52
Jul 27, 2018
07/18
Jul 27, 2018
by
Frontiers Media S.A.
Frontiers in Psychology is the largest journal in its field, publishing rigorously peer-reviewed research across the psychological sciences, from clinical research to cognitive science, from perception to consciousness, from imaging studies to human factors, and from animal cognition to social psychology.
Topics: Philosophy, Psychology, Religion: Psychology
55
55
Jul 27, 2018
07/18
Jul 27, 2018
by
Public Library of Science
PLOS Biology features works of exceptional significance, originality, and relevance in all areas of biological science, from molecules to ecosystems, including works at the interface of other disciplines, such as chemistry, medicine, and mathematics. We also welcome data-driven meta-research articles that evaluate and aim to improve the standards of research in the life sciences and beyond. Our audience is the international scientific community as well as educators, policy makers, patient...
Topic: Biology
Rough sketch for how to incorporate journal article discovery into the archive.org site framework.Â
Topics: journals, scholarly communications, articles, design
107
107
Jul 25, 2018
07/18
Jul 25, 2018
Design documentation and resources that are restricted for internal use only.
Topics: IA, design
273.2M
273M
Jul 6, 2018
07/18
Jul 6, 2018
Images contributed by Internet Archive users and community members. These images are available for free download. Please select a Creative Commons License during upload so that others will know what they may (or may not) do with with your images.
Topic: images
637
637
Jun 22, 2018
06/18
Jun 22, 2018
This is an experimental collection of DVDs that is not yet available for public release.
Topic: dvd
395
395
Jun 21, 2018
06/18
Jun 21, 2018
NQMC is an initiative of the Agency for Healthcare Research and Quality (AHRQ) External Web Site Policy, U.S. Department of Health and Human Services . It is a database and Web site for information on specific evidence-based health care quality measures and measure sets. NQMC is sponsored by AHRQ to promote widespread access to quality measures by the health care community and other interested individuals. The NQMC mission is to provide practitioners, health care providers, health plans,...
Topics: evidence-based health care, healthcare quality measures, government documents
12,132
12K
Jun 21, 2018
06/18
Jun 21, 2018
NGC is an initiative of the Agency for Healthcare Research and Quality (AHRQ) External Web Site Policy, U.S. Department of Health and Human Services . NGC was originally created by AHRQ in partnership with the American Medical Association and the American Association of Health Plans (now America’s Health Insurance Plans [AHIP]). The NGC mission is to provide physicians and other health care professionals, health care providers, health plans, integrated delivery systems, purchasers and others...
Topics: healthcare, clinical practice guidelines, government documents
694
694
Apr 25, 2018
04/18
Apr 25, 2018
Hot Record Society was an American jazz record label, founded in 1937 for the purposes of reissuing out-of-print early hot jazz music. It was founded by Steve Smith. The advisory board included John Hammond, Marshall Stearns, Charles Edward Smith, Wilder Hobson, Bill Russell, Charles Delaunay, Hugues Panassié, and Sinclair Traill. The company initially issued out-of-print works, especially from the ARC and Decca catalogs, and collected biographical and discographical information. In 1938 it...
Topics: 78rpm, label
72
72
Mar 23, 2018
03/18
Mar 23, 2018
Media items from the Sikh Institute at Berkeley.
Topic: Sikh Institute at Berkeley
This walk through describes how to upload files to the Internet Archive.
Topics: help center, uploading, internet archive
536
536
Mar 15, 2018
03/18
Mar 15, 2018
Help videos and other files intended for use in the Internet Archive help center and documentation.
Topics: help center, documentation, how to
157
157
Feb 12, 2018
02/18
Feb 12, 2018
Books uploaded by the University of Illinois, Urbana-Champaign Disability Resources and Educational Services for the use of people with print disabilities.
Topics: UIUC, DRES, print disabled, low vision
87
87
Feb 9, 2018
02/18
Feb 9, 2018
by
Afghan Media Resource Center
data
eye 87
favorite 0
comment 0
Metadata files for video, photos, and audio recordings from the Afghan Media Resource Center (AMRC).
Topics: AMRC, Afghan Media Resource Center
How to choose Internet Archive as your Amazon Smile charity
favoritefavoritefavoritefavoritefavorite ( 1 reviews )
Topics: internet archive, amazon smile, charity, donations, how to
37,205
37K
Oct 9, 2017
10/17
Oct 9, 2017
by
Alexis Rossi
movies
eye 37,205
favorite 0
comment 0
Internet Archive 2017 Celebration presentation about 78rpm digitization.  Photo is of D'Anna Alexander with her mother and grandmother. Her grandmother collected the discs available in https://archive.org/details/argumedo-hug Music sample is from https://archive.org/details/78_low-bridge-everybody-down_billy-murray-allen_gbia0015535b
Topics: 78rpms, IA Presentation
317,471
317K
Oct 6, 2017
10/17
Oct 6, 2017
The Tina Argumedo and Lucrecia Hug Collection of 78s contains discs collected in Argentina beginning in the mid-1930s. It comprises primarily tango music, with boleros, sambas, mambo and other dance music. The collection was donated by D'Anna Alexander, Michael and Daniel Alexander, and Débora Simcovich. D'Anna Alexander (center) with her mother (right) and grandmother (left). https://archive.org/details/d_anna_2017 Contains details of the physical collection.
Topic: 78rpm, argentina, argentian tango
Source: 78
1
1.0
Sep 18, 2017
09/17
Sep 18, 2017
The San Francisco Prime Timers Leftover Erotica collection is an illuminating snapshot of the heyday of gay male adult films on VHS tape. Spanning from the late 80’s through the early 2000’s, these 1500 commercially produced (mostly gay male) erotic titles are the donated and unsold remains from the San Francisco chapter of the Prime Timers gay seniors group’s fundraising video sales. This collection was highlighted for active preservation for several reasons. It offers a unique view into...
395
395
Jul 28, 2017
07/17
Jul 28, 2017
SFMOMA is dedicated to making the art for our time a vital and meaningful part of public life. For that reason we assemble unparalleled collections, create exhilarating exhibitions, and develop engaging public programs. In all of these endeavors, we are guided by our enduring commitment to fostering creativity and embracing new ways of seeing the world.
Topic: sfmoma
11,378
11K
Jul 11, 2017
07/17
Jul 11, 2017
Digital and digitized music albums from Internet record collectors spanning from LP's to web releases.
Topics: records, LPs, CDs, music
224,449
224K
Jun 13, 2017
06/17
Jun 13, 2017
A collection of texts related to the history of music and audio recordings.
Topics: 78rpm, audio, music, history
1.9M
1.9M
Jun 12, 2017
06/17
Jun 12, 2017
by
Sex With Timaree
Dr. Timaree Schmit has a doctorate in Human Sexuality Education from Widener University. She conducts fascinating interviews and answers listener questions as part of her lifelong search for rational, sex-positive, empirically-based knowledge about sexuality. You can subscribe to the podcast on iTunes .
Topics: sex, sexuality, sex-positive, podcast
173,132
173K
Jun 12, 2017
06/17
Jun 12, 2017
by
Movies for the Blind
Movies for the Blind raises the profile of audio description/video description/described video by entertaining people, regardless how much they can see. Doing a small part to demo nstrate that accessibility isn't about addressing a niche, but just including as many people as possible, giving everyone a chance to enjoy - or get annoyed by - the same stuff.
Topics: blind, podcast, audio description, accessibility, movies
We want to digitize 4 million books and make them publicly available by working with libraries everywhere. Our newest partner, BookMooch.com , is already sending donations our way - so let's get started! As we work toward winning the # 100andChange  grant from the MacArthur Foundation, we are looking for more partners who would like to make the world's knowledge easier to access.
Topics: 100andChange, BookMooch, book digitization
732,468
732K
Mar 10, 2017
03/17
Mar 10, 2017
by
UC Berkeley
Videos from the UC Berkeley Webcast project that were available on Youtube.
Topics: berkeley, educational videos
76
76
Feb 7, 2017
02/17
Feb 7, 2017
texts
eye 76
favorite 0
comment 0
UPCs MSUL comparison
Topic: acdc
139
139
Jan 26, 2017
01/17
Jan 26, 2017
by
U.S. Department of Labor Women's Bureau
texts
eye 139
favorite 0
comment 0
U.S. Department of Labor Women's Bureau library of digital resources pulled from https://www.dol.gov/wb/resources/ on January 25th, 2017.  These documents were contained in a zip file at address https://www.dol.gov/wb/resources/DOL_WB_E-%20Library.zip
Topic: U.S. Department of Labor Women's Bureau
296
296
Jan 23, 2017
01/17
Jan 23, 2017
by
Donald Trump
texts
eye 296
favorite 0
comment 0
Presidential proclamation declaring that January 20, 2017 is a National Day of Patriotic Devotion. Text of proclamation: NATIONAL DAY OF PATRIOTIC DEVOTION 9570 - - - - - - - BY THE PRESIDENT OF THE UNITED STATES OF AMERICA A PROCLAMATION A new national pride stirs the American soul and inspires the American heart. We are one people, united by a common destiny and a shared purpose. Freedom is the birthright of all Americans, and to preserve that freedom we must maintain faith in our sacred...
Topics: Donald Trump, National Day of Patriotic Devotion
226
226
Dec 24, 2016
12/16
Dec 24, 2016
by
Alexis Rossi
texts
eye 226
favorite 3
comment 0
Recipes from MEGA Cookie Smackdown 2016
Topics: mega cookie smackdown, baking, cookies, recipes
17.3M
17M
Oct 6, 2016
10/16
Oct 6, 2016
Newest uploads! Auto-78-twitter . Through the Great 78 Project the Internet Archive has begun to digitize 78rpm discs for preservation, research, and discovery with the help of George Blood, L.P. . 78s were mostly made from shellac, i.e., beetle resin, and were the brittle predecessors to the LP (microgroove) era. @great78project for uploads as they happen. Turntable used for 78rpm digitization of four simultaneous recordings with different needles. The...
Topics: 78rpm, digitization
Source: 78
6.1M
6.1M
Oct 6, 2016
10/16
Oct 6, 2016
78rpm shellac discs donated from the Batavia Public Library Thorpe Collection to the Archive of Contemporary Music and digitized by George Blood, LP for the Internet Archive. Turntable used for 78rpm digitization of four simultaneous recordings with different needles. https://archive.org/details/BTB-10202016 contains details of the physical materials.
Topics: 78rpm, thorpe, ARC, Archive of Contemporary Music
Source: 78
534
534
Jul 30, 2016
07/16
Jul 30, 2016
by
Alexis Rossi
movies
eye 534
favorite 0
comment 0
Fast motion video of Alexis Rossi assembling a piece of geometric body jewelry. Square neck ring purchased at H&M on sale for $2. Concentric brass circles purchased at Joanne's. Large jump rings and chain from my stash. Music: Sidewalk Shade - slower by Kevin MacLeod (incompetech.com), Licensed under Creative Commons: By Attribution 3.0 License
Topics: jewelry, body jewelry, beads, beading, jewelry making, metalwork, DIY
43,296
43K
Jul 29, 2016
07/16
Jul 29, 2016
by
BBC World Service
Radio programs archived from BBC World Service. Items in this collection are restricted.
Topics: radio, BBC World Service
2.2M
2.2M
Jul 28, 2016
07/16
Jul 28, 2016
Radio programs recorded from Internet and over the air. Some items in this collection have access restrictions.
Topic: radio
2
2.0
Jul 6, 2016
07/16
Jul 6, 2016
Books with Beautiful Illustrations and Graphics
Topic: listmania
This video describes how to upload a high quality scanned book to archive.org.
Topics: how to, archive.org
map of Carlson Ave Physical Archive Modules, reviewed by Brewster Kahle Nov 23, 2015. More on the Physical Archive .
Topics: carlson, containers, physical archive
1,983
2.0K
Oct 8, 2015
10/15
Oct 8, 2015
image
eye 1,983
favorite 0
comment 0
How to add a license to an item uploaded to archive.org
Topics: upload, license
184
184
Sep 30, 2015
09/15
Sep 30, 2015
by
Alexis Rossi
data
eye 184
favorite 0
comment 0
Presentation for IASA
Topic: presentation
194
194
Aug 3, 2015
08/15
Aug 3, 2015
Crawls of RSS feeds contributed by the public.
202
202
Jul 23, 2015
07/15
Jul 23, 2015
by
Alexis Rossi
texts
eye 202
favorite 0
comment 0
Universal Access presentation for OSCON, July 2015.
Topics: presentation, OSCON
240
240
Jun 16, 2015
06/15
Jun 16, 2015
by
Alexis Rossi
data
eye 240
favorite 1
comment 0
First draft of slides for Nerd Nite presentation
Topic: IA
145
145
May 27, 2015
05/15
May 27, 2015
by
Alexis Rossi
data
eye 145
favorite 1
comment 0
"Building Libraries Together" presentation for meeting in Ecuador, July 2015.
Topic: arossi
A video tour of the navigation of the new archive.org site.
Topics: v2, navigation
A video tour of search tools on the new archive.org site.
Topics: v2, search
Video tour of how to download files on the new archive.org site.
Topics: v2, download
A video tour of views (downloads) on the new archive.org site.
Topics: v2, views, downloads
A video tour of how to use favorites (bookmarks) on the new archive.org site.
Topics: v2, bookmarks, favorites
4.4M
4.4M
Apr 20, 2015
04/15
Apr 20, 2015
Tour videos for the new version of archive.org. Read about it on our blog.
Tour video of new archive.org site, completed April 20, 2015.
favoritefavoritefavoritefavoritefavorite ( 2 reviews )
Topic: v2
118.6M
119M
Jan 9, 2015
01/15
Jan 9, 2015
Web wide crawl number 16 The seed list for Wide00016 was made from the join of the top 1 million domains from CISCO and the top 1 million domains from Alexa.
76.4M
76M
Jan 9, 2015
01/15
Jan 9, 2015
The seeds for this crawl came from: 251 million Domains that had at least one link from a different domain in the Wayback Machine, across all time ~ 300 million Domains that we had in the Wayback, across all time 55,945,067 Domains from https://archive.org/details/wide00016 This crawl was run with a Heritrix setting of "maxHops=0" (URLs including their embeds) The WARC files associated with this crawl are not currently available to the general public.
42.9M
43M
Jan 9, 2015
01/15
Jan 9, 2015
The seed for this crawl was a list of every host in the Wayback Machine This crawl was run at a level 1 (URLs including their embeds, plus the URLs of all outbound links including their embeds) The WARC files associated with this crawl are not currently available to the general public.
334M
334M
Jan 9, 2015
01/15
Jan 9, 2015
The seed for this crawl was a list of every host in the Wayback Machine This crawl was run at a level 1 (URLs including their embeds, plus the URLs of all outbound links including their embeds) The WARC files associated with this crawl are not currently available to the general public.
288.1M
288M
Jan 9, 2015
01/15
Jan 9, 2015
The seed for this crawl was a list of every host in the Wayback Machine This crawl was run at a level 1 (URLs including their embeds, plus the URLs of all outbound links including their embeds) The WARC files associated with this crawl are not currently available to the general public.
230.4M
230M
Jan 9, 2015
01/15
Jan 9, 2015
280.9M
281M
Jan 9, 2015
01/15
Jan 9, 2015
421.7M
422M
Jan 9, 2015
01/15
Jan 9, 2015
Web wide crawl with initial seedlist and crawler configuration from January 2015.
480.5M
481M
Dec 17, 2014
12/14
Dec 17, 2014
The seed for this crawl was a list of every host in the Wayback Machine This crawl was run at a level 1 (URLs including their embeds, plus the URLs of all outbound links including their embeds) The WARC files associated with this crawl are not currently available to the general public.
161
161
Dec 17, 2014
12/14
Dec 17, 2014
Survey crawl of domains started December 2014. This data is currently not publicly accessible.
1,142
1.1K
Dec 2, 2014
12/14
Dec 2, 2014
data
eye 1,142
favorite 0
comment 0
holiday letter
Topic: Internet Archive
1M
1.0M
Nov 24, 2014
11/14
Nov 24, 2014
IDs of tweets that mention Ferguson, Missouri between August 10th and August 27th, 2014 subsequent to the death of Michael Brown . Tweets collected by Ed Summers. He subsequently extracted the URLs from these tweets, and they were crawled by the Internet Archive. Please read Summers's article at inkdroid.org , with an update here , for more information. Photo: " Memorial to Michael Brown " by Jamelle Bouie
232
232
Nov 22, 2014
11/14
Nov 22, 2014
texts
eye 232
favorite 0
comment 0
247
247
Nov 22, 2014
11/14
Nov 22, 2014
texts
eye 247
favorite 0
comment 0
251
251
Nov 21, 2014
11/14
Nov 21, 2014
texts
eye 251
favorite 0
comment 0
673
673
Nov 13, 2014
11/14
Nov 13, 2014
texts
eye 673
favorite 0
comment 0
2014 holiday invite
Topic: IA
1,881
1.9K
Sep 22, 2014
09/14
Sep 22, 2014
Alexis is the Director of Web Collections at Internet Archive. She is particularly obsessed with jewelry, baking, costumes, and songs about things she hates - like pickles.
Topic: favorites
1.1M
1.1M
Sep 17, 2014
09/14
Sep 17, 2014
This crawl of online resources of the 113th US Congress was performed on behalf of NARA.
Video tour of the Philadelphia Area Political Ads Pilot Project
favoritefavoritefavoritefavoritefavorite ( 1 reviews )
Topic: tvnews
A video tour of the Philadelphia Area Political Ads Pilot Project
Topic: tvnews
182.4M
182M
Aug 27, 2014
08/14
Aug 27, 2014
A daily crawl of more than 200,000 home pages of news sites, including the pages linked from those home pages. Site list provided by The GDELT Project
Topics: GDELT, News
445.7M
446M
Jun 6, 2014
06/14
Jun 6, 2014
Web wide crawl with initial seedlist and crawler configuration from June 2014.
244.9M
245M
Apr 25, 2014
04/14
Apr 25, 2014
The seed for this crawl was a list of every host in the Wayback Machine This crawl was run at a level 1 (URLs including their embeds, plus the URLs of all outbound links including their embeds) The WARC files associated with this crawl are not currently available to the general public.
186,367
186K
Apr 18, 2014
04/14
Apr 18, 2014
by
David Alpern
For Your Ears Only/On Air is a unique, fair and far-ranging weekend news program that veteran journalist/broadcaster David Alpern has produced and hosted since 1982. In addition to guests from across the right-left spectrum on critical policy and political issues, For Your Ears Only presents news and views on a broad range of other important and engaging topics: health, science, technology, the economy, food, arts and entertainment. Launched as Newsweek On Air, reflecting content from that...
Topics: For Your Ears Only, radio, news
111
111
Jan 17, 2014
01/14
Jan 17, 2014
data
eye 111
favorite 0
comment 0
3,421
3.4K
Dec 20, 2013
12/13
Dec 20, 2013
data
eye 3,421
favorite 0
comment 0
List of companies that match charitable donations by employees.
Topics: ia, donations
13,354
13K
Nov 27, 2013
11/13
Nov 27, 2013
texts
eye 13,354
favorite 0
comment 0
holiday letter 2013
Topic: ia
679
679
Nov 20, 2013
11/13
Nov 20, 2013
texts
eye 679
favorite 0
comment 0
invitation to holiday party 2013
Topic: ia
upload 1,596
ARossi
archivist for 12 years
Nov 8, 2006
11/06
archive.org account
person
upload 1,596
comment 635
favorite 150
182.3M
182M
Nov 15, 2013
11/13
Nov 15, 2013
Web wide crawl with initial seedlist and crawler configuration from February 2014.
346
346
Nov 14, 2013
11/13
Nov 14, 2013
image
eye 346
favorite 0
comment 0
Invitation for birthday of the Defensive Patent License, Novermber 15, 2013.
Topics: DPL, internetarchivepresents
544
544
Nov 13, 2013
11/13
Nov 13, 2013
texts
eye 544
favorite 1
comment 0
On "13 November 2013, WikiLeaks released the secret negotiated draft text for the entire TPP (Trans-Pacific Partnership) Intellectual Property Rights Chapter. The TPP is the largest-ever economic treaty, encompassing nations representing more than 40 per cent of the world’s GDP. The WikiLeaks release of the text comes ahead of the decisive TPP Chief Negotiators summit in Salt Lake City, Utah, on 19-24 November 2013. The chapter published by WikiLeaks is perhaps the most controversial...
Topics: wikileaks, TPP, Trans-Pacific Partnership
230
230
Oct 25, 2013
10/13
Oct 25, 2013
movies
eye 230
favorite 0
comment 0
This was a movie that was part of the Internet Archive 2013 Celebration on October 24, 2013.
Topic: 1024
823
823
Oct 22, 2013
10/13
Oct 22, 2013
image
eye 823
favorite 0
comment 0
Images of web pages from the wayback machine.
Topic: wayback
645
645
Oct 1, 2013
10/13
Oct 1, 2013
texts
eye 645
favorite 0
comment 0
Invitation for October 24, 2013 evening event at Internet Archive.
Topic: invitation
371.9M
372M
Sep 23, 2013
09/13
Sep 23, 2013
This is a collection of web page captures from links added to, or changed on, Wikipedia pages. The idea is to bring a reliability to Wikipedia outlinks so that if the pages referenced by Wikipedia articles are changed, or go away, a reader can permanently find what was originally referred to. This is part of the Internet Archive's attempt to rid the web of broken links .
Topics: Wikipedia, Wikimedia
28,553
29K
Sep 12, 2013
09/13
Sep 12, 2013
These documents from the NOAA Central Library are voluntary meteorological observations taken at the Tuskegee Institute in Tuskegee, Alabama from November 1899 through June 1954. Most of these observations were handwritten and signed by George Washington Carver until 1932 and his successors until 1954. The observations were submitted on U.S. Weather Bureau Form no. 1009. The observer recorded minimum and maximum temperature, precipitation, prevailing wind direction, general description of the...
771.1M
771M
Sep 12, 2013
09/13
Sep 12, 2013
These crawls are part of an effort to archive pages as they are created and archive the pages that they refer to. That way, as the pages that are referenced are changed or taken from the web, a link to the version that was live when the page was written will be preserved. Then the Internet Archive hopes that references to these archived pages will be put in place of a link that would be otherwise be broken, or a companion link to allow people to see what was originally intended by a page's...
181.5M
182M
Sep 11, 2013
09/13
Sep 11, 2013
This is a collection of pages and embedded objects from WordPress blogs and the external pages they link to. Captures of these pages are made on a continuous basis seeded from a feed of new or changed pages hosted by Wordpress.com or by Wordpress pages hosted by sites running a properly configured Jetpack wordpress plugin.
Topics: Wordpress.com, blogs, jetpack
57.5M
57M
Aug 21, 2013
08/13
Aug 21, 2013
CDX Index shards for the Wayback Machine. The Wayback Machine works by looking for historic URL's based on a query. This is done by searching an index of all the web objects (pages, images, etc) that have been archived over the years. This collection holds the index used for this purpose, which is broken up into 300 pieces so they fit into items more naturally and distribute the lookup load. Each of these 300 pieces is stored in at least 2 items, and then those are also stored on the backup...
367.3M
367M
Jul 30, 2013
07/13
Jul 30, 2013
Web wide crawl with initial seedlist and crawler configuration from August 2013.
774,949
775K
Jun 26, 2013
06/13
Jun 26, 2013
HD videos that require special deriving rules.