National Register and the National Archives: Difference between revisions
National Register and the National Archives (view source)
Revision as of 11:02, 18 April 2025
, 1 day agono edit summary
No edit summary |
No edit summary |
||
Line 1: | Line 1: | ||
[[File:Excel thumbnail.jpg|none|thumb]] | [[File:Excel thumbnail.jpg|none|thumb]] | ||
File | File can be downloaded here: [[:File:NRHP NARA PDF download 2025.04.18.xlsx|https://openpreservation.xyz/wiki/File:NRHP_NARA_PDF_download_2025.04.18.xlsx]] | ||
Steps taken: | |||
For each listing in the national-register-everything-20240710.xlsx file available at https://www.nps.gov/subjects/nationalregister/data-downloads.htm, look up the corresponding NARA address. | |||
For each NARA address, record the direct link to the PDF download, which is hosted on AWS servers (See NARA PDF Links tab in Excel file). | |||
EG: 9001229 <nowiki>https://catalog.archives.gov/id/75320568</nowiki> <nowiki>https://s3.amazonaws.com/NARAprodstorage/lz/electronic-records/rg-079/NPS_NY/09001229.pdf</nowiki> | |||
Automate the download of all PDFs of public, unrestricted National Register listings. | |||
Once stored on a local disk, use a script to extract information about each PDF, including: | |||
-file size | |||
-page count | |||
-other PDF metadata (title, subject, keywords, file creation and modification date, etc). | |||
The 'Calculations' tab includes a chart of Page Count distribution, as well as some basic summary statistics: | |||
-Total page count: 3,124,141 | |||
-Total documents: 76,092 | |||
-Average pages per document: 41 | |||
-Total file size: 3,759 GiBs | |||
-Average file size: 51 MiBs. | |||
This page will be updated once a more in-depth analysis of the content of the PDFs is available. | |||
Reach out to hello@openpreservation.xyz if you have data analysis ideas about the 76,092 PDFs of National Register nomination forms available from the NARA website. |