In this blog post, I am going to offer a way to extract large batches of email newsletters from Constant Contact for the purposes of creating email archives, resulting in each message as a PDF.
First, some background. I have recently finished an email archiving project for the History & Archives of Front Runners New York. The club used to snail-mail newsletters since the early 1980s, but transitioned to email newsletters around 2004, and has been using Constant Contact since 2007 for its newsletter software. They had managed to retain all the messages in Constant Contact, however, not all the embedded images.
Constant Contact does not have an easy way to export sent messages in bulk. Thus, I created a script that leverages the Constant Contact API to export messages and the related metadata. It creates a PDF, first including a full-length image of the email message, followed by a JSON export of the message metadata, and complete with text-version of the email message (if available). This allows for the look of the message to be retained, but also text-searchable.
The script uses some Open Source software (like Ghostscript) and a very nice screenshot tool called Grab Them All for Firefox. The basic procedure for creating the collection of PDF email messages are the following:
1. Setup the Contstant Contact API and Access token. Information on doing this can be found on their website.
2. Install the required software. This includes enscript, ps2pdf, convert and gs. On a Mac, you can install these by typing:
brew install gs
brew install enscript
brew install ps2pdf
brew install convert
2. Run get_newsletter.php > urls.txt
This will create the metadata text files and dump all the URLs into urls.txt.
3. Run Grab Them All in Firefox, using the urls.txt as the source. This will create full-length PNG screenshots of all the URLS.
4. Run fixup.sh, which will convert the PNG screenshots to PDF, convert the metadata files to PDF, and append them all together in a single file, with the subject of the message as the name of the file.
You can download the script here: get_newsletter.zip. Be sure to add the access token and API key to the script.