Comparison of software saving Web pages for offline use
A number of proprietary software products are available for saving Web pages for later use offline . They vary in terms of the techniques used for saving, what types of content can be saved, the format and compression of the saved files, provision for working with already saved content, and in other ways.
HTML Content
Name
Technology
Completeness of saved content
Support for collections
Ease of adding to existing collections
Navigable between saved pages in offline
Format of saved files; open/proprietary
Compression
Notes
wget
command line application
images and CSS (if -p
option is used), but no client-side generated HTML content
Yes
?
Yes, if -k
option is used
Open (HTML or WARC )
Yes, if WARC files are used
HTTrack
command line application has WinHTTrack for Windows and WebHTTrack for Linux/BSD/Unix GUI front-ends
?
?
?
Yes. Links all remade so open your locally stored pages for the site you download
Open. Standard HTML pages saved in a folder. Click on index.html to open home page
No
Many options to let you refine what you save.
Tenmax's Teleport
windows desktop application and scriptable tools for web crawling and archiving
multimedia (except streaming files), CSS, limited support for javascript events and cookies; shockwave/flash content is downloaded but not crawled
?
?
Yes
Open. Standard HTML pages saved in a folder. Click on index.html to open home page
No
supports advanced filtering options and authentication
ScrapBook
Firefox extension
See note[ ScrapBook 1] [ 1]
Yes
Easy
Yes IF those pages were saved in scrapbook
Proprietary catalog; regular HTML and content for each page
No
See note[ ScrapBook 2]
Mozilla Archive Format
Firefox extension
Images, CSS and other static content; clientside-generated HTML content saved fine
Yes
Impossible
No
MAFF (=ZIP of regular HTML and web content)
Always
The Mozilla Archive Format add-on is no longer maintained since September 5, 2018.[ 2]
Read Later Fast
Google Chrome extension
Stylesheets are saved incompletely or not at all
No
—
No
Proprietary ; restricted to Google Chrome profile location
No
PageArchiver
Google Chrome extension
Video and audio files (via Flash or HTML5) are not saved
Yes
Yes (import/export features)
No
Open; regular HTML for pages, regular zip file for catalog
Yes for catalog
Archia's Web Page Archiver[ 3]
E-mail based on-line service
See note[ Archia 1]
No
No
No
Open
Yes
See also
Notes
ScrapBook
^ Saved content: Default:
images, CSS and other static content; clientside-generated HTML content—all saved fine
Optionally:
sound (MP3, WAV, RAM, WMA)
video (MPG, AVI, MOV, WMV)
archives (ZIP, LZH, RAR, JAR, XPI)
java - but can be problematic
custom document extensions (e.g. PDF)
^ Extra features:
Search across collections
Known issues:
saved pages embedding TED.com presentations (incl. pages on TED.com) cannot be played even when online
selecting a piece of page will save only selected piece — inconvenient when you change page title with a quote from the page
doesn't work with Firefox Quantum at the moment
Archia
^ Saved content: Images, CSS and other static content, sound (MP3, WAV, RAM, WMA), video (MPG, AVI, MOV, WMV), archives (ZIP, LZH, RAR, JAR, XPI), custom document extensions (e.g. PDF)
Video
To save video embedded on web sites (e.g. YouTube), there are video download extensions for Firefox (including Download Helper) and Chrome.
References