Tuesday, February 24, 2015

How to download all files recursively from HTTP folders:



Many times we are faced by the problem of downloading a folder listed by apache or IIS on the web, and we have to download the files one by one, or at best case go to each folder and download all links with a tool like downThemAll or FlashGot, which will download the files in a folder but not in sub folders.
a tip at this page :
https://bmwieczorek.wordpress.com/2008/10/01/wget-recursively-download-all-files-from-certain-directory-listed-by-apache/

shows how you can use a command line application named Wget to accomplish this.

Example: recursively download all the files that are in the ‘ddd’  folder for the url ‘http://hostname/aaa/bbb/ccc/ddd/;
Solution:
wget -r -np -nH –cut-dirs=3 -R index.html http://hostname/aaa/bbb/ccc/ddd/
Explanation:
It will download all files and subfolders in ddd directory:
recursively (-r),
not going to upper directories, like ccc/… (-np),
not saving files to hostname folder (-nH),
but to ddd by omitting first 3 folders aaa, bbb, ccc (–cut-dirs=3),
excluding index.html files (-R index.html)

In addition you may want to use a destination folder with the option
-P destDir
where destDir is your destination folder

you can download A GUI for this tool from here:
https://sites.google.com/site/visualwget/a-download-manager-gui-based-on-wget-for-windows

but actually the GUI tool didn't work for me and I ended using the wget.exe file used by this tool from the command line as explained above.

1 comment: