How to setup Wikileaks mirror site by using wget and Github

There are millions of Wikileaks supporters all over the world. You will be surprised to see how many of mirror sites has been setup. Wikileaks has been seeking support and it currently has 1426 mirrors registered at this time (Dec24,2010) if my counting is not wrong. Of course, you also can be one of the Wikileaks supporters by hosting a mirror of the site. Wikileaks provides simple instructions on how to setup a WikiLeaks mirror on its Mass-mirroring Wikileaks page. Basically you need to have a unix based server on which web server is running first, then give Wikileaks staff access permission to the server so that they can upload a copy of the Wikileaks site. That’s how you provide the the Wikileaks content from your webserver. But don’t you think it is risky to host the Wikileaks content on your server? Highly possibly your server may suffer from DDoS Attacks. What if whoever attackers get to know who you are from your domain registration record or something? it may be frightening … Soooooo here I am going to introduce the way you can set up a Wikileaks mirror site without having such a risk of your expose to the attackers and without having any hosting space for Wikileaks. — Here I use wget and github on linux environment (ubuntu).
Github as web server
What you need to do is basically only 2 steps:
- 1. mirror a Website with all markup, text, css, scripts, images, etc. to your local machine
- 2. save the content to the github repository(project name: username.github.com).
Then, the Github Pages allows you to publish the content to the web as if you publish the content on your own site. The Github Pages rule is very simple. If your Github username is ‘wikileaks-mirror-jp’ and you push the content to repository named ‘wikileaks-mirror-jp.github.com’, the content can be accessible through the URL – http://wikileaks-mirror-jp.github.com. In short, you can use Github as web server to publish the mirriroed wikileaks content to the web.
1. Setup a Github user account for the mirror
You need to setup a Github user account so that you can push the content to Github repositories. Let’s say your Github username is ‘wikileaks-mirror-jp’ and you have an existing unix user named ‘wikileaks-mirror-jp’ on your linux machine. Firest of all, you need to add wikileaks-mirror-jp’s SSH pub key to the Github account. This below is how I created SSH pub key, id_rsa_github.pub. See also Generating SSH keys for more detail.
$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/wikileaks-mirror-jp/.ssh/id_rsa): /home/wikileaks-mirror-jp/.ssh/id_rsa_github
Created directory ‘/home/wikileaks-mirror-jp/.ssh’.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/wikileaks-mirror-jp/.ssh/id_rsa_github.
Your public key has been saved in /home/wikileaks-mirror-jp/.ssh/id_rsa_github.pub.
The key fingerprint is:
92:23:42:9e:70:c8:ef:65:68:d5:23:0d:62:af:da:87 wikileaks-mirror-jp@ubuntu
The key’s randomart image is:
….
Then, configure SSH like this below so as for the SSH key created above to be used in accessing to github.com server through SSH. See also Multiple SSH Keys for more detail.
Host github.com
User wikileaks-mirror-jp
Port 22
Hostname github.com
IdentityFile ~/.ssh/id_rsa_github
TCPKeepAlive yes
IdentitiesOnly yes
Finally, add the SSH pub key on your account setting page.
2. Create a Github repository for the mirror
Once the Github account is ready, then you need to create a new repository for Github user page , wikileaks-mirror-jp.github.com. First, you need to create a new reposotiry by entering project name, description and homepage URL on Create a New Repository page. The new project Name is wikileaks-mirror-jp.github.com
Then, you create an empty git repository for wikileaks-mirror-jp.github.com and add a sample file to the repository like this below.
$ mkdir ~/github
$ mkdir wikileaks-mirror-jp.github.com
$ cd wikileaks-mirror-jp.github.com
$ git init
$ touch README
$ echo "wikileaks-mirror-jp.github.com" > README
$ git add README
$ git commit -m 'first commit'
$ git remote add origin git@github.com:wikileaks-mirror-jp/wikileaks-mirror-jp.github.com.git
$ git push origin master
You will see the sample file pushed to the repository on the project page.
3. Wikileaks mirroring with wget
Here you mirror a Wikileaks website to your local machine by using a wget command. Let’s say the mirroring target site is http://wikileaks.ch and you save all the files and subdiectories to the directory where create the empty git repository for wikileaks-mirror-jp.github.com, ~/github/wikileaks-mirror-jp.github.com. By executing the following command line, you will have all the files downloaded from http://wikileaks.ch under the directory for the Github repository.
-P ~/github/wikileaks-mirror-jp.github.com http://wikileaks.ch/
--2010-12-23 22:39:40-- http://wikileaks.ch/
Resolving wikileaks.ch... 178.21.20.9, 213.251.145.96, 46.59.1.2, ...
Connecting to wikileaks.ch|178.21.20.9|:80... connected.
HTTP request sent, awaiting response... 200 OK
....
Speaking of ‘-e robots=off‘, you need to turn robots param off in order to download some script or css file that are located under the directories which the site’s robots.txt instructs web robots not to visit. Other key options are ‘–mirror‘, ‘–convert-links‘. Please see wget man page or GNU manual for the wget option details.
4. Push the content to the repository to publish to the web
Finally, push the downloaded content to the repository like this below.
$ git add *
$ git commit -m "added mirror site"
$ git push
Counting objects: 3043, done.
Compressing objects: 100% (3013/3013), done.
Writing objects: 100% (3042/3042), 20.37 MiB | 39 KiB/s, done.
Total 3042 (delta 2836), reused 0 (delta 0)
To git@github.com:wikileaks-mirror-jp/wikileaks-mirror-jp.github.com.git
aff2b93..1032dc2 master -> master
Now the content is accessible through http://wikileaks-mirror-jp.github.com. What’s more, if you want to automate the series of commands that you have executed above, put them on crontab. That’s it!
No related posts.
Posted in: Environment Setup



