Thank you for your interest in contributing to the project. If you want to run a client, expect the application to use up to 30 GB of harddisk space and to download up to 20 GB at the beginning. These numbers are only realistic if the English Wikipedia is dumped, for other languages, the amounts are much smaller. Furthermore, after the first download, the dump is processed in slices. You get the best results if you just leave the computer running. If you interrupt the process, it will start the last unfinished slice again.


  • Linux on an x86 computer
  • at most 30 GB of harddisk space
  • at most 20 GB download at every language change (infrequent)
  • about 20 MB upload per hour
  • Software: php5-cli php5-mysql mysql-server mysql-client curl pngcrush tidy texlive texlive-latex-extra texlive-math-extra dvipng

Please report if there is any error in the above lists. Note that although a MySQL server is used, your system is not harmed in any way since a local instance of MySQL is started. Furthermore, note that the numbers given above depend on the number of contributers and the speed of your computer.

Let's Start

If you have all of the above packages installed, grab the client
download client
and extract it to some newly created directory. Note that the client will not use or create any files outside of its own directory (except for temporary files in /tmp).

If you want your efforts to be visible in the statistics, put your name and your team in respective files in the state directory:

echo -n "bill" > state/user
echo -n "DVAS" > state/team

If you prefer to stay anonymous, just don't touch these files (or delete them if you changed them).

The client automatically uses all CPUs / CPU cores available on the system. If you want to change the number of parallel processes, simply edit state/processes.

To start the dump process, call scripts/ and have fun watching the output.

If you have any questions, please contact me via mail.