Offline Dictionary Server

kiwix and wiktionary to the rescue

Copyright © 2016 by KV5R. All Rights Reserved. Rev. 3/21/2016

Note: If you simply want a way to look up words online from within any program, see my article, Look Up Words From Any Program.

About

This article describes installing a local server (Kiwix-serve) and a dictionary (Wiktionary) in Windows, providing an offline local dictionary server that may be queried with a web browser or any software that has a Dictionary Lookup command that outputs an http URL. The specific case here is to use it for lookups from the Calibre e-book software when the Internet is not available for online lookups.

In addition to the dictionary, you can also install a bunch of other offline wikis, as we’ll see by the end of this article.

This can also be done in OSX, Linux, and Android, but you’ll need to look elsewhere (start with the kiwix site) for details. And please don’t ask me, “How do I put this in my [whatever] reader?” because I haven’t the foggiest notion.

Why Do It?

After all, doesn’t everybody have Internet connectivity everywhere, all the time? Well, no. You might take a vacation at a remote mountain cabin. Or a summer-long missionary project in some remote place. Or be without Internet service for two weeks after a hurricane. Stocking up on some extra web-stuff and e-books, on a notebook or tablet that can run on a car battery, is just good prep!

Calibre

Calibre (pronounced kal-i-ber, not ka-leeb-ray) is a free open-source e-book manager and reader for Windows, OSX, and Linux. It has lots of great features, including the ability to store, convert, edit, and read all sorts of e-books.

One feature is that you can select a word in a book, right-click, then “Lookup in Dictionary.” The “lookup” simply sends a URL query to any online dictionary, which you can set in Calibre’s viewer preferences.

But what if you want to use a dictionary when offline? Unlike Kindle, Calibre (and most epub readers) don’t have a built-in dictionary. The answer is to run a local dictionary that you can access from Calibre.

Dictionary Software

There are several ways to do this. The first thing I found when searching is to use WordWeb or TheSage. These are free Windows dictionary programs that allow you to look up words from within most Windows programs. But their dictionaries are rather limited, and insufficient if you frequently look up scientific, technical, archaic, or foreign words.

The next thing I found is Wiktionary, the online dictionary made by the WikiMedia Foundation. You can download “dump” files of the various WikiMedia sites, including Wiktionary. But you need a program to navigate and view the dump files.

One is called bzReader, which reads and views WikiMedia’s bz2-compressed XML dump files. You get the latest wiktionary dump and bzReader and you have a local copy of Wiktionary. It includes a server so you can query it with a web browser, but alas, bzReader picks a random port number every time you run it! It also displays the data with very plain formatting (no CSS).

Another possibility is to install a local LAMP/WAMP server stack (usually Apache, MySQL, and PHP), then install MediaWiki (the software that runs Wikipedia, etc.), then get a MySQL dump of Wiktionary, and set up a local instance. Complex and messy! Uses too many resources! Takes DAYS to configure!

Okay. What we really need is a lightweight local server that’s designed to query a compressed dump of Wiktionary, without MySQL, PHP, or 4.5 million files. Something like bzReader, but better.

Kiwix

Kiwix is a wiki reader with many nice features, including a lightweight http server with user-defined port, and enhanced output formatting to look like the online wikis. You can read your local wikis through the Kiwix program, or run its server, kiwix-serve.exe, and hit it with any web browser on your local network.

But Kiwix doesn’t directly use WikiMedia’s XML bz2 dump files. They take and convert them to the ZIM format. So you have to download your desired wiki dumps from the Kiwix site in ZIM format. You can do this manually or from within the Kiwix program.

Wiktionary

The Wiktionary ZIM comes (from Kiwix) in several flavors:

  • Simplified English (~24,000 entries) without images, ~5MB
  • Simplified English (~24,000 entries) with images, ~45MB
  • English (~4.3 million entries) without images, ~750MB
  • English (~4.3 million entries) with images, ~1.1GB
Note: File sizes given herein were approximate at the time of writing (Mar. 2015) but they are always growing.

You’ll want the last one, which will take ~2 hours to download (on basic DSL), then several hours to build the index (Kiwix has full-text search indexing). The Wiktionary ZIM and its index files will end up occupying about 6.5GB. You’ll then have a modern dictionary of about 4.2 million entries from over 1550 languages, with English definitions, etymologies, translations, synonyms, etc.

Note: The top navigation links (Aa-Zz, etc.) on the Main Page, and the audio pronunciation links throughout, do not work. You need to enter a search word to get anywhere.

By default, Kiwix will bury its downloaded ZIMs in an obscure folder like:

c:\Users\[you]\AppData\Roaming\www.kiwix.org\Kiwix\Profiles\[bunchofletters].default\data\content\wiktionary_en_all_2015-11.ZIM

but you can change that to something simple like:

c:\kiwix-data\

by going to Kiwix menu Edit > Preferences > Data Directory > Browse, and then making your desired folder. If you’ve already downloaded a ZIM, you’ll need to move the folders “content,” “index,” and “library” there. Do NOT delete the \AppData\Roaming\www.kiwix.org\ tree, as Kiwix still stores much user profile stuff there, even if you’ve told it to store its data files elsewhere.

To query it with http URLs that Calibre can send, you need to run the kiwix server (called kiwix-serve.exe). You can start it from within Kiwix (menu Tools > Server), or you can run the server directly from a command line, and not run the Kiwix program at all, except for downloading, indexing, and setting up; or just using it as a stand-alone dictionary.

Running Kiwix-serve

Kiwix-serve.exe (located in \kiwix\xulrunner) is a command-line utility. You need to run it with several parameters. To make it easy to run, create a shortcut (or run it as a Windows Service; see further below):

  • Right-click the desktop and select New > Shortcut
  • Program it like this (use your correct path to library.XML):
    • Target: "C:\Program Files\Kiwix\kiwix\xulrunner\kiwix-serve.exe" --port=8000 --daemon --library C:\Kiwix-data\library\library.XML
      Open in: "C:\Program Files\Kiwix\kiwix\xulrunner"
      Run: Minimized
  • OK the new shortcut.

Notes:

  • The "quotes" above are required because of the space in Program Files.
  • Adjust the above to your path to library.XML.
  • You can right-click the icon and “Pin to Start Menu”, or just drag it in. Put it right by your Calibre shortcut so you can start the server when starting Calibre.

If your “target” command string is correct, clicking the shortcut will create a minimized command prompt icon in the taskbar, running kiwix-serve, using the port and ZIM file specified.

If it fails to start (icon appears and disappears), open the shortcut’s Properties and carefully check your spelling, verifying the paths to kiwix-serve.exe and library.XML.

To stop the server, simply close its blank command window. To start it automatically, drag the shortcut into the Start > All Programs > Startup menu. (But Wait! Below, we’ll see how to install it as a Windows service!)

More Notes:

  • The first time you start it, Windows Firewall will pop up and ask if it is to be allowed. Obviously, yes.
  • The kiwix server will also work with any program that can output a configurable URL for dictionary lookups or “online” searches. Unfortunately, each search (from another program) will open a new tab in the browser, not re-use an existing one (you might be able to change this behavior in your browser).

Querying the Kiwix Server From a Browser

Query the server and wiktionary in a web browser with:

http://localhost:8000/wiktionary_en_all_2015-11/A/word.html

Adjust that line to your particular dictionary ZIM. “word” is the word you want to look up. If it contains spaces, replace space with underscore. The /A/ before and the .html after are required.

If you installed more than one ZIM, go to http://localhost:8000 (the “root” of kiwix-serve) and it’ll list your kiwix library with “Load” buttons, so you can pick the desired ZIM without having to remember its long filename.

On the local machine you can use localhost, or 127.0.0.1, or the machine’s local IP address (192.168.x.x) if you’re using fixed LAN addresses instead of DHCP-assigned ones. You can also query the server on a different machine on your LAN.

If you have a dedicated server machine on your LAN, you can just run the dictionary server there and leave it running all the time.

Installing Kiwix-serve as a Windows Service

This isn’t necessary unless you just wanna get rid of the minimized icon that is running kiwix-serve. Installing it as a service means you’ll have a dictionary on port 8000 all the time, but don’t worry, it uses very little resources (~35M of RAM) while idle. To install it as a Windows service, do thus:

Download NSSM - the Non-Sucking Service Manager. It doesn’t install. Unpack the zip file to a pathed folder (like c:\windows). Use the correct on for x86 or x64 versions.

Open a command prompt and type:

nssm install Kiwix-serve

It will open a dialog box where you can put in the particulars.

On the Application tab:

Application: C:\Program Files\Kiwix\kiwix\xulrunner\kiwix-serve.exe
Startup Directory: C:\Program Files\Kiwix\kiwix\xulrunner
Arguments:  --port=8000 --daemon --library C:\Kiwix-data\library\library.XML

Note: Use the correct path to your library.XML file.

On the Details tab:

Display Name: Kiwix-serve
Description: Provides an http server on port 8000 for ZIM archives. cf. kv5r.com/computers/offline-dictionary-server/
Startup Type: Automatic (Delayed Start)

Click “Install” and it should do so.

Now see if it will start without error. At the command prompt, type:

nssm start kiwix-serve

If it works, you’ll see:

The Kiwix-serve service is starting.
The Kiwix-serve service was started successfully.

If it fails to start, edit it with:

nssm edit kiwix-serve

and carefully check the Application and Arguments strings. Then try to start it again.

To see it in services, run services.msc and scroll down to Kiwix-serve. Double-click it to open its Properties. You can see description, start, stop, and change Startup Type here.

If you move kiwix-serve or your ZIM file you’ll need to edit the parameters of the service. At a command prompt, type:

nssm edit Kiwix-serve

and it’ll open the same window as when you installed it, and you can edit and update the particulars as needed. After editing the service, you need to restart it. You also need to restart it after adding another ZIM to the library. Use:

nssm restart kiwix-serve

or run services.msc and stop-start it from there.

If you uninstall Kiwix, you’ll need to remove the service, lest Windows keeps trying to start a program it can’t find. At a command prompt, type:

sc delete Kiwix-serve

or

nssm remove Kiwix-serve

Make Calibre Use Kiwix-serve as its Dictionary

Okay! Now we have a nice dictionary server running as a service, just waiting for us to query it. While we can manually access it with any browser at http://localhost:8000, typing or copy/pasting search words isn’t much fun. Right-clicking and selecting “Lookup” provides a much better workflow.

  • Run Calibre and open a book in the viewer
  • Click Preferences button (4th from the bottom)
  • Click the Dictionaries tab, Add Website, then paste in (alter it as needed):
    http://localhost:8000/wiktionary_en_all_2015-11/A/{word}.html and be sure to use the correct file name there (2015-11 was the latest at time of writing).
  • OK, OK.

The literal “{word}” part is a Calibre variable; it gets replaced with the selected word when you “Lookup in Dictionary.”

Now drag-select or double-click a single word in the book, right-click and select “Lookup in Dictionary.” If it works, your default browser will open your local Wiktionary to the selected word, in your default browser.

Note that search strings are case-sensitive, so if you select “Misbehave” it will return a “not found” error. Simply lower-case the word in the browser’s address bar and hit Refresh (or F5), and “misbehave” will pop up in the browser. And some words have entries for both forms: “Malamute” the people; “malamute” the dog breed.

Other Neato Offline Wikis

Now that you have Kiwix and an offline Wiktionary, you may want to add others. Given enough time and space, you can even install Wikipedia (52GB), Wikipedia for Schools, Wikipedia Simplified English, Wikipedia Medical, WikiBooks, WikiQuotes, WikiVersity, TED Talks, Project Gutenberg (40GB), and more. Some of those (Wikipedia and Gutenberg) will take like three days (each) to download, so you should do them manually, and use a download manager with auto-recovery. But most ZIMs are just a gig or so.

All the Kiwix ZIMs are at: download.kiwix.org/ZIM/

Conclusion

I hope this is helpful. It took me quite a while to stumble upon this solution, and it works quite well, so I wrote an article about it, since no one else has.

—KV5R

4 thoughts on “Offline Dictionary Server
  1. Hi!,

    Thanks for documenting this, i’ve been looking how to achieve this.
    Can you reach kiwix-serve from another machine?

    I’m testing on a VM on the same network, but can’t make it work, added a rule on the firewall, but no luck.

    Any hint?

    • Hi Luis,

      Make sure you are using the server machine’s local IPv4 address (192.168.x.x) in the remote computer’s browser. Localhost or 127.0.0.1 (loopback) only works on the same machine.

      You can find the server machine’s LAN address with ipconfig at the command prompt (windows), or ifconfig on most Linux.

      If you’re running kiwix-serve on windows, you don’t have to create a firewall rule. The first time to try to access its local IPv4 address from other computer, windows firewall will pop up and ask to Allow. With Private networks checked, click Allow. Firewall will create a rule, allowing kiwix-serve.exe inbound TCP on port 8000 (or whatever port you used).

      But if you clicked Abort, it makes a block rule, then you have to go into control panel, windows firewall, and delete the block rule from inbound rules. Then hit the machine’s ipv4 (192.168..) address again from remote browser, and you get the firewall popup again (on server machine) and can allow it. That’s much easier than making an inbound allow rule manually.

      You don’t need any firewall rule on the remote machine because all outbound connections are ‘allow’ by default.

      I don’t know how all this works on a VM. I guess it’s OS sees a virtual LAN interface that connects to the real interface, if you set up the VM right?

      Anyway, I took out Kiwix because the ZIMs don’t have wiki “landing” pages or working internal links. Currently writing a big article on setting up a local web server with PHP and MySQL/MariaDB, so stay tuned.

      73, –kv5r

Leave a Reply

Your email address will not be published. Required fields are marked *