Happy New Year! Assuming it is still OK to say that at the end of January.

It is already shaping up to be great 2015 for LogicalSpark, with exiciting projects going live and interesting new requests coming in.

As a little present to kick of the year we have pulled together a little goodie based on two of our favourite projects, Apache Tika and Docker.

We have written previously about Apache Tika and it’s server component. As it is something used a lot by LogicalSpark and our clients, we are always keen to make it easier to manage and deploy it. Therefore, we would like to introduce the Apache Tika Docker image here:


This image bundles the latest 1.7 release of Apache Tika running on Ubuntu 14.10 together with the dependencies for some new parsers, namely the OCR Parser and GDAL Parser.

If you already have Docker, getting started is as easy as pulling down the our build from Dockerhub:

docker pull logicalspark/docker-tikaserver

Then runing the container, by executing the following command to start it and make port 9998 available locally:

docker run -d -p 9998:9998 logicalspark/docker-tikaserver

As with other Tika Server examples, the endpoints are now available by browsing to:


All the code is hosted on GitHub, with a build on Dockerhub, allowing you to adapt or amend your configuration.

Happy Parsing!