NLP4DH round 2 notes and instructions

Update the tool

First, you want to update the tool. Start your Docker Desktop, open a terminal/command prompt, go to the directory where you put the tool, and run

docker pull annebeth/nlp4dhwebapp:latest

Then run the tool as usual – that is, run

docker-compose up

and then open Chrome and write localhost:8000 in the URL bar.

(Problems? Perhaps you want to review the instructions on getting the tool up and running – go to the Master Readme and scroll down for platform-specific instructions.)


Mostly, it’s just fixing crashes and errors. But there’s a new feature where you can choose the interval to display in graphs over time – if the graph attempts to display a large number of years on the X axis, the axis gets too crowded. This is hard to fix in a general case, so we made a workaround that allows you to display the labels for only some of the years/sequences – e.g. 1940, 1945, 1950 instead of 1940,1941,1942… Do try that out.

Testing the tool

Basically the purpose of this second round is to see if we actually fixed the errors you had pointed out and to see if you can find new errors.

Secondarily, we’d appreciate questions and suggestions that help us better figure out how to explain the tool and what kinds of things instructions and explanations should address.

Finally, if you would like to try a different data set than the Pride and Prejudice one, that might be helpful. The original zip file you downloaded contains the Wizard of Oz (also chapter by chapter) as well as files on the Vietnam War from the Foreign Relations of the United States collection (there’s a description in the zip file.)  To make one of the other data sets the set that the tool reads, quit Docker, rename the “data” directory something else (e.g. “austen”), rename one of the other files directories “data”, rerun docker-compose up. Now the files you see in the tool should be the new set in the current data directory.

If you want to use your own files, feel free – just put them in the data directory instead. They should be plain text files. To use the graphs-over-time feature, they need to have a year (or some other sequential number, like a chapter number) as the first line, like the sample data files do. If you code at all, you can download this little Python script that I made to add the years and chapters to the sample data files and see if you can tweak it to make it work with your files.

Please note that we haven’t been able to implement all your suggestions yet, or to add significantly to the instructions, so that’s still to come (in the next version, we plan to have a tab within the tool that has more of a description of what SRL is, what this tool does, and how to use it.)

Please use the same procedures for reporting errors and offering suggestions as you did last time. Here’s the page with the instructions for that.

Blog at

Up ↑

%d bloggers like this: