Running your own buzzword
buzzword is open-source, and designed to be shared. Server administrators are welcome to host their own instance of the tool. Individual users can also run the tool on their local machines, just like they would any other corpus tool.
If you want to run the tool with pre-existing datasets, you'll first need to have one parsed and ready to go. The buzz backend provides numerous ways of getting corpora parsed from your terminal, with comprehensive documentation available here. In short, however, you can parse a dataset in your terminal with:
# because buzzword does not use constituencies, turn them off python -m buzz.parse --cons-parser none ./path/to/data # or parse -cons-parser none ./path/to/data
Parsing will leave you with
./path/to/data-parsed, a directory that mimics the original corpus, but with CONLL-U files instead of plain text.
Next, you need to configure buzzword to load this corpus (and potentially others) on startup. To do this, you should create a
corpora.json, which tells the tool which corpora will be loaded into the tool. Copy
corpora.json, and modify it to contain the corpora you want to show in the app.
The final part of the configuration process is to add the tool's settings. This can be done by modifying
buzzword/settings.py. Most importantly, make sure
CORPORA_FILE is set to the path for your
Once your configuration files are set up, you can start the tool with:
python -m buzzword # or buzzword