Blacklight uses Apache Solr as its data index 1, but does not care how you run or index data into your instance. For this workshop, we’re starting with the same set of configuration files that Blacklight uses for its own testing. In this section, we’ll take a look at some commonly used tools and approaches.
solr_wrapper
solr_wrapper
is a tool that can be used to download, run, and configure a Solr instance, and is often used as part of an application’s test suite to ensure a clean slate.
In a separate terminal window within the application’s directory, run:
% bundle exec solr_wrapper
Solr should now be running on http://localhost:8983/. If we take a look at the solr admin, we can see there’s an empty blacklight-core
collection with our preferred solr configuration.
solr_wrapper
is a handy tool for development because it ensures the application starts from a clean slate. It’s configured in the application’s solr_wrapper.yml
file.
# Place any default configuration for solr_wrapper here
# port: 8983
collection:
dir: solr/conf/
name: blacklight-core
The collection
section tells solr_wrapper
to use the files in the solr/conf
directory as the configuration directory for the blacklight-core
collection.
Now that we have Solr up and running, we can index some data. Blacklight provides some out-of-the-box fixture objects that allow us to quickly index data:
% bin/rake blacklight:index:seed
If we send a query using the solr admin UI, we should see 30 freshly indexed documents.
The Blacklight web application is not strongly opinionated about how data are indexed into Solr. Many library-adjacent applications use a tool called traject, which we have available as part of the blacklight-marc gem dependencies.
In this example, we’ve pre-configured a traject indexer that handles MARC records in app/models/marc_indexer.rb
2.
The blacklight-marc
gem also includes some example MARC data (coincidentally, this data is also the source of the fixture data above).
% find $( bundle show blacklight-marc ) -name "*.mrc"
/path/to/gems/blacklight-marc-8.0.0/test_support/data/test_data.utf8.mrc
% bin/rake solr:marc:index MARC_FILE=/path/to/blacklight-marc-8.0.0/test_support/data/test_data.utf8.mrc
We can use this tool to index any MARC data we have available.
Now that we have some data in Solr, we can query solr and get results back. Try some of these:
These query parameters, and more, are documented in the Solr Reference Guide. Blacklight, for the most part, is just presenting a framework and a basic user interface for querying solr, so it is important to get to know how to manipulate Solr for indexing and querying to provide a good search experience.
Although some institutions have had luck creating adapters to work with ElasticSearch or bespoke search APIs, covering these use cases is outside the scope of this workshop. ↩
The details about the MARC format, traject, and this indexer are outside the scope of this tutorial; for now, it’s enough to know we have a way to get some data into the index. ↩