2015-01-06

Drupal, SOLR, Search API's and external data

I've been playing with feeds, feeds_xpath_parser,  xml_views_backend,  Solr, searchAPI and other related bits for a week or so now.

There are some great modules being developed that work really nicely together.

Example 1: Importing XML files (by memory so some bits might be unclear or missing...)

Lets say we have detailed XML results available for entities (records), via URL, each with a unique ID that starts at 1.

a) Start by creating the content type that will accept the imported fields.  Make sure any date field formats match those  supplied by the source file.

b) Then we need to add some modules.  Feeds, Feeds_import, xpath_parser, feeds crawler and some  dependencies like feeds_scheduler(or something).

c) Create the new importer, configure all the bits... feeds_xpath_parser, target type(nodes), etc... and add the fields to import.  When configuring the feeds_xpath_parser each field needs to be mapped to the imported fields... This is very laborious and boring.... So grab a cuppa, and some sugar, if you have many fields!

d) My favorite discovery during this process was feeds_crawler.  Basically it accepts the detailed URL with a replacement parameter so you can define where to start importing, for how many records, incremented by a value of your choice with a defined delay between queries so you don't kill your service if it's running on legacy hardware.

e) Then simply chose your importer, configure start, end, delay and increment values and go.

f) Once the data is imported into Nodes normal views ans search manipulation can occur.

CONS: If you have 40,000 records like this, that get populated elsewhere via legacy tools keeping the duplicate nodes in Drupal up to date is time consuming, resource heavy and unnecessary.

EXAMPLE 2:

Instead of importing detailed records we can use xml_views_backend to parse available xpath elements into fields that are then visible to views.

If your API offers a search URL to return a result set in XML, as well as detailed records in XML (and there is an ID linking the 2...) 

2 views need to be created:
   -  1 to display the XML search results.
   -  2 to display the detailed XML record.

Use use regular WebForms, views, contextual filters,  xml_views_backend, and path substitution to send a query URL, parse the result set into views(ID, Name, Date), rewrite the ID field to <a href= 'detailed record URL'?ID=%1>.

PROs - external data visible in Drupal, No importing required.
CONs - Not indexed nor searchable, modules work well but are alpha/beta/dev at this time.

Note: Similar things can be done with views_json_backend.


The best of these worlds would be to index external data into SOLR and use views do display the data...  That's what's next on the radar....

No comments:

Post a Comment