June 21, 2004

Amazon Products Feed: Display the Most Popular Items

Posted at 12:36 AM Gems, Instruction Manual, Toolbox

About apf_pop.cgi

MrRat's amazon_product_feeds.cgi script (APF) is a marvellously useful script. It's robust and nimble interface provides rich opportunities for exploitation. Here's a script most webmasters of APF sites will find useful. The apf_pop.cgi script searches your website's access log for specific kinds of APF calls that indicate a user clilcking on an item and grabs the Amazon ASIN's from them. The ASINs are counted and sorted, then one or more documents are constructed to display them. In addition to the data formatting templates available through APF, apf_pop adds an additional templating layer with its own variables. The type of document apf_pop can generate is virtually unlimited: HTML, SHTML, PHP, CGI, even output an APF template, if you like. As this implies, through apf_pop you can display data generated by APF in either dynamic or static documents.


First, things first: design an apf_pop template. It can be based on any type of document. Several template variables are available to help. We'll begin with the core variables.


There are two options for determining how ASIN data is to be placed in your files. Only one or the other of these variables can appear in a template file.


In the output file, this variable is replaced by a comma-delimited list of ASINs suitable for an APF AsinSearch. Examples:
<!--apf &apf_include=amazon_products_feed.cgi?

<!--#include virtual="/cgi-bin/amazon_products_feed.cgi?
    search_type=AsinSearch&item_id=%%Popular_ASIN%%" -->

<a href="/cgi-bin/amazon_products_feed.cgi?

    search_type=AsinSearch&item_id=%%Popular_ASIN%%>The Top 10!</a>
The first example is, of course, useful for APF templates. It will expand to a formatted list each time the template is parsed. The second can be placed in any SSI parsed file. The list of Amazon items will be generated each time the file is accessed. The third is a simple link you can place anywhere.


When apf_pop encounters this template variable it calls APF, passing it the ASINs. APF returns data formatted by the APF template set you specify in apf_pop. Finally, apf_pop parses the data, performing some error-checking and data reformatting, before writing the refined formatted data to the output file. PHEW!

This one requires a bit of tricky configuration to setup, but %%Popular_HTML%% offers some distinct advantages over %%Popular_ASIN%%.
  • Static Pages: Unless your APF templates are doing something unusual, the resulting formatted output is suitable for static HTML pages. This will reduce server overhead, bandwidth and user frustration. Yay!
  • Heavy data: You can use a 'heavy' version of APF rather than the default 'lite' version to generate these static pages. If your site has heavy traffic in DVDs and Videos, your users may appreciate the Director and Starring links apf_pop provides via APF 'heavy' calls.
  • Error checking: apf_pop pulls those annoying "There are no exact matches" and "Invalid Asin" messages right out of the data stream and discards them so your visitors never see them.
There is a niggly drawback to %%Popular_HTML%%
  • Multiple APF scripts: Whether you elect to create a 'heavy' APF for apf_pop's use or not, you'll want apf_pop to call a copy of the APF script with a different name than the one called when users click on your site's links. This is due to the way apf_pop gathers ASINs from the log file. I'll explain why in some detail later on. The distribution zip contains some tools for diminishing the burden.
The remaining template variables can be used in both the ASIN and HTML context.
%%see_prev_popular%% and %%see_next_popular%%
These will place Next and Back buttons for navigating between pages created by the call to apf_pop.
This is replaced by a "Page 3 of 4" message.
More template variables are coming in future versions of this script.

Configuring apf_pop for %%Popular_ASIN%%

We'll begin with the simpler case, configuring apf_pop for a %%Popular_ASIN%% template. There are several variables which must be defined in the script.
General script setup
$max_pages = 20
$items_per_page = 5;
Pretty straightforward. In this example apf_pop will create up to 20 pages listing 5 items per page.
$Popular_var_type = "ASIN";
Tells apf_pop to look for %%Popular_ASIN%% in the template file and replace it with a string of comma delimited ASINs in the output files.
logfile parsing
$logfile = "/home/username/logs/access.log"
Specifies is the log file apf_pop will scan for ASINs. Include the full server directory path as necessary so the script knows where to find it.
$logfile_get_string = "\"get /cgi-bin/Amazon/amazon_products_feed.cgi"
This character string identifies the lines in your logfile representing APF activity. apf_pop will parse these lines for AsinSearch. It's important that this string ignores the log entry's referrer data, so that only user activity to view an ASIN is counted, and not all the times the user clicks another link on that page. $logfile_get_string can be a regular expression; apf_pop will give the string a once-over to assure special characters have been escaped properly. Logfiles can be quite large so keep this string short to minimise the amount of processing apf_pop must do. The format of the string in the example above will work fine in most instances. "\"get" identifies the opening of a page and is followed by the URL of the page to be opened. Folks on unix servers can test this out using grep. EG:
grep "\"get /your/path/amazon_products_feed.cgi" access.log > apf.log
The file "apf.log" will contain the results. It's likely to be big, so don't forget to delete it.
apf_pop template setup
$template_file = "/home/bv126070/public_html/cgi-bin/Amazon/popular_list.template.html";
Identifies the template used by apf_pop to generate the output pages. Include directory path as necessary.
$nav_target = "_top";
Specifies the link "target" for links generated for %%see_prev_popular%% and %%see_prev_popular%%, $nav_target = "_self" is useful for popular pages placed within FRAMESETS or IFRAMES, otherwise, "_top" works well. If you specify an empty string, then the links will have no target. apf_pop scans the access log for entries containing an APF AsinSearch. If the item_id field contains only one ASIN, then apf_pop counts it as a click and grabs the ASIN. Posted by Patrick at June 21, 2004 12:36 AM