Determining Most Shared Content

Our client, Human Rights Watch, publishes a large volume of content and with the redevelopment of HRW.org, they wanted a way to curate their content in more meaningful ways. One way we did this was by showing what content is shared most frequently across a variety of social networks.
Keeping up with the number of shares across multiple social networks seemed like an uphill battle. A system like this would require creating a background process that looks up and stores the number of shares for HRW.org URLs. It would also require writing separate API integrations for each social network HRW wants to target and maintaining them over time.
Since we didn’t want to create and maintain a custom social share count aggregation system if we didn’t have to, we started looking at 3rd party solutions. HRW.org is a multi-lingual site, so some of the options weren’t viable since they provide the most shared items for a domain without taking into account the language of the content. We selected SharedCount.com, which has a simple API that allowed us to quickly build a system to determine most shared content. The system sends SharedCount a batch of URLs and it returns a JSON object with the number of shares across a variety of social networks for each URL.
Here is a sample JSON response from the API: https://docs.sharedcount.com/v1.0/docs/bulk-1

<br />
{<br />
    "data": {<br />
        "http://google.com/": {<br />
            "StumbleUpon": null,<br />
            "Pinterest": 1003223,<br />
            "Twitter": 11400,<br />
            "LinkedIn": 95,<br />
            "Facebook": {<br />
                "commentsbox_count": 10117,<br />
                "click_count": 265614,<br />
                "total_count": 9476803,<br />
                "comment_count": 1793601,<br />
                "like_count": 1500762,<br />
                "share_count": 6182440<br />
            },<br />
            "GooglePlusOne": 7710780<br />
        },<br />
        "http://stackoverflow.com/": {<br />
            "...snip...": "98 URLs not shown for brevity..(snip).."<br />
        }</p>
<p>    },<br />
    "_meta": {<br />
        "urls_completed": 100,<br />
        "bulk_id": "a4f8f0fd436995987dbef98bbff9accc61282c63",<br />
        "completed": true,<br />
        "urls_queued": 100<br />
}<br />

How it Works

HRW.org is built with Drupal. In a custom module, we set up several Drush commands that can run at intervals. We use Jenkins to run this every 10 minutes, so we can provide a nearly real-time look at what is happening socially with HRW’s content. The workflow looks like this:

  1. Query a batch of URLs in Drupal. Currently, it sends any URL published in the last 30 days. To make this easy for HRW to adjust, we used Drupal Views to determine which URLs to send to the ShareCount API.
  2. Send a batch of URLs to the SharedCount bulk API https://docs.sharedcount.com/v1.0/docs/bulk.
  3. Process a JSON object from SharedCount, which includes the number of shares for each URL.
  4. Store the results in a simple database table with the URL, number of shares and language of the content.
  5. Query the custom database table to show most shared content in a custom Drupal block on the HRW.org homepage.

Hooray for leveraging all that social data in a simple, but meaningful way.

Code Content Drupal Drupal Planet

Read This Next